The Data Engineering Bootcamp: Zero to Mastery

Name: The Data Engineering Bootcamp: Zero to Mastery
Price: 9 USD
Availability: InStock

13h 23m 15s

English

Paid

August 14, 2025

Course description

Learn how to build streaming pipelines with Apache Kafka and Flink, create data lakes on AWS, run ML workflows on Spark, and integrate LLM models into production systems. This course is designed to kickstart your career and make you a sought-after data engineer of tomorrow.

Watch Online

Watch Online The Data Engineering Bootcamp: Zero to Mastery

0:00

#1: The Data Engineering Bootcamp: Zero to Mastery

All Course Lessons (153)

#	Lesson Title	Duration
1	The Data Engineering Bootcamp: Zero to Mastery Demo	01:35
2	Introduction to Data Engineering	04:17
3	Who Are Data Engineers?	04:43
4	Prerequisites	03:19
5	Source Code for This Bootcamp	01:19
6	Plan for This Bootcamp	04:38
7	[Optional] What Is a Virtualenv?	06:37
8	[Optional] What Is Docker?	11:03
9	Introduction	04:08
10	Apache Spark	03:44
11	How Spark Works	04:24
12	Spark Application	07:41
13	DataFrames	06:43
14	Installing Spark	05:51
15	Inside Airbnb Data	07:02
16	Writing Your First Spark Job	07:05
17	Lazy Processing	02:16
18	[Exercise] Basic Functions	01:29
19	[Exercise] Basic Functions - Solution	06:41
20	Aggregating Data	04:00
21	Joining Data	04:40
22	Aggregations and Joins with Spark	06:10
23	Complex Data Types	05:09
24	[Exercise] Aggregate Functions	00:50
25	[Exercise] Aggregate Functions - Solution	05:54
26	User Defined Functions	03:25
27	Data Shuffle	06:14
28	Data Accumulators	03:42
29	Optimizing Spark Jobs	07:39
30	Submitting Spark Jobs	04:29
31	Other Spark APIs	05:16
32	Spark SQL	04:33
33	[Exercise] Advanced Spark	02:10
34	[Exercise] Advanced Spark - Solution	05:26
35	Summary	03:08
36	Introduction	04:26
37	What Is a Data Lake?	09:08
38	Amazon Web Services (AWS)	07:47
39	Simple Storage Service (S3)	05:45
40	Setting Up an AWS Account	09:29
41	Data Partitioning	03:24
42	Using S3	07:49
43	EMR Serverless	02:59
44	IAM Roles	02:52
45	Running a Spark Job	08:49
46	Parquet Data Format	07:41
47	Implementing a Data Catalog	05:32
48	Data Catalog Demo	06:42
49	Querying a Data Lake	04:00
50	Summary	03:39
51	Introduction	05:53
52	What Is Apache Airflow?	05:19
53	AirflowвЂ™s Architecture	03:15
54	Installing Airflow	06:33
55	Defining an Airflow DAG	08:03
56	Errors Handling	03:38
57	Idempotent Tasks	04:54
58	Creating a DAG - Part 1	04:58
59	Creating a DAG - Part 2	04:42
60	Handling Failed Tasks	04:09
61	[Exercise] Data Validation	04:31
62	[Exercise] Data Validation - Solution	03:27
63	Spark with Airflow	03:02
64	Using Spark with Airflow - Part 1	07:39
65	Using Spark with Airflow - Part 2	05:52
66	Sensors In Airflow	04:46
67	Using File Sensors	04:08
68	Data Ingestion	05:50
69	Reading Data From Postgres - Part 1	06:03
70	Reading Data from Postgres - Part 2	05:40
71	[Exercise] Average Customer Review	03:53
72	[Exercise] Average Customer Review - Solution	04:33
73	Advanced DAGs	04:26
74	Summary	02:27
75	Introduction	05:28
76	What Is Machine Learning	06:06
77	Regression Algorithms	05:38
78	Building a Regression Model	05:04
79	Training a Model	09:46
80	Model Evaluation	07:26
81	Testing a Regression Model	03:57
82	Model Lifecycle	02:12
83	Feature Engineering	08:44
84	Improving a Regression Model	07:34
85	Machine Learning Pipelines	03:56
86	Creating a Pipeline	02:41
87	[Exercise] House Price Estimation	01:59
88	[Exercise] House Price Estimation - Solution	03:12
89	[Exercise] Imposter Syndrome	02:57
90	Classification	07:37
91	Classifiers Evaluation	04:27
92	Training a Classifier	08:31
93	Hyperparameters	08:06
94	Optimizing a Model	03:02
95	[Exercise] Loan Approval	02:34
96	[Exercise] Load Approval - Solution	02:33
97	Deep Learning	06:56
98	Summary	03:23
99	Introduction	05:07
100	Natural Language Processing (NLP) before LLMs	06:11
101	Transformers	06:21
102	Types of LLMs	07:40
103	Hugging Face	02:19
104	Databricks Set Up	10:38
105	Using an LLM	07:36
106	Structured Output	03:42
107	Producing JSON Output	05:10
108	LLMs With Apache Spark	05:20
109	Summary	02:48
110	Introduction	06:06
111	What Is Apache Kafka?	07:00
112	Partitioning Data	08:56
113	Kafka API	07:42
114	Kafka Architecture	03:15
115	Set Up Kafka	05:53
116	Writing to Kafka	06:07
117	Reading from Kafka	07:37
118	Data Durability	06:39
119	Kafka vs Queues	02:11
120	[Exercise] Processing Records	03:44
121	[Exercise] Processing Records - Solution	02:59
122	Delivery Semantics	05:53
123	Kafka Transactions	04:34
124	Log Compaction	03:23
125	Kafka Connect	06:59
126	Using Kafka Connect	09:44
127	Outbox Pattern	04:31
128	Schema Registry	08:01
129	Using Schema Registry	08:10
130	Tiered Storage	03:28
131	[Exercise] Track Order Status Changes	04:27
132	[Exercise] Track Order Status Changes - Solution	05:06
133	Summary	04:41
134	Introduction	05:40
135	What Is Apache Flink?	05:24
136	Kafka Application	08:11
137	Multiple Streams	03:11
138	Installing Apache Flink	05:46
139	Processing Individual Records	07:22
140	[Exercise] Stream Processing	04:02
141	[Exercise] Stream Processing - Solution	02:40
142	Time Windows	06:49
143	Keyed Windows	02:40
144	Using Time Windows	05:18
145	Watermarks	10:06
146	Advanced Window Operations	06:17
147	Stateful Stream Processing	07:50
148	Using Local State	04:42
149	[Exercise] Anomalies Detection	04:35
150	[Exercise] Anomalies Detection - Solution	03:34
151	Joining Streams	05:50
152	Summary	03:10
153	Thank You!	01:18

Unlock unlimited learning

Get instant access to all 152 lessons in this course, plus thousands of other premium courses. One subscription, unlimited knowledge.

Learn more about subscription

Comments

0 comments

Want to join the conversation?

Similar courses

PyTorch for Deep Learning with Python Bootcamp

Sources: udemy

Welcome to the best online course for learning about Deep Learning with Python and PyTorch! PyTorch is an open source deep learning platform that provides a sea

17 hours 2 minutes 14 seconds

Python for Data Science and Machine Learning Bootcamp

Sources: udemy

Are you ready to start your path to becoming a Data Scientist! This comprehensive course will be your guide to learning how to use the power of Python to analy

24 hours 49 minutes 42 seconds

Deep Learning A-Z™: Hands-On Artificial Neural Networks

Sources: udemy

Artificial intelligence is growing exponentially. There is no doubt about that. Self-driving cars are clocking up millions of miles, IBM Watson is diagnosing pa

22 hours 36 minutes 30 seconds

Data Analysis for Beginners: Excel & Pivot Tables

Sources: zerotomastery.io

This short course on data analysis in Excel is perfect for beginners who want to acquire skills in analyzing structured data using two of Excel's most...

2 hours 10 minutes 21 seconds

Case Study in Product Data Science

Sources: LunarTech

This is a course that offers unique opportunities for students seeking to master key aspects of data analysis in product development. The course...

1 hour 4 minutes 47 seconds

The Data Engineering Bootcamp: Zero to Mastery

This is a demo lesson (10:00 remaining)

Watch Online The Data Engineering Bootcamp: Zero to Mastery

All Course Lessons (153)

Unlock unlimited learning

PyTorch for Deep Learning with Python Bootcamp

Python for Data Science and Machine Learning Bootcamp

Deep Learning A-Z™: Hands-On Artificial Neural Networks

Data Analysis for Beginners: Excel & Pivot Tables

Case Study in Product Data Science