Skip to main content

Fundamentals of Apache Airflow

2h 21m 18s
English
Paid

Course description

This practical course starts with the basics and step by step guides you to building real orchestration scenarios - from task retries to integration with Spark and loading external data.

Moving data from point A to point B is only a small part of the task. It is important that data is delivered accurately, reliably, and automatically - and this is where Apache Airflow comes to the rescue.

You will learn how to transform chaotic, manually configured pipelines into well-organized workflows. We'll begin with understanding the architecture of Airflow and its key components, followed by mastering more advanced techniques: setting up retries, handling failures, using sensors, working with Apache Spark, and automatically loading data from external sources into a data lake.

The course is suitable for both beginner data engineers and those who want to improve their orchestration skills. You will receive real practical tools for creating scalable and reliable data processing systems.

Watch Online

This is a demo lesson (10:00 remaining)

You can watch up to 10 minutes for free. Subscribe to unlock all 27 lessons in this course and access 10,000+ hours of premium content across all courses.

View Pricing
0:00
/
#1: Introduction

All Course Lessons (27)

#Lesson TitleDurationAccess
1
Introduction Demo
07:20
2
What Is Apache Airflow?
05:19
3
Airflow’s Architecture
03:15
4
[Optional] What Is a Virtualenv?
06:37
5
[Optional] What Is Docker?
11:03
6
Installing Spark
05:51
7
Installing Airflow
06:33
8
Defining an Airflow DAG
08:03
9
Errors Handling
03:38
10
Idempotent Tasks
04:54
11
Creating a DAG - Part 1
04:58
12
Creating a DAG - Part 2
04:42
13
Handling Failed Tasks
04:09
14
[Exercise] Data Validation
04:31
15
[Exercise] Data Validation - Solution
03:27
16
Spark with Airflow
03:02
17
Using Spark with Airflow - Part 1
07:39
18
Using Spark with Airflow - Part 2
05:52
19
Sensors In Airflow
04:46
20
Using File Sensors
04:08
21
Data Ingestion
05:50
22
Reading Data From Postgres - Part 1
06:03
23
Reading Data from Postgres - Part 2
05:40
24
[Exercise] Average Customer Review
03:53
25
[Exercise] Average Customer Review - Solution
04:33
26
Advanced DAGs
04:26
27
Let's Keep Learning Together!
01:06

Unlock unlimited learning

Get instant access to all 26 lessons in this course, plus thousands of other premium courses. One subscription, unlimited knowledge.

Learn more about subscription

Comments

0 comments

Want to join the conversation?

Sign in to comment

Similar courses

dbt for Data Engineers

dbt for Data Engineers

Sources: Andreas Kretz
dbt (data build tool) is a data transformation tool with a priority on SQL. It allows for simple and transparent transformation, testing, and documentation...
1 hour 52 minutes 55 seconds
Statistics Bootcamp (with Python): Zero to Mastery

Statistics Bootcamp (with Python): Zero to Mastery

Sources: zerotomastery.io
Master statistics with Python through projects and quizzes. Learn with fun from industry experts. Ideal for careers in Data Analytics and Machine Learning.
20 hours 50 minutes 51 seconds
Getting Started with Embedded AI | Edge AI

Getting Started with Embedded AI | Edge AI

Sources: udemy
Nowadays, you may have heard of many keywords like Embedded AI /Embedded ML /Edge AI, the meaning behind them is the same, I.e. To make an AI algorithm or model
3 hours 33 minutes 42 seconds
Machine Learning & Containers on AWS

Machine Learning & Containers on AWS

Sources: Andreas Kretz
In this practical course, you will learn how to build a complete data pipeline on the AWS platform - from obtaining data from the Twitter API to analysis, stora
1 hour 33 minutes 34 seconds
Modern Data Warehouses & Data Lakes

Modern Data Warehouses & Data Lakes

Sources: Andreas Kretz
As a data engineer, you will regularly work with analytics platforms where companies store data in Data Lakes and Data Warehouses for building...
58 minutes 9 seconds