Skip to main content

Fundamentals of Apache Airflow

2h 21m 18s
English
Paid

Course description

This practical course starts with the basics and step by step guides you to building real orchestration scenarios - from task retries to integration with Spark and loading external data.

Moving data from point A to point B is only a small part of the task. It is important that data is delivered accurately, reliably, and automatically - and this is where Apache Airflow comes to the rescue.

You will learn how to transform chaotic, manually configured pipelines into well-organized workflows. We'll begin with understanding the architecture of Airflow and its key components, followed by mastering more advanced techniques: setting up retries, handling failures, using sensors, working with Apache Spark, and automatically loading data from external sources into a data lake.

The course is suitable for both beginner data engineers and those who want to improve their orchestration skills. You will receive real practical tools for creating scalable and reliable data processing systems.

Watch Online

This is a demo lesson (10:00 remaining)

You can watch up to 10 minutes for free. Subscribe to unlock all 27 lessons in this course and access 10,000+ hours of premium content across all courses.

View Pricing
0:00
/
#1: Introduction

All Course Lessons (27)

#Lesson TitleDurationAccess
1
Introduction Demo
07:20
2
What Is Apache Airflow?
05:19
3
Airflow’s Architecture
03:15
4
[Optional] What Is a Virtualenv?
06:37
5
[Optional] What Is Docker?
11:03
6
Installing Spark
05:51
7
Installing Airflow
06:33
8
Defining an Airflow DAG
08:03
9
Errors Handling
03:38
10
Idempotent Tasks
04:54
11
Creating a DAG - Part 1
04:58
12
Creating a DAG - Part 2
04:42
13
Handling Failed Tasks
04:09
14
[Exercise] Data Validation
04:31
15
[Exercise] Data Validation - Solution
03:27
16
Spark with Airflow
03:02
17
Using Spark with Airflow - Part 1
07:39
18
Using Spark with Airflow - Part 2
05:52
19
Sensors In Airflow
04:46
20
Using File Sensors
04:08
21
Data Ingestion
05:50
22
Reading Data From Postgres - Part 1
06:03
23
Reading Data from Postgres - Part 2
05:40
24
[Exercise] Average Customer Review
03:53
25
[Exercise] Average Customer Review - Solution
04:33
26
Advanced DAGs
04:26
27
Let's Keep Learning Together!
01:06

Unlock unlimited learning

Get instant access to all 26 lessons in this course, plus thousands of other premium courses. One subscription, unlimited knowledge.

Learn more about subscription

Comments

0 comments

Want to join the conversation?

Sign in to comment

Similar courses

Machine Learning & Containers on AWS

Machine Learning & Containers on AWS

Sources: Andreas Kretz
In this practical course, you will learn how to build a complete data pipeline on the AWS platform - from obtaining data from the Twitter API to analysis, stora
1 hour 33 minutes 34 seconds
dbt for Data Engineers

dbt for Data Engineers

Sources: Andreas Kretz
dbt (data build tool) is a data transformation tool with a priority on SQL. It allows for simple and transparent transformation, testing, and documentation...
1 hour 52 minutes 55 seconds
Statistics for Data Science and Business Analysis

Statistics for Data Science and Business Analysis

Sources: udemy
Is statistics a driving force in the industry you want to enter? Do you want to work as a Marketing Analyst, a Business Intelligence Analyst, a Data Analyst, or
4 hours 49 minutes 30 seconds
Azure Data Pipelines with Terraform

Azure Data Pipelines with Terraform

Sources: Andreas Kretz
Azure is becoming an increasingly popular platform for companies using the Microsoft365 ecosystem. If you want to enhance your data engineering skills...
4 hours 20 minutes 29 seconds
Data Platform & Pipeline Design

Data Platform & Pipeline Design

Sources: Andreas Kretz
Data pipelines are a key component of any Data Science platform. Without them, data loading and machine learning model deployment would not be possible. This...
1 hour 59 minutes 5 seconds