Skip to main content
CF

Apache Airflow Workflow Orchestration

1h 18m 41s
English
Paid

Apache Airflow Workflow Orchestration is a 21-lesson 1 hour 18 minutes self-paced course by Andreas Kretz. Apache Airflow is a versatile, platform-independent tool for workflow orchestration , offering extensive capabilities for creating and monitoring both streaming and batch pipelines.

Course facts

Lessons
21
Duration
1 hour 18 minutes
Level
All levels
Language
English
Updated
Instructor
Andreas Kretz
Price
Premium

Apache Airflow is a versatile, platform-independent tool for workflow orchestration, offering extensive capabilities for creating and monitoring both streaming and batch pipelines. With its comprehensive features, even the most complex processes can be implemented seamlessly. Airflow is supported by key platforms and tools in the Data Engineering world, such as AWS and Google Cloud.

Airflow not only provides scheduling and management of processes but also enables real-time tracking of job execution, allowing for swift identification and resolution of errors.

In brief: Airflow is currently one of the most in-demand and "hyped" tools in pipeline orchestration. It is widely adopted by companies globally, and knowledge of Airflow is fast becoming an essential skill for data engineers, particularly for students beginning their career in this field.

Basic Concepts of Airflow

This section introduces you to the fundamentals of working with Airflow. You will learn how DAGs (Directed Acyclic Graphs) are created, what they consist of (operators, tasks), and how the architecture of Airflow is structured, including the database, scheduler, and web interface. We will also examine examples of event-driven pipelines that can be implemented using Airflow.

Installation and Environment Setup

In this practical module, you will work on a project involving weather data processing. The DAG will fetch data from a weather API, transform it, and store it in a Postgres database. You will gain skills in:

  • Configuring the environment using Docker;
  • Verifying the web interface and container operations;
  • Configuring the API and creating the necessary tables in the database.

Practice: Creating DAGs

In this hands-on practice session, you will delve into the Airflow interface and learn to monitor task statuses effectively. You will:

  • Create DAGs based on Airflow 2.0 that retrieve and process data;
  • Master the Taskflow API—a modern approach to building DAGs with more convenient syntax;
  • Implement parallel task execution (fanout) to run multiple processes simultaneously.

Who teaches Apache Airflow Workflow Orchestration? Andreas Kretz

Andreas Kretz thumbnail

Andreas Kretz is a German data engineer and one of the most widely followed independent voices on data engineering as a career discipline. He runs the Plumbers of Data Science brand and has been publishing tutorial material continuously since the field consolidated around the modern lake-house stack (Spark, Kafka, Snowflake, Databricks, Airflow).

His CourseFlix listing is the largest single-author catalog under this source — over thirty courses spanning data-pipeline construction, streaming architectures, the cloud-native data stack on AWS / Azure / GCP, the Python and Scala tooling that dominates the field, and the soft-skills / career side of breaking into data engineering. Material is paid and aimed at engineers transitioning into data work or already-working data engineers picking up specific tools.

What lessons are included in Apache Airflow Workflow Orchestration?

This is a demo lesson (10:00 remaining)

You can watch up to 10 minutes for free. Subscribe to unlock all 21 lessons in this course and access 10,000+ hours of premium content across all courses.

View Pricing
0:00
/
#1: Introduction
All Course Lessons (21)
#Lesson TitleDurationAccess
1
Introduction Demo
01:37
2
Airflow Usage
03:20
3
Fundamental Concepts
02:48
4
Airflow Architecture
03:10
5
Example Pipelines
04:50
6
Spotlight 3rd Party Operators
02:18
7
Airflow XComs
04:33
8
Project Setup
01:44
9
Docker Setup Explained
02:07
10
Docker Compose & Starting Containers
04:24
11
Checking Services
01:49
12
Setup WeatherAPI
01:34
13
Setup Postgres DB
01:59
14
Airflow Webinterface
04:38
15
Creating DAG With Airflow 2.0
09:47
16
Running our DAG
04:16
17
Creating DAG With TaskflowAPI
07:00
18
Getting Data From the API With SimpleHTTPOperator
03:39
19
Writing into Postgres
04:13
20
Parallel Processing
04:16
21
Recap & Outlook
04:39
Unlock unlimited learning

Get instant access to all 20 lessons in this course, plus thousands of other premium courses. One subscription, unlimited knowledge.

Learn more about subscription

What courses are similar to Apache Airflow Workflow Orchestration?

Frequently asked questions

What are the prerequisites for this course?
The course assumes a basic understanding of data engineering concepts and familiarity with Python programming. Prior experience with Docker is beneficial, as the course includes a module on Docker setup and using Docker Compose to start containers.
What projects will I build during the course?
You will work on a project involving weather data processing, where you will create a DAG to fetch data from a weather API, transform it, and store it in a Postgres database. This project helps in understanding the practical implementation of Airflow's capabilities.
Who is the target audience for this course?
The course is designed for aspiring data engineers and professionals looking to enhance their skills in workflow orchestration using Apache Airflow. It is particularly beneficial for those beginning their careers or transitioning into data engineering roles.
What specific platforms are covered in the course?
The course covers setting up and using Apache Airflow with AWS and Google Cloud, which are key platforms in the data engineering landscape. It also includes using Docker for environment setup.
How does the depth of this course compare to similar courses?
This course offers comprehensive coverage of Airflow, including its architecture, DAG creation, and real-time tracking of job execution. It includes practical modules on using third-party operators and parallel processing, providing a thorough grounding in Airflow's capabilities.
What is the estimated time commitment for this course?
The course comprises 21 lessons. Although the total runtime is not specified, students should allocate additional time for practical exercises, particularly the hands-on project involving weather data processing, which requires setting up a Postgres database and Docker containers.
What topics are not covered in this course?
The course does not cover advanced data engineering topics beyond the scope of workflow orchestration with Apache Airflow, such as detailed machine learning model deployment or deep dive into cloud-specific data services beyond the basics of AWS and Google Cloud integration.