Apache Airflow Workflow Orchestration

1h 18m 41s
English
Paid

Apache Airflow is a platform-independent tool for workflow orchestration that provides extensive capabilities for creating and monitoring both streaming and batch pipelines. Even the most complex processes are easily implemented with its help—all with the support of key platforms and tools in the world of Data Engineering, including AWS, Google Cloud, and others.

Airflow not only allows for scheduling and managing processes but also tracking job execution in real-time, as well as quickly identifying and resolving errors.

In brief: today, Airflow is one of the most in-demand and "hyped" tools in the field of pipeline orchestration. It is actively used by companies worldwide, and knowledge of Airflow is becoming an important skill for any data engineer. This is especially relevant for students starting their journey in this field.

Read more about the course

Basic Concepts of Airflow

Introduction to the fundamentals of working with Airflow: you will learn how DAGs (Directed Acyclic Graphs) are created, what they consist of (operators, tasks), and how the architecture of Airflow is structured - including the database, scheduler, and web interface. We will also look at examples of event-driven pipelines that can be implemented using Airflow.

Installation and Environment Setup

In practice, you will work on a project dealing with weather data processing. The DAG will fetch data from a weather API, transform it, and store it in a Postgres database. You will learn how to:

  • configure the environment using Docker;
  • verify the web interface and container operations;
  • configure the API and create the necessary tables in the database.

Practice: Creating DAGs

You will thoroughly understand the Airflow interface and learn to monitor task statuses. Then you will:

  • create DAGs based on Airflow 2.0 that retrieve and process data;
  • master the Taskflow API - a modern approach to building DAGs with more convenient syntax;
  • implement parallel task execution (fanout) to run multiple processes simultaneously.

Watch Online Apache Airflow Workflow Orchestration

Join premium to watch
Go to premium
# Title Duration
1 Introduction 01:37
2 Airflow Usage 03:20
3 Fundamental Concepts 02:48
4 Airflow Architecture 03:10
5 Example Pipelines 04:50
6 Spotlight 3rd Party Operators 02:18
7 Airflow XComs 04:33
8 Project Setup 01:44
9 Docker Setup Explained 02:07
10 Docker Compose & Starting Containers 04:24
11 Checking Services 01:49
12 Setup WeatherAPI 01:34
13 Setup Postgres DB 01:59
14 Airflow Webinterface 04:38
15 Creating DAG With Airflow 2.0 09:47
16 Running our DAG 04:16
17 Creating DAG With TaskflowAPI 07:00
18 Getting Data From the API With SimpleHTTPOperator 03:39
19 Writing into Postgres 04:13
20 Parallel Processing 04:16
21 Recap & Outlook 04:39

Similar courses to Apache Airflow Workflow Orchestration

Complete linear algebra: theory and implementation

Complete linear algebra: theory and implementationudemy

Category: Python, Data processing and analysis
Duration 32 hours 53 minutes 26 seconds
Machine Learning & Containers on AWS

Machine Learning & Containers on AWSAndreas Kretz

Category: Data processing and analysis, Machine learning
Duration 1 hour 33 minutes 34 seconds
Streaming with Kafka & Spark

Streaming with Kafka & SparkAndreas Kretz

Category: Data processing and analysis
Duration 2 hours 46 minutes 25 seconds
Data Analysis with Pandas and Python

Data Analysis with Pandas and Pythonudemy

Category: Python, Data processing and analysis
Duration 19 hours 5 minutes 40 seconds
Deep Learning A-Z™: Hands-On Artificial Neural Networks

Deep Learning A-Z™: Hands-On Artificial Neural Networksudemy

Category: Python, Data processing and analysis
Duration 22 hours 36 minutes 30 seconds
Data Structures and Algorithmic Trading: Machine Learning

Data Structures and Algorithmic Trading: Machine Learningudemy

Category: Data processing and analysis
Duration 2 hours 20 minutes 32 seconds
PyTorch for Deep Learning and Computer Vision

PyTorch for Deep Learning and Computer Visionudemy

Category: Data processing and analysis
Duration 10 hours 20 minutes 51 seconds
Kamal Handbook

Kamal HandbookJosef Strzibny

Category: Other (Tools)
Duration
Choosing Data Stores

Choosing Data StoresAndreas Kretz

Category: Data processing and analysis
Duration 1 hour 25 minutes 31 seconds
Machine Learning in JavaScript with TensorFlow.js

Machine Learning in JavaScript with TensorFlow.jsudemy

Category: JavaScript, Data processing and analysis
Duration 6 hours 42 minutes 20 seconds