Skip to main content

Apache Airflow Workflow Orchestration

1h 18m 41s
English
Paid

Course description

Apache Airflow is a platform-independent tool for workflow orchestration that provides extensive capabilities for creating and monitoring both streaming and batch pipelines. Even the most complex processes are easily implemented with its help—all with the support of key platforms and tools in the world of Data Engineering, including AWS, Google Cloud, and others.

Airflow not only allows for scheduling and managing processes but also tracking job execution in real-time, as well as quickly identifying and resolving errors.

In brief: today, Airflow is one of the most in-demand and "hyped" tools in the field of pipeline orchestration. It is actively used by companies worldwide, and knowledge of Airflow is becoming an important skill for any data engineer. This is especially relevant for students starting their journey in this field.

Read more about the course

Basic Concepts of Airflow

Introduction to the fundamentals of working with Airflow: you will learn how DAGs (Directed Acyclic Graphs) are created, what they consist of (operators, tasks), and how the architecture of Airflow is structured - including the database, scheduler, and web interface. We will also look at examples of event-driven pipelines that can be implemented using Airflow.

Installation and Environment Setup

In practice, you will work on a project dealing with weather data processing. The DAG will fetch data from a weather API, transform it, and store it in a Postgres database. You will learn how to:

  • configure the environment using Docker;
  • verify the web interface and container operations;
  • configure the API and create the necessary tables in the database.

Practice: Creating DAGs

You will thoroughly understand the Airflow interface and learn to monitor task statuses. Then you will:

  • create DAGs based on Airflow 2.0 that retrieve and process data;
  • master the Taskflow API - a modern approach to building DAGs with more convenient syntax;
  • implement parallel task execution (fanout) to run multiple processes simultaneously.

Watch Online

This is a demo lesson (10:00 remaining)

You can watch up to 10 minutes for free. Subscribe to unlock all 21 lessons in this course and access 10,000+ hours of premium content across all courses.

View Pricing
0:00
/
#1: Introduction

All Course Lessons (21)

#Lesson TitleDurationAccess
1
Introduction Demo
01:37
2
Airflow Usage
03:20
3
Fundamental Concepts
02:48
4
Airflow Architecture
03:10
5
Example Pipelines
04:50
6
Spotlight 3rd Party Operators
02:18
7
Airflow XComs
04:33
8
Project Setup
01:44
9
Docker Setup Explained
02:07
10
Docker Compose & Starting Containers
04:24
11
Checking Services
01:49
12
Setup WeatherAPI
01:34
13
Setup Postgres DB
01:59
14
Airflow Webinterface
04:38
15
Creating DAG With Airflow 2.0
09:47
16
Running our DAG
04:16
17
Creating DAG With TaskflowAPI
07:00
18
Getting Data From the API With SimpleHTTPOperator
03:39
19
Writing into Postgres
04:13
20
Parallel Processing
04:16
21
Recap & Outlook
04:39

Unlock unlimited learning

Get instant access to all 20 lessons in this course, plus thousands of other premium courses. One subscription, unlimited knowledge.

Learn more about subscription

Comments

0 comments

Want to join the conversation?

Sign in to comment

Similar courses

Introduction to Excel Automation: Excel Macros and VBA

Introduction to Excel Automation: Excel Macros and VBA

Sources: zerotomastery.io
Studying macros and VBA (Visual Basic for Applications) for Excel is an indispensable skill for anyone who regularly works with data, performs repetitive tasks.
2 hours 44 minutes 10 seconds
OpenSeadragon Deep Dive

OpenSeadragon Deep Dive

Sources: newline (ex fullstack.io)
In this course, you will learn how to prepare and publish giant images on the web using OpenSeadragon scaling technology. The course is designed for web...
51 minutes 8 seconds
Complete Machine Learning and Data Science: Zero to Mastery

Complete Machine Learning and Data Science: Zero to Mastery

Sources: udemy, zerotomastery.io
This is a brand new Machine Learning and Data Science course just launched January 2020 and updated this month with the latest trends and skills! Become a complete Data Scientis...
43 hours 22 minutes 23 seconds
Scale React Development with Nx

Scale React Development with Nx

Sources: egghead
On the surface, starting a project sounds easy. First you make some directories, install some dependencies, then you write some code. But there's a bit more to
1 hour 34 minutes 10 seconds
Supercharge Excel with Dynamic Array Functions & Formulas

Supercharge Excel with Dynamic Array Functions & Formulas

Sources: zerotomastery.io
Dynamic arrays in Excel are a real breakthrough! With their help, you can simplify the execution of complex data tasks, make Excel faster, smarter, and...
3 hours 8 minutes 53 seconds