Skip to main content
CF

Azure Data Pipelines with Terraform

4h 20m 29s
English
Paid

Azure is becoming an increasingly popular platform for companies using the Microsoft365 ecosystem. If you want to enhance your data engineering skills, the ability to work with Azure and automate infrastructure using Terraform are key competencies. That is why we created this course "Azure ETL with Terraform".

Course Overview

In a practical project, you will learn how to build a comprehensive data processing solution in Azure, combining the capabilities of Terraform, Azure Data Factory, Synapse Analytics, and Power BI.

Automated ETL Process

You will create a fully automated ETL process:

  • Extract data from an external API
  • Process it using powerful Azure tools
  • Prepare the data for visualization

Data Architecture Implementation

During the course, you will implement Lakehouse and Medallion architecture (Bronze, Silver, Gold layers) to make your pipeline efficient and scalable.

By the end of the course, you will not only master the principles of building modern data pipelines and infrastructure automation but also gain a comprehensive practical project for your portfolio.

What You Will Learn in the Course

Introduction to Azure and Terraform

Get acquainted with Azure's role in the modern data landscape and key services for data engineers: Data Factory, Data Lake, and Synapse Analytics. Understand how Terraform helps manage infrastructure resources as code (IaC), making their creation and maintenance scalable and reliable.

Practical Setup

Install Terraform and configure it to work with Azure. Create a Service Principal, set up authentication for secure automated resource deployment, and prepare a working environment for resource management.

Basics of Terraform

Understand the structure of a Terraform project and learn the basic commands and principles of modular development. You will learn to:

  • Deploy Azure Data Factory for pipeline orchestration
  • Configure Azure Data Lake Storage for data storage (Bronze layer)
  • Deploy Synapse Analytics for data processing
  • Write reusable and scalable code in Terraform

Real Deployment

Start deploying pipeline components: connect Azure Data Factory to an external Soccer API for data loading, and configure Azure Data Lake for storing raw data. You will learn to combine manual and automated approaches as done in real projects.

CI/CD for Infrastructure

Understand how to apply CI/CD principles for infrastructure using Terraform and Azure DevOps. Learn:

  • Continuous Integration (CI): automatic build, testing, and code verification
  • Continuous Deployment (CD): automatic infrastructure deployment and application updates
  • Integrate Terraform into CI/CD pipelines to ensure your deployments are stable, repeatable, and fast

What's Next

In the next parts of the course, you will dive deeper into:

  • API integration (using the Soccer API as an example)
  • Advanced features of Azure Data Factory for batch data processing
  • Advanced data processing in Synapse Spark
  • Optimizing Lakehouse architecture for handling large volumes of data and team collaboration
  • Full automation of deployment pipelines for replicating infrastructure across different environments

Additional

https://github.com/team-data-science/azure-terraform

About the Author: Andreas Kretz

Andreas Kretz thumbnail

Andreas Kretz is a German data engineer and one of the most widely followed independent voices on data engineering as a career discipline. He runs the Plumbers of Data Science brand and has been publishing tutorial material continuously since the field consolidated around the modern lake-house stack (Spark, Kafka, Snowflake, Databricks, Airflow).

His CourseFlix listing is the largest single-author catalog under this source — over thirty courses spanning data-pipeline construction, streaming architectures, the cloud-native data stack on AWS / Azure / GCP, the Python and Scala tooling that dominates the field, and the soft-skills / career side of breaking into data engineering. Material is paid and aimed at engineers transitioning into data work or already-working data engineers picking up specific tools.

Watch Online 39 lessons

This is a demo lesson (10:00 remaining)

You can watch up to 10 minutes for free. Subscribe to unlock all 39 lessons in this course and access 10,000+ hours of premium content across all courses.

View Pricing
0:00
/
#1: Introduction
All Course Lessons (39)
#Lesson TitleDurationAccess
1
Introduction Demo
01:52
2
Software Setup
04:32
3
Introduction to Azure
01:44
4
Managing Azure
10:52
5
Introduction to Terraform
02:38
6
Terraform Setup on Azure
03:49
7
Terraform Project Structure
06:44
8
Terraform Commands
09:03
9
Backend Deployment
01:40
10
Terraform Modules
09:39
11
Service Principle Deployment
05:18
12
Why CI/CD
05:18
13
CI/CD Process Basics
04:55
14
CI/CD Steps
05:28
15
CI/CD Workflow Example
05:24
16
CI/CD Bascis Summary
01:23
17
Azure CI/CD Pipelines Terminology
10:22
18
Single YAML Pipeline Approach
07:31
19
Azure Dev Ops & Azure Cloud setup
08:27
20
CI/CD Pipeline Implementation
11:58
21
Pipeline Source Code explained & Job Analysis
14:08
22
Executing the CI/CD Pipeline
02:20
23
API Introduction
11:03
24
Azure Data Factory Introduction
05:39
25
Azure Data Factory Components
04:05
26
Working with Data Factory - 1
04:47
27
Working with Data Factory - 2
08:00
28
Working with Data Factory - 3
10:38
29
Working with Data Factory - 4
08:43
30
Introduction to Databricks
06:27
31
Databricks Infrastructure Setup - 1
11:03
32
Databricks Infrastructure Setup - 2
04:22
33
Databricks Infrastructure Setup - 3
04:52
34
The Databricks User Interface
08:10
35
End-To-End Pipeline Execution - 1
05:28
36
End-To-End Pipeline Execution - 2
04:34
37
End-To-End Pipeline Execution - 3
15:09
38
End-To-End Pipeline Execution - 4
06:41
39
End-To-End Pipeline Execution - 5
05:43
Unlock unlimited learning

Get instant access to all 38 lessons in this course, plus thousands of other premium courses. One subscription, unlimited knowledge.

Learn more about subscription

Related courses

Frequently asked questions

What are the prerequisites for enrolling in this course?
Before enrolling, it's beneficial to have a basic understanding of cloud platforms, especially Azure, as the course involves working with Azure Data Factory, Synapse Analytics, and other Azure services. Familiarity with data engineering concepts and some experience with Terraform or infrastructure as code (IaC) will also help you follow along more effectively.
What practical skills will I gain from completing this course?
You will learn how to build a fully automated ETL process using Azure tools, implement Lakehouse and Medallion architecture to optimize data pipelines, and manage infrastructure with Terraform. These skills will be reinforced through a practical project that includes using Azure Data Factory, Synapse Analytics, and Power BI.
Who is the target audience for this course?
The course is designed for data engineers and IT professionals who want to enhance their skills in cloud-based data processing and infrastructure automation using Azure and Terraform. It's particularly suited for those working within the Microsoft365 ecosystem or looking to expand their capabilities in data pipeline construction.
How does the scope of this course compare to other data engineering courses?
This course offers a focused exploration of building data pipelines in Azure using Terraform. Unlike broader data engineering courses, it specifically teaches the integration of Azure services like Data Factory and Synapse Analytics with Terraform's infrastructure automation capabilities, providing a project-based learning experience.
What specific tools and platforms will I use in this course?
The course utilizes Azure's Data Factory, Synapse Analytics, and Power BI for data processing and visualization. You'll also work extensively with Terraform for infrastructure management and Azure DevOps for continuous integration and deployment pipelines.
What topics are not covered in this course?
The course does not cover non-Azure cloud platforms or non-Terraform infrastructure as code tools like AWS or Ansible. It also doesn't delve into non-data engineering aspects of Azure, such as Azure Machine Learning or Azure Kubernetes Service.
How much time should I expect to commit to this course?
The course consists of 39 lessons that cover various components of building and managing data pipelines in Azure. While the total runtime is not specified, expect to spend several hours on video content and additional time on practical exercises and project work.