Azure is becoming an increasingly popular platform for companies using the Microsoft365 ecosystem. If you want to enhance your data engineering skills, the ability to work with Azure and automate infrastructure using Terraform are key competencies. That is why we created this course "Azure ETL with Terraform".
Course Overview
In a practical project, you will learn how to build a comprehensive data processing solution in Azure, combining the capabilities of Terraform, Azure Data Factory, Synapse Analytics, and Power BI.
Automated ETL Process
You will create a fully automated ETL process:
- Extract data from an external API
- Process it using powerful Azure tools
- Prepare the data for visualization
Data Architecture Implementation
During the course, you will implement Lakehouse and Medallion architecture (Bronze, Silver, Gold layers) to make your pipeline efficient and scalable.
By the end of the course, you will not only master the principles of building modern data pipelines and infrastructure automation but also gain a comprehensive practical project for your portfolio.
What You Will Learn in the Course
Introduction to Azure and Terraform
Get acquainted with Azure's role in the modern data landscape and key services for data engineers: Data Factory, Data Lake, and Synapse Analytics. Understand how Terraform helps manage infrastructure resources as code (IaC), making their creation and maintenance scalable and reliable.
Practical Setup
Install Terraform and configure it to work with Azure. Create a Service Principal, set up authentication for secure automated resource deployment, and prepare a working environment for resource management.
Basics of Terraform
Understand the structure of a Terraform project and learn the basic commands and principles of modular development. You will learn to:
- Deploy Azure Data Factory for pipeline orchestration
- Configure Azure Data Lake Storage for data storage (Bronze layer)
- Deploy Synapse Analytics for data processing
- Write reusable and scalable code in Terraform
Real Deployment
Start deploying pipeline components: connect Azure Data Factory to an external Soccer API for data loading, and configure Azure Data Lake for storing raw data. You will learn to combine manual and automated approaches as done in real projects.
CI/CD for Infrastructure
Understand how to apply CI/CD principles for infrastructure using Terraform and Azure DevOps. Learn:
- Continuous Integration (CI): automatic build, testing, and code verification
- Continuous Deployment (CD): automatic infrastructure deployment and application updates
- Integrate Terraform into CI/CD pipelines to ensure your deployments are stable, repeatable, and fast
What's Next
In the next parts of the course, you will dive deeper into:
- API integration (using the Soccer API as an example)
- Advanced features of Azure Data Factory for batch data processing
- Advanced data processing in Synapse Spark
- Optimizing Lakehouse architecture for handling large volumes of data and team collaboration
- Full automation of deployment pipelines for replicating infrastructure across different environments