Data Platform & Pipeline Design
Data pipelines are a key element of any Data Science platform. Without them, neither data loading nor the running of machine learning models would be possible. This practical course lasting 170 minutes will teach you how to create streaming, batch, and ML pipelines using proven templates and examples for popular cloud platforms.
Read more about the course
Basic Module
Fundamentals of Platforms and Pipelines
You will get acquainted with platform architectures and different types of pipelines. You will learn how they differ, how they work, what a machine learning pipeline looks like, and how to integrate them within a single system.
Platform Architecture and End-to-End Pipeline
You will understand the structure of a typical platform architecture: connection, buffering, processing, storage, and data visualization. By examining an end-to-end pipeline, you will learn how to apply this structure in your work.
Push and Pull Pipelines
You will understand the difference between the push and pull model of data transmission—sending versus fetching. Includes illustrative examples and diagrams.
Batch and Streaming Pipelines
This is one of the most important blocks for a data engineer. You will learn to distinguish and apply batch and streaming processing depending on the scenario.
Data Streams Visualization
You will understand how to visualize data processing and storage—even if you don't have direct access to them. An example with Apache Spark will help reinforce the material.
Lambda Architecture
You will learn how batch and stream pipelines are integrated within a single platform—especially important for ML, where training is done on batch data and application through streaming.
Platform Examples
You will study architecture templates on AWS, GCP, Azure, and Hadoop, where you will see how tools like Lambda, API Gateway, and DynamoDB fit into the real infrastructure.
Advanced Module
Processing Models: Event-Driven, Batch, and Stream
You will understand the differences between event-driven, batch, micro-batching, and streaming. Learn how to choose the appropriate processing type for tasks: analytics, transactions, reverse ETL, and more.
Targeted Design and Platform Schema Replication
You will revisit the platform schema and learn to align business goals and data types with architectural solutions. Instead of choosing tools "by feel," you will learn to design the system from the task.
Modern Architectures: Lakehouse and Medallion
You will learn how Lakehouse combines file storage and transactional tables, and how bronze-silver-gold layers in the Medallion architecture help maintain order and scalability.
Machine Learning and Generative AI (GenAI)
You will learn how machine learning pipelines integrate into the platform: where training, inference, and deployment occur. Get acquainted with the concepts of semantic search and Retrieval-Augmented Generation (RAG)—the foundation of modern AI applications.
Platform Testing
A brief but important module: testing strategies for pipelines at all stages—from loading and processing to data transformation.
This course will give you a comprehensive understanding of platforms and pipelines and will teach you how to build efficient architecture applicable in real cloud solutions. It is ideal for both beginner engineers and those who want to advance to the next level.
Watch Online Data Platform & Pipeline Design
# | Title | Duration |
---|---|---|
1 | Introduction & Contents | 03:14 |
2 | The Platform Blueprint | 10:12 |
3 | Data Engineering Tools Guide | 02:45 |
4 | End to End Pipeline Example | 06:19 |
5 | Push Ingestion Pipelines | 03:43 |
6 | Pull Ingestion Pipelines | 03:35 |
7 | Batch Pipelines | 03:08 |
8 | Streaming Pipelines | 03:35 |
9 | Stream Analytics | 02:27 |
10 | Lambda Architecture | 04:03 |
11 | Visualization Pipelines | 03:48 |
12 | Visualization with Hive & Spark on Hadoop | 06:22 |
13 | Visualization Data via Spark Thrift Server | 03:28 |
14 | Part 2 introduction | 01:17 |
15 | Core Use Cases in Platform Design: Transactions, Analytics, and Reverse ETL | 02:58 |
16 | Blueprint Recap: Mapping Tools Across the Modern Data Platform | 03:32 |
17 | Demystifying Event-Driven, Batch, and Streaming Workflows in Data Platforms | 08:11 |
18 | Micro-Batching vs. Streaming: What’s the Real Difference? | 04:56 |
19 | Connecting Sources to Goals: Batch and Stream Processing in a Data Platform | 06:29 |
20 | Building Blocks of a Modern Data Platform: Components, Storage, and Processing | 03:10 |
21 | Before the Tech: How Data and Goals Shape Your Data Platform | 10:10 |
22 | Lakehouse Architecture Explained: From Raw Files to Transactional Tables | 03:35 |
23 | How Machine Learning Fits into Data Platforms: Training, Inference, and Deployment | 06:24 |
24 | From Embeddings to Answers: Understanding Semantic Search and Retrieval-Augmented Generation | 06:07 |
25 | Testing in the Modern Data Platform: From Ingestion to Transformation | 03:11 |
26 | Understanding the Medallion Architecture: Bronze, Silver, and Gold Layers in Data Warehousing | 02:26 |
Read Book Data Platform & Pipeline Design
# | Title |
---|---|
1 | Hadoop Course Contents |
2 | GCP Course Contents.key |
3 | Platform & Pipeline Design questions |
4 | Tools Guide Academy |
Similar courses to Data Platform & Pipeline Design

Relational Data ModelingEka Ponkratova

SQL & Database Design A-Z™: Learn MS SQL Server + PostgreSQLudemy

Case Study in A/B TestingLunarTech

Time Series Analysis, Forecasting, and Machine Learningudemy

Building APIs with FastAPIAndreas Kretz

Machine Learning: Natural Language Processing in Python (V2)udemy

Getting Started with Embedded AI | Edge AIudemy

Deep Learning A-Z™: Hands-On Artificial Neural Networksudemy

Mathematical Foundations of Machine Learningudemy
