Apache Iceberg Fundamentals
33m 32s
English
Paid
Course description
Modern data platforms need the flexibility of data lakes and the reliability of warehouses. Apache Iceberg combines both approaches. In this course, you will understand how this powerful open table format works, study its architecture, and learn to use its key features: schema evolution, "time travel," and high-performance analytics in Lakehouse systems.
The course is based on practical examples from real data engineering. You will set up a local lab with Docker, Spark, and MinIO, create and manage Iceberg tables. From data recording and metadata analysis to query optimization and partition restructuring – you will gain the experience necessary for confidently working with Iceberg in a production environment.
By the end of the course, you will not only understand how Iceberg is structured internally but also have a working environment, ready-made notebooks for projects, and a deep understanding of table operations that are critically important for Lakehouse architecture.
Read more about the course
Why Iceberg?
Iceberg addresses long-standing issues of big data: slow queries, complex schema changes, and the tight coupling of storage with computing systems. You'll learn why companies like Netflix, Stripe, and Apple have chosen Iceberg for their platforms and how to apply these approaches in your own setup.
What you will do:
- Build a local Lakehouse lab based on Iceberg using Docker Compose, Spark, REST catalog, and MinIO.
- Create your first Iceberg table using a fun dataset (like one with Pokémon), define the schema, write data through PySpark, and explore how Iceberg manages metadata.
- Master schema evolution: adding, renaming, and changing column types, as well as advanced partitioning techniques.
- Learn to perform point-in-time operations (such as deleting rows) and use the "time travel" feature to analyze past versions of data.
- Dive into Iceberg's architecture: parquet files, manifests, snapshots, and catalogs.
- Use the MinIO UI to see how data and metadata are physically stored.
- Run analytical SQL queries on Iceberg tables through PySpark, using familiar operations like join, group by, and filter.
Watch Online
Watch Online Apache Iceberg Fundamentals
0:00
/ #1: Intro
All Course Lessons (12)
| # | Lesson Title | Duration | Access |
|---|---|---|---|
| 1 | Intro Demo | 01:07 | |
| 2 | Goals | 01:03 | |
| 3 | Challenges | 04:10 | |
| 4 | Iceberg & Lakehouses | 01:42 | |
| 5 | Architecture Deep Dive | 02:02 | |
| 6 | Iceberg Features | 02:45 | |
| 7 | Architecture & Summary | 02:51 | |
| 8 | Setup & Docker | 03:31 | |
| 9 | Spark Iceberg Config | 02:31 | |
| 10 | Write data to Iceberg | 01:32 | |
| 11 | Inspect metadata & schema eval | 08:41 | |
| 12 | Inspect data on MinIO & Outro | 01:37 |
Unlock unlimited learning
Get instant access to all 11 lessons in this course, plus thousands of other premium courses. One subscription, unlimited knowledge.
Learn more about subscriptionComments
0 commentsSimilar courses

Case Study in Causal Analysis
Sources: LunarTech
This course offers unique opportunities for students striving to master methods of causal analysis. This course is designed to inspire...
2 hours 3 minutes 34 seconds

PyTorch for Deep Learning
Sources: zerotomastery.io
Master PyTorch for deep learning with a step-by-step course. Build real-world projects and enhance your skills to become a Deep Learning Engineer.
52 hours 27 seconds

Case Study in Product Data Science
Sources: LunarTech
This is a course that offers unique opportunities for students seeking to master key aspects of data analysis in product development. The course...
1 hour 4 minutes 47 seconds

The Data Engineering Bootcamp: Zero to Mastery
Sources: zerotomastery.io
Learn to build streaming pipelines with Apache Kafka and Flink, create data lakes on AWS, run ML workflows on Spark, and integrate LLM models into...
13 hours 23 minutes 15 seconds

Data Analysis with Pandas and Python
Sources: udemy
Welcome to the most comprehensive Pandas course available on Udemy! An excellent choice for both beginners and experts looking to expand their knowledge on one of the most popul...
19 hours 5 minutes 40 seconds
Want to join the conversation?
Sign in to comment