Apache Iceberg Fundamentals
Why Iceberg?
Iceberg addresses long-standing issues of big data: slow queries, complex schema changes, and the tight coupling of storage with computing systems. You'll learn why companies like Netflix, Stripe, and Apple have chosen Iceberg for their platforms and how to apply these approaches in your own setup.
What you will do:
- Build a local Lakehouse lab based on Iceberg using Docker Compose, Spark, REST catalog, and MinIO.
- Create your first Iceberg table using a fun dataset (like one with Pokémon), define the schema, write data through PySpark, and explore how Iceberg manages metadata.
- Master schema evolution: adding, renaming, and changing column types, as well as advanced partitioning techniques.
- Learn to perform point-in-time operations (such as deleting rows) and use the "time travel" feature to analyze past versions of data.
- Dive into Iceberg's architecture: parquet files, manifests, snapshots, and catalogs.
- Use the MinIO UI to see how data and metadata are physically stored.
- Run analytical SQL queries on Iceberg tables through PySpark, using familiar operations like join, group by, and filter.
About the Author: David Reger
David Reger is a Cloud Data Engineer at MSG Systems, where he develops scalable Lakehouse platforms based on Azure, Databricks, and open-source technologies such as Apache Spark and Iceberg. His experience spans IoT, data integration, and architecture design, enabling him to combine deep theoretical knowledge with practical approaches. David is passionate about helping engineers master modern data tools and sharing knowledge gained from real projects.
Watch Online 12 lessons
| # | Lesson Title | Duration | Access |
|---|---|---|---|
| 1 | Intro Demo | 01:07 | |
| 2 | Goals | 01:03 | |
| 3 | Challenges | 04:10 | |
| 4 | Iceberg & Lakehouses | 01:42 | |
| 5 | Architecture Deep Dive | 02:02 | |
| 6 | Iceberg Features | 02:45 | |
| 7 | Architecture & Summary | 02:51 | |
| 8 | Setup & Docker | 03:31 | |
| 9 | Spark Iceberg Config | 02:31 | |
| 10 | Write data to Iceberg | 01:32 | |
| 11 | Inspect metadata & schema eval | 08:41 | |
| 12 | Inspect data on MinIO & Outro | 01:37 |
Get instant access to all 11 lessons in this course, plus thousands of other premium courses. One subscription, unlimited knowledge.
Learn more about subscription