Modern Data Warehouses & Data Lakes

58m 9s
English
Paid
As a data engineer, you will regularly work with analytical platforms where companies store data in Data Lakes and Data Warehouses for building visualizations and creating machine learning models. Modern data warehouses, such as AWS Redshift, Google BigQuery, and Snowflake, allow you to load data directly from files in a Data Lake. This integration makes working with warehouses flexible and convenient for analytical tasks.
Read more about the course

In this course you will learn:

  • How to use Data Lakes, Data Warehouses, and BI tools in a unified system
  • How to load data into Data Lakes and visualize it in reports
  • How to build integrations in Google Cloud Platform and AWS
  • How ETL/ELT architecture works and how to apply it in modern data warehouses

Basics of Data Warehouses and Data Lakes

  • The role of data warehouses in analytical platforms
  • How data is loaded into Data Warehouse through ETL/ELT
  • What Data Lakes are and how to use them
  • How to work with files directly in the data lake

Practice on GCP: Cloud Storage, BigQuery, and Data Studio

  • Setting up Cloud Storage, creating a table in BigQuery
  • Data visualization in Data Studio
  • Understanding the general principles of cloud platforms

Practice on AWS: S3, Athena, Glue, and Quicksight

  • Creating data integration through S3, Athena, and Quicksight
  • Setting up Glue Data Catalog for data management
  • Detailed setup and integration of Glue

Summary and bonus lesson: AWS Redshift Spectrum

  • Course summary
  • Additional module on working with Redshift Spectrum using the prepared Data Catalog from the AWS project

Required knowledge

  • Basics of working with Data Warehouses (it is recommended to take the "Data Warehouses" course in the academy)
  • Basic knowledge of AWS Athena and Redshift (for the block with Redshift Spectrum, a prepared Data Catalog from the AWS project is used)

This course will help you master modern approaches to building data storage and processing systems and learn how to effectively use the capabilities of Data Lakes and Data Warehouses for analytics.

Watch Online Modern Data Warehouses & Data Lakes

Join premium to watch
Go to premium
# Title Duration
1 Introduction 02:14
2 Data Science Platform 04:11
3 ETL & ELT Data Warehouse 06:23
4 Data Lake & Data Warehouse integration 03:30
5 GCP & AWS Piplines we build 03:15
6 GCP hands on Cloud Storage & BigQuery 08:36
7 GCP hands on create Data Studio dashboard 07:34
8 GCP Recap & AWS goals 02:13
9 AWS Setup & upload data to S3 02:13
10 Athena Data Lake manual table configuration 03:49
11 Creating a Quicksight dashboard 05:06
12 Athena configuration using AWS Glue data catalog 03:30
13 Course recap 02:37
14 BONUS Configure Redshift Spectrum table with S3 02:58

Similar courses to Modern Data Warehouses & Data Lakes

Complete linear algebra: theory and implementation

Complete linear algebra: theory and implementationudemy

Category: Python, Data processing and analysis
Duration 32 hours 53 minutes 26 seconds
Deep Learning A-Z™: Hands-On Artificial Neural Networks

Deep Learning A-Z™: Hands-On Artificial Neural Networksudemy

Category: Python, Data processing and analysis
Duration 22 hours 36 minutes 30 seconds
Data Analysis for Beginners: Excel & Pivot Tables

Data Analysis for Beginners: Excel & Pivot Tableszerotomastery.io

Category: Data processing and analysis
Duration 2 hours 10 minutes 21 seconds
Storing & Visualizing Time Series Data

Storing & Visualizing Time Series DataAndreas Kretz

Category: Data processing and analysis
Duration 2 hours 11 minutes 34 seconds
The Data Science Course: Complete Data Science Bootcamp 2023

The Data Science Course: Complete Data Science Bootcamp 2023udemy

Category: Data processing and analysis
Duration 31 hours 14 minutes 14 seconds
Time Series Analysis, Forecasting, and Machine Learning

Time Series Analysis, Forecasting, and Machine Learningudemy

Category: Python, Data processing and analysis
Duration 22 hours 47 minutes 45 seconds
Relational Data Modeling

Relational Data ModelingEka Ponkratova

Category: Data processing and analysis
Duration 1 hour 52 minutes
Machine Learning: Natural Language Processing in Python (V2)

Machine Learning: Natural Language Processing in Python (V2)udemy

Category: Python, Data processing and analysis
Duration 22 hours 4 minutes 2 seconds
Introduction to Data Engineering 2025

Introduction to Data Engineering 2025Andreas Kretz

Category: Data processing and analysis
Duration 44 minutes 26 seconds
PyTorch for Deep Learning and Computer Vision

PyTorch for Deep Learning and Computer Visionudemy

Category: Data processing and analysis
Duration 10 hours 20 minutes 51 seconds