Data Engineering on AWS

4h 46m 38s
English
Paid

This course is the perfect start for those who want to master cloud technologies and begin working with Amazon Web Services (AWS), one of the most popular platforms for data processing. The course is especially useful for beginner data engineers and those seeking their first job in this field.

Throughout the course, you will create a fully-fledged end-to-end project based on data from an online store. Step by step, you will learn to model data, build pipelines, and work with key AWS tools: Lambda, API Gateway, Kinesis, DynamoDB, Redshift, Glue, and S3.

Read more about the course

What to expect in the course:

  • Data Work
    • Learn the structure and types of data you will be working with. Define the project goals - an important step for successful implementation.
  • Platform and Pipeline Design
    • Get acquainted with the platform architecture and design pipelines: for data loading, storage in S3 (Data Lake), processing in DynamoDB (NoSQL), and Redshift (Data Warehouse). Learn to build pipelines for interfaces and data streaming.
  • Basics of AWS
    • Create an account in AWS, understand access and security management (IAM), get introduced to CloudWatch and the Boto3 library for working with AWS through Python.
  • Data Ingestion Pipeline
    • Create an API via API Gateway, send data to Kinesis, configure IAM, and develop an ingestion pipeline in Python.
  • Data Transfer to S3 (Data Lake)
    • Set up a Lambda function to receive data from Kinesis and save it to S3.
  • Data Transfer to DynamoDB
    • Implement a pipeline for transferring data from Kinesis to DynamoDB - a fast NoSQL database.
  • API for Data Access
    • Create an API for working with data in the database. Learn why direct access from visualization to the database is a bad practice.
  • Data Visualization in Redshift
    • Send streaming data to Redshift via Kinesis Firehose, create a Redshift cluster, configure security, create tables, and set up Firehose. Connect Power BI to Redshift for data analysis.
  • Batch Processing: AWS Glue, S3, and Redshift
    • Master batch data processing: set up and run Glue to write data from S3 to Redshift, understand Crawler and data catalog, and learn to debug processes.

This course will help you gain practical experience in creating streaming and batch pipelines in AWS, as well as mastering key tools for working with cloud data.

Watch Online Data Engineering on AWS

Join premium to watch
Go to premium
# Title Duration
1 Important: Before you start! 00:31
2 Introduction 02:22
3 Data Engineering 04:16
4 Data Science Platform 05:21
5 Data Types You Encounter 03:04
6 What Is A Good Dataset 02:55
7 The Dataset We Use 03:17
8 Defining The Purpose 06:28
9 Relational Storage Possibilities 03:47
10 NoSQL Storage Possibilities 06:29
11 Selecting The Tools 03:50
12 Client 03:06
13 Connect 01:19
14 Buffer 01:30
15 Process 02:43
16 Store 03:42
17 Visualize 03:02
18 Data Ingestion Pipeline 03:01
19 Stream To Raw Storage Pipeline 02:20
20 Stream To DynamoDB Pipeline 03:10
21 Visualization API Pipeline 02:57
22 Visualization Redshift Data Warehouse Pipeline 05:30
23 Batch Processing Pipeline 03:20
24 Create An AWS Account 01:59
25 Things To Keep In Mind 02:46
26 IAM Identity & Access Management 04:08
27 Logging 02:23
28 AWS Python API Boto3 02:58
29 Development Environment 04:03
30 Create Lambda for API 02:34
31 Create API Gateway 08:31
32 Setup Kinesis 01:39
33 Setup IAM for API 05:01
34 Create Ingestion Pipeline (Code) 06:10
35 Create Script to Send Data 05:47
36 Test The Pipeline 04:54
37 Setup S3 Bucket 03:43
38 Configure IAM For S3 03:22
39 Create Lambda For S3 Insert 07:17
40 Test The Pipeline 04:02
41 Setup DynamoDB 09:01
42 Setup IAM For DynamoDB Stream 03:37
43 Create DynamoDB Lambda 09:21
44 Create API & Lambda For Access 06:11
45 Test The API 04:48
46 Setup Redshift Data Warehouse 08:09
47 Security Group For Firehose 03:13
48 Create Redshift Tables 05:52
49 S3 Bucket & jsonpaths.json 03:03
50 Configure Firehose 07:59
51 Debug Redshift Streaming 07:44
52 Bug-fixing 05:59
53 Power Bi 12:17
54 AWS Glue Basics 05:15
55 Glue Crawlers 13:10
56 Glue Jobs 13:44
57 Redshift Insert & Debugging 07:17
58 What We Achieved & Improvements 10:41

Similar courses to Data Engineering on AWS

Deep Learning A-Z™: Hands-On Artificial Neural Networks

Deep Learning A-Z™: Hands-On Artificial Neural Networksudemy

Category: Python, Data processing and analysis
Duration 22 hours 36 minutes 30 seconds
Case Study in Causal Analysis

Case Study in Causal AnalysisLunarTech

Category: Data processing and analysis
Duration 2 hours 3 minutes 34 seconds
Data Engineering on Databricks

Data Engineering on DatabricksAndreas Kretz

Category: Data processing and analysis
Duration 1 hour 27 minutes 29 seconds
Complete Machine Learning and Data Science: Zero to Mastery

Complete Machine Learning and Data Science: Zero to Masteryudemyzerotomastery.io

Category: Data processing and analysis
Duration 43 hours 22 minutes 23 seconds
Platform & Pipeline Security

Platform & Pipeline SecurityAndreas Kretz

Category: Data processing and analysis
Duration 34 minutes 46 seconds
AWS AppSync & Amplify with React & GraphQL - Complete Guide

AWS AppSync & Amplify with React & GraphQL - Complete Guideudemy

Category: React.js, AWS, GraphQL
Duration 11 hours 11 minutes 36 seconds
Becoming a Better Data Engineer

Becoming a Better Data EngineerAndreas Kretz

Category: Data processing and analysis
Duration 1 hour 46 minutes 10 seconds
Storing & Visualizing Time Series Data

Storing & Visualizing Time Series DataAndreas Kretz

Category: Data processing and analysis
Duration 2 hours 11 minutes 34 seconds
dbt for Data Engineers

dbt for Data EngineersAndreas Kretz

Category: Data processing and analysis
Duration 1 hour 52 minutes 55 seconds