Data Engineering on AWS
4h 46m 38s
English
Paid
This course is the perfect start for those who want to master cloud technologies and begin working with Amazon Web Services (AWS), one of the most popular platforms for data processing. The course is especially useful for beginner data engineers and those seeking their first job in this field.
Throughout the course, you will create a fully-fledged end-to-end project based on data from an online store. Step by step, you will learn to model data, build pipelines, and work with key AWS tools: Lambda, API Gateway, Kinesis, DynamoDB, Redshift, Glue, and S3.
Read more about the course
What to expect in the course:
- Data Work
- Learn the structure and types of data you will be working with. Define the project goals - an important step for successful implementation.
- Platform and Pipeline Design
- Get acquainted with the platform architecture and design pipelines: for data loading, storage in S3 (Data Lake), processing in DynamoDB (NoSQL), and Redshift (Data Warehouse). Learn to build pipelines for interfaces and data streaming.
- Basics of AWS
- Create an account in AWS, understand access and security management (IAM), get introduced to CloudWatch and the Boto3 library for working with AWS through Python.
- Data Ingestion Pipeline
- Create an API via API Gateway, send data to Kinesis, configure IAM, and develop an ingestion pipeline in Python.
- Data Transfer to S3 (Data Lake)
- Set up a Lambda function to receive data from Kinesis and save it to S3.
- Data Transfer to DynamoDB
- Implement a pipeline for transferring data from Kinesis to DynamoDB - a fast NoSQL database.
- API for Data Access
- Create an API for working with data in the database. Learn why direct access from visualization to the database is a bad practice.
- Data Visualization in Redshift
- Send streaming data to Redshift via Kinesis Firehose, create a Redshift cluster, configure security, create tables, and set up Firehose. Connect Power BI to Redshift for data analysis.
- Batch Processing: AWS Glue, S3, and Redshift
- Master batch data processing: set up and run Glue to write data from S3 to Redshift, understand Crawler and data catalog, and learn to debug processes.
This course will help you gain practical experience in creating streaming and batch pipelines in AWS, as well as mastering key tools for working with cloud data.
Watch Online Data Engineering on AWS
Join premium to watch
Go to premium
# | Title | Duration |
---|---|---|
1 | Important: Before you start! | 00:31 |
2 | Introduction | 02:22 |
3 | Data Engineering | 04:16 |
4 | Data Science Platform | 05:21 |
5 | Data Types You Encounter | 03:04 |
6 | What Is A Good Dataset | 02:55 |
7 | The Dataset We Use | 03:17 |
8 | Defining The Purpose | 06:28 |
9 | Relational Storage Possibilities | 03:47 |
10 | NoSQL Storage Possibilities | 06:29 |
11 | Selecting The Tools | 03:50 |
12 | Client | 03:06 |
13 | Connect | 01:19 |
14 | Buffer | 01:30 |
15 | Process | 02:43 |
16 | Store | 03:42 |
17 | Visualize | 03:02 |
18 | Data Ingestion Pipeline | 03:01 |
19 | Stream To Raw Storage Pipeline | 02:20 |
20 | Stream To DynamoDB Pipeline | 03:10 |
21 | Visualization API Pipeline | 02:57 |
22 | Visualization Redshift Data Warehouse Pipeline | 05:30 |
23 | Batch Processing Pipeline | 03:20 |
24 | Create An AWS Account | 01:59 |
25 | Things To Keep In Mind | 02:46 |
26 | IAM Identity & Access Management | 04:08 |
27 | Logging | 02:23 |
28 | AWS Python API Boto3 | 02:58 |
29 | Development Environment | 04:03 |
30 | Create Lambda for API | 02:34 |
31 | Create API Gateway | 08:31 |
32 | Setup Kinesis | 01:39 |
33 | Setup IAM for API | 05:01 |
34 | Create Ingestion Pipeline (Code) | 06:10 |
35 | Create Script to Send Data | 05:47 |
36 | Test The Pipeline | 04:54 |
37 | Setup S3 Bucket | 03:43 |
38 | Configure IAM For S3 | 03:22 |
39 | Create Lambda For S3 Insert | 07:17 |
40 | Test The Pipeline | 04:02 |
41 | Setup DynamoDB | 09:01 |
42 | Setup IAM For DynamoDB Stream | 03:37 |
43 | Create DynamoDB Lambda | 09:21 |
44 | Create API & Lambda For Access | 06:11 |
45 | Test The API | 04:48 |
46 | Setup Redshift Data Warehouse | 08:09 |
47 | Security Group For Firehose | 03:13 |
48 | Create Redshift Tables | 05:52 |
49 | S3 Bucket & jsonpaths.json | 03:03 |
50 | Configure Firehose | 07:59 |
51 | Debug Redshift Streaming | 07:44 |
52 | Bug-fixing | 05:59 |
53 | Power Bi | 12:17 |
54 | AWS Glue Basics | 05:15 |
55 | Glue Crawlers | 13:10 |
56 | Glue Jobs | 13:44 |
57 | Redshift Insert & Debugging | 07:17 |
58 | What We Achieved & Improvements | 10:41 |
Similar courses to Data Engineering on AWS

Apache Spark Certification TrainingFlorian Roscheck
Category: Python, Data processing and analysis
Duration 15 hours 13 minutes 1 second
Course

Stratospheric - From Zero to Production with Spring Boot and AWS + BOOKleanpub
Category: AWS, Spring Boot
Duration 7 hours 19 minutes 39 seconds
Course

Deep Learning: Advanced Computer Visionudemy
Category: Data processing and analysis
Duration 15 hours 10 minutes 54 seconds
Course

AWS Certified Solutions Architect - Associate (SAA-C03)Adrian Cantrill
Category: AWS
Duration 70 hours 6 minutes 47 seconds
Course

Data Analysis for Beginners: Python & Statisticszerotomastery.io
Category: Python, Data processing and analysis
Duration 6 hours 34 minutes 20 seconds
Course

Data Engineering with HadoopSuyog Nagaokar
Category: Data processing and analysis
Duration 7 hours 3 minutes
Course

2022 Python for Machine Learning & Data Science Masterclassudemy
Category: Python, Data processing and analysis
Duration 44 hours 5 minutes 31 seconds
Course

Build Fast MasterclassBuildFast Academy
Category: Python, Data processing and analysis
Duration 7 hours 22 minutes 11 seconds
Course

Storing & Visualizing Time Series DataAndreas Kretz
Category: Data processing and analysis
Duration 2 hours 11 minutes 34 seconds
Course

Learning Apache SparkAndreas Kretz
Category: Data processing and analysis
Duration 1 hour 44 minutes 4 seconds
Course