Skip to main content
CF

Machine Learning & Containers on AWS

1h 33m 34s
English
Paid

Machine Learning & Containers on AWS is a 25-lesson 1 hour 33 minutes self-paced course by Andreas Kretz. Embark on a comprehensive journey to build a complete data pipeline on the AWS platform.

Course facts

Lessons
25
Duration
1 hour 33 minutes
Level
All levels
Language
English
Updated
Instructor
Andreas Kretz
Price
Premium

Embark on a comprehensive journey to build a complete data pipeline on the AWS platform. In this practical course, you will gain hands-on experience, from acquiring data with the Twitter API to analysis, storage, and visualization.

Course Overview

You will learn how to create your own machine learning algorithm and deploy it on AWS using Lambda. The course also covers setting up a Postgres database with Amazon RDS. For result visualization, you'll develop an interactive dashboard with Streamlit and learn to deploy it using Elastic Container Registry (ECR) and Elastic Container Service (ECS). Furthermore, you'll be introduced to the Poetry tool for effective project dependency management.

Course Structure

Twitter API Integration

Twitter API provides an excellent gateway for accessing open data. You will learn to configure API access and retrieve tweets from a user's feed for further processing. Delve into API configuration details and understand the data format (payload) it returns.

Setting Up RDS Database

Data storage is crucial for any platform. You will set up a Postgres database in Amazon RDS and understand the rationale behind storing JSON tweets in it. Get hands-on practice with virtual private clouds (VPC) to make your database internet-accessible. Learn to use PGAdmin to create tables and execute database queries.

Implementing NLP with Lambda

Use the Natural Language Toolkit (NLTK) library to perform text analysis with a pre-built machine learning algorithm. You will create a Lambda function to retrieve tweets, analyze their sentiment, and save the results in your database. Learn to connect necessary dependencies through layers, including how to import pre-made K-Layers and create custom layers. Discover how to set up automatic Lambda function triggers using EventBridge.

Dependency Management and Streamlit App Development

Visualize results by creating a Streamlit application. Establish a local development environment with Anaconda3 and create a conda virtual environment. Manage project dependencies using Poetry as you navigate through the provided Git repository. We will guide you step-by-step through the application code, demonstrating how to run it in a new virtual environment for testing.

Deploying Streamlit Application in ECS

Upon completing your visualization, learn to handle Docker images and containers on AWS. Create an Elastic Container Registry (ECR) and set up AWS CLI. Understand how to create user groups and set access restrictions with IAM. After building your Docker image, upload it to ECR, configure an ECS Fargate cluster, and successfully deploy your Streamlit application as a task on the platform.

Additional

Link to the GitHub of this project: https://github.com/team-data-science/ML-on-AWS-1

Who teaches Machine Learning & Containers on AWS? Andreas Kretz

Andreas Kretz thumbnail

Andreas Kretz is a German data engineer and one of the most widely followed independent voices on data engineering as a career discipline. He runs the Plumbers of Data Science brand and has been publishing tutorial material continuously since the field consolidated around the modern lake-house stack (Spark, Kafka, Snowflake, Databricks, Airflow).

His CourseFlix listing is the largest single-author catalog under this source — over thirty courses spanning data-pipeline construction, streaming architectures, the cloud-native data stack on AWS / Azure / GCP, the Python and Scala tooling that dominates the field, and the soft-skills / career side of breaking into data engineering. Material is paid and aimed at engineers transitioning into data work or already-working data engineers picking up specific tools.

What lessons are included in Machine Learning & Containers on AWS?

This is a demo lesson (10:00 remaining)

You can watch up to 10 minutes for free. Subscribe to unlock all 25 lessons in this course and access 10,000+ hours of premium content across all courses.

View Pricing
0:00
/
#1: Introduction video
All Course Lessons (25)
#Lesson TitleDurationAccess
1
Introduction video Demo
02:39
2
Project architecture explained
02:07
3
Relational DB
01:27
4
RDS setup
02:38
5
Setting VPC inbound rules for internet access
02:13
6
PG Admin installation & S3 config
04:06
7
Lambda intro & IAM setup
03:12
8
Create Lambda function
01:25
9
The Lambda function code explained
08:23
10
Insert the code into your Lambda function
00:57
11
Add layers to Lambda from Klayers
05:33
12
Create & configure custom layers for twython & psycopg2
04:41
13
Test Lambda & set environment variables
04:54
14
Schedule your Lambda with Event Bridge
03:16
15
Setup virtual conda environment
04:08
16
Poetry dependency installs & run Streamlit UI locally
05:58
17
Streamlit app code explained
07:53
18
Setup container registry ECR
01:53
19
AWS CLI install and ECR login
05:20
20
Dockerfile explained, Docker image build & push image to ECR
02:53
21
Create ECS Fargate cluster
01:35
22
ECS task IAM configuration & Streamlit task creation
05:00
23
Fixing the ECS task
05:15
24
Stopping the task on ECS after you are finished
01:00
25
Conclusion & outlook
05:08
Unlock unlimited learning

Get instant access to all 24 lessons in this course, plus thousands of other premium courses. One subscription, unlimited knowledge.

Learn more about subscription

What courses are similar to Machine Learning & Containers on AWS?

  • Semantic Log Indexing & Search thumbnailUpdated 7mo ago

    Semantic Log Indexing & Search

    By: Andreas Kretz
    Master semantic search with our course on generative AI. Learn to build a complete pipeline using FastAPI, qdrant, and Streamlit for advanced data processing
    53m
  • Schema Design Data Stores thumbnailUpdated 1y ago

    Schema Design Data Stores

    By: Andreas Kretz
    Schema design is a vital topic in data management, repeatedly highlighted during coaching sessions .
    2h 30m5/5
  • Data Engineering on GCP thumbnailUpdated 11mo ago

    Data Engineering on GCP

    By: Andreas Kretz
    Google Cloud Platform (GCP) is one of the most popular cloud platforms in the world, providing an extensive set of tools and services for building, managing.
    1h 17m5/5

Frequently asked questions

What prerequisites should I have before enrolling in the course?
Before starting the course, you should have a basic understanding of Python programming, as you will be working with libraries such as NLTK and using tools like Poetry for dependency management. Familiarity with AWS services and concepts like Lambda, RDS, and VPCs is beneficial but not mandatory, as the course provides guidance on setting these up.
What specific projects will I work on during the course?
Throughout the course, you will build a complete data pipeline on AWS. Key projects include integrating with the Twitter API to retrieve data, setting up a Postgres database with Amazon RDS, deploying a machine learning algorithm using AWS Lambda, and creating an interactive dashboard with Streamlit that you will deploy using Elastic Container Registry (ECR) and Elastic Container Service (ECS).
How does this course differ in scope from other machine learning courses?
This course uniquely combines machine learning with containerization and cloud deployment. Unlike courses focused solely on machine learning models, this course covers the end-to-end pipeline, including data acquisition, storage, model deployment, and visualization on AWS. It also emphasizes practical skills such as setting up cloud infrastructure and using Docker for containerization.
What AWS services and tools will I learn to use?
You will learn to use several AWS services and tools including Amazon RDS for database management, AWS Lambda for running machine learning models, Elastic Container Registry (ECR) and Elastic Container Service (ECS) for deploying applications, and EventBridge for scheduling tasks. Additionally, you'll use AWS CLI for managing these services from the command line.
What topics are not covered in this course?
The course does not cover in-depth machine learning theory or advanced algorithm development. It focuses on practical implementation and deployment. Topics such as deep learning, advanced data science techniques, or detailed security practices on AWS are outside the scope of this course.
How much time should I expect to commit to this course?
The course consists of 25 lessons, each designed to build on the previous one. While the exact time commitment can vary based on your prior experience, you should expect to spend several hours per week going through the lessons, implementing projects, and reviewing course materials to get the most out of the course.
How can the skills learned in this course be applied to other careers or courses?
The skills acquired in this course are highly transferable to roles in data engineering, cloud computing, and software development. Understanding AWS services, containerization with Docker, and data pipeline development can be beneficial for courses focused on cloud architecture, DevOps, and scalable application deployment, as well as in careers that require cloud infrastructure management and automation.