Skip to main content
CF

The Real-World ML Tutorial

4h 3m 44s
English
Paid

Hello! I am Pau, a machine learning engineer with extensive experience in developing real ML products. Are you ready to design, develop, and implement your own ML product?

This course will guide you in creating fully functional ML solutions from concept to production, enabling startups and large companies to tackle business challenges effectively.

Course Highlights

What awaits you inside:

  • From Business to ML
    • Learn to translate a business problem into an ML task.
    • Master four key steps: from raw data to a comprehensive ML solution, beyond just a simple prototype.
  • Data Preparation
    • Transform raw data into ready features and target variables.
    • Create a reliable data pipeline in Python, covering collection, validation, transformation, and generation of training data.
  • Model Prototyping
    • Quickly create and improve basic models.
    • Enhance them step-by-step using feature engineering and boosting.
    • Master hyperparameter optimization to maximize data utility.
  • Deploying and Monitoring the Model
    • Turn a prototype into a functioning batch-scoring system using Feature Store and CI/CD.
    • Create a dashboard to display live forecasts.
    • Set up a system to monitor the quality and stability of the model.

Target Audience

Who is this course for?

  • Individuals who can prepare data and train models in a notebook but don't know how to turn them into a working service.
  • Specialists aiming to learn how to design, implement, and deploy a real ML solution from start to finish.
  • Aspiring ML engineers who want to master the practice of building complete ML systems.

Course Benefits

What will you get:

  • 3 hours of video lectures and presentations, regularly updated.
  • Access to the full source code in a GitHub repository.

Course Project

What will you build:

Throughout the course, you will develop a complete ML service that forecasts taxi demand in New York. The methods and tools you will master, such as pipelines, MLOps, and monitoring, are applicable across various industries.

Move beyond "toy" projects and learn to build truly effective ML systems that deliver business value.

About the Author: Pau Labarta Bajo

Pau Labarta Bajo thumbnail

Pau Labarta Bajo is a Spanish ML engineer and educator who runs the Real-World Machine Learning teaching brand and a popular newsletter on production ML systems. His material focuses on the engineering side of ML — feature stores, model serving, monitoring — rather than on the math-heavy theory side that dominates academic ML coursework.

His CourseFlix listing carries two Pau Labarta Bajo courses: The Real-World ML Tutorial and Let's Rust (ML + Rust). Material is paid and aimed at engineers transitioning into production ML work.

Watch Online 45 lessons

This is a demo lesson (10:00 remaining)

You can watch up to 10 minutes for free. Subscribe to unlock all 45 lessons in this course and access 10,000+ hours of premium content across all courses.

View Pricing
0:00
/
#1: Welcome
All Course Lessons (45)
#Lesson TitleDurationAccess
1
Welcome Demo
00:50
2
Development tools: VSCode, Python Poetry and Git/GitHub
00:57
3
Install VSCode and Python Poetry in your local machine
02:18
4
Create the Project Structure
02:19
5
Create local git repository
02:05
6
Create remote GitHub repository and connect it to the local one
01:55
7
Let's understand the business problem
07:34
8
3 Steps to go from raw data to training data
01:46
9
Our raw data source: the NYC taxi website
01:39
10
Step 1. Fetch raw data and validate it
09:22
11
Step 2. Transform raw validated data into time-series data
06:13
12
Plot the time-series data
02:44
13
Step 3. Transform time-series data into training data
09:38
14
Steps 1, 2 and 3. From raw data to training data + Code re-factoring!
09:56
15
Plot the training data
05:10
16
How do you build a Supervised Machine Learning model?
09:32
17
Split the dataset into training and test datasets
01:47
18
Baseline model 1
06:28
19
Baseline model 2
02:28
20
Baseline model 3
02:40
21
XGBoost model
05:27
22
LightGBM model
03:08
23
LightGBM + Feature engineering
11:40
24
LightGBM + Feature engineering + Hyper-parameter tuning
10:33
25
Batch-scoring ML service with a Feature Store
03:56
26
What is a Feature Store?
01:18
27
Create a Serverless Feature Store with Hopsworks
01:42
28
Backfill Feature Store with Historical Data
08:52
29
Build the Feature Pipeline
05:22
30
Automate the execution of the Feature Pipeline using a GitHub action
04:47
31
Build the Model Training Pipeline
09:15
32
ML Frontend app using Streamlit
02:15
33
Inference functions
04:13
34
Build the Streamlit app - Part 1: inference code
09:43
35
Build the Streamlit app - Part 2: build the UI
05:41
36
Deploy the Streamlit app to Streamlit Cloud
04:54
37
Our plan
02:35
38
Create an inference pipeline to generate and store predictions in the store
07:24
39
The new (and way simpler) frontend Streamlit app - frontend.py
02:56
40
Create a monitoring dashboard with Streamlit
04:07
41
Deploy the monitoring dashboard to Streamlit Cloud
02:33
42
Why model re-training?
03:02
43
Implementation
10:29
44
2024_07_02_Karthikeya
05:11
45
2024-07-09-Karthikeya
25:20
Unlock unlimited learning

Get instant access to all 44 lessons in this course, plus thousands of other premium courses. One subscription, unlimited knowledge.

Learn more about subscription

Books

Read Book The Real-World ML Tutorial

#TitleTypeOpen
1The_Real-World_Machine_Learning_Tutorial_Slides PDF

Related courses

Frequently asked questions

What prerequisites are needed before taking this course?
This course is designed for individuals who are comfortable preparing data and have a basic understanding of Python programming. Familiarity with development tools like VSCode, Python Poetry, and Git/GitHub is also beneficial, as these tools are used throughout the course to develop and manage the project.
What will I build during the course?
During the course, you'll build a fully functional ML product, starting from understanding a business problem and translating it into an ML task. You'll create a data pipeline to transform raw data into training data, develop models using XGBoost and LightGBM, and deploy a batch-scoring system. Additionally, you'll create a dashboard to monitor model performance using tools like Streamlit.
Who is the ideal audience for this course?
The course targets individuals who can prepare data and have a foundational understanding of machine learning concepts. It is particularly suited for those looking to apply ML solutions to real-world business challenges, whether in a startup environment or within larger companies.
How does the scope of this course compare to other ML courses?
Unlike many introductory machine learning courses that focus solely on model creation, this course covers the entire lifecycle of developing an ML product. It includes data preparation, model prototyping, deployment, and monitoring, with a strong emphasis on translating business problems into ML solutions.
Which specific tools and platforms are used in this course?
The course utilizes several specific tools and platforms including VSCode, Python Poetry, Git/GitHub, XGBoost, LightGBM, and Streamlit. For deployment and monitoring, the course employs Feature Store and CI/CD processes. These tools are integral to building and maintaining a real-world ML product.
What topics are not covered in this course?
The course does not delve into advanced topics such as deep learning, natural language processing, or computer vision. Instead, it focuses on traditional machine learning techniques, feature engineering, and the practical deployment of ML systems in business contexts.
What is the expected time commitment for this course?
The course comprises 45 lessons. The exact time commitment will depend on the individual's pace, particularly in coding and implementing the projects. As the course covers comprehensive steps from data preparation to deployment, students should be prepared to invest significant time in practice and application.