Data Preparation & Cleaning for ML
3h 7m 23s
English
Paid
Course description
Have you ever heard the expression "data preparation and cleaning"? This is perhaps the most important part of the entire machine learning process. Real-world data is often "messy" - it can contain errors, omissions, duplicates, and outliers, leading to distortions, issues, and failures in model performance. That is why it is crucial that data is cleaned and ready for analysis.
Read more about the course
Simply put, data preparation and cleaning are implementations of the principle of "garbage in, garbage out." Identifying and correcting errors, removing damaged and duplicate records, filling in missing values, and handling outliers are all essential steps in preparation. This process can be labor-intensive, but it is quality data that determines a project's success. Even the most advanced machine learning algorithms cannot be trained on unstructured or "dirty" data.
To ensure you feel confident in your ML projects, this mini-course will cover everything you need to know about data preparation.
- We'll start with an 8-key-step checklist to keep in mind when launching any project.
- We'll delve into theory, including missing values, outliers, feature selection, and more.
- We'll move on to practice, where for each segment you'll complete tasks in Python, working with real data.
Watch Online
Join premium to watch
Go to premium
# | Title | Duration |
---|---|---|
1 | Introduction | 01:02 |
2 | ML Prep Checklist | 07:18 |
3 | Theory Missing Values | 08:48 |
4 | Missing Values with Pandas | 12:43 |
5 | Missing Values with SimpleImputer | 11:06 |
6 | Missing Values with KNNImputer | 11:50 |
7 | Theory Categorical Variables | 08:19 |
8 | Categorical Variables One-Hot-Encoding | 10:51 |
9 | Theory Outliers | 08:56 |
10 | Outliers hands-on | 13:35 |
11 | Theory Feature Scaling | 09:20 |
12 | Feature Scaling hands-on | 08:19 |
13 | Theory Feature Selection | 12:05 |
14 | Practical Correlation Matrix | 04:27 |
15 | Practical Univariate Testing | 17:54 |
16 | Practical RFECV | 13:49 |
17 | Theory Model Validation | 08:54 |
18 | Practical Model Validation | 18:07 |
Books
Read Book Data Preparation & Cleaning for ML
# | Title |
---|---|
1 | Note to students from Andrew Jones |
Comments
0 commentsSimilar courses

Learn to Build Machine Learning Systems That Don't Suck
Sources: Santiago Valdarrama
A live, interactive course that will teach you from scratch how to design, create, and implement ready-to-use ML systems - no fluff and academic...
32 hours 6 minutes 40 seconds

The Real-World ML Tutorial
Sources: Pau Labarta Bajo
Hello! I am Pau, a machine learning engineer with many years of experience in developing real-world ML products. Do you want to design, develop, and...
4 hours 3 minutes 44 seconds

Predictive Analytics & Machine Learning
Sources: LunarTech
Predictive analytics and machine learning is a course that will help you master key concepts and practical skills in data forecasting...
55 minutes 15 seconds

Machine Learning & Containers on AWS
Sources: Andreas Kretz
In this practical course, you will learn how to build a complete data pipeline on the AWS platform - from obtaining data from the Twitter API to analysis, stora
1 hour 33 minutes 34 seconds

Machine Learning System Design
Sources: Arseny Kravchenko, Valerii Babushkin
Machine Learning System Design is a practical guide to designing effective and reliable machine learning systems. The book covers the entire cycle...
Want to join the conversation?
Sign in to comment