Skip to main content
CF

Data Science Jumpstart with 10 Projects Course

3h 12m 21s
English
Paid

Embark on a transformative journey into data science with our comprehensive course designed to equip you with essential skills using Python. No prior data science experience is required, though a basic understanding of Python is assumed. You will gain exposure to the tools and techniques used by data scientists, engineers, and analysts to address real-world problems effectively.

Course Highlights

Throughout this course, you will:

  1. Understand Data Handling: Delve into loading, cleaning, and summarizing data, and perform basic statistics with both CSV and Excel files. These skills form the foundation of any data analysis task.
  2. Retail Data Insights Project: Master the art of combining and reshaping datasets to uncover hidden patterns in retail data. This project will enhance your ability to derive meaningful insights from large datasets.
  3. Health Data Deep Dives: Learn to handle missing data, recognize abnormalities, and apply foundational machine learning techniques. This knowledge is crucial for building robust models in healthcare or any other industry.
  4. Model Building: Create models to explore intriguing analyses such as Air Quality Trends and Movie Reviews. Through these projects, you will gain experience in predictive modeling and evaluation.
  5. Interactive Dashboards & SQL Exploration: Construct interactive dashboards using Plotly, and explore SQL databases for deeper insights. This combination will empower you to present data dynamically and connect with back-end data sources.
  6. Utilize Powerful Libraries: Harness the power of essential Python libraries like Pandas, Matplotlib, and Plotly, among others. Mastery of these libraries is essential for efficient data manipulation and visualization.

Additional

https://github.com/talkpython/data-science-jumpstart-with-10-projects-course

About the Author: Talk Python Training

Talk Python Training thumbnail

Talk Python Training is the paid course platform of Michael Kennedy, the host of the long-running Talk Python To Me podcast — one of the most-listened-to podcasts in the Python ecosystem. The course platform extends Michael's interview-based knowledge of the field into structured video courses taught by Michael and a curated set of guest instructors.

The course catalog covers the full Python landscape: web development with Django, Flask, FastAPI, and the broader async-Python stack; data science and pandas; LLM / RAG application development; testing and CI/CD; deployment patterns; the data-engineering side of Python; and a long list of practical Python patterns aimed at working developers. Few platforms cover the language with this much breadth from inside the Python community itself.

The CourseFlix listing under this source carries over 18 Talk Python Training courses spanning that range. Material is paid; Talk Python Training runs on per-course pricing on the original platform. Courses are aimed at developers using Python as a serious primary language rather than as a scripting tool.

Watch Online 104 lessons

This is a demo lesson (10:00 remaining)

You can watch up to 10 minutes for free. Subscribe to unlock all 104 lessons in this course and access 10,000+ hours of premium content across all courses.

View Pricing
0:00
/
#1: Welcome
All Course Lessons (104)
#Lesson TitleDurationAccess
1
Welcome Demo
00:51
2
Installing Jupyter in a Virtual Environment
02:01
3
Running in Github Codespaces
01:37
4
How to use Jupyter
02:09
5
How to use VS Code
01:11
6
Remember the Exercises
00:27
7
Intro csv v2
00:34
8
Loading CSV data from a ZIP file with Pandas and Pyarrow
05:26
9
Summary stats in Pandas using describe, dtypes, and quantile
06:35
10
Pearson and Spearman Correlations in Pandas and Heatmaps
05:36
11
Understanding Pandas Categoricals with value_counts and Cross Tabulations
04:50
12
Visualizations in Pandas, with Histograms, Scatterplots, and Barplots
08:37
13
Summary
00:25
14
Intro excel
00:42
15
Create an Excel in Pandas with to_excel
01:46
16
Read Excel file in Pandas with read_excel and Pyarrow
01:31
17
Understanding Counts and Frequencies of Missing Data in Pandas with isna, any, sum, and mean
03:03
18
Quantifying Strings with filter and value_counts
02:07
19
Understanding Numbers with Correlations, Scatterplots, and Histograms
03:33
20
Writing and Formatting Excel Sheets in Pandas with to_excel and XlsxWriter add_format
01:49
21
Summary
00:11
22
Intro
00:15
23
Loading Data for Merging with Pyarrow
00:57
24
Merging Dataframes with the merge method and left_on, right_on parameters
01:34
25
Validating one to one and one to many merges
02:51
26
Debugging Merging by piping dataframe size
02:36
27
Cleanup columns after merging with loc
02:19
28
Export Merged data to Excel
00:56
29
Merging summary
00:31
30
Intro grouping
00:38
31
Loading Retail Data from Excel into Pandas Dataframe
00:33
32
Using Feather and Pyarrow to Speed up loading Retail Data in Pandas
00:49
33
Exploratory Data Analysis (EDA) in Pandas with describe, histograms, and value_counts
03:48
34
Aggregating in Pandas to Calculate Sales by Year
02:44
35
Using Groupby in Pandas to visualize Sales by country
06:06
36
Using Grouper in Pandas to Groupby by Month Frequency
03:36
37
Grouping by Month and Country and Visualizing with a Line Plot
05:31
38
Summary
00:26
39
Intro cleaning
00:37
40
Loading Multiple Files into a Single Pandas Datafarme with Glob
00:47
41
Understanding the Heart Data to Cleanup
02:47
42
Fixing the Age Column Type to Int8
00:44
43
Converting the Numeric Sex Column into a String
01:18
44
Converting the Chest Pain Column into an Int8
00:49
45
Dealing with ? Characters in the Trestbps Numeric Column
02:25
46
Creating a Function to Repeat Common Cleanup in the Chol Column
03:08
47
Using the Cleanup Function for the Fbs Column
01:05
48
Fixing the Restecg Column
01:28
49
Fixing the Thalach Column
00:14
50
Fixing the Exang Column
00:15
51
Updating the Cleanup Function to Clean the Oldpeak Column
00:23
52
Cleaning the Slope Column
00:19
53
Cleaning the Ca Column
00:18
54
Converting Numeric Values to Catgoricals with the Thal Column
00:39
55
Fixing the Num Column
01:07
56
Comparing Memory usage in Pandas with memory_usage
00:50
57
Refactoring to a Function in Pandas for Cleanup
04:19
58
Cleaning summary
00:06
59
Intro time series air quality dataset
00:31
60
Load CSV file from a Zip file with Pandas
00:51
61
Checking for Missing Values and Shape in Pandas
00:52
62
Parsing Dates Using Format Strings and to_datetime
02:04
63
Rename columns in Pandas to Remove Invalid Characters
02:36
64
Make a Function to Clean up Pandas Data
00:52
65
Converting Dates to UTC in Pandas
00:57
66
Converting Dates to Italian time in Pandas and pytz
01:30
67
Making Line Plots for Time Series Data in Pandas
03:24
68
Interpolating and Filling in Missing values in Pandas
03:27
69
Resampling Time Series Data in Pandas with resample
02:30
70
Creating 7 Day Rolling Averages in Pandas with rolling
01:45
71
Updating the Function with Cleanup Functionality
00:16
72
Summary
00:22
73
Intro text v2
00:25
74
Load movie review text data from a directory
01:32
75
Exploring the str attribute in Pandas for String manipulation
00:55
76
Using Spacy to Remove Stop words in Pandas
02:44
77
Using scikit-learn to calculate Tfidf for Pandas text
01:44
78
Using XGBoost to Create a Classification Model
02:40
79
Predicting Values with XGBoost and Pandas
01:40
80
Intro v2
00:21
81
Combining Multiple Datasets with Pandas and concat
02:00
82
Exploring heart disease with aggregations and scatterplots
05:01
83
Preparing a Pandas Dataset to Create an XGBoost Model
04:59
84
Tuning an XGBoost Model with Hyperopt
06:02
85
Using a Confusion matrix to Understand the Model
01:48
86
Ml summary
00:09
87
Intro SQL
00:13
88
Load CSV data into a Pandas dataframe and cleaning it
01:32
89
Using SqlAlchemy to Connect to a SQLite Database
00:55
90
Create a database table with Pandas using to_sql
00:31
91
Query a SQLite table from Pandas using read_sql
01:19
92
Query a SQLite table with Pandas
01:57
93
Visualize SQLite Data using Pandas
01:54
94
Summary SQL
00:27
95
Intro plotly
00:11
96
Load CSV data into Pandas dataframe
00:22
97
Clean Pandas data with a function for plotly
01:45
98
Creating a Line Plot in Plotly for Pandas
02:01
99
Creating a Bar plot in Plotly
02:29
100
Creating a Scatter plot in Plotly
03:41
101
Creating a Dashboard with Dash and Plotly Graphs
01:43
102
Creating a Plotly Dashboard using Dash with Widgets
01:10
103
Summary plotly
00:08
104
Conclusion
01:17
Unlock unlimited learning

Get instant access to all 103 lessons in this course, plus thousands of other premium courses. One subscription, unlimited knowledge.

Learn more about subscription

Related courses

Frequently asked questions

What are the prerequisites for enrolling in this course?
The course assumes a basic understanding of Python programming, which is necessary for working with tools like Pandas and Pyarrow. However, no prior experience in data science is required, making it accessible for beginners who are eager to dive into the field of data science.
What types of projects will I work on during the course?
The course includes 10 projects that cover a variety of topics, such as the Retail Data Insights Project, where you will learn to combine and reshape datasets, and Health Data Deep Dives, focusing on handling missing data and foundational machine learning techniques. Other projects involve model building for Air Quality Trends and Movie Reviews, and creating interactive dashboards using Plotly.
Who is the target audience for this course?
This course is designed for individuals who are interested in starting a career in data science, data analysis, or related fields. It is suitable for beginners with some Python programming knowledge who want to learn data handling, model building, and interactive data presentation.
How does this course compare to other data science courses in terms of depth and scope?
This course provides a practical introduction to data science with a focus on real-world applications. It covers essential topics such as data handling, model building, and interactive dashboards, providing a hands-on approach with 10 projects. It may not delve as deeply into advanced topics as more specialized or longer courses but offers a solid foundation for beginners.
What specific tools and platforms will I learn to use in this course?
You will learn to use tools such as Pandas and Pyarrow for data handling, Plotly for creating interactive dashboards, and explore SQL databases. Additionally, the course covers using Jupyter and VS Code as development environments, as well as working within GitHub Codespaces.
What topics are not covered in this course?
The course does not cover advanced machine learning algorithms beyond foundational techniques, nor does it delve deeply into big data technologies or cloud computing platforms. It focuses on imparting fundamental data science skills and techniques suitable for beginners.
What is the expected time commitment for completing the course?
The course consists of 104 lessons, and while the total runtime is not specified, students should expect to spend several weeks completing the projects and exercises, depending on their familiarity with Python and ability to grasp new concepts. Dedicating regular study sessions will help in absorbing the material effectively.