Skip to main content

Fundamentals of Apache Spark and PySpark

2h 20m 54s
English
Paid

Apache Spark is an essential tool for any aspiring Data Engineer or Data Scientist, and PySpark allows you to harness the full power of Spark using the familiar Python programming language.

Course Overview

This comprehensive course is designed for individuals eager to confidently explore the world of big data. You will delve into Spark's architecture, learn to write clear and efficient PySpark code, and gain the skills to create scalable data processing pipelines.

Hands-On Learning Experience

Our training is practice-based, ensuring you work with real datasets, tackle practical tasks, and develop skills that are in high demand among employers.

Key Learning Objectives

  • Understand the architecture and components of Apache Spark.
  • Write efficient and maintainable PySpark code.
  • Create and manage scalable data processing pipelines.

Why Enroll in This Course?

If your goal is to learn how to analyze massive amounts of data, swiftly clean and transform information, and master the tools used by industry leaders like Netflix and Amazon, this course is the perfect fit for you.

About the Author: zerotomastery.io

zerotomastery.io thumbnail
Whether you are just starting to learn to code or want to advance your skills, Zero To Mastery Academy will teach you React, Javascript, Python, CSS and more to help you advance your career, get hired and succeed at some of the top companies in the world.

Watch Online 29 lessons

This is a demo lesson (10:00 remaining)

You can watch up to 10 minutes for free. Subscribe to unlock all 29 lessons in this course and access 10,000+ hours of premium content across all courses.

View Pricing
0:00
/
#1: Introduction
All Course Lessons (29)
#Lesson TitleDurationAccess
1
Introduction Demo
07:30
2
[Optional] What Is a Virtualenv?
06:37
3
Apache Spark
03:44
4
How Spark Works
04:24
5
Spark Application
07:41
6
DataFrames
06:43
7
Installing Spark
05:51
8
Inside Airbnb Data
07:02
9
Writing Your First Spark Job
07:05
10
Lazy Processing
02:16
11
[Exercise] Basic Functions
01:29
12
[Exercise] Basic Functions - Solution
06:41
13
Aggregating Data
04:00
14
Joining Data
04:40
15
Aggregations and Joins with Spark
06:10
16
Complex Data Types
05:09
17
[Exercise] Aggregate Functions
00:50
18
[Exercise] Aggregate Functions - Solution
05:54
19
User Defined Functions
03:25
20
Data Shuffle
06:14
21
Data Accumulators
03:42
22
Optimizing Spark Jobs
07:39
23
Submitting Spark Jobs
04:29
24
Other Spark APIs
05:16
25
Spark SQL
04:33
26
[Exercise] Advanced Spark
02:10
27
[Exercise] Advanced Spark - Solution
05:26
28
Summary
03:08
29
Let's Keep Learning Together!
01:06
Unlock unlimited learning

Get instant access to all 28 lessons in this course, plus thousands of other premium courses. One subscription, unlimited knowledge.

Learn more about subscription