Apache Spark Certification Training

15h 13m 1s
English
Paid

Course description

Apache Spark is a core data skill – here is how to show you got it!

Learn Apache Spark from the ground up and show off your knowledge with the Databricks Associate Developer for Apache Spark certification. This course will transform you into a PySpark professional and get you ready to pass the popular Databricks Spark certification.

Join me for an easy to understand and engaging look into Spark and take your big data career to the next level!

Read more about the course

What will you learn?

The goal of this course is to teach you fundamental PySpark skills and prepare you to get certified with the Databricks Certified Associate Developer for Apache Spark certification.

The course includes 18 modules to help you understand how Apache Spark works internally and how to use it in practice. You can find all topics covered below, but here is an overview:

  • Become a seasoned expert at coding with Spark DataFrames
  • Get confident with the Databricks certification exam content
  • Discover Spark's distributed, fault-tolerant data processing
  • Master how to work with Spark in Databricks
  • Understand the Spark cluster architecture
  • Learn when and how Spark evaluates code
  • Grasp Spark's efficient memory management mechanisms
  • Analyze typical Spark problems like out-of-memory errors
  • See how Spark executes complex operations like joins
  • Become proficient in navigating through the Spark UI
  • ...and many more topics – check out the full list below!

Who is this for?

Anyone with basic Python skills who wants to develop their big data processing skills! And anyone who would like to pass the popular Databricks Certified Associate Developer for Apache Spark certification using PySpark.

  • If you want to learn how to use Apache Spark with the Scala programming language, this course isn't a fit. We focus on Python and PySpark exclusively, but the fundamental Spark concepts taught are applicable to both languages.
  • Data analysts and developers who want to add verified big data skills and Databricks experience to their portfolio
  • Data engineers who want or need a proof of their Apache Spark skills via a certification to boost their career
  • Data scientists wanting to work efficiently and frustration-free with large data sets in Apache Spark
  • Companies who want to enable their data staff to use Apache Spark in a professional, time- and cost-efficient way
  • Anyone wanting to brush up their Apache Spark skills with a solid understanding of how it works under the hood


Watch Online

This is a demo lesson (10:00 remaining)

You can watch up to 10 minutes for free. Subscribe to unlock all 99 lessons in this course and access 10,000+ hours of premium content across all courses.

View Pricing

Watch Online Apache Spark Certification Training

0:00
/
#1: 01. Introduction

All Course Lessons (99)

#Lesson TitleDurationAccess
1
01. Introduction Demo
09:48
2
02. Certification Exam Overview
05:03
3
03. Signing up for Databricks Community Edition
01:44
4
04. Loading Data Into Databricks
02:43
5
05. Overview of the Spark Cluster Architecture and its Components
08:24
6
06. Getting to Know the Spark Driver
11:55
7
07. Getting to Know Executors
07:37
8
08. Discovering Execution Modes
17:33
9
09. Overview
05:38
10
10. Internal Types, DataFrames, Datasets, RDDs, and the Spark SQL API
19:10
11
11. Hands-on Session_ Exploring Data APIs on Databricks Community Edition
08:26
12
12. Intro to Labs
01:12
13
13. Intro & Creating DataFrames
06:58
14
14. Exercise_ Creating a DataFrame
01:07
15
15. Exercise_ Creating a DataFrame - Solution
01:59
16
16. Working with Schemas
26:10
17
17. Exercise_ Building a Simple Schema
01:46
18
18. Exercise_ Building a Simple Schema - Solution
05:13
19
19. Exercise_ Building a Complex Schema
02:28
20
20. Exercise_ Building a Complex Schema - Solution
05:53
21
21. Type Conversion of DataFrame Columns
07:20
22
22. Exercise_ Changing the Type of a Column
01:50
23
23. Exercise_ Changing the Type of a Column - Solution
04:20
24
24. Overview
09:18
25
25. Shuffles
07:52
26
26. Data Skew
13:15
27
27. Spark Configurations for Partitions
03:47
28
28. Hands-on Session_ The Power of Partitions
30:18
29
29. Storage Layout
17:39
30
30. Caching and Storage Levels
10:29
31
31. Memory in Action
30:59
32
32. Hands-on Session_ Executor Memory Management - Part 1
10:27
33
33. Hands-on Session_ Executor Memory Management - Part 2
13:08
34
34. Intro & How to Get Help in PySpark
04:01
35
35. Partitioning Recap
09:45
36
36. Exercise_ Repartitioning
01:31
37
37. Exercise_ Repartitioning - Solution
06:08
38
38. Caching Recap
03:27
39
39. Exercise_ Caching
01:13
40
40. Exercise_ Caching - Solution
03:20
41
41. Overview
07:40
42
42. Hands-On Session_ Actions vs. Transformations
06:47
43
43. Intro & Reading Data
18:36
44
44. Exercise_ Reading Parquet Files
02:20
45
45. Exercise_ Reading Parquet Files - Solution
03:44
46
46. Reading from CSV Files
17:18
47
47. Exercise_ Reading CSV Files
02:29
48
48. Exercise_ Reading CSV Files - Solution
03:55
49
49. Reading from JSON Files
05:16
50
50. Writing Data
10:57
51
51. Exercise_ Writing to Parquet Files
02:08
52
52. Exercise_ Writing to Parquet Files - Solution
04:27
53
53. Writing to CSV Files
02:53
54
54. Exercise_ Writing to CSV Files
02:16
55
55. Exercise_ Writing to CSV Files - Solution
03:12
56
56. Writing to JSON Files
01:58
57
57. Using PySpark with SQL
05:01
58
58. Exercise_ SQL in PySpark
00:46
59
59. Exercise_ SQL in PySpark - Solution
02:16
60
60. Overview
16:33
61
61. Hands-on Session_ Discovering the Spark UI
12:27
62
62. Intro & Removing Data
16:58
63
63. Exercise_ Removing Data
00:59
64
64. Exercise_ Removing Data - Solution
03:16
65
65. Modifying Data
30:49
66
66. Exercise_ Modifying Data
02:08
67
67. Exercise_ Modifying Data - Solution
07:22
68
68. Analyzing Data
18:14
69
69. Exercise_ Analyzing Data
01:39
70
70. Exercise_ Analyzing Data - Solution
06:30
71
71. The Catalyst Optimizer
18:32
72
72. Adaptive Query Execution
15:32
73
73. Dynamic Partition Pruning
10:08
74
74. The DAG_ Achieving Fault Tolerance
12:25
75
75. Intro & Working With Dates and Times
33:30
76
76. Exercise_ Working With Dates and Times
02:10
77
77. Exercise_ Working With Dates and Times - Solution
08:00
78
78. Working With Strings
15:30
79
79. Exercise_ Working With Strings
03:20
80
80. Exercise_ Working With Strings - Solution
07:47
81
81. Working with Arrays
14:38
82
82. Exercise_ Working With Arrays
05:17
83
83. Exercise_ Working With Arrays - Solution
13:19
84
84. Accumulator and Broadcast Variables
11:14
85
85. Joins
34:02
86
86. Hands-on Session_ Cross-Cluster Communication
42:39
87
87. Intro & Grouping and Aggregating
19:16
88
88. Exercise_ Grouping and Aggregating
01:43
89
89. Exercise_ Grouping and Aggregating - Solution
07:19
90
90. Joining
15:06
91
91. Exercise_ Joining
03:58
92
92. Exercise_ Joining - Solution
03:58
93
93. User-Defined Functions (UDFs)
20:29
94
94. Exercise_ UDFs
04:06
95
95. Exercise_ UDFs - Solution
17:51
96
96. Signing up for the Exam
02:24
97
97. Last Minute Preparations
01:34
98
98. Introduction
04:36
99
99. Congratulations!
00:50

Unlock unlimited learning

Get instant access to all 98 lessons in this course, plus thousands of other premium courses. One subscription, unlimited knowledge.

Learn more about subscription

Books

Read Book Apache Spark Certification Training

#Title
11-Proposed Timeline
2Apache Spark Certification Exam Guide
3Mastery Map 1 - Cluster Components
4Mastery Map 2 - Spark Execution Modes
5Mastery Map 3 - Spark Data APIs
6Mastery Map 4 - Executor Memory Layout
7Mastery Map 5 - PySpark Storage Levels
8Mastery Map 6 - Executor Out-of-Memory Errors
9Mastery Map 7 - Actions Vs. Transformations
10Mastery Map 8 - Execution Hierarchy
11Mastery Map 9 - A Query, From Plan to Execution
12Mastery Map 10 - Adaptive Query Execution Strategies
13Mastery Map 11 - Dynamic Partition Pruning
14Mastery Map 12 - Joins

Comments

0 comments

Want to join the conversation?

Sign in to comment

Similar courses

The Complete Guide to Django REST Framework and Vue JS

The Complete Guide to Django REST Framework and Vue JS

Sources: udemy
Hi! Welcome to The Complete Guide to Django REST Framework and Vue JS course! In this course you will learn how to create professional REST APIs with Python and Django REST Fram...
13 hours 40 minutes 40 seconds
The Data Bootcamp: Transform your Data using dbt™

The Data Bootcamp: Transform your Data using dbt™

Sources: udemy
Are you looking for a cutting-edge way to extract load and transform your data? Do you want to know more about dbt™ and how to use it? Well, this is the course
4 hours 10 minutes 51 seconds
Visual Studio Code for Python Developers

Visual Studio Code for Python Developers

Sources: Talkpython
This course takes you hands-on through creating a real and meaningful Python project using FastAPI to give you a true sense of VS Code's potential and exposure to many of its fe...
4 hours 10 minutes 20 seconds
Data Science Jumpstart with 10 Projects Course

Data Science Jumpstart with 10 Projects Course

Sources: Talkpython
This course will empower you with the skills and tools to dive deep into data science using Python. We assume you have a foundational understanding of Python but not data scienc...
3 hours 12 minutes 21 seconds