Apache Spark Certification Training

15h 13m 1s
English
Paid
December 30, 2023

Apache Spark is a core data skill – here is how to show you got it!

Learn Apache Spark from the ground up and show off your knowledge with the Databricks Associate Developer for Apache Spark certification. This course will transform you into a PySpark professional and get you ready to pass the popular Databricks Spark certification.

Join me for an easy to understand and engaging look into Spark and take your big data career to the next level!

More

What will you learn?

The goal of this course is to teach you fundamental PySpark skills and prepare you to get certified with the Databricks Certified Associate Developer for Apache Spark certification.

The course includes 18 modules to help you understand how Apache Spark works internally and how to use it in practice. You can find all topics covered below, but here is an overview:

  • Become a seasoned expert at coding with Spark DataFrames
  • Get confident with the Databricks certification exam content
  • Discover Spark's distributed, fault-tolerant data processing
  • Master how to work with Spark in Databricks
  • Understand the Spark cluster architecture
  • Learn when and how Spark evaluates code
  • Grasp Spark's efficient memory management mechanisms
  • Analyze typical Spark problems like out-of-memory errors
  • See how Spark executes complex operations like joins
  • Become proficient in navigating through the Spark UI
  • ...and many more topics – check out the full list below!

Who is this for?

Anyone with basic Python skills who wants to develop their big data processing skills! And anyone who would like to pass the popular Databricks Certified Associate Developer for Apache Spark certification using PySpark.

  • If you want to learn how to use Apache Spark with the Scala programming language, this course isn't a fit. We focus on Python and PySpark exclusively, but the fundamental Spark concepts taught are applicable to both languages.
  • Data analysts and developers who want to add verified big data skills and Databricks experience to their portfolio
  • Data engineers who want or need a proof of their Apache Spark skills via a certification to boost their career
  • Data scientists wanting to work efficiently and frustration-free with large data sets in Apache Spark
  • Companies who want to enable their data staff to use Apache Spark in a professional, time- and cost-efficient way
  • Anyone wanting to brush up their Apache Spark skills with a solid understanding of how it works under the hood


Watch Online Apache Spark Certification Training

Join premium to watch
Go to premium
# Title Duration
1 01. Introduction 09:48
2 02. Certification Exam Overview 05:03
3 03. Signing up for Databricks Community Edition 01:44
4 04. Loading Data Into Databricks 02:43
5 05. Overview of the Spark Cluster Architecture and its Components 08:24
6 06. Getting to Know the Spark Driver 11:55
7 07. Getting to Know Executors 07:37
8 08. Discovering Execution Modes 17:33
9 09. Overview 05:38
10 10. Internal Types, DataFrames, Datasets, RDDs, and the Spark SQL API 19:10
11 11. Hands-on Session_ Exploring Data APIs on Databricks Community Edition 08:26
12 12. Intro to Labs 01:12
13 13. Intro & Creating DataFrames 06:58
14 14. Exercise_ Creating a DataFrame 01:07
15 15. Exercise_ Creating a DataFrame - Solution 01:59
16 16. Working with Schemas 26:10
17 17. Exercise_ Building a Simple Schema 01:46
18 18. Exercise_ Building a Simple Schema - Solution 05:13
19 19. Exercise_ Building a Complex Schema 02:28
20 20. Exercise_ Building a Complex Schema - Solution 05:53
21 21. Type Conversion of DataFrame Columns 07:20
22 22. Exercise_ Changing the Type of a Column 01:50
23 23. Exercise_ Changing the Type of a Column - Solution 04:20
24 24. Overview 09:18
25 25. Shuffles 07:52
26 26. Data Skew 13:15
27 27. Spark Configurations for Partitions 03:47
28 28. Hands-on Session_ The Power of Partitions 30:18
29 29. Storage Layout 17:39
30 30. Caching and Storage Levels 10:29
31 31. Memory in Action 30:59
32 32. Hands-on Session_ Executor Memory Management - Part 1 10:27
33 33. Hands-on Session_ Executor Memory Management - Part 2 13:08
34 34. Intro & How to Get Help in PySpark 04:01
35 35. Partitioning Recap 09:45
36 36. Exercise_ Repartitioning 01:31
37 37. Exercise_ Repartitioning - Solution 06:08
38 38. Caching Recap 03:27
39 39. Exercise_ Caching 01:13
40 40. Exercise_ Caching - Solution 03:20
41 41. Overview 07:40
42 42. Hands-On Session_ Actions vs. Transformations 06:47
43 43. Intro & Reading Data 18:36
44 44. Exercise_ Reading Parquet Files 02:20
45 45. Exercise_ Reading Parquet Files - Solution 03:44
46 46. Reading from CSV Files 17:18
47 47. Exercise_ Reading CSV Files 02:29
48 48. Exercise_ Reading CSV Files - Solution 03:55
49 49. Reading from JSON Files 05:16
50 50. Writing Data 10:57
51 51. Exercise_ Writing to Parquet Files 02:08
52 52. Exercise_ Writing to Parquet Files - Solution 04:27
53 53. Writing to CSV Files 02:53
54 54. Exercise_ Writing to CSV Files 02:16
55 55. Exercise_ Writing to CSV Files - Solution 03:12
56 56. Writing to JSON Files 01:58
57 57. Using PySpark with SQL 05:01
58 58. Exercise_ SQL in PySpark 00:46
59 59. Exercise_ SQL in PySpark - Solution 02:16
60 60. Overview 16:33
61 61. Hands-on Session_ Discovering the Spark UI 12:27
62 62. Intro & Removing Data 16:58
63 63. Exercise_ Removing Data 00:59
64 64. Exercise_ Removing Data - Solution 03:16
65 65. Modifying Data 30:49
66 66. Exercise_ Modifying Data 02:08
67 67. Exercise_ Modifying Data - Solution 07:22
68 68. Analyzing Data 18:14
69 69. Exercise_ Analyzing Data 01:39
70 70. Exercise_ Analyzing Data - Solution 06:30
71 71. The Catalyst Optimizer 18:32
72 72. Adaptive Query Execution 15:32
73 73. Dynamic Partition Pruning 10:08
74 74. The DAG_ Achieving Fault Tolerance 12:25
75 75. Intro & Working With Dates and Times 33:30
76 76. Exercise_ Working With Dates and Times 02:10
77 77. Exercise_ Working With Dates and Times - Solution 08:00
78 78. Working With Strings 15:30
79 79. Exercise_ Working With Strings 03:20
80 80. Exercise_ Working With Strings - Solution 07:47
81 81. Working with Arrays 14:38
82 82. Exercise_ Working With Arrays 05:17
83 83. Exercise_ Working With Arrays - Solution 13:19
84 84. Accumulator and Broadcast Variables 11:14
85 85. Joins 34:02
86 86. Hands-on Session_ Cross-Cluster Communication 42:39
87 87. Intro & Grouping and Aggregating 19:16
88 88. Exercise_ Grouping and Aggregating 01:43
89 89. Exercise_ Grouping and Aggregating - Solution 07:19
90 90. Joining 15:06
91 91. Exercise_ Joining 03:58
92 92. Exercise_ Joining - Solution 03:58
93 93. User-Defined Functions (UDFs) 20:29
94 94. Exercise_ UDFs 04:06
95 95. Exercise_ UDFs - Solution 17:51
96 96. Signing up for the Exam 02:24
97 97. Last Minute Preparations 01:34
98 98. Introduction 04:36
99 99. Congratulations! 00:50

Read Book Apache Spark Certification Training

# Title
1 1-Proposed Timeline
2 Apache Spark Certification Exam Guide
3 Mastery Map 1 - Cluster Components
4 Mastery Map 2 - Spark Execution Modes
5 Mastery Map 3 - Spark Data APIs
6 Mastery Map 4 - Executor Memory Layout
7 Mastery Map 5 - PySpark Storage Levels
8 Mastery Map 6 - Executor Out-of-Memory Errors
9 Mastery Map 7 - Actions Vs. Transformations
10 Mastery Map 8 - Execution Hierarchy
11 Mastery Map 9 - A Query, From Plan to Execution
12 Mastery Map 10 - Adaptive Query Execution Strategies
13 Mastery Map 11 - Dynamic Partition Pruning
14 Mastery Map 12 - Joins

Similar courses to Apache Spark Certification Training

Django Masterclass : Build Web Apps With Python & Django

Django Masterclass : Build Web Apps With Python & Django

Duration 15 hours 42 minutes 28 seconds
Developing LLM App Frontends with Streamlit

Developing LLM App Frontends with Streamlit

Duration 1 hour 43 minutes 52 seconds
Build a Python REST API with the Django Rest Framework

Build a Python REST API with the Django Rest Framework

Duration 10 hours 8 minutes 56 seconds
The Ultimate Django Series: Part 1

The Ultimate Django Series: Part 1

Duration 4 hours 49 minutes 19 seconds
Complete Backend (API) Development with Python A-Z

Complete Backend (API) Development with Python A-Z

Duration 12 hours 35 minutes 9 seconds
Modern APIs with FastAPI and Python Course

Modern APIs with FastAPI and Python Course

Duration 3 hours 53 minutes 18 seconds
DS4B 101-P: Python for Data Science Automation

DS4B 101-P: Python for Data Science Automation

Duration 27 hours 6 minutes 1 second
Python 3: Deep Dive (Part 1 - Functional)

Python 3: Deep Dive (Part 1 - Functional)

Duration 44 hours 40 minutes 37 seconds
The Automation Bootcamp: Zero to Mastery

The Automation Bootcamp: Zero to Mastery

Duration 22 hours 39 minutes 15 seconds