Apache Spark Certification Training

15h 13m 1s
English
Paid

Apache Spark is a core data skill – here is how to show you got it!

Learn Apache Spark from the ground up and show off your knowledge with the Databricks Associate Developer for Apache Spark certification. This course will transform you into a PySpark professional and get you ready to pass the popular Databricks Spark certification.

Join me for an easy to understand and engaging look into Spark and take your big data career to the next level!

Read more about the course

What will you learn?

The goal of this course is to teach you fundamental PySpark skills and prepare you to get certified with the Databricks Certified Associate Developer for Apache Spark certification.

The course includes 18 modules to help you understand how Apache Spark works internally and how to use it in practice. You can find all topics covered below, but here is an overview:

  • Become a seasoned expert at coding with Spark DataFrames
  • Get confident with the Databricks certification exam content
  • Discover Spark's distributed, fault-tolerant data processing
  • Master how to work with Spark in Databricks
  • Understand the Spark cluster architecture
  • Learn when and how Spark evaluates code
  • Grasp Spark's efficient memory management mechanisms
  • Analyze typical Spark problems like out-of-memory errors
  • See how Spark executes complex operations like joins
  • Become proficient in navigating through the Spark UI
  • ...and many more topics – check out the full list below!

Who is this for?

Anyone with basic Python skills who wants to develop their big data processing skills! And anyone who would like to pass the popular Databricks Certified Associate Developer for Apache Spark certification using PySpark.

  • If you want to learn how to use Apache Spark with the Scala programming language, this course isn't a fit. We focus on Python and PySpark exclusively, but the fundamental Spark concepts taught are applicable to both languages.
  • Data analysts and developers who want to add verified big data skills and Databricks experience to their portfolio
  • Data engineers who want or need a proof of their Apache Spark skills via a certification to boost their career
  • Data scientists wanting to work efficiently and frustration-free with large data sets in Apache Spark
  • Companies who want to enable their data staff to use Apache Spark in a professional, time- and cost-efficient way
  • Anyone wanting to brush up their Apache Spark skills with a solid understanding of how it works under the hood


Watch Online Apache Spark Certification Training

Join premium to watch
Go to premium
# Title Duration
1 01. Introduction 09:48
2 02. Certification Exam Overview 05:03
3 03. Signing up for Databricks Community Edition 01:44
4 04. Loading Data Into Databricks 02:43
5 05. Overview of the Spark Cluster Architecture and its Components 08:24
6 06. Getting to Know the Spark Driver 11:55
7 07. Getting to Know Executors 07:37
8 08. Discovering Execution Modes 17:33
9 09. Overview 05:38
10 10. Internal Types, DataFrames, Datasets, RDDs, and the Spark SQL API 19:10
11 11. Hands-on Session_ Exploring Data APIs on Databricks Community Edition 08:26
12 12. Intro to Labs 01:12
13 13. Intro & Creating DataFrames 06:58
14 14. Exercise_ Creating a DataFrame 01:07
15 15. Exercise_ Creating a DataFrame - Solution 01:59
16 16. Working with Schemas 26:10
17 17. Exercise_ Building a Simple Schema 01:46
18 18. Exercise_ Building a Simple Schema - Solution 05:13
19 19. Exercise_ Building a Complex Schema 02:28
20 20. Exercise_ Building a Complex Schema - Solution 05:53
21 21. Type Conversion of DataFrame Columns 07:20
22 22. Exercise_ Changing the Type of a Column 01:50
23 23. Exercise_ Changing the Type of a Column - Solution 04:20
24 24. Overview 09:18
25 25. Shuffles 07:52
26 26. Data Skew 13:15
27 27. Spark Configurations for Partitions 03:47
28 28. Hands-on Session_ The Power of Partitions 30:18
29 29. Storage Layout 17:39
30 30. Caching and Storage Levels 10:29
31 31. Memory in Action 30:59
32 32. Hands-on Session_ Executor Memory Management - Part 1 10:27
33 33. Hands-on Session_ Executor Memory Management - Part 2 13:08
34 34. Intro & How to Get Help in PySpark 04:01
35 35. Partitioning Recap 09:45
36 36. Exercise_ Repartitioning 01:31
37 37. Exercise_ Repartitioning - Solution 06:08
38 38. Caching Recap 03:27
39 39. Exercise_ Caching 01:13
40 40. Exercise_ Caching - Solution 03:20
41 41. Overview 07:40
42 42. Hands-On Session_ Actions vs. Transformations 06:47
43 43. Intro & Reading Data 18:36
44 44. Exercise_ Reading Parquet Files 02:20
45 45. Exercise_ Reading Parquet Files - Solution 03:44
46 46. Reading from CSV Files 17:18
47 47. Exercise_ Reading CSV Files 02:29
48 48. Exercise_ Reading CSV Files - Solution 03:55
49 49. Reading from JSON Files 05:16
50 50. Writing Data 10:57
51 51. Exercise_ Writing to Parquet Files 02:08
52 52. Exercise_ Writing to Parquet Files - Solution 04:27
53 53. Writing to CSV Files 02:53
54 54. Exercise_ Writing to CSV Files 02:16
55 55. Exercise_ Writing to CSV Files - Solution 03:12
56 56. Writing to JSON Files 01:58
57 57. Using PySpark with SQL 05:01
58 58. Exercise_ SQL in PySpark 00:46
59 59. Exercise_ SQL in PySpark - Solution 02:16
60 60. Overview 16:33
61 61. Hands-on Session_ Discovering the Spark UI 12:27
62 62. Intro & Removing Data 16:58
63 63. Exercise_ Removing Data 00:59
64 64. Exercise_ Removing Data - Solution 03:16
65 65. Modifying Data 30:49
66 66. Exercise_ Modifying Data 02:08
67 67. Exercise_ Modifying Data - Solution 07:22
68 68. Analyzing Data 18:14
69 69. Exercise_ Analyzing Data 01:39
70 70. Exercise_ Analyzing Data - Solution 06:30
71 71. The Catalyst Optimizer 18:32
72 72. Adaptive Query Execution 15:32
73 73. Dynamic Partition Pruning 10:08
74 74. The DAG_ Achieving Fault Tolerance 12:25
75 75. Intro & Working With Dates and Times 33:30
76 76. Exercise_ Working With Dates and Times 02:10
77 77. Exercise_ Working With Dates and Times - Solution 08:00
78 78. Working With Strings 15:30
79 79. Exercise_ Working With Strings 03:20
80 80. Exercise_ Working With Strings - Solution 07:47
81 81. Working with Arrays 14:38
82 82. Exercise_ Working With Arrays 05:17
83 83. Exercise_ Working With Arrays - Solution 13:19
84 84. Accumulator and Broadcast Variables 11:14
85 85. Joins 34:02
86 86. Hands-on Session_ Cross-Cluster Communication 42:39
87 87. Intro & Grouping and Aggregating 19:16
88 88. Exercise_ Grouping and Aggregating 01:43
89 89. Exercise_ Grouping and Aggregating - Solution 07:19
90 90. Joining 15:06
91 91. Exercise_ Joining 03:58
92 92. Exercise_ Joining - Solution 03:58
93 93. User-Defined Functions (UDFs) 20:29
94 94. Exercise_ UDFs 04:06
95 95. Exercise_ UDFs - Solution 17:51
96 96. Signing up for the Exam 02:24
97 97. Last Minute Preparations 01:34
98 98. Introduction 04:36
99 99. Congratulations! 00:50

Read Book Apache Spark Certification Training

#Title
11-Proposed Timeline
2Apache Spark Certification Exam Guide
3Mastery Map 1 - Cluster Components
4Mastery Map 2 - Spark Execution Modes
5Mastery Map 3 - Spark Data APIs
6Mastery Map 4 - Executor Memory Layout
7Mastery Map 5 - PySpark Storage Levels
8Mastery Map 6 - Executor Out-of-Memory Errors
9Mastery Map 7 - Actions Vs. Transformations
10Mastery Map 8 - Execution Hierarchy
11Mastery Map 9 - A Query, From Plan to Execution
12Mastery Map 10 - Adaptive Query Execution Strategies
13Mastery Map 11 - Dynamic Partition Pruning
14Mastery Map 12 - Joins

Similar courses to Apache Spark Certification Training

Spark and Python for Big Data with PySpark

Spark and Python for Big Data with PySparkudemy

Category: Python, Data processing and analysis
Duration 10 hours 35 minutes 43 seconds
Python 3: Deep Dive (Part 2 - Iteration, Generators)

Python 3: Deep Dive (Part 2 - Iteration, Generators)udemy

Category: Python
Duration 34 hours 42 minutes 47 seconds
The Complete Guide to Django REST Framework and Vue JS

The Complete Guide to Django REST Framework and Vue JSudemy

Category: Python, Vue, Django
Duration 13 hours 40 minutes 40 seconds
Machine Learning in JavaScript with TensorFlow.js

Machine Learning in JavaScript with TensorFlow.jsudemy

Category: JavaScript, Data processing and analysis
Duration 6 hours 42 minutes 20 seconds
The Complete Python Course | Learn Python by Doing

The Complete Python Course | Learn Python by Doingudemy

Category: Python
Duration 35 hours 20 seconds
Machine Learning with Python : COMPLETE COURSE FOR BEGINNERS

Machine Learning with Python : COMPLETE COURSE FOR BEGINNERSudemy

Category: Python, Data processing and analysis
Duration 13 hours 12 minutes 31 seconds
Introduction to Ansible

Introduction to AnsibleTalkpython

Category: Python
Duration 2 hours 54 minutes 19 seconds
Python - The Practical Guide

Python - The Practical Guideudemy

Category: Python, Other (Blockchain)
Duration 16 hours 26 minutes 30 seconds
SQL & Database Design A-Z™: Learn MS SQL Server + PostgreSQL

SQL & Database Design A-Z™: Learn MS SQL Server + PostgreSQLudemy

Category: Sql, Data processing and analysis
Duration 12 hours 32 minutes 7 seconds
Machine Learning A-Z : Become Kaggle Master

Machine Learning A-Z : Become Kaggle Masterudemy

Category: Python, Data processing and analysis
Duration 36 hours 23 minutes 54 seconds