Master the latest Big Data Technology - Spark! Learn to leverage it with one of the most popular programming languages, Python! In the modern job market, the ability to analyze huge data sets is invaluable, and this course is designed to equip you with the knowledge of one of the best technologies for this task, Apache Spark! Top technology companies such as Google, Facebook, Netflix, Airbnb, Amazon, and NASA utilize Spark to tackle their big data challenges!
Course Overview
Spark operates up to 100 times faster than Hadoop MapReduce, spiking demand for this skill! The Spark 2.0 DataFrame framework is relatively new, offering you the chance to quickly become a sought-after professional in the job market.
This course covers everything from a crash course in Python to using Spark DataFrames with the latest Spark 2.0 syntax. Progressing further, you'll learn how to use the MLlib Machine Library alongside the DataFrame syntax and Spark. Throughout the course, you'll engage in exercises and Mock Consulting Projects that simulate real-world scenarios, allowing you to apply your new skills to tackle real problems!
We delve into the latest Spark Technologies such as Spark SQL, Spark Streaming, and advanced models including Gradient Boosted Trees! After completing this course, you will be confident in adding Spark and PySpark to your resume!
If you're ready to step into the realm of Python, Spark, and Big Data, this is the course for you!
Requirements
- General Programming Skills in any Language (preferably Python)
- 20 GB of free space on your local computer (or strong internet connection for AWS usage)
Who This Course Is For
- Individuals proficient in Python looking to apply it to Big Data
- Those skilled in another programming language needing to learn Spark
What You'll Learn
- Utilize Python and Spark in unison to analyze Big Data
- Master the new Spark 2.0 DataFrame Syntax
- Engage in Consulting Projects mimicking real-world situations
- Classify Customer Churn using Logistic Regression
- Apply Spark with Random Forests for Classification
- Use Spark's Gradient Boosted Trees
- Create robust Machine Learning Models with Spark's MLlib
- Understand the DataBricks Platform
- Get started with Amazon Web Services EC2 for Big Data Analysis
- Learn AWS Elastic MapReduce Service functionalities
- Leverage the power of Linux in a Spark Environment
- Create a Spam filter using Spark and Natural Language Processing
- Analyze Tweets in Real Time with Spark Streaming