Skip to main content
CF

Data Engineering on GCP

1h 17m 33s
English
Paid

Google Cloud Platform (GCP) is one of the most popular cloud platforms in the world, providing an extensive set of tools and services for building, managing, and optimizing data pipelines. GCP enables efficient storage, processing, analysis, and visualization of data, helping data engineers create scalable and high-performance solutions.

What You Will Learn in the Course

In this practical course, you will step by step create your own project on GCP:

  • Extract data from an external weather API
  • Process it through a pipeline using GCP cloud services
  • Store the data in a server database
  • Create visualizations using Looker Studio

The course will help you master GCP from scratch, and the skills you acquire will also be useful when working with other cloud platforms such as AWS, as many of the services are quite similar.

In addition to the course, you will have access to a GitHub repository with a project overview and ready-made code snippets to help you quickly replicate the training examples.

Course Structure

Project Data and Goals

We will analyze the pipeline architecture, define the project goals, and get acquainted with the API for obtaining weather data. You will also learn how to set up an account in Google Cloud and activate the necessary services. (By the way, Google provides $300 for free platform testing!)

Project Preparation

Create a project in Google Cloud, activate APIs, and set up schedules for task automation.

Pipeline Creation: Extracting Data from API

Set up the necessary resources for pipeline operation:

  • A server database MySQL via Cloud SQL
  • A virtual machine based on Linux via Compute Engine for database management
  • Cloud Scheduler for scheduling API calls
  • Server functions for data processing
  • Pub/Sub message queue for data transfer between services

Writing Data to the Database

Learn to write server functions to store data in MySQL, test the writing process, and ensure that data is stored correctly.

Data Visualization

Set up Looker Studio to create clear visualizations: build bubble charts, time series, and organize weather data monitoring.

This course will give you practical experience with Google Cloud Platform tools and help develop key skills for working as a data engineer.

Additional

Link to the GitHub: https://github.com/team-data-science/Data-Engineering-On-GCP

About the Author: Andreas Kretz

Andreas Kretz thumbnail

Andreas Kretz is a German data engineer and one of the most widely followed independent voices on data engineering as a career discipline. He runs the Plumbers of Data Science brand and has been publishing tutorial material continuously since the field consolidated around the modern lake-house stack (Spark, Kafka, Snowflake, Databricks, Airflow).

His CourseFlix listing is the largest single-author catalog under this source — over thirty courses spanning data-pipeline construction, streaming architectures, the cloud-native data stack on AWS / Azure / GCP, the Python and Scala tooling that dominates the field, and the soft-skills / career side of breaking into data engineering. Material is paid and aimed at engineers transitioning into data work or already-working data engineers picking up specific tools.

Watch Online 23 lessons

This is a demo lesson (10:00 remaining)

You can watch up to 10 minutes for free. Subscribe to unlock all 23 lessons in this course and access 10,000+ hours of premium content across all courses.

View Pricing
0:00
/
#1: Introduction
All Course Lessons (23)
#Lesson TitleDurationAccess
1
Introduction Demo
01:14
2
GitHub & the team
01:31
3
Architecture of this project
03:20
4
Introduction Weather API
02:19
5
Setup Google Cloud Account
02:13
6
Creating the project
02:36
7
Enabling the required APIs
01:35
8
Configure scheduling
02:21
9
Setup VM for database interaction
02:54
10
Setup mysql database
02:17
11
Setup vm client and create database
02:47
12
Creating pub/sub message queue
01:42
13
Create cloud function to pull data form API
04:18
14
Explanation code pull from API
04:21
15
Create function to write to db
07:48
16
Explanation code write data to db
05:57
17
Testing the function
05:52
18
Create function write data to db - pull
03:54
19
Explanation code write data to db - pull
04:34
20
Setup Looker Studio and create bubble chart
02:21
21
Setup Looker Studio and create time series chart
01:58
22
Pipeline Monitoring
06:21
23
Conclusion & Challenges
03:20
Unlock unlimited learning

Get instant access to all 22 lessons in this course, plus thousands of other premium courses. One subscription, unlimited knowledge.

Learn more about subscription

Related courses

Frequently asked questions

What are the prerequisites for enrolling in the course?
The course does not explicitly list prerequisites, but familiarity with cloud platforms and basic programming concepts will be beneficial. The course starts with setting up a Google Cloud account and progresses through pipeline creation, which may require some technical understanding.
What types of projects will I work on during the course?
During the course, you will work on a project that involves extracting data from an external weather API, processing it through a pipeline using Google Cloud Platform services, storing it in a MySQL database, and creating visualizations with Looker Studio.
Who is the target audience for this course?
The course is designed for aspiring and current data engineers who want to learn about building data pipelines on Google Cloud Platform. It is also suitable for professionals looking to expand their skills to include cloud-based data processing and visualization.
How does the depth of this course compare to similar courses?
This course provides a project-based approach to learning about Google Cloud Platform, allowing students to gain practical experience in data extraction, processing, storage, and visualization. The focus on a real-world project distinguishes it from other courses that might only cover theoretical aspects.
Which specific tools and platforms are covered in the course?
The course covers several Google Cloud Platform services such as Pub/Sub, Cloud Functions, and Looker Studio. Students will also use MySQL for database storage and learn to set up virtual machines for database interaction.
What topics are not covered in this course?
The course does not cover advanced machine learning algorithms or detailed network security practices. It focuses on building and managing data pipelines using GCP rather than in-depth data science or security topics.
How can the skills learned in this course be applied to other careers or platforms?
The skills learned in the course are applicable to other cloud platforms such as AWS, as many of the concepts and services are similar. Understanding how to build data pipelines and create visualizations is valuable for roles in data engineering, analytics, and cloud architecture across various industries.