Skip to main content
CF

Dockerized ETL With AWS, TDengine & Grafana

29m 12s
English
Paid

Data engineers often need to quickly set up a simple ETL script that just gets the job done. In this project, you will learn how to easily implement such an ETL on AWS: connect live data from a weather API and write it to a TDengine time-series database.

Course Overview

Embark on a journey to master the integration of Docker, AWS, TDengine, and Grafana for efficient ETL processes. This course provides hands-on experience with cutting-edge technologies to streamline data engineering tasks.

Learning Objectives

The Basics of Temporal Databases

You will get acquainted with the basics of working with temporal databases, their architecture, and use cases.

Working with a Public Weather API

Learn to set up and explore an external weather API, and write a Python script to read real-time data from the API.

Docker ETL on AWS

Discover how to package the script into a Docker container and deploy it as a serverless ETL using Amazon Elastic Container Registry (ECR), Lambda, and EventBridge.

TDengine Setup

Get familiar with TDengine, set up an instance via the TDengine Cloud, and configure the database for optimum performance.

Data Visualization in Grafana

Learn how to visualize data from the API stored in TDengine using Grafana. Connect TDengine to Grafana and create a comprehensive dashboard for data analysis.

Course Benefits

  • Hands-on experience with real-world data integration and visualization.
  • Skills to implement Dockerized ETL projects on the AWS cloud.
  • Profound understanding of temporal databases and their applications.
  • Ability to leverage Grafana for impactful data visualization.

Additional

https://github.com/team-data-science/dockerized-etl-aws-tdengine

https://github.com/team-data-science/dockerized-etl-aws-tdengine/blob/main/src/writer_json.py

About the Author: Andreas Kretz

Andreas Kretz thumbnail

Andreas Kretz is a German data engineer and one of the most widely followed independent voices on data engineering as a career discipline. He runs the Plumbers of Data Science brand and has been publishing tutorial material continuously since the field consolidated around the modern lake-house stack (Spark, Kafka, Snowflake, Databricks, Airflow).

His CourseFlix listing is the largest single-author catalog under this source — over thirty courses spanning data-pipeline construction, streaming architectures, the cloud-native data stack on AWS / Azure / GCP, the Python and Scala tooling that dominates the field, and the soft-skills / career side of breaking into data engineering. Material is paid and aimed at engineers transitioning into data work or already-working data engineers picking up specific tools.

Watch Online 16 lessons

This is a demo lesson (10:00 remaining)

You can watch up to 10 minutes for free. Subscribe to unlock all 16 lessons in this course and access 10,000+ hours of premium content across all courses.

View Pricing
0:00
/
#1: Quick note from Andreas before you begin
All Course Lessons (16)
#Lesson TitleDurationAccess
1
Quick note from Andreas before you begin Demo
00:44
2
Introduction
01:27
3
Setup Of The Project
02:53
4
Time Series Data Basics
02:21
5
Big Pros Of Timeseries Databases
02:07
6
About TDengine
01:23
7
Setup Weather API
01:05
8
Code query API
02:42
9
TDengine Setup
03:05
10
Connect Python To TDengine
01:51
11
Lambda Docker Container & Push To ECR
01:56
12
AWS Setup
01:37
13
Create Lambda Function Using Docker image
01:05
14
Schedule Function With EventBridge
01:26
15
Cloud Watch Lambda Events
00:28
16
Grafana Setup
03:02
Unlock unlimited learning

Get instant access to all 15 lessons in this course, plus thousands of other premium courses. One subscription, unlimited knowledge.

Learn more about subscription

Related courses

  • MongoDB Fundamentals thumbnailUpdated 11mo ago

    MongoDB Fundamentals

    By: Andreas Kretz
    Discover the power of document-oriented databases with our comprehensive course on MongoDB fundamentals.
    1h 23m
  • PyTorch for Deep Learning thumbnailUpdated 2y ago

    PyTorch for Deep Learning

    By: Zero To Mastery
    Master PyTorch for deep learning with a step-by-step course. Build real-world projects and enhance your skills to become a Deep Learning Engineer.
    52h5/5
  • Build Fast Masterclass thumbnailUpdated 2y ago

    Build Fast Masterclass

    By: Build Fast Academy
    How to finally Launch your AI Product (without ripping your hair out).. that makes you money in 30 days (or less). But unlike other AI courses, you won't learn
    7h 22m5/5

Frequently asked questions

What prerequisites are needed before enrolling in this course?
Before enrolling, students should have a basic understanding of Python scripting and familiarity with cloud computing concepts. Some experience with Docker and AWS services would be beneficial but not mandatory, as the course guides through setting up Docker containers and deploying them on AWS.
What kind of project will I work on during the course?
The course focuses on building a Dockerized ETL project that involves connecting live weather data from a public API, storing it in a TDengine time-series database, and visualizing it using Grafana. You'll learn to automate these processes using AWS services like Lambda and EventBridge.
Who is the target audience for this course?
This course is designed for data engineers and developers who want to enhance their skills in data integration and visualization using modern tools and platforms. It is also suitable for those interested in learning about Dockerized ETL processes on AWS.
How does the depth of this course compare to similar courses?
The course offers a hands-on approach with 16 lessons, focusing on practical implementation of ETL processes using Docker, AWS, TDengine, and Grafana. Unlike theoretical courses, it emphasizes real-world application and integration of these technologies.
Which specific tools and platforms are covered in the course?
The course covers Docker for containerization, AWS for deploying serverless ETL processes (using Lambda, ECR, and EventBridge), TDengine for managing time-series data, and Grafana for data visualization. It also involves working with a public weather API and Python scripting.
What topics are not covered in this course?
The course does not cover advanced programming techniques, in-depth AWS architecture beyond the services used, or detailed Grafana dashboard customization. It remains focused on the integration and deployment of ETL processes using the specified tools.
What is the expected time commitment for completing the course?
The course comprises 16 lessons, each designed to provide hands-on experience. While the total runtime is not specified, students should allocate time for both learning the material and completing practical exercises, typically suggesting a commitment of several hours per week.