Skip to main content
Data processing and analysis thumbnail

Data processing and analysis

Courses on data analysis and processing, artificial intelligence, machine learning, big data sorting, and more.

Courses in Data processing and analysis

  • The Data Engineering Bootcamp: Zero to Mastery thumbnail

    The Data Engineering Bootcamp: Zero to Mastery

    Learn to build streaming pipelines with Apache Kafka and Flink, create data lakes on AWS, run ML workflows on Spark, and integrate LLM models into...
    16 hours 46 minutes 22 seconds
  • Fundamentals of Apache Spark and PySpark thumbnail

    Fundamentals of Apache Spark and PySpark

    Study Apache Spark and PySpark for big data processing. Practical assignments will help you acquire key skills of a data engineer.
    2 hours 20 minutes 54 seconds
  • Analytics Engineering for Data Professionals thumbnail

    Analytics Engineering for Data Professionals

    Analytics Engineering is the foundation of Data Science and artificial intelligence. This approach represents a dynamic combination of data engineering and...
    12 hours 46 minutes 13 seconds
  • Semantic Log Indexing & Search thumbnail

    Semantic Log Indexing & Search

    Semantic search is one of the most practical ways to apply generative AI in real-world data processing projects. In this course, we go beyond...
    53 minutes 37 seconds
  • Apache Iceberg Fundamentals thumbnail

    Apache Iceberg Fundamentals

    Modern data platforms need the flexibility of data lakes and the reliability of warehouses. Apache Iceberg combines both approaches. In this course, you will...
    33 minutes 32 seconds
  • Fundamentals of Apache Airflow thumbnail

    Fundamentals of Apache Airflow

    This practical course starts with the basics and step by step guides you to building real orchestration scenarios - from task retry executions to...
    2 hours 21 minutes 18 seconds
  • Azure Data Pipelines with Terraform thumbnail

    Azure Data Pipelines with Terraform

    Azure is becoming an increasingly popular platform for companies using the Microsoft365 ecosystem. If you want to enhance your data engineering skills...
    4 hours 20 minutes 29 seconds
  • Machine Learning & Containers on AWS thumbnail

    Machine Learning & Containers on AWS

    In this practical course, you will learn how to build a complete data pipeline on the AWS platform - from obtaining data from the Twitter API to analysis, stora
    1 hour 33 minutes 34 seconds
  • Storing & Visualizing Time Series Data thumbnail

    Storing & Visualizing Time Series Data

    Processing, storing, and visualizing time series data is becoming an increasingly important task. From IoT data and system logs to statistics...
    2 hours 11 minutes 34 seconds
  • Data Engineering with Hadoop thumbnail

    Data Engineering with Hadoop

    Big Data is not just a buzzword but a real phenomenon. Every day, companies around the world collect and process massive volumes of data at a high...
    7 hours 3 minutes
  • Dockerized ETL With AWS, TDengine & Grafana thumbnail

    Dockerized ETL With AWS, TDengine & Grafana

    Data engineers often need to quickly set up a simple ETL script that just does its job. In this project, you will learn how to easily implement...
    29 minutes 12 seconds
  • Streaming with Kafka & Spark thumbnail

    Streaming with Kafka & Spark

    This course is a comprehensive project with a full cycle of real-time data processing. You will work with data from an online store, including invoices...
    2 hours 46 minutes 25 seconds
  • Data Engineering on AWS thumbnail

    Data Engineering on AWS

    This course is the perfect start for those who want to learn cloud technologies and start working with Amazon Web Services (AWS), one of the most popular..
    4 hours 46 minutes 38 seconds
  • Data Engineering on Azure thumbnail

    Data Engineering on Azure

    Microsoft Azure is a cloud platform offering more than 200 products and services for data storage, management, virtual machine deployment, and...
    1 hour 20 minutes 57 seconds
  • Data Engineering on GCP thumbnail

    Data Engineering on GCP

    Google Cloud Platform (GCP) is one of the most popular cloud platforms in the world, providing an extensive set of tools and services for building...
    1 hour 17 minutes 33 seconds
  • Modern Data Warehouses & Data Lakes thumbnail

    Modern Data Warehouses & Data Lakes

    As a data engineer, you will regularly work with analytics platforms where companies store data in Data Lakes and Data Warehouses for building...
    58 minutes 9 seconds
  • Data Analysis for Beginners: Python & Statistics thumbnail

    Data Analysis for Beginners: Python & Statistics

    This course is your first step into the world of data analysis using one of the main tools for analysts - Python. Without complicated terms, advanced...
    6 hours 34 minutes 20 seconds
  • dbt for Data Engineers thumbnail

    dbt for Data Engineers

    dbt (data build tool) is a data transformation tool with a priority on SQL. It allows for simple and transparent transformation, testing, and documentation...
    1 hour 52 minutes 55 seconds
  • Snowflake for Data Engineers thumbnail

    Snowflake for Data Engineers

    Snowflake is a next-generation cloud data warehouse that everyone is talking about today. The platform operates 100% in the cloud, providing flexible access...
    2 hours 4 minutes 8 seconds
  • Apache Kafka Fundamentals thumbnail

    Apache Kafka Fundamentals

    In this course, you will acquire the basic knowledge necessary for confidently starting to work with Apache Kafka. You will learn how to set up a message...
    1 hour 4 minutes 52 seconds
  • Data Engineering on Databricks thumbnail

    Data Engineering on Databricks

    Databricks is one of the most popular platforms for data processing using Apache Spark and creating modern data warehouses (Lakehouse).
    1 hour 27 minutes 29 seconds
  • Schema Design Data Stores thumbnail

    Schema Design Data Stores

    During my coaching sessions, one important topic repeatedly comes up - designing diagrams. Therefore, I decided to create a separate course in the academy to...
    2 hours 30 minutes 25 seconds
  • Learning Apache Spark thumbnail

    Learning Apache Spark

    After building data pipelines, data processing is one of the most important tasks in Data Engineering. As a data engineer, you constantly encounter...
    1 hour 44 minutes 4 seconds
  • Choosing Data Stores thumbnail

    Choosing Data Stores

    One of the key tasks when creating a data platform and pipelines is the selection of appropriate data storage systems. This course is dedicated to that topic.
    1 hour 25 minutes 31 seconds
  • Dimensional Data Modeling thumbnail

    Dimensional Data Modeling

    In today's world, where data plays a key role, effective organization of information is the foundation for quality analytics and report building.
    1 hour 37 minutes 57 seconds
  • Apache Airflow Workflow Orchestration thumbnail

    Apache Airflow Workflow Orchestration

    Apache Airflow is a platform-independent tool for workflow orchestration that provides extensive capabilities for creating and...
    1 hour 18 minutes 41 seconds
  • Building APIs with FastAPI thumbnail

    Building APIs with FastAPI

    API is the foundation of any modern data platform. You either provide an API for clients or use external APIs yourself. In any case, it's important to be...
    1 hour 35 minutes 40 seconds
  • Platform & Pipeline Security thumbnail

    Platform & Pipeline Security

    A reliable security concept for platforms and pipelines is critically important. Almost anyone can put together a Proof of Concept without an adequate level...
    34 minutes 46 seconds
  • Relational Data Modeling thumbnail

    Relational Data Modeling

    Relational modeling is widely used in building transactional databases. You might say, "But I'm not planning to become a backend engineer."
    1 hour 52 minutes
  • Data Platform & Pipeline Design thumbnail

    Data Platform & Pipeline Design

    Data pipelines are a key component of any Data Science platform. Without them, data loading and machine learning model deployment would not be possible. This...
    1 hour 59 minutes 5 seconds