Skip to main content
CF

Contact Tracing with Elasticsearch

1h 37m 3s
English
Paid

Contact Tracing with Elasticsearch is a 19-lesson 1 hour 37 minutes self-paced course by Andreas Kretz. Embark on an intriguing journey in this engineering project where you'll learn to trace user movements through their phone scans using Elasticsearch .

Course facts

Lessons
19
Duration
1 hour 37 minutes
Level
All levels
Language
English
Updated
Instructor
Andreas Kretz
Price
Premium

Embark on an intriguing journey in this engineering project where you'll learn to trace user movements through their phone scans using Elasticsearch. This project's goal is to employ Elasticsearch as a search system to analyze a comprehensive dataset in which 100,000 users visit stores and make 1,000,000 scans.

Create Your Dataset

Utilize Python and Pandas to craft your own dataset from an open San Francisco stores dataset, which features over 140,000 stores with their names and coordinates. Learn to refine this dataset down to 10,000 selected stores and generate 100,000 fictional users, each performing an average of 10 check-ins. Once the data preparation is complete, you'll upload it to Elasticsearch and build a vibrant user interface with Streamlit for robust data visualization.

Application Interface Features

  • Search by store name
  • Search by ZIP code to filter stores by area
  • Search by business ID for visit analysis
  • Search and track by Device ID to observe specific user movements

Skills and Learning Outcomes

Throughout this project, you'll develop the ability to:

  • Transform and upload data in parquet format to Elasticsearch
  • Utilize Kibana for effective index management and document search
  • Design an interactive interface using Streamlit with controls, Folium maps, and tables
  • Configure pages and execute intricate queries on Elasticsearch

Course Program Overview

  • Preparation of the San Francisco dataset with 10,000 stores
  • Generation of 100,000 fictional user profiles
  • Integration of user data with store information
  • Creation of 1,000,000 app check-ins
  • Data preparation for Elasticsearch upload
  • Data upload to Elasticsearch
  • Streamlit application development including maps, filters, and tables
  • Page configuration and Elasticsearch query execution

Prerequisites and Recommendations

It is recommended to complete the “Log Analysis in Elasticsearch” course to gain foundational knowledge in Elasticsearch. Additionally, consider taking the Pandas lessons from the course “Python for Data Engineers” to enhance your data manipulation skills.

This project is best suited for systems equipped with at least 8 GB of RAM.

Additional

https://github.com/team-data-science/ElasticSearch-contact-tracing

Who teaches Contact Tracing with Elasticsearch? Andreas Kretz

Andreas Kretz thumbnail

Andreas Kretz is a German data engineer and one of the most widely followed independent voices on data engineering as a career discipline. He runs the Plumbers of Data Science brand and has been publishing tutorial material continuously since the field consolidated around the modern lake-house stack (Spark, Kafka, Snowflake, Databricks, Airflow).

His CourseFlix listing is the largest single-author catalog under this source — over thirty courses spanning data-pipeline construction, streaming architectures, the cloud-native data stack on AWS / Azure / GCP, the Python and Scala tooling that dominates the field, and the soft-skills / career side of breaking into data engineering. Material is paid and aimed at engineers transitioning into data work or already-working data engineers picking up specific tools.

What lessons are included in Contact Tracing with Elasticsearch?

This is a demo lesson (10:00 remaining)

You can watch up to 10 minutes for free. Subscribe to unlock all 19 lessons in this course and access 10,000+ hours of premium content across all courses.

View Pricing
0:00
/
#1: Introduction
All Course Lessons (19)
#Lesson TitleDurationAccess
1
Introduction Demo
03:01
2
Setup & Goals
03:28
3
San Francisco dataset
03:49
4
Relational database vs elasticsearch
06:49
5
Preparing the dev environment
02:05
6
Prepare the SF dataset 1
09:48
7
Preparing the SF dataset 2
08:47
8
Creating 100k fake users
08:59
9
Merging 100k users with SF dataset
06:02
10
Creating app scans for users
08:22
11
Preparing Elasticsearch and loading the data
04:41
12
Creating the Streamlit app basics and folium maps
02:27
13
Page setup and querying from Elasticsearch
05:28
14
Creating free text search
04:58
15
Zip code search
02:24
16
Business_id search
04:03
17
Search by device ID & tracking people
03:38
18
Summary
03:53
19
Outlook
04:21
Unlock unlimited learning

Get instant access to all 18 lessons in this course, plus thousands of other premium courses. One subscription, unlimited knowledge.

Learn more about subscription

What courses are similar to Contact Tracing with Elasticsearch?

Frequently asked questions

What are the prerequisites for this course?
The course does not explicitly list prerequisites, but familiarity with Python and Pandas is beneficial as they are used for creating and refining the dataset. Understanding basic concepts of Elasticsearch and data visualization tools like Streamlit will also help, although the course covers these topics in detail.
What will I build by the end of this course?
By the end of the course, you will have developed a fully functional application capable of conducting analyses on user movement data. This includes creating a dataset from a San Francisco store dataset, uploading it to Elasticsearch, and building an interactive user interface with Streamlit that allows for searching and tracking user activity by store, ZIP code, business ID, and device ID.
Who is the target audience for this course?
This course is aimed at individuals interested in data engineering and analysis, particularly those who want to learn about integrating Elasticsearch into data-driven applications. It is suitable for those who wish to explore building user interfaces for data visualization using Streamlit.
How does this course compare in depth and scope to other Elasticsearch courses?
While many Elasticsearch courses focus on search and data indexing, this course distinguishes itself by providing a hands-on project that integrates data preparation, uploading, and visualization using Streamlit. It combines practical application development with data analysis, making it more comprehensive for those interested in end-to-end solutions.
What specific tools and platforms are used in this course?
The course utilizes several specific tools and platforms including Python, Pandas, Elasticsearch, Kibana, and Streamlit. Python and Pandas are used for data manipulation, Elasticsearch for data storage and search, Kibana for index management, and Streamlit for building the user interface.
What topics are not covered in this course?
The course does not cover advanced Elasticsearch features such as machine learning or security configurations. It focuses primarily on setting up basic search and data visualization functionalities. Topics like relational database management or detailed Python programming are also not covered in depth.
What is the expected time commitment for completing this course?
Although the total runtime of lessons is not specified, the course consists of 19 lessons. Considering the hands-on nature of the project, students should expect to spend additional time outside of the video lessons working on data preparation, application development, and understanding the functionalities of the tools used.