Skip to main content

Contact Tracing with Elasticsearch

1h 37m 3s
English
Paid

Embark on an intriguing journey in this engineering project where you'll learn to trace user movements through their phone scans using Elasticsearch. This project's goal is to employ Elasticsearch as a search system to analyze a comprehensive dataset in which 100,000 users visit stores and make 1,000,000 scans.

Create Your Dataset

Utilize Python and Pandas to craft your own dataset from an open San Francisco stores dataset, which features over 140,000 stores with their names and coordinates. Learn to refine this dataset down to 10,000 selected stores and generate 100,000 fictional users, each performing an average of 10 check-ins. Once the data preparation is complete, you'll upload it to Elasticsearch and build a vibrant user interface with Streamlit for robust data visualization.

Application Interface Features

  • Search by store name
  • Search by ZIP code to filter stores by area
  • Search by business ID for visit analysis
  • Search and track by Device ID to observe specific user movements

Skills and Learning Outcomes

Throughout this project, you'll develop the ability to:

  • Transform and upload data in parquet format to Elasticsearch
  • Utilize Kibana for effective index management and document search
  • Design an interactive interface using Streamlit with controls, Folium maps, and tables
  • Configure pages and execute intricate queries on Elasticsearch

Course Program Overview

  • Preparation of the San Francisco dataset with 10,000 stores
  • Generation of 100,000 fictional user profiles
  • Integration of user data with store information
  • Creation of 1,000,000 app check-ins
  • Data preparation for Elasticsearch upload
  • Data upload to Elasticsearch
  • Streamlit application development including maps, filters, and tables
  • Page configuration and Elasticsearch query execution

Prerequisites and Recommendations

It is recommended to complete the “Log Analysis in Elasticsearch” course to gain foundational knowledge in Elasticsearch. Additionally, consider taking the Pandas lessons from the course “Python for Data Engineers” to enhance your data manipulation skills.

This project is best suited for systems equipped with at least 8 GB of RAM.

About the Author: Andreas Kretz

Andreas Kretz thumbnail

I am a senior data engineer and trainer, a tech enthusiast, and a father. For more than ten years, I have been passionate about Data Engineering. Initially, I became a self-taught data engineer and then led a team of data engineers at a large company. When I realized the great demand for education in this field, I followed my passion and founded my own Data Engineering Academy. Since then, I have helped over 2,000 students achieve their goals.

Watch Online 19 lessons

This is a demo lesson (10:00 remaining)

You can watch up to 10 minutes for free. Subscribe to unlock all 19 lessons in this course and access 10,000+ hours of premium content across all courses.

View Pricing
0:00
/
#1: Introduction
All Course Lessons (19)
#Lesson TitleDurationAccess
1
Introduction Demo
03:01
2
Setup & Goals
03:28
3
San Francisco dataset
03:49
4
Relational database vs elasticsearch
06:49
5
Preparing the dev environment
02:05
6
Prepare the SF dataset 1
09:48
7
Preparing the SF dataset 2
08:47
8
Creating 100k fake users
08:59
9
Merging 100k users with SF dataset
06:02
10
Creating app scans for users
08:22
11
Preparing Elasticsearch and loading the data
04:41
12
Creating the Streamlit app basics and folium maps
02:27
13
Page setup and querying from Elasticsearch
05:28
14
Creating free text search
04:58
15
Zip code search
02:24
16
Business_id search
04:03
17
Search by device ID & tracking people
03:38
18
Summary
03:53
19
Outlook
04:21
Unlock unlimited learning

Get instant access to all 18 lessons in this course, plus thousands of other premium courses. One subscription, unlimited knowledge.

Learn more about subscription