Contact Tracing with Elasticsearch

1h 37m 3s
English
Paid

Course description

In this fascinating engineering project, you will learn to track user movements through their phone scans. The aim of the project is to use Elasticsearch as a search system to analyze a dataset in which 100,000 users visit stores and make 1,000,000 scans.

Read more about the course

You will create your own dataset using Python and Pandas, utilizing an open dataset of San Francisco stores containing over 140,000 stores with their names and coordinates. From this dataset, you will select 10,000 stores and generate 100,000 fictional users, each of whom will perform an average of 10 check-ins. After uploading the data to Elasticsearch, you will create a user interface with Streamlit for data visualization.

Your application interface includes:

  • Search by store name
  • Search by ZIP code to filter stores by area
  • Search by business ID for visit analysis
  • Search and track by Device ID to see where a specific user has been

In the course of working on the project, you will learn to:

  • Transform data and upload it in parquet format to Elasticsearch
  • Work with Kibana for index management and document search
  • Create an interactive interface with Streamlit featuring controls, Folium maps, and tables
  • Configure pages and execute queries to Elasticsearch

Course Program

  • Preparing the San Francisco dataset with 10,000 stores
  • Generating 100,000 fictional users
  • Merging user data with stores
  • Creating 1,000,000 app check-ins
  • Preparing data for upload to Elasticsearch
  • Uploading data to Elasticsearch
  • Developing a Streamlit application: maps, filters, tables
  • Page setup and working with Elasticsearch queries

Requirements

Before starting, it is recommended to take the course “Log Analysis in Elasticsearch” to understand the basics of working with Elasticsearch. Additionally, due to extensive work with data, it's advisable to complete the lessons on Pandas from the course “Python for Data Engineers”.

The project is designed for a computer with 8 GB of RAM.

Watch Online

Join premium to watch
Go to premium
# Title Duration
1 Introduction 03:01
2 Setup & Goals 03:28
3 San Francisco dataset 03:49
4 Relational database vs elasticsearch 06:49
5 Preparing the dev environment 02:05
6 Prepare the SF dataset 1 09:48
7 Preparing the SF dataset 2 08:47
8 Creating 100k fake users 08:59
9 Merging 100k users with SF dataset 06:02
10 Creating app scans for users 08:22
11 Preparing Elasticsearch and loading the data 04:41
12 Creating the Streamlit app basics and folium maps 02:27
13 Page setup and querying from Elasticsearch 05:28
14 Creating free text search 04:58
15 Zip code search 02:24
16 Business_id search 04:03
17 Search by device ID & tracking people 03:38
18 Summary 03:53
19 Outlook 04:21

Comments

0 comments

Want to join the conversation?

Sign in to comment

Similar courses

Django for Beginners/APIs/Professionals

Django for Beginners/APIs/Professionals

Sources: leanpub
Django for Professionals Once you have learned the basics of Django there is a massive gap between building simple "toy apps" and what it takes to build a "pro
Eve: Building RESTful APIs with MongoDB and Flask

Eve: Building RESTful APIs with MongoDB and Flask

Sources: Talkpython
Eve is an open source Python REST API framework designed for human beings. It allows you to effortlessly build and deploy highly customizable, fully featured RE
5 hours 6 minutes 34 seconds
Python for Business Data Analytics & Intelligence

Python for Business Data Analytics & Intelligence

Sources: zerotomastery.io
Become a top Business Data Analyst. We’ll teach you everything you need to go from a complete beginner to getting hired as an analytics professional. You’ll lea
15 hours 25 minutes 6 seconds
Machine Learning A-Z : Become Kaggle Master

Machine Learning A-Z : Become Kaggle Master

Sources: udemy
Want to become a good Data Scientist? Then this is a right course for you. This course has been designed by IIT professionals who have mastered in Mathematics and Data Science....
36 hours 23 minutes 54 seconds