Skip to main content

Semantic Log Indexing & Search

53m 37s
English
Paid

Unlock the power of semantic search with our comprehensive course, where we dive deep into the practicalities of generative AI in real-world data processing projects. Building on the foundational knowledge from the course The Hidden Foundation of GenAI, we embark on a journey to apply embeddings in practice. You will master the entire process of creating a semantic search pipeline—from generating embeddings and storing them in a vector database to executing natural language queries.

Course Overview

This course is structured around an impactful data observability project. You will construct a pipeline that aggregates logs, processes them with FastAPI, and secures the embeddings in qdrant—a high-performance vector storage solution. Furthermore, you'll craft an intuitive dashboard on Streamlit, enabling semantic log searches instead of traditional keyword searches, and evaluate the outputs against conventional SQL queries in DuckDB.

Key Course Steps

  1. From Embeddings to Search: Revisit the basics of embeddings and delve into how they enable semantic search functionality.
  2. Building a Pipeline: Implement an API with FastAPI for processing logs and generating embeddings.
  3. Working with qdrant: Explore collections, points, cosine similarity search, and optimize the embedding structure.
  4. Streamlit Interface: Develop a user-friendly search interface and compare the semantic search approach with traditional SQL.
  5. Improving Accuracy: Discover methods for optimizing embeddings, refining query formulations, and configuring searches.
  6. Launching in Docker: Deploy the entire stack (FastAPI, qdrant, Streamlit, DuckDB) using Docker Compose.
  7. Bonus: Utilize DuckDB for analytics by implementing WAL, handling data in Docker, and contrasting SQL capabilities with vector search.

Course Outcomes

By the end of the course, you will not only comprehend the mechanics of semantic search but also possess a ready-to-use project that can be tailored for your personal AI-driven solutions. This hands-on experience will prepare you to apply semantic search capabilities effectively and innovate within the realm of AI.

About the Author: Andreas Kretz

Andreas Kretz thumbnail

I am a senior data engineer and trainer, a tech enthusiast, and a father. For more than ten years, I have been passionate about Data Engineering. Initially, I became a self-taught data engineer and then led a team of data engineers at a large company. When I realized the great demand for education in this field, I followed my passion and founded my own Data Engineering Academy. Since then, I have helped over 2,000 students achieve their goals.

Watch Online 16 lessons

This is a demo lesson (10:00 remaining)

You can watch up to 10 minutes for free. Subscribe to unlock all 16 lessons in this course and access 10,000+ hours of premium content across all courses.

View Pricing
0:00
/
#1: Intro
All Course Lessons (16)
#Lesson TitleDurationAccess
1
Intro Demo
00:44
2
Getting Started: Semantic Search for Your Logs
03:08
3
Dissecting the Pipeline Monitor Architecture: FastAPI, Qdrant & DuckDB
03:50
4
Beginner’s Guide to Qdrant Collections and Similarity Search
03:28
5
Your First Glimpse at the Project Code Structure on GitHub
02:55
6
Building and Launching the Pipeline with Docker Compose
04:37
7
Writing JSON Logs to FastAPI: Bulk Upload Explained
01:42
8
How FastAPI Parses LogEntry Models and Prepares Embeddings
04:37
9
Embeddings 101: Turning Your Logs into Searchable Vectors
02:06
10
Querying Qdrant: From Playground to Streamlit Dashboard
03:55
11
Hands-On Embedding Tuning: Boost Your Log Search Accuracy
03:54
12
Deploying Improved Embeddings and Measuring Improvement
05:35
13
What We Built and Why It Matters
02:53
14
How DuckDB Fits into Your Data Observability Stack
01:28
15
Writing to DuckDB with a Write-Ahead Log
05:03
16
Docker & DuckDB: Implementing WAL to Solve File Lock Errors
03:42
Unlock unlimited learning

Get instant access to all 15 lessons in this course, plus thousands of other premium courses. One subscription, unlimited knowledge.

Learn more about subscription