Semantic Log Indexing & Search
Course description
Semantic search is one of the most practical applications of generative AI in real data processing projects. In this course, we go beyond the basic introduction to embeddings (from the course The Hidden Foundation of GenAI) and start using them in practice. You will learn to build a complete semantic search pipeline from scratch: from creating embeddings and storing them in a vector database to performing natural language queries.
The course is built around a real data observability project. You will create a pipeline that collects logs, processes them using FastAPI, and stores the embeddings in qdrant - a high-performance vector storage. Then, you will develop a dashboard on Streamlit, allowing you to search logs by meaning, rather than by keywords, and compare the results with traditional SQL queries in DuckDB.
Read more about the course
Key steps of the course:
- From embeddings to search: review the basics of embeddings and analyze how exactly they enable semantic search functionality.
- Building a pipeline: implementing an API on FastAPI for log processing and embedding generation.
- Working with qdrant: collections, points, cosine similarity search, and optimization of embedding structure.
- Streamlit interface: creating a user-friendly search and comparing the semantic approach with classic SQL.
- Improving accuracy: methods for optimizing embeddings, query formulation, and search configuration.
- Launching in Docker: deploying the entire stack (FastAPI, qdrant, Streamlit, DuckDB) using Docker Compose.
- Bonus: using DuckDB for analytics - implementing WAL, working with data in Docker, and comparing the capabilities of SQL and vector search.
Upon completion of the course, you will not only understand the mechanics of semantic search but also have a ready-to-use working project that can be adapted for your own AI-based solutions.
Watch Online
Watch Online Semantic Log Indexing & Search
All Course Lessons (16)
| # | Lesson Title | Duration | Access |
|---|---|---|---|
| 1 | Intro Demo | 00:44 | |
| 2 | Getting Started: Semantic Search for Your Logs | 03:08 | |
| 3 | Dissecting the Pipeline Monitor Architecture: FastAPI, Qdrant & DuckDB | 03:50 | |
| 4 | Beginner’s Guide to Qdrant Collections and Similarity Search | 03:28 | |
| 5 | Your First Glimpse at the Project Code Structure on GitHub | 02:55 | |
| 6 | Building and Launching the Pipeline with Docker Compose | 04:37 | |
| 7 | Writing JSON Logs to FastAPI: Bulk Upload Explained | 01:42 | |
| 8 | How FastAPI Parses LogEntry Models and Prepares Embeddings | 04:37 | |
| 9 | Embeddings 101: Turning Your Logs into Searchable Vectors | 02:06 | |
| 10 | Querying Qdrant: From Playground to Streamlit Dashboard | 03:55 | |
| 11 | Hands-On Embedding Tuning: Boost Your Log Search Accuracy | 03:54 | |
| 12 | Deploying Improved Embeddings and Measuring Improvement | 05:35 | |
| 13 | What We Built and Why It Matters | 02:53 | |
| 14 | How DuckDB Fits into Your Data Observability Stack | 01:28 | |
| 15 | Writing to DuckDB with a Write-Ahead Log | 05:03 | |
| 16 | Docker & DuckDB: Implementing WAL to Solve File Lock Errors | 03:42 |
Unlock unlimited learning
Get instant access to all 15 lessons in this course, plus thousands of other premium courses. One subscription, unlimited knowledge.
Learn more about subscriptionComments
0 commentsSimilar courses

Data Structures and Algorithmic Trading: Machine Learning

Mathematical Foundations of Machine Learning

How To Connect, Code & Debug Supabase With Bolt

Want to join the conversation?
Sign in to comment