AI Engineering Course
1h 36m 46s
English
Paid
This course is designed to help programmers and developers transition into the field of artificial intelligence engineering. You will thoroughly explore vector databases, indexing, large language models (LLM), and the attention mechanism.
By the end of the course, you will understand how LLMs work and be able to use them to create real applications.
What you will learn:
- Develop mental models of how LLMs in the style of GPT work
- Understand processes such as tokenization, embeddings, attention, and masking
- Optimize LLM inference using caching, batching, and quantization
- Design and deploy RAG pipelines using vector databases
- Compare methods: prompt engineering, fine-tuning, and agent-based architectures
- Debug, monitor, and scale LLM systems in production
About the Author: get.interviewready.io (Gaurav Sen)
I am a software engineer working on distributed systems and interesting algorithms.
Watch Online 21 lessons
0:00
/ #1: Course Intro
All Course Lessons (21)
| # | Lesson Title | Duration | Access |
|---|---|---|---|
| 1 | Course Intro Demo | 02:01 | |
| 2 | Usecase | 01:48 | |
| 3 | How are vectors constructed | 06:43 | |
| 4 | Choosing the right DB | 03:27 | |
| 5 | Vector compression | 03:27 | |
| 6 | Vector Search | 06:59 | |
| 7 | Milvus DB | 05:38 | |
| 8 | LLM Intro | 00:43 | |
| 9 | How LLMs work | 08:31 | |
| 10 | LLM text generation | 03:08 | |
| 11 | LLM improvements | 05:10 | |
| 12 | Attention | 05:28 | |
| 13 | Transformer Architecture | 03:40 | |
| 14 | KV Cache | 08:28 | |
| 15 | Paged Attention | 04:38 | |
| 16 | Mixture Of Experts | 04:01 | |
| 17 | Flash Attention | 03:40 | |
| 18 | Quantization | 03:33 | |
| 19 | Sparse Attention | 05:14 | |
| 20 | SLM and Distillation | 05:31 | |
| 21 | Speculative Decoding | 04:58 |
Unlock unlimited learning
Get instant access to all 20 lessons in this course, plus thousands of other premium courses. One subscription, unlimited knowledge.
Learn more about subscriptionBooks
Read Book AI Engineering Course
| # | Title |
|---|---|
| 1 | 1. Vector+Embeddings+&+Semantic+Space |
| 2 | 2. Compression+&+Quantization_+Scaling+Vectors+Efficiently-4 |
| 3 | 3. Indexing+Techniques_+Making+Vector+Search+Scale |
| 4 | 4. Search+Execution+Flow_+From+Query+to+Result |
| 5 | 5. LLMs+and+RAG |
| 6 | 6. What+is+Attention+and+Why+Does+It+Matter |
| 7 | 7. Paged+Attention |
| 8 | 8. Quantization+Summary |