Master and Build Large Language Models

17h 15m 55s
English
Paid

The best way to understand how Large Language Models (LLM) work is to build your own. And that is exactly what you will do in this course. In this exciting video course, AI expert Sebastian Raschka will guide you step by step through all the stages of creating an LLM - in practice and with explanations in liveVideo format. You will implement a project from his bestseller Build a Large Language Model (From Scratch) alongside the author.

Read more about the course

In this course, you will learn to:

  • Plan architecture and write code for all LLM components
  • Prepare a dataset suitable for training a language model
  • Finetune LLM for text classification tasks and work with your own data
  • Utilize human feedback to improve instruction following
  • Load pre-trained weights into your model

This video course is perfect for:

  • Developers who want to take initiative in AI-related projects
  • Data scientists and ML researchers who need to be able to configure or create LLM from scratch

The course also includes a block of 6 mandatory introductory videos by Abhinav Kimoti, an expert in artificial intelligence and the author of the book A Simple Guide to Retrieval Augmented Generation. He explains everything you need to know before starting: from Python capabilities to advanced operations in PyTorch. Regardless of your level of preparation, you will gain a solid foundation for successful work with large language models.

Watch Online Master and Build Large Language Models

Join premium to watch
Go to premium
# Title Duration
1 1.1. Python Environment Setup Video 21:10
2 1.2. Foundations to Build a Large Language Model (From Scratch) 06:28
3 2.1. Prerequisites to Chapter 2 (1 01:07:40
4 2.2. Tokenizing text 14:10
5 2.3. Converting tokens into token IDs 09:59
6 2.4. Adding special context tokens 06:36
7 2.5. Byte pair encoding 13:40
8 2.6. Data sampling with a sliding window 23:16
9 2.7. Creating token embeddings 08:37
10 2.8. Encoding word positions 12:23
11 3.1. Prerequisites to Chapter 3 (1 01:14:17
12 3.2. A simple self-attention mechanism without trainable weights | Part 1 41:10
13 3.3. A simple self-attention mechanism without trainable weights | Part 2 11:43
14 3.4. Computing the attention weights step by step 20:00
15 3.5. Implementing a compact self-attention Python class 08:31
16 3.6. Applying a causal attention mask 11:37
17 3.7. Masking additional attention weights with dropout 05:38
18 3.8. Implementing a compact causal self-attention class 08:53
19 3.9. Stacking multiple single-head attention layers 12:05
20 3.10. Implementing multi-head attention with weight splits 16:47
21 4.1. Prerequisites to Chapter 4 (1 01:11:23
22 4.2. Coding an LLM architecture 14:00
23 4.3. Normalizing activations with layer normalization 22:14
24 4.4. Implementing a feed forward network with GELU activations 16:19
25 4.5. Adding shortcut connections 10:52
26 4.6. Connecting attention and linear layers in a transformer block 12:14
27 4.7. Coding the GPT model 12:45
28 4.8. Generating text 17:47
29 5.1. Prerequisites to Chapter 5 23:58
30 5.2. Using GPT to generate text 17:32
31 5.3. Calculating the text generation loss: cross entropy and perplexity 27:14
32 5.4. Calculating the training and validation set losses 24:52
33 5.5. Training an LLM 27:04
34 5.6. Decoding strategies to control randomness 03:37
35 5.7. Temperature scaling 13:43
36 5.8. Top-k sampling 08:20
37 5.9. Modifying the text generation function 10:51
38 5.10. Loading and saving model weights in PyTorch 04:24
39 5.11. Loading pretrained weights from OpenAI 20:04
40 6.1. Prerequisites to Chapter 6 39:21
41 6.2. Preparing the dataset 26:58
42 6.3. Creating data loaders 16:08
43 6.4. Initializing a model with pretrained weights 10:11
44 6.5. Adding a classification head 15:38
45 6.6. Calculating the classification loss and accuracy 22:32
46 6.7. Fine-tuning the model on supervised data 33:36
47 6.8. Using the LLM as a spam classifier 11:07
48 7.1. Preparing a dataset for supervised instruction fine-tuning 15:48
49 7.2. Organizing data into training batches 23:45
50 7.3. Creating data loaders for an instruction dataset 07:31
51 7.4. Loading a pretrained LLM 07:48
52 7.5. Fine-tuning the LLM on instruction data 20:02
53 7.6. Extracting and saving responses 09:40
54 7.7. Evaluating the fine-tuned LLM 21:57

Similar courses to Master and Build Large Language Models

RAG (Retrieval)

RAG (Retrieval)Mckay Wrigley (takeoff)

Category: Other (AI)
Duration 4 hours 33 minutes 19 seconds
Build AI Agents with CrewAI

Build AI Agents with CrewAIzerotomastery.io

Category: Other (AI)
Duration 2 hours 51 minutes 42 seconds
Design and Code User Interfaces with Galileo and Claude AI

Design and Code User Interfaces with Galileo and Claude AIdesigncode.io

Category: Other (AI)
Duration 3 hours 42 minutes 41 seconds
Building Apps with o1 Pro Template System: Part 1

Building Apps with o1 Pro Template System: Part 1Mckay Wrigley (takeoff)

Category: Other (AI)
Duration 4 hours 4 minutes 38 seconds
Learn how to use MCP (Model Context Protocol)

Learn how to use MCP (Model Context Protocol)Kevin Kern (instructa.ai)

Category: Other (AI)
Duration 3 hours 10 minutes 2 seconds
Build SwiftUI apps for iOS 18 with Cursor and Xcode

Build SwiftUI apps for iOS 18 with Cursor and Xcodedesigncode.io

Category: Other (Mobile Apps Development), Swift, Other (AI)
Duration 4 hours 35 minutes 14 seconds
AI Engineering: Fine-Tuning LLMs

AI Engineering: Fine-Tuning LLMszerotomastery.io

Category: Other (AI)
Duration 1 hour 35 minutes 46 seconds
The Dark Side of AI: Jailbreaking, Injections, Hallucinations & more

The Dark Side of AI: Jailbreaking, Injections, Hallucinations & morezerotomastery.io

Category: Other (AI)
Duration 3 hours 3 minutes 38 seconds