Skip to main content
CF

AI Engineering: Customizing LLMs for Business (Fine-Tuning LLMs with QLoRA & AWS)

7h 12m 10s
English
Paid

Master the in-demand skill that companies are actively seeking: developing and implementing custom Large Language Models (LLMs). In this course, you will learn how to fine-tune open LLMs using corporate data and deploy your models efficiently with AWS tools like SageMaker, Lambda, and API Gateway, alongside creating intuitive interfaces with Streamlit for employees and clients.

Course Overview

This is not "just another introductory AI course." Instead, it is a practical, in-depth exploration of the skills that distinguish AI engineers on real-world projects. You'll engage in fine-tuning using QLoRA—an approach that significantly minimizes resource consumption—and transition the model into a production-ready service.

Key Learning Outcomes

What you will master:

  1. Fine-tuning open-source LLMs with your own datasets, including corporate data.
  2. Hands-on experience with QLoRA, bfloat16 training, dataset chunking, and attention masks.
  3. Utilizing the Hugging Face ecosystem, including the Estimator API, and setting up an MLOps pipeline on AWS.
  4. Model deployment and integration with tools like SageMaker endpoints, Lambda, API Gateway, and monitoring.
  5. Creation of a simple business UI using Streamlit.

Course Outcomes

From theory to code to production—experience the complete development cycle of applied AI tailored for business scenarios.

Target Audience and Career Paths

This course is designed to benefit and prepare professionals for roles such as:

  1. AI Engineer / ML Engineer: designing, fine-tuning, and producing models.
  2. AI Specialist: developing applied AI solutions.
  3. Data Scientist: preparing data, performing EDA, and building models for organizational tasks.
  4. AI Research Scientist: focusing on attention mechanisms and in-depth LLM work.
  5. Cloud Engineer: crafting architecture and implementing best deployment practices on AWS.
  6. DevOps Engineer: automating, releasing, and monitoring ML services using tools like CloudWatch.
  7. Software Engineer: integrating scalable models into applications.
  8. Data Engineer: constructing data pipelines, managing storage (e.g., S3), and preprocessing.
  9. Technical Product Manager: planning and releasing ML products, focusing on metrics and monitoring.

If you're looking to harness the "AI wave," customizing LLMs for business needs offers a substantial entry point and growth opportunity.

Additional

https://github.com/patrikszepesi/qlora-course

About the Author: Zero To Mastery

Zero To Mastery thumbnail

Zero To Mastery (ZTM) is a Toronto-based online coding academy founded by Andrei Neagoie, originally a senior developer at large Canadian tech firms before turning to teaching full-time. The academy's signature is the cohort-based bootcamp track combined with a deep self-paced course library, all aimed at career-changers and self-taught developers preparing to land software-engineering roles at top companies.

The instructor roster has grown well beyond Andrei to include other senior practitioners: Daniel Bourke (machine learning), Aleksa Tešić (DevOps), Jacinto Wong, and others. Courses cover the full software-engineering career path: web development with React and Next.js, Python, machine learning and deep learning, DevOps and cloud, system design, mobile, and the algorithm / data-structure interview prep that gates engineering jobs.

The CourseFlix listing under this source carries over 120 ZTM courses spanning that full range. Material is paid; ZTM itself runs on a monthly / annual membership model. The teaching style favours long-form, project-based courses where students build complete portfolio-quality applications rather than disconnected feature tutorials.

Watch Online 58 lessons

This is a demo lesson (10:00 remaining)

You can watch up to 10 minutes for free. Subscribe to unlock all 58 lessons in this course and access 10,000+ hours of premium content across all courses.

View Pricing
0:00
/
#1: Course Introduction (What We're Building)
All Course Lessons (58)
#Lesson TitleDurationAccess
1
Course Introduction (What We're Building) Demo
05:20
2
Signing in to AWS
04:31
3
Creating an IAM User
05:30
4
Using our new IAM User
03:13
5
What To Do In Case You Get Hacked!
01:31
6
Creating a SageMaker Domain
02:29
7
Logging in to our SageMaker Environment
04:54
8
Introduction to JupyterLab
07:38
9
Sagemaker Sessions, Regions, and IAM Roles
07:51
10
Examining Our Dataset from HuggingFace
13:30
11
Tokenization and Word Embeddings
09:09
12
HuggingFace Authentication with Sagemaker
04:22
13
Applying the Templating Function to our Dataset
08:44
14
Attention Masks and Padding
15:56
15
Star Unpacking with Python
04:04
16
Chain Iterator, List Constructor and Attention Mask example with Python
10:23
17
Understanding Batching
08:12
18
Slicing and Chunking our Dataset
07:32
19
Creating our Custom Chunking Function
16:07
20
Tokenizing our Dataset
09:31
21
Running our Chunking Function
04:31
22
Understanding the Entire Chunking Process
08:33
23
Uploading the Training Data to AWS S3
05:54
24
Setting Up Hyperparameters for the Training Job
06:48
25
Creating our HuggingFace Estimator in Sagemaker
06:46
26
Introduction to Low-rank adaptation (LoRA)
08:12
27
LoRA Numerical Example
10:56
28
LoRA Summarization and Cost Saving Calculation
09:09
29
(Optional) Matrix Multiplication Refresher
04:46
30
Understanding LoRA Programatically Part 1
12:33
31
Understanding LoRA Programatically Part 2
05:49
32
Bfloat16 vs Float32
08:11
33
Comparing Bfloat16 Vs Float32 Programatically
06:33
34
Setting up Imports and Libraries for the Train Script
07:20
35
Argument Parsing Function Part 1
07:57
36
Argument Parsing Function Part 2
10:55
37
Understanding Trainable Parameters Caveats
14:31
38
Introduction to Quantization
07:36
39
Identifying Trainable Layers for LoRA
07:20
40
Setting up Parameter Efficient Fine Tuning
04:36
41
Implement LoRA Configuration and Mixed Precision Training
10:35
42
Understanding Double Quantization
04:22
43
Creating the Training Function Part 1
14:15
44
Creating the Training Function Part 2
07:17
45
Exercise: Imposter Syndrome
02:57
46
Finishing our Sagemaker Script
05:09
47
Gaining Access to Powerful GPUs with AWS Quotas
05:11
48
Final Fixes Before Training
03:55
49
Starting our Training Job
07:16
50
Inspecting the Results of our Training Job and Monitoring with Cloudwatch
11:24
51
Deploying our LLM to a Sagemaker Endpoint
17:58
52
Testing our LLM in Sagemaker Locally
08:19
53
Creating the Lambda Function to Invoke our Endpoint
08:56
54
Creating API Gateway to Deploy the Model Through the Internet
02:37
55
Implementing our Streamlit App
05:12
56
Streamlit App Correction
03:27
57
Congratulations and Cleaning up AWS Resources
02:39
58
Thank You!
01:18
Unlock unlimited learning

Get instant access to all 57 lessons in this course, plus thousands of other premium courses. One subscription, unlimited knowledge.

Learn more about subscription

Course content

58 lessons · 7h 12m 10s
Show all 58 lessons
  1. 1 Course Introduction (What We're Building) 05:20
  2. 2 Signing in to AWS 04:31
  3. 3 Creating an IAM User 05:30
  4. 4 Using our new IAM User 03:13
  5. 5 What To Do In Case You Get Hacked! 01:31
  6. 6 Creating a SageMaker Domain 02:29
  7. 7 Logging in to our SageMaker Environment 04:54
  8. 8 Introduction to JupyterLab 07:38
  9. 9 Sagemaker Sessions, Regions, and IAM Roles 07:51
  10. 10 Examining Our Dataset from HuggingFace 13:30
  11. 11 Tokenization and Word Embeddings 09:09
  12. 12 HuggingFace Authentication with Sagemaker 04:22
  13. 13 Applying the Templating Function to our Dataset 08:44
  14. 14 Attention Masks and Padding 15:56
  15. 15 Star Unpacking with Python 04:04
  16. 16 Chain Iterator, List Constructor and Attention Mask example with Python 10:23
  17. 17 Understanding Batching 08:12
  18. 18 Slicing and Chunking our Dataset 07:32
  19. 19 Creating our Custom Chunking Function 16:07
  20. 20 Tokenizing our Dataset 09:31
  21. 21 Running our Chunking Function 04:31
  22. 22 Understanding the Entire Chunking Process 08:33
  23. 23 Uploading the Training Data to AWS S3 05:54
  24. 24 Setting Up Hyperparameters for the Training Job 06:48
  25. 25 Creating our HuggingFace Estimator in Sagemaker 06:46
  26. 26 Introduction to Low-rank adaptation (LoRA) 08:12
  27. 27 LoRA Numerical Example 10:56
  28. 28 LoRA Summarization and Cost Saving Calculation 09:09
  29. 29 (Optional) Matrix Multiplication Refresher 04:46
  30. 30 Understanding LoRA Programatically Part 1 12:33
  31. 31 Understanding LoRA Programatically Part 2 05:49
  32. 32 Bfloat16 vs Float32 08:11
  33. 33 Comparing Bfloat16 Vs Float32 Programatically 06:33
  34. 34 Setting up Imports and Libraries for the Train Script 07:20
  35. 35 Argument Parsing Function Part 1 07:57
  36. 36 Argument Parsing Function Part 2 10:55
  37. 37 Understanding Trainable Parameters Caveats 14:31
  38. 38 Introduction to Quantization 07:36
  39. 39 Identifying Trainable Layers for LoRA 07:20
  40. 40 Setting up Parameter Efficient Fine Tuning 04:36
  41. 41 Implement LoRA Configuration and Mixed Precision Training 10:35
  42. 42 Understanding Double Quantization 04:22
  43. 43 Creating the Training Function Part 1 14:15
  44. 44 Creating the Training Function Part 2 07:17
  45. 45 Exercise: Imposter Syndrome 02:57
  46. 46 Finishing our Sagemaker Script 05:09
  47. 47 Gaining Access to Powerful GPUs with AWS Quotas 05:11
  48. 48 Final Fixes Before Training 03:55
  49. 49 Starting our Training Job 07:16
  50. 50 Inspecting the Results of our Training Job and Monitoring with Cloudwatch 11:24
  51. 51 Deploying our LLM to a Sagemaker Endpoint 17:58
  52. 52 Testing our LLM in Sagemaker Locally 08:19
  53. 53 Creating the Lambda Function to Invoke our Endpoint 08:56
  54. 54 Creating API Gateway to Deploy the Model Through the Internet 02:37
  55. 55 Implementing our Streamlit App 05:12
  56. 56 Streamlit App Correction 03:27
  57. 57 Congratulations and Cleaning up AWS Resources 02:39
  58. 58 Thank You! 01:18

Related courses

Frequently asked questions

What is AI Engineering: Customizing LLMs for Business (Fine-Tuning LLMs with QLoRA & AWS) about?
Master the in-demand skill that companies are actively seeking: developing and implementing custom Large Language Models (LLMs). In this course, you will learn how to fine-tune open LLMs using corporate data and deploy your models…
Who teaches AI Engineering: Customizing LLMs for Business (Fine-Tuning LLMs with QLoRA & AWS)?
AI Engineering: Customizing LLMs for Business (Fine-Tuning LLMs with QLoRA & AWS) is taught by Zero To Mastery. You can find more courses by this instructor on the corresponding source page.
How long is AI Engineering: Customizing LLMs for Business (Fine-Tuning LLMs with QLoRA & AWS)?
AI Engineering: Customizing LLMs for Business (Fine-Tuning LLMs with QLoRA & AWS) contains 58 lessons with a total runtime of 7 hours 12 minutes. All lessons are available to watch online at your own pace.
Is AI Engineering: Customizing LLMs for Business (Fine-Tuning LLMs with QLoRA & AWS) free to watch?
AI Engineering: Customizing LLMs for Business (Fine-Tuning LLMs with QLoRA & AWS) is part of CourseFlix's premium catalog. A CourseFlix subscription unlocks the full video player; the course description, table of contents, and preview information are available to everyone.
Where can I watch AI Engineering: Customizing LLMs for Business (Fine-Tuning LLMs with QLoRA & AWS) online?
AI Engineering: Customizing LLMs for Business (Fine-Tuning LLMs with QLoRA & AWS) is available to watch online on CourseFlix at https://courseflix.net/course/ai-engineering-customizing-llms-for-business-fine-tuning-llms-with-qlora-aws. The page hosts every lesson with the integrated video player; no download is required.