Skip to main content

AI Engineering: Customizing LLMs for Business (Fine-Tuning LLMs with QLoRA & AWS)

7h 12m 10s
English
Paid

Course description

Master an in-demand skill that companies are looking for: developing and implementing custom LLMs. In this course, you will learn how to fine-tune open large language models on closed/corporate data and deploy your models using AWS (SageMaker, Lambda, API Gateway) and Streamlit to provide a convenient interface for employees and clients.

This is not "just another introductory AI course." It is a practical deep dive into the skills that set AI engineers apart on real projects. You will perform fine-tuning using QLoRA, a method that drastically reduces resource consumption, and then turn the model into a production service.

Read more about the course

What you will master:

  1. Fine-tuning open-source LLM on your own datasets (including corporate ones).
  2. Practice with QLoRA, bfloat16 training, chunking datasets, attention masks.
  3. The Hugging Face ecosystem (including Estimator API) and MLOps pipeline on AWS.
  4. Model deployment and integration: SageMaker endpoints, Lambda, API Gateway, monitoring.
  5. Creating a simple business UI on Streamlit.


Outcome: from theory to code and production - the complete development cycle of applied AI for business cases.

Who it benefits and what roles it prepares for:

  1. AI Engineer / ML Engineer - designing, fine-tuning, and producing models.
  2. AI Specialist - creating applied solutions based on AI.
  3. Data Scientist - data preparation, EDA, and building models for company tasks.
  4. AI Research Scientist - in-depth work with attention mechanisms and LLM.
  5. Cloud Engineer - architecture and best deployment practices in AWS.
  6. DevOps Engineer - automation, release, and monitoring of ML services (CloudWatch, etc.).
  7. Software Engineer - integrating models into applications with scalability in mind.
  8. Data Engineer - data pipelines, storage (S3), preprocessing.
  9. Technical Product Manager - planning and releasing ML products, metrics, and monitoring.


If you want to catch the "AI wave," customizing LLM for business tasks is a great entry point and growth opportunity.

Watch Online

This is a demo lesson (10:00 remaining)

You can watch up to 10 minutes for free. Subscribe to unlock all 58 lessons in this course and access 10,000+ hours of premium content across all courses.

View Pricing
0:00
/
#1: Course Introduction (What We're Building)

All Course Lessons (58)

#Lesson TitleDurationAccess
1
Course Introduction (What We're Building) Demo
05:20
2
Signing in to AWS
04:31
3
Creating an IAM User
05:30
4
Using our new IAM User
03:13
5
What To Do In Case You Get Hacked!
01:31
6
Creating a SageMaker Domain
02:29
7
Logging in to our SageMaker Environment
04:54
8
Introduction to JupyterLab
07:38
9
Sagemaker Sessions, Regions, and IAM Roles
07:51
10
Examining Our Dataset from HuggingFace
13:30
11
Tokenization and Word Embeddings
09:09
12
HuggingFace Authentication with Sagemaker
04:22
13
Applying the Templating Function to our Dataset
08:44
14
Attention Masks and Padding
15:56
15
Star Unpacking with Python
04:04
16
Chain Iterator, List Constructor and Attention Mask example with Python
10:23
17
Understanding Batching
08:12
18
Slicing and Chunking our Dataset
07:32
19
Creating our Custom Chunking Function
16:07
20
Tokenizing our Dataset
09:31
21
Running our Chunking Function
04:31
22
Understanding the Entire Chunking Process
08:33
23
Uploading the Training Data to AWS S3
05:54
24
Setting Up Hyperparameters for the Training Job
06:48
25
Creating our HuggingFace Estimator in Sagemaker
06:46
26
Introduction to Low-rank adaptation (LoRA)
08:12
27
LoRA Numerical Example
10:56
28
LoRA Summarization and Cost Saving Calculation
09:09
29
(Optional) Matrix Multiplication Refresher
04:46
30
Understanding LoRA Programatically Part 1
12:33
31
Understanding LoRA Programatically Part 2
05:49
32
Bfloat16 vs Float32
08:11
33
Comparing Bfloat16 Vs Float32 Programatically
06:33
34
Setting up Imports and Libraries for the Train Script
07:20
35
Argument Parsing Function Part 1
07:57
36
Argument Parsing Function Part 2
10:55
37
Understanding Trainable Parameters Caveats
14:31
38
Introduction to Quantization
07:36
39
Identifying Trainable Layers for LoRA
07:20
40
Setting up Parameter Efficient Fine Tuning
04:36
41
Implement LoRA Configuration and Mixed Precision Training
10:35
42
Understanding Double Quantization
04:22
43
Creating the Training Function Part 1
14:15
44
Creating the Training Function Part 2
07:17
45
Exercise: Imposter Syndrome
02:57
46
Finishing our Sagemaker Script
05:09
47
Gaining Access to Powerful GPUs with AWS Quotas
05:11
48
Final Fixes Before Training
03:55
49
Starting our Training Job
07:16
50
Inspecting the Results of our Training Job and Monitoring with Cloudwatch
11:24
51
Deploying our LLM to a Sagemaker Endpoint
17:58
52
Testing our LLM in Sagemaker Locally
08:19
53
Creating the Lambda Function to Invoke our Endpoint
08:56
54
Creating API Gateway to Deploy the Model Through the Internet
02:37
55
Implementing our Streamlit App
05:12
56
Streamlit App Correction
03:27
57
Congratulations and Cleaning up AWS Resources
02:39
58
Thank You!
01:18

Unlock unlimited learning

Get instant access to all 57 lessons in this course, plus thousands of other premium courses. One subscription, unlimited knowledge.

Learn more about subscription

Comments

0 comments

Want to join the conversation?

Sign in to comment

Similar courses

Planning with Claude Code

Planning with Claude Code

Sources: Mckay Wrigley (takeoff)
In the workshop, participants create a simple front-end project— a landing page generator (marketing pages)- using Claude Code and Next.js. The focus is not...
47 minutes 32 seconds
Master AI for Work

Master AI for Work

Sources: Towards AI, Louis-François Bouchard
The course "Master AI for Work" is designed for those who want to achieve real results from using large language models (LLM) in their professional...
2 hours 27 minutes 56 seconds
Build AI-Powered Apps – An AI Course for Developers

Build AI-Powered Apps – An AI Course for Developers

Sources: codewithmosh (Mosh Hamedani)
AI is everywhere - but can you really create applications with it? Most developers have tried ChatGPT. Some have even inserted pieces...
7 hours 3 minutes 31 seconds
React Node AWS - Build infinitely Scaling MERN Stack App

React Node AWS - Build infinitely Scaling MERN Stack App

Sources: udemy
Master the art of building a highly scalable real world project using MERN Stack for a new startup that will scale infinitely. I will demonstrate how you could launch a project ...
25 hours 1 minute 19 seconds
Production-Ready Serverless

Production-Ready Serverless

Sources: Yan Cui
The Production-Ready Serverless course teaches how to build resilient and scalable serverless applications, ready for production deployment. It covers...
13 hours 37 minutes 6 seconds