Skip to main content

Building LLMs for Production

0h 0m 0s
English
Paid

Creating LLM for Production is a comprehensive 470-page guide (updated in October 2024) crafted for developers and specialists aiming to transcend prototyping and develop robust, industry-ready applications based on large language models.

Understanding the Fundamentals of LLMs

This guide elucidates the core principles of how large language models (LLMs) operate, providing a solid foundation for building advanced applications.

Key Techniques Explored

The book delves into a variety of essential techniques needed to harness the full potential of LLMs, including:

  • Advanced Prompting: Mastering the art of effectively instructing LLMs to achieve precise outcomes.
  • Retrieval-Augmented Generation (RAG): Exploring techniques that combine information retrieval and generative models.
  • Model Fine-Tuning: Detailed methods for customizing LLMs to specific tasks or domains.
  • Evaluation Methods: Comprehensive approaches for assessing model performance and accuracy.
  • Deployment Strategies: Proven tactics for integrating LLMs into live production environments.

Practical Tools and Resources

Readers benefit from hands-on resources, including:

  • Interactive Colab Notebooks for practical experimentation.
  • Real-world code examples to illustrate key concepts in action.
  • Case Studies that demonstrate how to successfully integrate LLMs into existing products and workflows.

Addressing Critical Challenges

This guide also pays special attention to:

  • Security: Safeguarding applications and data when using LLMs.
  • Monitoring: Keeping track of model performance and system health.
  • Optimization: Enhancing efficiency and effectiveness of LLM implementations.
  • Cost Reduction: Strategies to minimize operational costs in LLM deployment.

About the Authors

Louis-François Bouchard

Louis-François Bouchard thumbnail
My journey into the world of AI began in 2019 during my studies in "systems engineering" when I won a competition in emoji classification and realized that I wanted to apply research to real-world tasks. In 2020, I enrolled in a master's program in artificial intelligence, led the AI division at a startup, and launched a YouTube channel dedicated to explaining key AI concepts. These experiences revealed to me a substantial gap between academic research and industry requirements, and in 2022, I became a co-founder of Towards AI to bridge that gap. In 2024, I paused my work on a PhD in medical AI to focus on creating practical solutions. Experience showed that successful AI products require more than just research—they need well-structured technologies and processes. Together with the expert team at TAI, we identified an optimal technology stack for adapting large language models to specific tasks, achieving the necessary accuracy and reliability metrics for a scalable product. Through Towards AI Academy and key projects like the course "From Beginner to Advanced LLM Developer" and an upcoming book, I strive to share these tools and help you create truly effective AI solutions.

Towards AI

Towards AI thumbnail

Towards AI Academy is an expert online school founded in 2019 with the goal of making application "building" with AI accessible to everyone. Our...

Books

Read Book Building LLMs for Production

#Title
1Table of Contents
2About The Book
3Introduction
4Why Prompt Engineering, Fine-Tuning, and RAG?
5Coding Environment and Packages
6A Brief History of Language Models
7What are Large Language Models?
8Building Blocks of LLMs
9Tutorial: Translation with LLMs (GPT-3.5 API)
10Tutorial: Control LLMs Output with Few-Shot Learning
11Recap
12Understanding Transformers
13Transformer Model’s Design Choices
14Transformer Architecture Optimization Techniques
15The Generative Pre-trained Transformer (GPT) Architecture
16Introduction to Large Multimodal Models
17Proprietary vs. Open Models vs. Open-Source Language Models
18Applications and Use-Cases of LLMs
19Recap
20Understanding Hallucinations and Bias
21Reducing Hallucinations by Controlling LLM Outputs
22Evaluating LLM Performance
23Recap
24Prompting and Prompt Engineering
25Prompting Techniques
26Prompt Injection and Security
27Recap
28Why RAG?
29Building a Basic RAG Pipeline from Scratch
30Recap
31LLM Frameworks
32LangChain Introduction
33Tutorial 1: Building LLM-Powered Applications with LangChain
34Tutorial 2: Building a News Articles Summarizer
35LlamaIndex Introduction
36LangChain vs. LlamaIndex vs. OpenAI Assistants
37Recap
38What are LangChain Prompt Templates
39Few-Shot Prompts and Example Selectors
40What are LangChain Chains
41Tutorial 1: Managing Outputs with Output Parsers
42Tutorial 2: Improving Our News Articles Summarizer
43Tutorial 3: Creating Knowledge Graphs from Textual Data: Finding Hidden Connections
44Recap
45LangChain’s Indexes and Retrievers
46Data Ingestion
47Text Splitters
48Similarity Search and Vector Embeddings
49Tutorial 1: A Customer Support Q&A Chatbot
50Tutorial 2: A YouTube Video Summarizer Using Whisper and LangChain
51Tutorial 3: A Voice Assistant for Your Knowledge Base
52Tutorial 4: Preventing Undesirable Outputs with the Self-Critique Chain
53Tutorial 5: Preventing Undesirable Outputs from a Customer Service Chatbot
54Recap
55From Proof of Concept to Product: Challenges of RAG Systems
56Advanced RAG Techniques with LlamaIndex
57RAG - Metrics & Evaluation
58LangChain LangSmith and LangChain Hub
59Recap
60What are Agents: Large Models as Reasoning Engines
61An Overview of AutoGPT and BabyAGI
62The Agent Simulation Projects in LangChain
63Tutorial 1: Building Agents for Analysis Report Creation
64Tutorial 2: Query and Summarize a DB with LlamaIndex
65Tutorial 3: Building Agents with OpenAI Assistants
66Tutorial 4: LangChain OpenGPT
67Tutorial 5: Multimodal Financial Document Analysis from PDFs
68Recap
69Understanding Fine-Tuning
70Low-Rank Adaptation (LoRA)
71Tutorial 1: SFT with LoRA
72Tutorial 2: Using SFT and LoRA for Financial Sentiment
73Tutorial 3: Fine-Tuning a Cohere LLM with Medical Data
74Reinforcement Learning from Human Feedback
75Tutorial 4: Improving LLMs with RLHF
76Recap
77Model Distillation and Teacher-Student Models
78LLM Deployment Optimization: Quantization, Pruning, and Speculative Decoding
79Tutorial: Deploying a Quantized LLM on a CPU on Google Cloud Platform (GCP)
80Deploying Open-Source LLMs on Cloud Providers
81Recap
82Conclusion
83Further Reading and Courses