Skip to main content

The RLHF Book. Reinforcement learning from human feedback, alignment, and post-training LLMs

0h 0m 0s
English
Paid

Course description

This book is dedicated to a key task in modern AI engineering — aligning models with human preferences. Reinforcement Learning from Human Feedback (RLHF) allows models to be made safer, more understandable, user-friendly, and precisely tailored to the developer’s specific style. In his book, Nathan Lambert combines philosophical and economic ideas with the fundamental mathematics and computer sciences of RLHF, offering a practical guide to applying these methods to your own models.

You will learn how modern models are trained on human preferences, how to collect and enhance large-scale preference datasets, and get a detailed explanation of the basic methods of training using policy-gradient algorithms. The book covers Direct Preference Optimization (DPO), direct alignment algorithms, simplified methods for fine-tuning per preferences, and explains how the evolution of RLHF has led to the emergence of a new approach — RLVR. The author examines industrial practices of post-training: training character and personality, using feedback from AI, complex quality assessment schemes, and modern recipes for combining instructional training with RLHF. Lambert shares real experience creating open models like Llama-Instruct, Zephyr, Olmo, and Tülu.

After ChatGPT became an industrial product thanks to RLHF, the technology rapidly spread. In this book, Nathan Lambert offers for the first time an inside look at modern RLHF pipelines, their advantages and trade-offs, supporting explanations with practical experiments and minimal implementations. Readers gain a comprehensive understanding of the foundations of RLHF, optimization methods, constitutional AI, synthetic data, and new approaches to model evaluation — as well as an insight into the unresolved issues that the community is working on today. The book helps readers join the forefront of those creating and aligning the next generation of models.

Books

Read Book The RLHF Book. Reinforcement learning from human feedback, alignment, and post-training LLMs

#Title
1The RLHF Book v1 MEAP
2The RLHF Book v2 MEAP

Comments

0 comments

Want to join the conversation?

Sign in to comment

Similar courses

Build and Deploy a SaaS AI Agent Platform

Build and Deploy a SaaS AI Agent Platform

Sources: Code With Antonio
In this video course, you will create a video call application with AI support from scratch. You will learn how to set up real-time video communication...
13 hours 24 minutes 14 seconds
Learn to build Web Apps with Bolt.new and AI

Learn to build Web Apps with Bolt.new and AI

Sources: Kevin Kern (instructa.ai)
The course "Creating Web Applications with Bolt.new and AI" offers a comprehensive guide on creating, editing, and launching web applications using Bolt.new...
3 hours 8 minutes 36 seconds
10-Hour LLM Fundamentals

10-Hour LLM Fundamentals

Sources: Towards AI, Louis-François Bouchard
The intensive course "Basics of LLM in 10 Hours" will teach you how to understand and use large language models in real projects. You will learn when it is...
10 hours 30 minutes 55 seconds
Beginner Python Primer for AI Engineering

Beginner Python Primer for AI Engineering

Sources: Towards AI, Louis-François Bouchard
Don't just interact with LLM models - create your own AI solutions in Python. This course will take you from beginner to confident proficiency in Python...
1 hour 41 minutes 58 seconds
Build Your First Product with LLMs, Prompting, RAG

Build Your First Product with LLMs, Prompting, RAG

Sources: Towards AI, Louis-François Bouchard
This practical intensive course will provide you with all the necessary skills to create a fully functional advanced product based on large language models...
2 hours 25 minutes 20 seconds