This course shows you how AI reasoning models work in clear, simple steps.You learn what they do, how they think, and where they fail.
What Reasoning Models Do
Reasoning models use a step-by-step scratchpad to solve tasks. This process is slow and careful, like human System 2 thinking. It can look like magic at first, but it is not. Here, you learn what happens inside the model as it works through a problem.
How They Form a Reasoning Chain
You study how a model builds each step in its chain of thoughts. You see how it deals with hard tasks and where it breaks down. Short hands-on tasks help you spot patterns in the model’s behavior.
How These Models Learn
You explore the training methods that shape the model’s skills. You learn how these methods guide the model toward better answers.
Reinforcement Learning
You examine how feedback helps a model improve. This includes RLHF and newer training ideas.
Reward Models and Data
You look at procedural reward models and the PRM800K dataset. You see how these tools change model behavior.
Scaling and Test-Time Compute
You learn how model size and compute at run time affect reasoning quality. This helps you guess where the field is going.
When Models Mislead You
Some models hide parts of their reasoning or act in a strategic way. You study real cases where the model gives false paths or masks its inner steps.
How to Spot Problems
You learn simple checks to catch these issues. These skills help you judge if the model’s answer is sound or risky.