Skip to main content
CourseFlix

Build a DeepSeek Model (From Scratch)

0h 0m 0s
English
Paid

This course shows you how to build a small DeepSeek model from scratch. You learn each idea in clear steps. You write code, test it, and understand why each part works.

What You Will Build

You create a compact DeepSeek model that runs on a laptop. You start with core LLM ideas and the limits of a standard transformer. You then use the main DeepSeek methods to build a fast and lean model.

Core Ideas You Learn

Latent Attention

You replace full attention with a smaller latent space. This helps you cut memory use and speed up training.

Mixture of Experts

You add MoE layers. These layers route each token to a small set of expert networks. This gives you more model capacity without raising the total compute by much.

Multi-Token Prediction

You train the model to predict several future tokens at once. This improves training speed and helps the model learn stronger patterns.

Quantization and Efficient Training

You set up an FP8 pipeline. You also learn how to use smart parallel methods to train on limited hardware.

Post-Training Steps

Supervised Fine-Tuning

You guide the model with labeled examples. This helps shape its style and fix common errors.

Reinforcement Learning for Reasoning

You try simple RL steps to improve the model’s decisions. You see how reward design changes the model’s behavior.

How You Learn

The course uses short code blocks, drawings, and a clear problem-then-solution flow. You see each idea, try it, and check how it changes the model.

What You Get in the End

You finish with a working mini DeepSeek model. You know how to scale it, shrink it, and adapt it for research or small production tasks.

About the Authors

Dr. Sreedath Pana

Dr. Sreedath Pana thumbnail

Dr. Sreedath Pana — an engineering researcher and entrepreneur known for his developments in AI and sustainable technologies:

  • He holds a PhD from the Massachusetts Institute of Technology (MIT), where he studied applied methods of mechanics, machine learning, and artificial intelligence.
  • Graduated from IIT Madras (dual degree BTech) before attending MIT.
  • Co-founded Vizuara AI Labs, where he serves as an engineer and AI product strategist.
  • Known as the inventor of self-cleaning AI-driven solar technology—a development that uses intelligent systems to optimize the cleaning and efficiency of solar panels.
  • In addition to his engineering work, he is actively involved in developing educational programs on AI, delivers technical lectures, and shares practical knowledge on ML and computer vision.

Naman Dwivedi

Naman Dwivedi thumbnail

Naman Dwivedi — a researcher and machine learning engineer associated with Vizuara AI Labs:

  • Works at Vizuara AI Labs as an AI researcher, specializing in translating advanced deep learning concepts into practical code and working implementations.
  • Mentioned as one of the young team members involved in developing exercises and projects on ML, including modules and practical assignments on deep learning models.
  • Publishes educational and technical content in the context of ML, NLP, MLOps, and LLM development (based on professional profiles and social media posts).

Rajat Dandekar

Rajat Dandekar  thumbnail

Dr. Rajat Dandekar — a researcher and entrepreneur in the field of artificial intelligence and machine learning:

  • Received a PhD in Mechanical Engineering from Purdue University (USA), where he worked on applying machine learning methods to complex physical systems.
  • Also holds BTech and MTech degrees (IIT Madras).
  • Specializes in machine learning models and their application to engineering and scientific computing tasks.
  • Co-founded Vizuara AI Labs (also participated in projects Videsh and FirstPrincipleLabs.ai), where he develops educational and research initiatives for the democratization of AI education and the creation of AI labs, courses, and tools.
  • Actively publishes in scientific works on scientific machine learning and participates in international conferences.

Books

Read Book Build a DeepSeek Model (From Scratch)

#TitleTypeOpen
1Build a DeepSeek Model (From_Scratch) v2 MEAP

Related courses