LLMs (large language models) are neural networks trained on enormous text corpora to predict the next token given a context. The category covers fundamentals — how transformers work, what attention does, why scaling laws matter — and the practical engineering on top: fine-tuning, evaluation, inference optimization, and prompt design.
The frontier moved fast through 2024-2025. Closed-model providers (OpenAI, Anthropic, Google, xAI) compete on benchmark performance. Open-weight models (Llama, Qwen, Mistral, DeepSeek) are now competitive for most tasks at a fraction of the cost when self-hosted. The actual job for an engineer is choosing the right model per use case, not assuming GPT-4 is always the answer.
What you'll work with in these 26 courses
- Transformer architecture: attention, positional encoding, KV-cache
- Tokenization: BPE, sentencepiece, why it matters for non-English text
- Training paradigms: pre-training, supervised fine-tuning, RLHF, DPO
- Inference: vLLM, llama.cpp, quantization (GPTQ, AWQ, GGUF formats)
- Embedding models and vector search
- Evaluation: MMLU, HumanEval, custom benchmarks, LLM-as-judge