LLM Engineering
68 courses 6 categories
Part of Learn Data & AI
LLM engineering is the applied discipline of shipping production systems on top of large language models — the API-side, infra-side work that lives between prompt-writing and pre-training. Unlike the broader AI hub, this topic focuses narrowly on the provider side: building retrieval-augmented pipelines, designing agentic loops, evaluating model outputs at scale, defending against prompt injection, and keeping inference cost predictable. It is the engineering layer that turns a model API into a service that survives real traffic.
The toolchain in 2026 has stabilized around a recognizable stack. Orchestration is LangGraph, the OpenAI Agents SDK, CrewAI, or hand-rolled state machines. Retrieval lives on pgvector, Qdrant, Pinecone, Weaviate, or Turbopuffer with hybrid search and reranking via Cohere or open models. MCP (Model Context Protocol) has become the standard for exposing tools and resources to agents across providers. Evals run continuously through LangSmith, Braintrust, Langfuse, or in-house golden-dataset rigs, with LLM-as-judge for fuzzy assertions and exact-match for the rest.
What you'll find under this topic
- RAG architecture: chunking strategies, embeddings, hybrid search, reranking, query rewriting
- Agent design: tool-calling, state management, error recovery, multi-agent patterns
- MCP servers and clients: exposing tools, resources, and prompts across providers
- Production eval harnesses: regression suites, LLM-as-judge, trace-based debugging
- Prompt-injection defense: input sanitization, output filtering, indirect-injection mitigation
- Cost and latency control: model routing, prompt caching, structured outputs, batch API
- Provider integration patterns: OpenAI, Anthropic, Gemini, open-weight via vLLM / Together
The hiring market for LLM engineers in 2026 includes every SaaS company with an AI feature roadmap, dedicated applied-AI teams at OpenAI, Anthropic, and Google, and a long tail of startups built on top of foundation models. The skill set is distinct from ML research and from generic backend work — it sits at the intersection.
Categories (6)
Courses (68)
Showing 1 – 30 of 68 courses
NewBy: Academind Pro (Maximilian Schwarzmüller)Learn to develop AI applications and agent systems with practical examples and theory. Master content generation and process automation using AI.4 hours 2 minutes 13 seconds
NewBy: Nomad CodersLearn how to develop autonomous AI agents for business in practice. The course from Nomad Coders covers current frameworks and real projects.24 hours 27 minutes 34 seconds
NewBy: Zero To MasteryThis course shows you how to build smarter AI apps with RAG. You use RAG to give LLMs fresh facts from your own data.17 hours 51 minutes 59 seconds 5 / 5
NewBy: Tech with Lucy (Lucy Wang)Unlock your potential as an AI/ML Engineer with five hands-on projects on AWS. This course is designed to offer you practical experience.5 / 5
NewBy: Carlos MarcialUnlock the full potential of AI chatbots with ChatRAG – a comprehensive Next.js build designed for launching a successful SaaS business.
Updated 1mo agoBy: WebDevCodyMaster agent-based programming using AI models and tools like Claude and GPT-5.1. Create applications faster and become a systems architect.11 hours 58 minutes 41 seconds 5 / 5
Updated 1mo agoBy: Matt PocockLearn to create a personal AI assistant using TypeScript in 5 days. Work with data, customize it to your needs, and apply modern techniques.3 hours 38 minutes 48 seconds 5 / 5
Updated 1mo agoBy: Ishan AnandLearn how AI and LLM models work in just a few hours. The course helps to master the technical aspects without delving into programming. Ideal for developers.8 hours 9 minutes 42 seconds
Updated 1mo agoBy: IndyDevDanStudy the transition from AI coding to agent engineering. Create autonomous systems that design and test themselves, applying advanced practices.12 hours 53 minutes 36 seconds 5 / 5
Updated 3mo agoBy: Zero To MasteryStop memorizing random prompts. Instead, learn how Large Language Models (LLMs) actually work and how to use them effectively. This course will take you from be31 hours 45 minutes 3 seconds 5 / 5
Updated 3mo agoBy: Zero To MasteryLearn how to create AI agents in n8n without coding. Discover how to integrate language models, configure triggers, and set up nodes for task automation.2 hours 51 minutes 16 seconds 5 / 5
Updated 5mo agoBy: Talk Python TrainingLearn how to use agent AI to create and improve Python applications. Discover the difference from chatbots and customize AI for your tasks.2 hours 38 minutes 10 seconds 5 / 5
Updated 5mo agoBy: ByteSizeGoLearn to integrate AI with Go: create projects, enhance skills, and deploy AI apps. Includes LLM API, vector databases, and model interactions.11 hours 13 minutes 5 seconds
Updated 5mo agoBy: Andreas KretzLearn how to develop a local RAG system for processing PDFs with LlamaIndex and Ollama, using Elasticsearch and Mistral. Master the creation of chat interfaces.1 hour 49 minutes 50 seconds
Updated 5mo agoBy: Design GurusLearn the foundations of modern AI with practical examples and ethical insights. Ideal for beginners and those seeking to deepen AI understanding.
Updated 5mo agoBy: Vue School, Justin Schroeder, Daniel Kelly, Garrison SnellingFind out how to create custom AI agents and develop universal rules to enhance your productivity and address all aspects of the project.1 hour 2 minutes 46 seconds
Updated 6mo agoBy: Vue School, Justin Schroeder, Daniel Kelly, Garrison SnellingStudy the RAG approach to enhance AI with your own data. Learn about vectors, embeddings, and integration. Apply the approach in real projects.26 minutes 55 seconds
Updated 6mo agoBy: Towards AI, Louis-François BouchardLearn Python from the ground up and use it to build your own AI tools. You start with the basics and grow the skills you need to work with LLMs in real.1 hour 41 minutes 58 seconds
Updated 7mo agoBy: Andreas KretzMaster semantic search with our course on generative AI. Learn to build a complete pipeline using FastAPI, qdrant, and Streamlit for advanced data processing53 minutes 37 seconds
Updated 7mo agoBy: Andreas KretzThe Hidden Foundation of GenAI gives you a clear start in embeddings. You learn what sits under LLMs, vector search, and semantic tools.20 minutes 42 seconds 5 / 5
Updated 7mo agoBy: Kent C. DoddsThe most interesting thing in software right now is MCP. It's a protocol that turns applications into smart conversational partners: instead of.7 hours 23 minutes 25 seconds 5 / 5
Updated 7mo agoBy: Towards AI, Louis-François BouchardUnlock the potential of large language models with our intensive course, " LLM Basics in 10 Hours ".10 hours 30 minutes 55 seconds 5 / 5
Updated 7mo agoBy: Antonio Erdeljac (Code With Antonio)In this course, we will build a customer support platform powered by AI from scratch: we will set up a live chat using Convex Agents, add voice support through.22 hours 20 minutes 55 seconds 5 / 5
Updated 7mo agoBy: Zero To MasteryMaster an in-demand skill that companies are looking for: the development and implementation of custom LLMs. In the course, you will learn how to fine-tune open7 hours 12 minutes 10 seconds 5 / 5
Updated 7mo agoBy: Prompt EngineeringExplore the cutting-edge world of Retrieval-Augmented Generation (RAG) in this comprehensive course designed to deepen your understanding of both the.2 hours 40 minutes 48 seconds 5 / 5
Updated 7mo agoBy: Newline (ex-Fullstack.io)In this comprehensive course, you will master the powerful capabilities of n8n , an open platform designed for creating robust, AI-powered workflows.49 minutes 8 seconds
Updated 7mo agoBy: Newline (ex-Fullstack.io)If you are a freelancer or indie hacker for whom speed of implementation is just as important as quality, this course could be the most exciting one this year.28 minutes 5 seconds 5 / 5
Updated 7mo agoBy: Newline (ex-Fullstack.io)Prompt Engineering helps you guide AI models with clear and useful inputs. LLMs can write, plan, explain, and code.45 minutes 54 seconds 5 / 5
Updated 7mo agoBy: Newline (ex-Fullstack.io)You will learn how MCP works and how to use it in real projects. This course keeps things clear and practical so you can build and test your own tools fast.1 hour 10 minutes 6 seconds 5 / 5
Updated 7mo agoBy: Newline (ex-Fullstack.io)Unlock the potential to create visually stunning and fully functional web applications with the power of Replit Agent —an advanced AI agent dedicated to.44 minutes 5 seconds 5 / 5
Related topics
Frequently asked questions
- What does an LLM engineer actually do?
- Designs prompts and system messages, builds RAG pipelines and agents, integrates models via API or self-hosted inference, writes evaluation harnesses and guardrails, controls cost and latency, defends against prompt injection, and works closely with product on what models can and can't reliably do. Most of the work is engineering around the model, not training it.
- LLM engineering vs Prompt engineering — what's the difference?
- Prompt engineering is a sub-skill — writing the actual instructions the model receives. LLM engineering is the broader role: prompts plus retrieval, evaluation, deployment, observability, cost, security, and orchestration. Pure prompt-engineering job titles have largely faded; the durable role is LLM engineer or AI engineer, with prompting as one component.
- Do I need to understand transformers at the math level?
- Not for applied LLM engineering — knowing what attention, tokens, embeddings, and context length mean conceptually is enough. Math-level understanding becomes relevant only if you're fine-tuning at scale, designing new architectures, or doing research. Most production LLM work succeeds on solid software engineering plus model literacy.
- Closed models vs open weights — which to use?
- Closed (OpenAI, Anthropic, Google, xAI) for the strongest quality, easy onboarding, and frontier capability. Open weights (Llama, Qwen, Mistral, DeepSeek) for cost at high volume, data residency, on-prem requirements, and full customization. Most production stacks mix both — frontier model for hard tasks, smaller open model for cheap high-volume calls.
- How important are evaluations?
- Critical and chronically underdone. Without an evaluation harness you can't tell whether a prompt change is an improvement or a regression, and prompt-engineering devolves into vibes-based iteration. Invest early in eval datasets, automated grading (model-as-judge or rule-based), and a way to compare runs side by side. This is where most LLM projects succeed or fail.
Top instructors in LLM Engineering
Authors with the most LLM Engineering courses on CourseFlix.