Skip to main content
CF

Building a Virtual Machine for Programming Language

4h 27m 8s
English
Paid

How programming languages work under the hood? What’s the difference between compiler and interpreter? What is a virtual machine, and JIT-compiler? And what about the difference between functional and imperative programming? There are so many questions when it comes to implementing a programming language! The problem with “compiler classes” in school is such classes are usually presented as some “hardcore rocket science” which is only for advanced engineers.

Moreover, classic compiler books start from the least significant topic, such as Lexical analysis, going straight down to the theoretical aspects of formal grammars. And by the time of implementing the first Tokenizer module, students simply lose an interest to the topic, not having a chance to actually start implementing a programing language itself. And all this is spread to a whole semester of messing with tokenizers and BNF grammars, without understanding an actual semantics of programming languages.

I believe we should be able to build and understand a full programming language semantics, end-to-end, in 4-6 hours — with a content going straight to the point, showed in live coding sessions as pair-programming and described in a comprehensible way.

In the Building a Virtual Machine class we focus specifically on runtime semantics, and build a stack-based VM for a programming language very similar to JavaScript or Python. Working closely with the bytecode level you will understand how lower-level interpretation works in production VMs today.

Implementing a programing language would also make your practical level in other programming languages more professional.

Prerequisites

There are two prerequisites for this class.

The Building a Virtual Machine course is a natural extension for the previous class — Building an Interpreter from scratch (aka Essentials of Interpretation), where we build also a full programming language, but at a higher, AST-level. Unless you already have understanding of how programming languages work at this level, i.e. what eval, a closure, a scope chainenvironments, and other constructs are — you have to take the interpreters class as a prerequisite.

Also, going to lower (bytecode) level where production VMs live, we need to have basic C++ experience. This class however is not about C++, so we use just very basic (and transferrable) to other languages constructs.

Watch the introduction video for the details.

Who this class is for?

This class is for any curious engineer, who would like to gain skills of building complex systems (and building a programming language is an advanced engineering task!), and obtain a transferable knowledge for building such systems.

If you are interested specifically in compilers, bytecode interpreters, virtual machines, and source code transformation, then this class is also for you.

What is used for implementation?

Since lower-level VMs are about performance, they are usually implemented in a low-level language such as C or C++. This is exactly what we use as well, however mainly basic features from C++, not distracting to C++ specifics. The code should be easily convertible and portable to any other language, e.g. to Rust or even higher-level languages such as JavaScript — leveraging typed arrays to mimic memory concept. Using C++ also makes it easier implementing further JIT-compiler.

Note: we want our students to actually follow, understand and implement every detail of the VM themselves, instead of just copy-pasting from final solution. Even though the full source code for the language is presented in the video lectures, the code repository for the project contains /* Implement here */ assignments, which students have to solve.

About the Authors

Dmitry Soshnikov

Dmitry Soshnikov thumbnail

Dmitry Soshnikov is a Russian software engineer and educator focused on programming-language internals, compiler construction, JavaScript engine architecture, and the theoretical computer-science foundations underneath modern software development. His independent course catalog is one of the deepest sources of long-form material on language implementation available outside university CS programs.

His CourseFlix listing carries nine courses spanning parser combinators, interpreter construction, garbage-collection algorithm internals, the design of pattern-matching engines, and JavaScript object-model deep dives. Material is paid and aimed at engineers who want to understand how the languages they use every day actually work under the hood.

Udemy

Udemy thumbnail

Udemy is the largest open marketplace for online courses on the internet. Founded in 2010 by Eren Bali, Oktay Caglar, and Gagan Biyani and headquartered in San Francisco, the company went public on the Nasdaq in 2021 under the ticker UDMY. The platform hosts well over two hundred thousand courses across software development, IT and cloud, data science, design, business, marketing, and creative skills, taught by tens of thousands of independent instructors. Roughly seventy million learners use it worldwide, and the corporate arm — Udemy Business — supplies a curated subset of that catalog to enterprise customers.

Because Udemy is a marketplace rather than a single editorial publisher, the catalog is uneven by design. The strongest material lives in the long-form, project-based courses authored by working engineers — full-stack JavaScript, React, Node.js, Python data science, AWS, Docker and Kubernetes, mobile development with Flutter and React Native, and cloud certification preparation. The CourseFlix listing under this source is the slice of that catalog that has been mirrored here for offline-friendly viewing, organized by topic and updated as new releases land. Pricing on Udemy itself swings dramatically with the site's near-permanent sales, which is why the platform is best treated as a deep reference catalog: pick instructors with strong reviews and a track record of updating their material rather than buying on the headline price alone.

Watch Online 29 lessons

This is a demo lesson (10:00 remaining)

You can watch up to 10 minutes for free. Subscribe to unlock all 29 lessons in this course and access 10,000+ hours of premium content across all courses.

View Pricing
0:00
/
#1: Introduction to Virtual Machines
All Course Lessons (29)
#Lesson TitleDurationAccess
1
Introduction to Virtual Machines Demo
19:23
2
Stack-based vs. Register-based VMs
09:37
3
Logger implementation
04:01
4
Numbers | Introduction to Stack
08:01
5
Math binary operations
07:08
6
Strings | Introduction to Heap and Objects
06:40
7
Syntax | Parser implementation
09:21
8
Compiler | Bytecode
09:14
9
Complex expressions
05:27
10
Comparison | Booleans
05:19
11
Control flow | Branch instruction
08:19
12
Disassembler
08:46
13
Global variables
11:08
14
Blocks | Local variables
14:41
15
Control flow | While-loops
03:35
16
Native functions
07:53
17
User-defined functions
12:08
18
Call stack | Return address
06:38
19
Lambda functions
05:28
20
Bytecode optimizations
05:32
21
Closures | Scope analysis
17:56
22
Closures | Compilation
09:32
23
Closures | Runtime
12:25
24
Tracing heap | Object header
11:01
25
Mark-Sweep GC
15:04
26
Class objects | Methods storage
13:24
27
Instance objects | Property access
10:58
28
Super classes | Inheritance
02:14
29
Final VM executable Final VM executable
06:15
Unlock unlimited learning

Get instant access to all 28 lessons in this course, plus thousands of other premium courses. One subscription, unlimited knowledge.

Learn more about subscription

Related courses

Frequently asked questions

What prerequisites should I have before enrolling in this course?
Before enrolling, it's recommended to have a solid understanding of programming fundamentals, as the course involves complex topics like parser implementation and bytecode compilation. Familiarity with data structures and algorithms will also be beneficial, given the focus on stack and heap management, as well as control flow mechanisms like loops and conditionals.
What will I build by the end of this course?
By the end of the course, you will have built a fully functional virtual machine capable of interpreting a custom programming language. This includes implementing components such as a parser, compiler, and disassembler, as well as features like control flow management, function handling, and garbage collection using the Mark-Sweep algorithm.
Who is the target audience for this course?
This course is targeted at intermediate to advanced programmers interested in understanding how programming languages and virtual machines operate. It is particularly suitable for those who want to deepen their knowledge of compilers, interpreters, and the underlying mechanics of programming languages.
How does the depth of this course compare to similar courses?
The course offers a detailed exploration of virtual machines and language implementation, focusing on both theoretical concepts and practical applications. Topics covered include bytecode optimization, garbage collection, and closures, which are often considered advanced subjects, making it more comprehensive than introductory compiler courses.
What specific tools or platforms will I use in this course?
Throughout the course, you'll work with a custom virtual machine development environment. The focus will be on implementing various components like parsers, compilers, and garbage collectors, rather than using pre-existing tools or platforms.
What topics are not covered in this course?
The course does not cover topics unrelated to virtual machine and programming language implementation, such as network programming, databases, or front-end development. It strictly focuses on the mechanics of virtual machines and language parsing/compilation.
How much time should I expect to commit to this course?
While the total runtime of video lessons is not specified, the course comprises 29 lessons that delve into complex topics like bytecode compilation and garbage collection. It's advisable to allocate several weeks to digest the material fully, especially if working part-time, to ensure a comprehensive understanding of the concepts taught.