Parsing, or syntactic analysis, is a crucial initial stage in designing and implementing a compiler. A well-designed syntax can significantly influence users to choose your programming language. This course is a practical class on building a manual Recursive-descent parser. If you're interested in parsing theory and automated algorithms, you might also consider our [Parsing Algorithms] class.
Why Choose Recursive Descent Parsers?
Recursive descent parsers are widely used in many production programming languages. Unlike automated parsing algorithms, manual implementation allows you to have full control over the parsing process, tackling complex constructs that may not be achievable with automatic parsers.
Implementing a manual parser from scratch provides an in-depth understanding of the process, demystifying internal structures, turning parser building into an engaging engineering task.
Course Overview
In the Building a Parser from Scratch class, we focus on the practical implementation and exploration of various parser aspects. You will learn the concept of Recursive descent parsing, understand what a Tokenizer is and how it works with the Parser module. You'll also discover what an Abstract Syntax Tree (AST) is, how to format these ASTs, and learn about “lookahead” and predictive parsing. Ultimately, you'll build a parser for a complete programming language similar to Java or JavaScript.
Building a parser will also enhance your practical skills in other programming languages.
Target Audience
Who should take this class?
This class is designed for any curious engineer who wants to acquire skills in building complex systems—such as creating a parser for a programming language, which is a challenging engineering task—and gain transferable knowledge for developing such systems. If you are particularly interested in compilers, interpreters, and source code transformation tools, this class is also ideal for you.
Prerequisites include understanding basic data structures and algorithms: trees, lists, traversal, and regular expressions.
Implementation Tools
We use JavaScript due to its elegant multi-paradigm structure, combining functional programming, class-based, and prototype-based OOP, which fit perfectly for our purposes.
JavaScript is familiar to many engineers, making it easier to start coding immediately. However, our approach ensures that JS-specific constructs are minimal, facilitating transferability to other programming languages of your choice.
Note: Our objective is for students to actively follow, understand, and implement each detail of the parser themselves, rather than relying on direct copy-pasting from a final solution. The complete source code is available in video lectures, with guidance on structuring specific modules.
Unique Features of This Class
- Concise and focused content. Each lecture stays self-contained, concise, and covers directly relevant material, avoiding distractions.
- Animated presentations paired with live-editing notes, enhancing topic comprehension and illustrating object structure connections dynamically.
- End-to-end live coding sessions include assignments. The entire source code, from scratch to completion, is demonstrated in our video lectures.
Course Structure
The course is divided into four parts, comprising 18 lectures and numerous sub-topics. Below is the table of contents and curriculum:
Part 1: Basic Expressions and Tokenizer
We explore basic expressions such as Numbers and Strings and develop Tokenizer modules using regular expressions.
Part 2: Program Structure
This section covers program structures, including statements, statement lists, blocks, and recursive production rules. We discuss various AST formats and begin constructing more complex expressions.
Part 3: Control Flow and Functions
Here, we implement variables, assignments, and operator precedence, introducing function abstractions. Control structures like the If-statement and iteration loops are also defined.
Part 4: Object-Oriented Programming
In the final part, we implement classes and objects, discussing property and array access. We also implement generic function and method calls, culminating in building the final parser executable.