Skip to main content
CF

System Design for Interviews and Beyond

7h 53m 5s
English
Paid

System Design for Interviews and Beyond is a 75-lesson 7 hours 53 minutes self-paced course by Mikhail Smarshchok. Having over 15 years of industry experience, last 9 years I worked on building scalable, highly available and low latency distributed systems.

Course facts

Lessons
75
Duration
7 hours 53 minutes
Level
All levels
Language
English
Updated
Instructor
Mikhail Smarshchok
Price
Premium

Having over 15 years of industry experience, last 9 years I worked on building scalable, highly available and low latency distributed systems. For a long time, I have wondered what is the best way to learn system design. While there are many excellent resources for learning individual concepts, few provide a holistic view of how to design systems. And even after you've invested a lot of time and gained a lot of knowledge, it's still hard to develop true system design thinking. Thinking that helps answer questions like: where to start my design; where to go next; how to break this big obscure problem into sub-problems that I know how to solve; and even if I don't know the answer, can I make an educated guess? So I challenged myself to create a course that can help build and improve system design thinking. And two years later, you can see the result of this work.

System requirements (functional and non-functional requirements)
Functional requirements (how to define, working backwards approach)
High availability (time-based and count-based availability, design principles behind high availability, processes behind high availability, SLO, SLA)
Fault tolerance, resilience, reliability (error, fault, failure, fault tolerance, resilience, game day vs chaos engineering, expected and unexpected failures, reliability)
Scalability (vertical and horizontal scaling, elasticity vs scalability)
Performance (latency, throughput, percentiles, how to increase write and throughput, bandwidth)
Durability (backup (full, differential, incremental), RAID, replication, checksum, availability vs durability)
Consistency (consistency models, eventual consistency, linearizability, monotonic reads, read-your-writes (read-after-write), consistent prefix reads)
Maintainability, security, cost (maintainability aspects (failure modes and mitigations, monitoring, testing, deployment), security aspects(CIA triad, identity and permissions management, infrastructure protection, data protection), cost aspects (engineering, maintenance, hardware, software))
Summary of system requirements (a single list of the most popular non-functional requirements)
Regions, availability zones, data centers, racks, servers (how hardware helps to achieve certain qualities)
Physical servers, virtual machines, containers, serverless (pros and cons of different computing environments, what are they good for)
Synchronous vs asynchronous communication (synchronous and asynchronous request-response models, asynchronous messaging)
Asynchronous messaging patterns (message queuing, publish/subscribe, competing consumers, request/response messaging, priority queue, claim check)
Network protocols (TCP, UDP, HTTP, HTTP request and response)
Blocking vs non-blocking I/O (socket (blocking and non-blocking), connection, thread per connection model, thread per request with non-blocking I/O model, event loop model, concurrency vs parallelism)
Data encoding formats (textual vs binary formats, schema sharing options, backward compatibility, forward compatibility)
Message acknowledgment (safe and unsafe acknowledgment modes)
Deduplication cache (local vs external cache, adding data to cache (explicitly, implicitly), cache data eviction (size-based, time-based, explicit), expiration vs refresh)
Metadata cache (cache-aside pattern, read-through and write-through patterns, write-behind (write-back) pattern)
Queue (bounded and unbounded queues, circular buffer (ring buffer) and its applications)
Full and empty queue problems (load shedding, rate limiting, what to do with failed requests, backpressure, elastic scaling)
Start with something simple (similarities between single machine and distributed system concepts, interview tip)
Blocking queue and producer-consumer pattern (producer-consumer pattern, wait and notify, semaphores, blocking queue applications)
Thread pool (pros and cons, CPU-bound and I/O-bound tasks, graceful shutdown)
Big compute architecture (batch computing model, embarrassingly parallel problems)
Log (memory vs disk, log segmentation, message position (offset))
Index (how to implement an efficient index for a messaging system)
Time series data (how to store and retrieve time series data at scale and with low latency)
Simple key-value database (how to build a simple key-value database, log compaction)
B-tree index (how databases and messaging systems use B-tree indexes)
Embedded database (embedded vs remote database)
RocksDB (memtable, write-ahead log, sorted strings table (SSTable))
LSM-tree vs B-tree (log-structured merge-tree data structure, write amplification, read amplification)
Page cache (how to increase disk throughput (batching, zero-copy read))
Push vs pull (pros and cons of both models)
Host discovery (DNS, anycast)
Service discovery (server‑side and client-side discovery patterns, service registry and its applications)
Peer discovery (peer discovery options, membership and failure detection problems, seed node, how gossip protocol works and its applications)
How to choose a network protocol (when and how to choose between TCP, UDP and HTTP)
Network protocols in real-life systems (quiz: what network protocol would you choose for various system design problems)
Video over HTTP (adaptive streaming)
CDN (how to use it, how it works, point of presence (POP), benefits)
Push and pull technologies (short polling, long polling, websocket, server-sent events)
Push and pull technologies in real-life systems (quiz: what technology would you choose for various system design problems)
Large-scale push architectures (C10K and C10M problems, examples of large-scale push architectures, the most noticeable problems of handling long-lived connections at large scale)
What else to know to build reliable, scalable, and fast systems (a list of common problems in distributed systems, a list of system design concepts that help solve these problems, three-tier architecture)
Timeouts (fast failures, slow failures, connection and request timeouts)
What to do with failed requests (strategies for handling failed requests (cancel, retry, failover, fallback))
When to retry (idempotency, quiz: which AWS API failures are safe to retry)
How to retry (exponential backoff, jitter)
Message delivery guarantees (at-most-once, at-least-once, exactly-once)
Consumer offsets (log-based messaging systems, checkpointing)
Batching (pros and cons, how to handle batch requests)
Compression (pros and cons, compression algorithms and the trade-offs they make)
How to scale message consumption (single consumer vs multiple consumers, problems with multiple consumers (order of message processing, double processing))
Partitioning in real-life systems (pros and cons, applications of partitioning)
Partitioning strategies (lookup strategy, range strategy, hash strategy)
Request routing (physical and virtual shards, request routing options)
Rebalancing partitions (how to rebalance partitions)
Consistent hashing (how to implement, advantages and disadvantages, virtual nodes, applications of consistent hashing)
System overload (why it is important to protect the system from overload)
Autoscaling (scaling policies (metric-based, schedule-based, predictive))
Autoscaling system design (how to design an autoscaling system)
Load shedding (how to implement it in distributed systems, important considerations)
Rate limiting (how to use the knowledge gained in the course for solving the problem of rate limiting (step by step guide))
Synchronous and asynchronous clients (admission control systems, blocking I/O and non-blocking I/O clients)
Circuit breaker (circuit breaker finite-state machine, important considerations)
Fail-fast design principle (problems with slow services (chain reactions, cascading failures) and ways to solve them)
Bulkhead (how to implement this pattern in distributed systems)
Shuffle sharding (how to implement this pattern in distributed systems)
The end (a list of topics that we will cover in the next module of the course)

Who teaches System Design for Interviews and Beyond? Mikhail Smarshchok

Mikhail Smarshchok thumbnail

Mikhail Smarshchok is a software engineer and educator behind the SystemDesignFightClub brand — focused on the system-design-interview discipline and the architectural patterns underneath production large-scale systems. His material is widely cited in the system-design-interview-prep community.

His CourseFlix listing carries System Design for Interviews and Beyond — a comprehensive treatment of system-design that covers both the interview-prep curriculum and the broader architectural foundations underneath production systems.

Material is paid and aimed at engineers preparing for senior-level system-design interviews. For broader content, see CourseFlix's System Design & Architecture category page.

What lessons are included in System Design for Interviews and Beyond?

This is a demo lesson (10:00 remaining)

You can watch up to 10 minutes for free. Subscribe to unlock all 75 lessons in this course and access 10,000+ hours of premium content across all courses.

View Pricing
0:00
/
#1: 01 Introduction - Course Introduction
All Course Lessons (75)
#Lesson TitleDurationAccess
1
01 Introduction - Course Introduction Demo
03:59
2
02 Introduction - Who will benefit from the course and how
03:05
3
03 Introduction - Course overview
04:51
4
04 System requirements
06:04
5
05 Functional requirements
04:39
6
06 High availability
10:01
7
07 Fault tolerance, resilience, reliability
08:24
8
08 Scalability
06:17
9
09 Performance
09:30
10
10 Durability
09:13
11
11 Consistency
10:37
12
12 Maintainability, security, cost
09:36
13
13 Summary of system requirements
02:11
14
14 Regions, availability zones, data centers, servers
08:53
15
15 Physical servers, virtual machines, containers, serverless
07:47
16
16 Synchronous vs asynchronous communication
03:30
17
17 Asynchronous messaging patterns
06:28
18
18 Network protocols
07:33
19
19 Blocking vs non-blocking IO
10:41
20
20 Data encoding formats
06:45
21
21 Message acknoledgement
03:01
22
22 Deduplication cache
07:46
23
23 Metadata cache
06:01
24
24 Queue
03:31
25
25 Full and empty queue problems
09:56
26
26 Start with something simple
01:39
27
27 Blocking queue and producer-consumer pattern
04:05
28
28 Thread pool
05:34
29
29 Big compute architecture
03:48
30
30 Log
05:00
31
31 Data store internals - Index
04:02
32
32 Data store internals - Time series data
03:10
33
33 Data store internals - Simple key-value database
05:27
34
34 Data store internals - B-tree index
05:55
35
35 Data store internals - Embedded database
04:05
36
36 Data store internals - RocksDB
06:19
37
37 Data store internals - LSM-tree and B-tree
04:37
38
38 Data store internals - Page cache
07:16
39
39 Push vs pull
04:06
40
40 Host discovery
08:47
41
41 Service discovery
06:32
42
42 Peer discovery
09:05
43
43 How to choose a network protocol
07:42
44
44 Network protocols in real-life systems
09:53
45
45 Video over HTTP
07:20
46
46 CDN
05:15
47
47 Push and pull technologies
08:04
48
48 Push and pull technologies in real-life systems
06:25
49
49 Large-scale push architectures
08:54
50
50 What else to know to build reliable, scalable, and fast systems
04:58
51
51 How to deliver data reliably - Timeouts
03:35
52
52 How to deliver data reliably - What to do with failed requests
05:40
53
53 How to deliver data reliably - When to retry
06:21
54
54 How to deliver data reliably - How to retry
03:49
55
55 How to deliver data reliably - Message delivery guarantees
06:51
56
56 How to deliver data reliably - Consumer offsets
08:08
57
57 How to deliver data quickly - Batching
07:25
58
58 How to deliver data quickly - Compression
04:23
59
59 How to deliver data at large scale - How to scale message consumption
08:07
60
60 How to deliver data at large scale - Partitioning in real-life systems
05:03
61
61 How to deliver data at large scale - Partitioning strategies
06:03
62
62 How to deliver data at large scale - Request routing
06:28
63
63 How to deliver data at large scale - Rebalancing partitions
07:46
64
64 How to deliver data at large scale - Consistent hashing
09:58
65
65 How to protect servers from clients - System overload
02:14
66
66 How to protect servers from clients - Autoscaling
04:38
67
67 How to protect servers from clients - Autoscaling system design
05:10
68
68 How to protect servers from clients - Load shedding
09:11
69
69 How to protect servers from clients - Rate limiting
11:48
70
70 How to protect clients from servers - Synchronous and asynchronous clients
07:25
71
71 How to protect clients from servers - Circuit breaker
04:08
72
72 How to protect clients from servers - Fail-fast design principle
07:32
73
73 How to protect clients from servers - Bulkhead
04:57
74
74 How to protect clients from servers - Shuffle sharding
07:27
75
75 Epilogue - The end (but not quite
00:41
Unlock unlimited learning

Get instant access to all 74 lessons in this course, plus thousands of other premium courses. One subscription, unlimited knowledge.

Learn more about subscription

What courses are similar to System Design for Interviews and Beyond?

Frequently asked questions

What are the prerequisites for enrolling in the system design course?
The course does not specify strict prerequisites, but a foundational understanding of computer science concepts and experience with software development would be beneficial. Familiarity with topics like network protocols, data encoding formats, and database internals will help, as these are covered in detail in the lessons.
What projects or exercises are included in the course?
The course does not focus on specific projects or exercises but emphasizes system design thinking through lessons on topics such as high availability, fault tolerance, and scalability. Students will engage with concepts like synchronous vs asynchronous communication, message acknowledgement, and service discovery to build a comprehensive understanding of system design.
Who is the target audience for this system design course?
The course is designed for software engineers, system architects, and IT professionals seeking to enhance their understanding of system design. It is particularly beneficial for those preparing for technical interviews that focus on designing scalable and reliable systems.
How does this course compare in depth and scope to similar system design courses?
This course offers a holistic view of system design by covering a wide range of topics from system requirements to real-life application of network protocols. Unlike some courses that focus solely on theoretical concepts, it provides insights into practical aspects like message delivery guarantees and partitioning in large-scale systems.
What specific tools or platforms are covered in the course?
The course explores various system design tools and platforms, including physical servers, virtual machines, containers, and serverless architectures. It also delves into data store internals with an examination of technologies like RocksDB and B-trees, providing a thorough understanding of data management within systems.
What topics are not covered in this system design course?
While the course provides extensive coverage of system design principles, it does not delve into specific programming languages or frameworks. The focus is on understanding design concepts rather than implementation details, so students will not find tutorials on specific coding practices or language-specific tools.
What is the expected time commitment for completing the course?
The course consists of 75 lessons with a total runtime of approximately 7 hours and 53 minutes. Students should allocate additional time for reviewing concepts and engaging with the material to fully grasp system design thinking. The flexible format allows learners to progress at their own pace.