System Design Foundations
System design has crept down into new-grad and mid-level loops, and most resources dump senior-level reference diagrams on you. This track does the opposite: one building block at a time, taught through the judgment an interviewer is really testing. When do you reach for a cache? Why this database and not that one? You will be able to make the call and defend it.
Every lesson is the same prediction-first format as the DSA path: read a concrete scenario, predict the move, then see why. No diagrams to memorize.
New here? Start at the beginning.
The track is ordered. Begin with How system design interviews work and work straight down, one building block at a time.
You’re not signed in. Your progress here won’t be saved.
Part 1 · The building blocks(20 units)
Caching, queues, sharding, and friends: when to reach for each, and what it costs.
1How system design interviews work
Walk into a system design interview with a repeatable plan: clarify what you're building, size it, then design it in a fixed order instead of freezing.
7 steps
How system design interviews work
Walk into a system design interview with a repeatable plan: clarify what you're building, size it, then design it in a fixed order instead of freezing.
- LessonWhy it feels impossible: there's no single right answerRead, then predict the moveFree
- LessonStart by clarifying, then estimate the scaleRead, then predict the moveFree
- LessonThen design in order: API, data, high-level, deep diveRead, then predict the moveFree
- LessonFinish with bottlenecks and tradeoffsRead, then predict the moveFree
- LessonThe habit that ties it together: think out loudRead, then predict the moveFree
- PracticeApplied drills3 scenario calls: would you reach for this here?Free
- TestGraded test5 questions · 4 to passFree
2The request path
Trace what happens between a user clicking a link and getting a response, and name where each system-design component slots into that path.
7 steps
The request path
Trace what happens between a user clicking a link and getting a response, and name where each system-design component slots into that path.
- LessonThe journey, end to endRead, then predict the moveFree
- LessonDNS: turning a name into an addressRead, then predict the moveFree
- LessonThe load balancer: one front door, many serversRead, then predict the moveFree
- LessonWhy app servers should be statelessRead, then predict the moveFree
- LessonWhere caches and CDNs slot inRead, then predict the moveFree
- PracticeApplied drills3 scenario calls: would you reach for this here?Free
- TestGraded test5 questions · 4 to passFree
3API design basics
Define a clean API for a system: pick the right verb for each action, keep it stateless, and make operations safe to retry.
6 steps
API design basics
Define a clean API for a system: pick the right verb for each action, keep it stateless, and make operations safe to retry.
- LessonAn API is a contractRead, then predict the movePremium
- LessonResources and verbs (REST)Read, then predict the movePremium
- LessonIdempotency: safe to retryRead, then predict the movePremium
- LessonKeep the API statelessRead, then predict the movePremium
- PracticeApplied drills3 scenario calls: would you reach for this here?Premium
- TestGraded test5 questions · 4 to passPremium
4Back-of-envelope estimation
Produce rough scale numbers (requests per second, storage, read/write ratio) fast, and use them to justify design choices instead of guessing.
7 steps
Back-of-envelope estimation
Produce rough scale numbers (requests per second, storage, read/write ratio) fast, and use them to justify design choices instead of guessing.
- LessonRound hard, aim for the order of magnitudeRead, then predict the movePremium
- LessonEstimating requests per second (QPS)Read, then predict the movePremium
- LessonEstimating storageRead, then predict the movePremium
- LessonThe read-to-write ratio drives the designRead, then predict the movePremium
- LessonLatency numbers worth knowingRead, then predict the movePremium
- PracticeApplied drills3 scenario calls: would you reach for this here?Premium
- TestGraded test5 questions · 4 to passPremium
5SQL vs NoSQL
Choose between a relational and a non-relational database based on data shape and access patterns, and defend the choice instead of guessing.
7 steps
SQL vs NoSQL
Choose between a relational and a non-relational database based on data shape and access patterns, and defend the choice instead of guessing.
- LessonTwo shapes of databaseRead, then predict the movePremium
- LessonWhat SQL is great atRead, then predict the movePremium
- LessonWhat NoSQL is great atRead, then predict the movePremium
- LessonChoose by access pattern, not by hypeRead, then predict the movePremium
- LessonYou can use bothRead, then predict the movePremium
- PracticeApplied drills3 scenario calls: would you reach for this here?Premium
- TestGraded test5 questions · 4 to passPremium
6Transactions & ACID
Spot when a set of writes must succeed or fail together, explain what a transaction guarantees, and say when you genuinely need one.
7 steps
Transactions & ACID
Spot when a set of writes must succeed or fail together, explain what a transaction guarantees, and say when you genuinely need one.
- LessonThe problem: multi-step changes, and the world can stop mid-stepRead, then predict the movePremium
- LessonAtomicity: all or nothingRead, then predict the movePremium
- LessonIsolation: concurrent actions that don't trample each otherRead, then predict the movePremium
- LessonDurability and consistency: committed means safe, rules stay trueRead, then predict the movePremium
- LessonThe judgment call: when you actually need thisRead, then predict the movePremium
- PracticeApplied drills3 scenario calls: would you reach for this here?Premium
- TestGraded test5 questions · 4 to passPremium
7Database indexing
Explain why a query is fast or slow, what an index speeds up and what it costs, and decide when a column deserves one.
7 steps
Database indexing
Explain why a query is fast or slow, what an index speeds up and what it costs, and decide when a column deserves one.
- LessonThe problem: finding a row can mean scanning every rowRead, then predict the movePremium
- LessonAn index is like the index of a bookRead, then predict the movePremium
- LessonThe cost: writes get slowerRead, then predict the movePremium
- LessonWhen to add an indexRead, then predict the movePremium
- LessonDon't over-indexRead, then predict the movePremium
- PracticeApplied drills3 scenario calls: would you reach for this here?Premium
- TestGraded test5 questions · 4 to passPremium
8Caching
Spot when a cache belongs in a design, say what it buys you and what it costs, and pick what to cache and where.
7 steps
Caching
Spot when a cache belongs in a design, say what it buys you and what it costs, and pick what to cache and where.
- LessonThe problem: the same expensive work, over and overRead, then predict the movePremium
- LessonWhat a cache buys you: speed and breathing roomRead, then predict the movePremium
- LessonThe catch: the cache can be wrongRead, then predict the movePremium
- LessonManaging staleness: expiry and invalidationRead, then predict the movePremium
- LessonWhere caches liveRead, then predict the movePremium
- PracticeApplied drills3 scenario calls: would you reach for this here?Premium
- TestGraded test5 questions · 4 to passPremium
9Load balancing
Explain how a load balancer spreads traffic, routes around dead servers, and why you run more than one of everything, including the balancer itself.
7 steps
Load balancing
Explain how a load balancer spreads traffic, routes around dead servers, and why you run more than one of everything, including the balancer itself.
- LessonThe job: one front door, many serversRead, then predict the movePremium
- LessonHealth checks: routing around failureRead, then predict the movePremium
- LessonHow it decides: distribution strategiesRead, then predict the movePremium
- LessonWhy stateless servers make this workRead, then predict the movePremium
- LessonIsn't the load balancer a single point of failure?Read, then predict the movePremium
- PracticeApplied drills3 scenario calls: would you reach for this here?Premium
- TestGraded test5 questions · 4 to passPremium
10Availability & Failover
Find the single points of failure in a design, add redundancy where it counts, and explain how failover actually switches to the backup.
7 steps
Availability & Failover
Find the single points of failure in a design, add redundancy where it counts, and explain how failover actually switches to the backup.
- LessonAvailability is measured in ninesRead, then predict the movePremium
- LessonThe single point of failureRead, then predict the movePremium
- LessonRedundancy: a second one, ready to goRead, then predict the movePremium
- LessonFailover: detecting death and switchingRead, then predict the movePremium
- LessonThe chain: you are only as available as your dependenciesRead, then predict the movePremium
- PracticeApplied drills3 scenario calls: would you reach for this here?Premium
- TestGraded test5 questions · 4 to passPremium
11Replication
Explain how copying a database across machines scales reads and survives failures, and name the price: replication lag.
7 steps
Replication
Explain how copying a database across machines scales reads and survives failures, and name the price: replication lag.
- LessonThe problem: one database is a bottleneck and a riskRead, then predict the movePremium
- LessonLeader and followersRead, then predict the movePremium
- LessonRead replicas scale readsRead, then predict the movePremium
- LessonReplication lag: the catchRead, then predict the movePremium
- LessonAvailability: promote a followerRead, then predict the movePremium
- PracticeApplied drills3 scenario calls: would you reach for this here?Premium
- TestGraded test5 questions · 4 to passPremium
12Sharding / partitioning
Explain how splitting data across machines scales writes and storage, how a shard key decides the split, and the hotspots and cross-shard costs it introduces.
7 steps
Sharding / partitioning
Explain how splitting data across machines scales writes and storage, how a shard key decides the split, and the hotspots and cross-shard costs it introduces.
- LessonWhen replicas aren't enoughRead, then predict the movePremium
- LessonSharding: split the data across machinesRead, then predict the movePremium
- LessonThe shard key decides where data livesRead, then predict the movePremium
- LessonHotspots: when a shard key goes wrongRead, then predict the movePremium
- LessonThe cost: cross-shard operations get hardRead, then predict the movePremium
- PracticeApplied drills3 scenario calls: would you reach for this here?Premium
- TestGraded test5 questions · 4 to passPremium
13Consistent hashing
Explain why hash(key) mod N falls apart when servers are added or removed, how the hash ring fixes it, and what virtual nodes smooth out.
7 steps
Consistent hashing
Explain why hash(key) mod N falls apart when servers are added or removed, how the hash ring fixes it, and what virtual nodes smooth out.
- LessonThe problem: mod N reshuffles almost everythingRead, then predict the movePremium
- LessonThe ring: hash servers and keys into the same spaceRead, then predict the movePremium
- LessonMembership changes now move only a sliverRead, then predict the movePremium
- LessonVirtual nodes: fixing the lumpy ringRead, then predict the movePremium
- LessonWhere you actually meet thisRead, then predict the movePremium
- PracticeApplied drills3 scenario calls: would you reach for this here?Premium
- TestGraded test5 questions · 4 to passPremium
14Message queues & async processing
Recognize when to move work off the request path with a queue, and explain the three things a queue buys you: fast responses, spike absorption, and decoupling.
7 steps
Message queues & async processing
Recognize when to move work off the request path with a queue, and explain the three things a queue buys you: fast responses, spike absorption, and decoupling.
- LessonThe problem: slow work makes users waitRead, then predict the movePremium
- LessonA queue: hand off work to do laterRead, then predict the movePremium
- LessonAbsorbing spikesRead, then predict the movePremium
- LessonDecoupling producers and consumersRead, then predict the movePremium
- LessonThe gotcha: messages can arrive more than onceRead, then predict the movePremium
- PracticeApplied drills3 scenario calls: would you reach for this here?Premium
- TestGraded test5 questions · 4 to passPremium
15Pub/sub vs queues
Tell a task apart from an event, pick a work queue or a pub/sub topic accordingly, and explain how the two combine in real systems.
7 steps
Pub/sub vs queues
Tell a task apart from an event, pick a work queue or a pub/sub topic accordingly, and explain how the two combine in real systems.
- LessonTasks and events are different animalsRead, then predict the movePremium
- LessonWhy a single queue can't broadcastRead, then predict the movePremium
- LessonPub/sub: publish once, deliver to everyoneRead, then predict the movePremium
- LessonThe payoff: adding features without touching producersRead, then predict the movePremium
- LessonReal systems use both, layeredRead, then predict the movePremium
- PracticeApplied drills3 scenario calls: would you reach for this here?Premium
- TestGraded test5 questions · 4 to passPremium
16CAP & consistency
Tell strong from eventual consistency, explain the availability tradeoff during a network partition, and choose the right consistency per piece of data.
7 steps
CAP & consistency
Tell strong from eventual consistency, explain the availability tradeoff during a network partition, and choose the right consistency per piece of data.
- LessonStrong vs eventual consistencyRead, then predict the movePremium
- LessonWhy not always strong? The costRead, then predict the movePremium
- LessonThe partition tradeoff (the CAP idea)Read, then predict the movePremium
- LessonChoosing consistency per piece of dataRead, then predict the movePremium
- LessonIt's a spectrum, and mostly a choice you make on purposeRead, then predict the movePremium
- PracticeApplied drills3 scenario calls: would you reach for this here?Premium
- TestGraded test5 questions · 4 to passPremium
17Rate limiting
Explain why systems cap how often a client can call them, the token-bucket intuition for allowing bursts, and where a rate limiter belongs.
7 steps
Rate limiting
Explain why systems cap how often a client can call them, the token-bucket intuition for allowing bursts, and where a rate limiter belongs.
- LessonThe problem: one client can hurt everyoneRead, then predict the movePremium
- LessonRate limiting: a cap per client per windowRead, then predict the movePremium
- LessonThe token bucket: allowing bursts, capping the rateRead, then predict the movePremium
- LessonWhere the rate limiter livesRead, then predict the movePremium
- LessonWhat to return when a client is limitedRead, then predict the movePremium
- PracticeApplied drills3 scenario calls: would you reach for this here?Premium
- TestGraded test5 questions · 4 to passPremium
18Real-time & push
Explain why servers can't push over plain HTTP, compare polling, long polling, and WebSockets, and pick the cheapest mechanism that meets the freshness need.
7 steps
Real-time & push
Explain why servers can't push over plain HTTP, compare polling, long polling, and WebSockets, and pick the cheapest mechanism that meets the freshness need.
- LessonThe wall: the server cannot start the conversationRead, then predict the movePremium
- LessonPolling: ask on a timerRead, then predict the movePremium
- LessonLong polling: ask, and the server holds the question openRead, then predict the movePremium
- LessonWebSockets: a two-way line that stays openRead, then predict the movePremium
- LessonChoosing: buy the cheapest freshness that meets the needRead, then predict the movePremium
- PracticeApplied drills3 scenario calls: would you reach for this here?Premium
- TestGraded test5 questions · 4 to passPremium
19Search & the inverted index
Explain why text search defeats a normal database index, how an inverted index answers word queries instantly, and where a search service fits next to your database.
7 steps
Search & the inverted index
Explain why text search defeats a normal database index, how an inverted index answers word queries instantly, and where a search service fits next to your database.
- LessonWhy LIKE '%term%' cannot be savedRead, then predict the movePremium
- LessonThe inverted index: organize by word insteadRead, then predict the movePremium
- LessonTokenizing: why "Running" matches "runs"Read, then predict the movePremium
- LessonMultiple words, and who comes firstRead, then predict the movePremium
- LessonWhere search lives in your designRead, then predict the movePremium
- PracticeApplied drills3 scenario calls: would you reach for this here?Premium
- TestGraded test5 questions · 4 to passPremium
20Blob storage & CDNs
Decide where files and media live (object storage, not the database) and how to serve them fast worldwide with a CDN, keeping only metadata in your database.
7 steps
Blob storage & CDNs
Decide where files and media live (object storage, not the database) and how to serve them fast worldwide with a CDN, keeping only metadata in your database.
- LessonThe problem: files don't belong in your databaseRead, then predict the movePremium
- LessonObject (blob) storageRead, then predict the movePremium
- LessonThe pattern: metadata in the database, file in blob storageRead, then predict the movePremium
- LessonServing globally: the CDNRead, then predict the movePremium
- LessonPutting it together: upload and serveRead, then predict the movePremium
- PracticeApplied drills3 scenario calls: would you reach for this here?Premium
- TestGraded test5 questions · 4 to passPremium