System Design · Unit 19
Search & the inverted index
Your marketplace has a million product listings, and the product manager asks for the obvious feature: a search box. The obvious implementation is one line of SQL: WHERE description LIKE '%running shoes%'. It works in the demo. It gets slower every week after launch, and no amount of ordinary database tuning fixes it.
Here is why. The indexing unit taught you that a database index is like a book's index: sorted, so you can jump straight to what you need. But sorted by what? By the column's value, from its first character. "running shoes" buried in the middle of a description is invisible to that sorting, so the database has no choice but to read all million descriptions, every search, checking each for the substring. And even when it limps through, the results are bad: "shoe" does not match "shoes", and there is no notion of which of 4,000 matches is most relevant.
Search is not a slow query problem; it is a different data structure problem. The structure is called an inverted index, and it is the machine inside every search engine you have ever used. The good news for the interview: you already understand its core idea from the DSA side. It is a hash map from word to list of documents.
The rest of the System Design course is premium
The first two units are free, and this is where the gate sits. Unlocking premium opens this unit and everything else in both courses:
- ✓This unit: 5 prediction-first lessons, 3 applied drills, and a 5-question graded test
- ✓All 20 System Design units, caching to CAP & consistency
- ✓The full DSA course: every unit, guided problem, and drill
Cancel anytime. Not useful within 7 days? Email for a full refund.