Hey folks, Rahul here ๐
You know that search box on Google, Amazon, or YouTube where suggestions magically appear as you type? That's a typeahead (or autocomplete) widget โ and it's deceptively hard to build well.
I've seen candidates nail the basic debounce-and-fetch approach, then completely fall apart when asked about "What happens when the user types faster than the network?" or "How do you handle 10 billion possible suggestions?" Let's make sure that doesn't happen to you.
R โ Requirements
Functional Requirements
- Display suggestions as the user types in a search input
- Highlight matched portions of each suggestion
- Support keyboard navigation (โ/โ/Enter/Escape)
- Show recent searches for authenticated users
- Support multi-section results (products, categories, articles)
- Handle selection โ navigate to result or fill input
Non-Functional Requirements
- Latency: Suggestions must appear within 100ms of the last keystroke (perceived)
- Network efficiency: Minimize redundant API calls
- Accessibility: Full ARIA combobox pattern with screen reader support
- Scalability: Handle suggestion pools of billions of entries
- Resilience: Graceful degradation on network failures
A โ Architecture
Let me walk you through the component architecture. There are three approaches candidates typically consider:
Approach 1: Naive Fetch-on-Keystroke
Fire an API call on every keystroke. Simple, but generates ~10 requests for "javascript" โ most of which are wasted. โ Don't do this.
Approach 2: Debounced Fetch
Classic approach: wait N ms after the last keystroke, then fetch. Reduces calls dramatically but introduces perceived latency โ the user finishes typing and waits 200-300ms before seeing anything.
Approach 3: Debounce + Request Deduplication + Client Cache โ
This is the production pattern. Combine debouncing with an LRU cache and request deduplication (in-flight tracking). The cache means repeated prefixes are instant. Let me show you:
Why 150ms debounce?
Research shows the average inter-keystroke interval for proficient typists is ~100-150ms. Setting the debounce at 150ms means we fire after most rapid typing bursts while keeping perceived latency low. Google actually uses ~100ms โ they can afford it with their edge infrastructure.
Component Tree
D โ Data Model
Client-Side State
LRU Cache Design
Why LRU and not just a plain object? Memory. If a user explores many queries, an unbounded cache grows forever. LRU with 100 entries keeps memory ~50KB while caching the most relevant prefixes.
I โ Interface Definition
API Contract
The Stale Response Problem
This is a classic gotcha. User types "rea" โ fires request โ types "react" โ fires another. If "rea" response arrives after "react" response, you'd show wrong suggestions. Solution:
Alternatively, the response echoes back the query field, so you can compare without closure tricks.
ARIA Combobox Pattern
The aria-activedescendant pattern is crucial โ it tells screen readers which suggestion is "focused" without actually moving DOM focus out of the input. This lets the user keep typing while navigating suggestions.
O โ Optimizations
1. Prefix-Based Cache Warming
Here's a trick Google uses: if you have cached results for "reac", you can use them as provisional results for "react" while the network request is in-flight:
This makes the UI feel instant. The user sees filtered results from cache while the real results load in the background.
2. Highlight Matching with Fuzzy Support
3. Keyboard Navigation State Machine
4. Mobile-First: Full-Screen Takeover
On mobile, the dropdown pattern breaks โ virtual keyboards eat half the screen. The production pattern is a full-screen search overlay:
5. Analytics & Search Intelligence
The selectedIndex is gold for ranking โ if users consistently pick the 3rd suggestion, your ranking model needs tuning.
6. Rate Limiting & Graceful Degradation
7. Server-Side: Trie + Ranking
On the backend (brief overview for completeness), suggestions are typically served from a Trie or prefix tree stored in Redis or a dedicated service like Elasticsearch's completion suggester:
Production Gotchas Rahul Has Debugged ๐ฅ
- IME Composition: For CJK (Chinese/Japanese/Korean) input, don't trigger searches during
compositionstart/compositionendโ the input is incomplete. Listen forcompositionendbefore fetching. - Dropdown Positioning: Use
position: fixed+floating-uito handle scroll containers and viewport edges. CSSabsolutebreaks inside overflow-hidden ancestors. - Click Outside vs. Mousedown: Use
onMouseDownon suggestion items, notonClick. Why? BecauseonBluron the input fires beforeonClickon the item, closing the dropdown before the click registers. - URL Encoding: Always
encodeURIComponentthe query. Users will paste emojis, special characters, and even SQL injection attempts into your search box. - Flash of Empty State: When switching from cached results to loading fresh ones, don't clear the UI. Show stale results with a subtle loading indicator until fresh data arrives.
Summary Comparison Table
| Aspect | Naive | Debounce Only | Production (Cache + Dedup) |
|---|---|---|---|
| API calls for "javascript" | 10 | 2-3 | 1 (rest from cache) |
| Perceived latency | Network RTT per key | Debounce + RTT | ~0ms (cache) or Debounce + RTT |
| Stale responses | Frequent | Possible | Handled via query check |
| Memory usage | Low | Low | Bounded (LRU) |
| Offline support | None | None | Cache serves stale results |
Next up: #7: Design a Calendar/Date Picker โ where we'll tackle date math nightmares, timezone handling, range selection state machines, and why Date is the worst API in JavaScript. Stay tuned! ๐๏ธ