Design Uber
Broadcasting location coordinates over WebSockets and matching drivers dynamically via Quadtrees.
What you'll learn
- WebSockets for Real-Time GPS
- Redis GEO for Geospatial Indexing
- Geohash / S2 Cells for Spatial Partitioning
- Driver-Rider Matching Algorithm
- Surge Pricing (Supply/Demand Ratio)
- Kafka + PostgreSQL for Trip Lifecycle
TL;DR
Broadcasting location coordinates over WebSockets and matching drivers dynamically via Quadtrees.
Visual System Topology
Uber — Real-Time Ride Matching Architecture
Surge: demand/supply ratio per geohash cell → multiplier applied → recalculated every 30 sec
Concept Overview
Uber is a real-time geo-distributed ride-matching platform. Its core challenge: match 1M+ active drivers to riders within 2 seconds using live GPS data that updates every 4 seconds from every active driver.
Functional Requirements:
- Rider requests a ride (pickup location, destination)
- Match rider to nearest available driver in < 2 seconds
- Real-time GPS tracking of the driver's location
- Dynamic pricing (surge when demand > supply)
- Trip management (start, complete, cancel)
- Ratings, payments, driver earnings
Non-Functional Requirements:
- Driver location updates: 250K updates/sec (1M drivers × once/4 sec)
- Matching latency: < 2 seconds from ride request to driver offer
- 99.9% availability
- Location data freshness: < 5 seconds stale
Capacity Estimation:
- Rides/month: 100M = 3.3M/day = 38 rides/sec
- Active drivers at peak: 1M globally
- GPS updates: 1M drivers × 1 update/4 sec = 250K writes/sec to Redis
- Rider requests at peak: 38/sec globally (much lower than location updates)
- Location storage (Cassandra history): 250K × 100 bytes = 25 MB/sec = 2 TB/day
Key Architectural Pillars
WebSockets for Real-Time GPS
Drivers send GPS coordinates every 4 seconds. Using HTTP REST (a new connection per update) would generate 250K × connection_overhead/sec — wasteful. WebSockets maintain one persistent connection per driver throughout their shift. The Location Service receives the stream of {driver_id, lat, lng, heading, speed} updates and writes them to Redis GEO in real time. Riders also receive driver location updates via WebSocket for the live map view.
Redis GEO for Geospatial Indexing
SQL databases with lat/lng columns require WHERE lat BETWEEN X AND Y AND lng BETWEEN A AND B — this is a slow range scan even with composite indexes at scale. Redis GEO stores driver locations using geohash encoding under the hood. The GEORADIUS command efficiently returns all drivers within a radius of a point in O(N+log(M)) where N is the result count. At 1M active drivers, this query completes in < 10ms.
Geohash / S2 Cells for Spatial Partitioning
The Earth's surface is divided into cells at multiple precision levels. A Geohash string encodes a region: "9q8y" ≈ 40km², "9q8yy" ≈ 1km². Drivers in the same cell are nearby. Benefits: (1) Nearby cells share a common prefix (geohash "9q8yy" and "9q8yz" are adjacent), (2) Cell-based surge pricing calculation (demand/supply ratio per cell), (3) Sharding: each geohash cell can be handled by a different server. Uber uses Google S2 cells (more uniform than geohash).
Driver-Rider Matching Algorithm
When a rider requests, the Matching Service: (1) Queries Redis GEORADIUS for drivers within 5km, (2) Filters by availability (no active trip, online) and car type, (3) Estimates ETA using real-time traffic data for each candidate, (4) Ranks by ETA + rating score, (5) Sends ride offer to top 3 drivers simultaneously. First driver to accept gets the trip. Offer expires in 10 seconds (driver must respond). If rejected by all 3, expand radius and retry.
Surge Pricing (Supply/Demand Ratio)
Surge pricing is calculated per geohash cell every 30 seconds: (demand_rate: ride requests per 5 min in cell) ÷ (supply_rate: available drivers in cell) = ratio. If ratio > threshold → surge multiplier applied (1.2x, 1.5x, 2.0x, up to configured max). Calculated by a Surge Pricing Service that reads aggregated metrics from Kafka (rider requests) and Redis GEO (available drivers per cell). Stored in Redis: surge:{cell_id} → multiplier with 30-second TTL.
Kafka + PostgreSQL for Trip Lifecycle
Every significant trip event (requested, driver_assigned, trip_started, trip_completed, payment_processed, rated) is published to Kafka. Downstream consumers: Analytics Service (build dashboards), Billing Service (finalize fare), Notification Service (send receipts), ML Training (improve ETA models). PostgreSQL stores the persistent trip record with all state transitions. This event-driven design decouples the core trip flow from all the downstream operations.
