SYSTEM_DESIGN
System Design: Uber
A deep dive into designing Uber's ride-hailing platform covering real-time matching, GPS tracking, surge pricing, and global scale. Essential reading for engineers preparing for system design interviews.
Requirements
Functional Requirements:
- Riders can request a ride by specifying pickup and dropoff locations
- System matches riders with nearby available drivers in real time
- Drivers and riders can track each other's location on a live map
- System calculates fare estimates before and final cost after the ride
- Surge pricing activates automatically during high demand periods
- Riders and drivers can rate each other after trip completion
Non-Functional Requirements:
- Match latency under 2 seconds for 99th percentile requests
- 99.99% availability — downtime means lost revenue and stranded riders
- Support 20 million concurrent users across 70+ countries
- Location updates processed at under 500ms end-to-end latency
- Strong consistency for payments; eventual consistency acceptable for location data
Scale Estimation
Uber processes roughly 25 million trips per day globally. With ~5 million active drivers, each sending GPS pings every 4 seconds, that's ~1.25 million location updates per second. Each update is ~100 bytes, yielding ~125 MB/s inbound location data. Trip records at ~1 KB each produce ~25 GB/day of trip data. A 5-year retention policy demands ~45 TB of trip storage, easily managed with columnar storage on S3 + Redshift for analytics.
High-Level Architecture
Uber's architecture is organized around three core flows: supply tracking (driver locations), demand intake (rider requests), and the matching engine that joins them. Drivers run a mobile SDK that emits location pings via a persistent WebSocket or gRPC connection to a fleet of Location Ingestion Services. These services write to a geospatial index (backed by Redis with geohash or an in-house system called Ringpop) so that at any moment the system knows which drivers are within a given bounding box.
When a rider submits a request, the Dispatch Service queries the geospatial index for nearby available drivers, scores them by ETA (computed via a routing engine like OSRM or Google Maps Platform), and issues a match offer to the best candidate. The driver app receives the offer via a push notification or persistent connection, accepts or declines within a timeout window, and if accepted the trip FSM (finite state machine) transitions from REQUESTED → MATCHED → EN_ROUTE → IN_PROGRESS → COMPLETED.
A separate Pricing Service runs surge calculations every 60 seconds by comparing supply/demand ratios in each geofenced zone. Completed trips flow into a Payment Service that charges the rider's stored card through Stripe or Braintree, then queues a driver payout via ACH or Instant Pay.
Core Components
Geospatial Index
The geospatial index is the beating heart of Uber's matching. Drivers are partitioned into geohash cells (roughly 1.2 km × 0.6 km at precision 6). Redis GEOADD/GEORADIUS commands provide O(N+log M) nearest-neighbor queries. At Uber's scale, a dedicated H3 hexagonal grid (Uber's open-source library) shards the planet into ~4 million cells at resolution 8, each owning a small sorted set of driver IDs. Writes are extremely hot so the index is replicated across multiple Redis clusters behind a consistent-hash ring.
Dispatch & Matching Engine
The Dispatch Service runs as a stateless horizontally scaled microservice. On each rider request, it fans out to the geospatial index for a candidate set (typically top-20 nearest drivers), computes ETA for each via a pre-computed road graph, ranks by ETA + acceptance rate + rating, and sends an offer to the top candidate. If declined or timed out (8 seconds), it moves to the next candidate. The entire flow is orchestrated with a saga pattern to handle partial failures.
Trip State Machine
Each trip is a document in a distributed key-value store (Cassandra for scale) with a well-defined FSM: CREATED → MATCHING → ACCEPTED → ARRIVING → IN_PROGRESS → COMPLETED | CANCELLED. State transitions are idempotent and write-ahead-logged. Kafka topics carry transition events downstream to analytics, billing, and notification services.
Database Design
Trip data lives in Cassandra partitioned by trip_id (UUID) with secondary indexes on rider_id and driver_id for history queries. Location history uses a time-series approach: each driver gets a row in Cassandra keyed by (driver_id, date) with a map of timestamp→coordinates, capped at 24 hours before archival to S3 Parquet. User profiles and payment methods are stored in a sharded PostgreSQL cluster (CockroachDB or Vitess) for ACID guarantees. Geospatial queries use the Redis geohash layer rather than PostGIS to avoid latency.
API Design
- POST /v1/rides — Rider submits pickup_lat, pickup_lng, dropoff_lat, dropoff_lng; returns ride_id, estimated fare, and driver ETA
- GET /v1/rides/{ride_id} — Polls trip state, driver location, and dynamic ETA; used by the rider app to update the live map
- PATCH /v1/drivers/location — Driver SDK sends current lat/lng + heading + speed every 4 seconds via authenticated bulk update endpoint
- POST /v1/rides/{ride_id}/rating — Either party submits a 1–5 star rating with optional text after trip completion
Scaling & Bottlenecks
The geospatial index is the primary write bottleneck at 1.25 million updates/second. Mitigation involves sharding Redis clusters by geographic region (Americas, EMEA, APAC) and within each region by city, so no single Redis instance handles more than ~50k updates/second. Read replicas serve ETA fan-out queries. Location updates are batched client-side every 4 seconds to reduce connection overhead.
The matching engine scales horizontally since it is stateless — each Dispatch pod handles ~2,000 concurrent match attempts. The routing ETA service caches road-graph segments in memory (OSRM pre-computes contraction hierarchies) so ETA lookups are sub-millisecond. Kafka decouples the high-throughput event stream from downstream consumers like analytics and billing, preventing backpressure from slower services from affecting trip latency.
Key Trade-offs
- Eventual consistency for location vs. strong consistency for payments — location data can lag 4 seconds acceptably, but double-charges are catastrophic, so payments use synchronous ACID transactions
- ETA accuracy vs. computation cost — pre-computed contraction hierarchies sacrifice real-time traffic precision for speed; a live traffic layer (Waze data feed) is blended in as a correction factor
- Push vs. poll for driver offers — persistent WebSocket connections reduce latency but require more server-side state; connection pools are managed by a dedicated gateway service
- Geohash precision vs. index size — finer precision (higher resolution H3 cells) improves match quality near cell boundaries but multiplies index storage; precision 8 (~0.7 km² cells) is the production sweet spot
GO DEEPER
Master this topic in our 12-week cohort
Our Advanced System Design cohort covers this and 11 other deep-dive topics with live sessions, assignments, and expert feedback.