System Design: Order Routing & Assignment

Requirements

Functional Requirements:

Match incoming delivery orders to the optimal available driver in real-time
Support order batching — grouping multiple orders for a single driver when pickups or drop-offs are nearby
Handle driver offer acceptance/rejection with automatic reassignment on timeout
Priority routing for scheduled orders, premium customers, and time-sensitive items
Dynamic adjustment of search radius based on driver supply/demand in an area
Support multi-stop routes for batched deliveries with optimized sequencing

Non-Functional Requirements:

Process 5,000 order assignments per second at peak
Driver assignment within 30 seconds of order readiness for 95th percentile
Assignment decision latency under 200ms per batch computation cycle
99.99% availability — failed assignment means lost revenue and poor customer experience
Fair distribution of orders across drivers to maintain driver satisfaction and retention

Scale Estimation

At peak, 5,000 orders/sec need assignment against a pool of 500K available drivers across all markets. Each market (city) operates independently, with the largest markets (NYC, LA, Chicago) having 20K-50K active drivers and 200-500 orders/sec. The matching algorithm runs every 2 seconds, processing all unassigned orders in a batch. Each cycle evaluates ~1,000 unassigned orders against ~30K candidate drivers per market, producing a matching matrix of 30M candidate pairs that must be scored and optimized. Driver location data updates at 4-second intervals, requiring the spatial index to handle 125K updates/sec (500K drivers / 4 seconds).

High-Level Architecture

The order routing system is structured as a three-stage pipeline: Candidate Generation, Scoring, and Assignment. When an order enters the assignment pool (either immediately upon placement or when the restaurant prep timer indicates food will be ready soon), the Candidate Generation stage queries a geospatial index to find all drivers within a dynamic radius (typically 3-8 km, expanding if no candidates are found). The Scoring stage evaluates each candidate driver using a multi-factor scoring model. The Assignment stage solves a global optimization problem to find the best overall matching.

The system operates on a batch cycle architecture rather than processing orders individually. Every 2 seconds, a Batch Coordinator collects all unassigned orders and available drivers in a market, runs the full matching pipeline, and emits assignment decisions. This batched approach produces globally better assignments than greedy first-come-first-served matching — an order that arrived 1 second ago might be better served by a driver that just became available, which greedy matching would have already assigned to an older order.

A separate Offer Management Service handles the driver-facing flow: sending assignment offers to driver apps, managing the 30-second acceptance timer, processing accept/decline responses, and recycling declined orders back into the assignment pool. The offer is sent via push notification and in-app alert simultaneously for reliability.

Core Components

Geospatial Candidate Generator

The Candidate Generator maintains a real-time spatial index of all available drivers using S2 Geometry cells at level 14 (approximately 300m × 300m cells). The index is stored in Redis using sorted sets keyed by S2 cell ID, with driver_id as the member and last_update_timestamp as the score. To find candidate drivers for an order, the system computes the S2 cell covering for a circle centered on the restaurant at the configured radius, then queries all covered cells via Redis ZRANGEBYSCORE (filtering out stale entries older than 15 seconds). The dynamic radius starts at 3 km and expands by 1 km every 10 seconds if insufficient candidates are found, up to a maximum of 12 km. This spatial index handles 125K updates/sec with p99 latency under 5ms.

Multi-Factor Scoring Engine

Each candidate driver-order pair receives a score computed as a weighted sum of factors: (1) estimated pickup time — driving time from driver's current location to restaurant, weighted 0.35; (2) estimated delivery time — driving time from restaurant to customer, weighted 0.25; (3) driver idle time — how long the driver has been waiting for an order, weighted 0.15 (promotes fairness); (4) batch potential — score bonus if the driver is already carrying an order and the new order's pickup or drop-off is within 500m of an existing stop, weighted 0.15; (5) driver acceptance rate — historical probability this driver accepts orders from this restaurant/area, weighted 0.10. Driving time estimates come from a pre-computed road network travel time matrix indexed by H3 hexagons, refreshed every 15 minutes with real-time traffic data.

Global Assignment Optimizer

The optimizer takes the scored candidate matrix and produces the globally optimal assignment using a modified Hungarian algorithm. The standard Hungarian algorithm has O(n³) complexity, which is prohibitive for 1,000 orders × 30K drivers. The system uses a two-phase approach: Phase 1 prunes the candidate matrix by keeping only the top 20 candidates per order (reducing the matrix to 1,000 × 20), then Phase 2 runs a modified auction algorithm (Bertsekas' auction algorithm) that converges in O(n·log(n)) time for sparse matrices. The optimizer also enforces constraints: maximum 2 active orders per driver, no assignment to drivers in a different zone than the restaurant, and priority reservations for scheduled orders.

Database Design

The assignment system is primarily in-memory for performance. Active driver state (location, availability, current assignments) is stored in Redis with the following structures: a geospatial sorted set for spatial queries, a hash per driver (driver:{id} with fields: lat, lng, status, current_orders, idle_since, acceptance_rate), and a list per driver for their current route plan. Assignment decisions are persisted to PostgreSQL for audit: an assignments table with assignment_id, order_id, driver_id, score, factors_json, batch_cycle_id, created_at, accepted_at, and outcome (accepted/declined/timed_out).

The travel time matrix is stored in Redis as a hash: key is travel:{origin_h3}:{destination_h3}, value is estimated seconds. This matrix covers the top 10,000 H3 hexagons per market (~20K entries per market = ~200M entries globally). It is pre-computed by a background job that queries the OSRM routing engine and updated every 15 minutes. Entries not in the cache fall back to a Haversine distance estimate with a speed factor based on the area's average traffic speed.

API Design

POST /api/v1/assignments/enqueue — Add an order to the assignment pool; body contains order_id, restaurant_location, delivery_location, priority, ready_at_estimate
POST /api/v1/assignments/drivers/{driver_id}/available — Mark driver as available for assignments; body contains location, vehicle_type, max_distance_preference
POST /api/v1/assignments/offers/{offer_id}/respond — Driver accepts or declines an assignment offer; body contains decision and optional decline_reason
GET /api/v1/assignments/metrics?market={city}&window=5m — Real-time assignment metrics: avg assignment time, match rate, batch utilization

Scaling & Bottlenecks

The batch optimization cycle is the critical bottleneck. Each 2-second cycle must complete candidate generation, scoring, and optimization within ~200ms to leave headroom for offer delivery and processing. The system achieves this by partitioning each market into zones (using H3 hexagons at resolution 4, ~20 km diameter) and running the optimizer independently per zone in parallel on separate threads. Cross-zone orders (where restaurant and delivery address are in different zones) are handled by a second pass that runs on boundary orders after the zone-local passes complete.

Redis is a single point of failure for the spatial index. This is mitigated with Redis Cluster (6 nodes per market, 3 masters + 3 replicas) and a circuit breaker that falls back to a degraded mode: if Redis is unavailable, the system uses a local in-memory spatial index rebuilt from driver location events on Kafka (stale by up to 10 seconds but functional). The Kafka consumer group for driver locations uses exactly-once semantics to prevent duplicate driver entries in the index.

Key Trade-offs

Batch matching over greedy first-come-first-served: Batching every 2 seconds produces globally optimal assignments (15% better delivery times in simulation) but adds 0-2 seconds of latency before a driver is assigned — acceptable given food prep time typically exceeds 2 seconds
S2 cells over PostGIS for spatial indexing: S2 cells in Redis provide sub-5ms spatial queries at 125K updates/sec, which PostGIS cannot match — the trade-off is less precise distance calculations (cell-level granularity) corrected by exact distance computation in the scoring phase
Pre-computed travel time matrix over live routing API: Caching travel times in Redis avoids 30M routing API calls per batch cycle, but introduces staleness — the 15-minute refresh interval balances accuracy against computation cost
Auction algorithm over Hungarian algorithm: The auction algorithm's O(n·log(n)) sparse complexity enables real-time optimization at scale, but it finds an approximately optimal solution rather than the exact optimum — the approximation gap is under 3% in practice

System Design: Order Routing & Assignment

Requirements

Scale Estimation

High-Level Architecture

Core Components

Geospatial Candidate Generator

Multi-Factor Scoring Engine

Global Assignment Optimizer

Database Design

API Design

Scaling & Bottlenecks

Key Trade-offs

Master this topic in our 12-week cohort

System Design: Ride Matching Algorithm

How to Design a URL Shortener (TinyURL)

System Design: Instagram

System Design: Twitter/X

System Design: Facebook

System Design: TikTok