System Design: Flight Search System

Requirements

Functional Requirements:

Search direct and connecting flights between any two airports for specific dates
Display fares with filtering by airline, stops, departure time, and cabin class
Calendar view showing cheapest fare for each day across a month
Price alerts: notify users when fare drops below a target price
Multi-city (open-jaw) itinerary support
Deep link to airline or OTA for final booking

Non-Functional Requirements:

Flight search results returned in under 3 seconds at the 95th percentile
Fare data freshness: airline fares change frequently; cache TTL of 15 minutes
Support 500 million searches/year; 20,000 searches/second at peak
Calendar cheapest-day queries must be pre-computed (not real-time)
99.9% availability for the search path

Scale Estimation

500 million searches/year = 1,585/second average; peak (Monday morning corporate travel, Sunday leisure planning) ~20,000/second. Each search touches ~100 airlines × multiple fare combinations = potentially thousands of fare lookups. Pre-computed fare cache size: 50,000 top routes × 90 day booking window × 5 cabin classes × avg 3 airlines/route = 67.5 million cache entries × ~200 bytes = ~13.5 GB — fits in a single Redis cluster. GDS (Global Distribution System: Amadeus, Sabre) API calls: expensive and rate-limited; minimized via aggressive caching.

High-Level Architecture

Flight search aggregates inventory from airline APIs, GDSes (Amadeus, Sabre, Travelport), and LCC (low-cost carrier) direct APIs. The core architecture is a three-tier pipeline: Data Acquisition (crawling fares from sources), Fare Cache (serving most searches without GDS calls), and Live Search (calling GDS for non-cached queries).

The Fare Acquisition Service runs continuous background crawls: it submits dummy searches to airline and GDS APIs for all major route-date combinations, stores results in the Fare Cache, and schedules re-crawls every 15 minutes. ~80% of user searches are served from cache; the remaining 20% (rare routes, specific dates) trigger live GDS API calls. A Search Orchestrator manages both paths: check cache first, fall back to live search, merge and deduplicate results.

For the calendar cheapest-day view, a nightly batch job computes the minimum fare per day for every popular origin-destination pair and stores results in a dedicated Calendar Fare Store (Redis sorted set per route). Calendar queries are served entirely from this pre-computed store.

Core Components

Fare Acquisition & Crawling Service

The acquisition service models routes by a priority tier: Tier 1 (top 1,000 routes by search volume, crawled every 15 minutes), Tier 2 (next 10,000 routes, crawled every 2 hours), Tier 3 (long-tail, crawled once per day or on-demand). Crawl jobs run on a distributed task queue (Celery + Redis). GDS API calls are proxied through a rate-limiter that respects per-carrier API quotas. Results are parsed from GDS response formats (EDIFACT or JSON depending on GDS), normalized into a canonical Fare object, and written to the Fare Cache (Redis).

Fare Cache Service

Fares are stored in Redis hashes keyed by (origin, destination, departure_date, return_date, cabin_class, adults) → list of Fare objects sorted by price. Each entry has a TTL of 15 minutes for Tier 1 routes, 2 hours for Tier 2. Cache hit is a Redis HGET with sub-millisecond latency. On cache miss, the query is forwarded to the Live Search Service and the result is stored back in cache for future requests. A cache warming job pre-populates high-traffic routes at system startup and after scheduled purges.

Itinerary Builder

For connecting flights, fares from individual flight segments (leg 1, leg 2) must be combined into valid itineraries. The Itinerary Builder receives all fare responses for an origin-destination, breaks multi-leg journeys into segments, and assembles valid combinations: minimum connection time (30–90 minutes by airport), same-day travel, and logical airport sequencing. It applies proration rules (how the total fare is split between legs for display). Combinatorial explosion for multi-city searches is controlled by pruning: only the top 5 cheapest options per connection point are retained.

Database Design

Fare cache entirely in Redis Cluster (8 shards for 13.5 GB fare data + 50% headroom). Calendar fare store: Redis sorted set per route keyed by departure_date with score = min_price. Price alert subscriptions in PostgreSQL: (alert_id, user_id, origin, destination, target_price, cabin_class, created_at, last_triggered_at). Historical fare data for price prediction models in Redshift (partitioned by route + month). Airline master data (codes, routes, baggage policies, change fees) in PostgreSQL with a read replica, refreshed daily from OAG or Cirium data feeds.

API Design

GET /v1/flights/search?origin={}&destination={}&departure={}&return={}&adults={}&cabin={} — Returns sorted list of itineraries with price, stops, airline, and duration
GET /v1/flights/calendar?origin={}&destination={}&month={} — Returns cheapest fare for each day in the month from pre-computed calendar store
POST /v1/alerts — User sets price alert: origin, destination, target_price, cabin_class; stored in PostgreSQL for background monitoring
GET /v1/flights/{itinerary_id}/deeplink — Returns redirect URL to airline or OTA booking page for the selected itinerary

Scaling & Bottlenecks

GDS API rate limits are the fundamental bottleneck for live search. Amadeus, Sabre, and Travelport each have strict QPS limits and per-call costs ($0.01–$0.10/query). The cache strategy (80% cache hit target) is critical to economics. During peak (20,000 searches/second), even 20% live-search rate = 4,000 GDS calls/second — far exceeding typical enterprise GDS quotas. This forces either aggressive pre-crawling (Google Flights uses its own crawled inventory) or request throttling/queuing for non-cached queries.

Calendar view pre-computation (min fare per day) runs as a nightly Spark job on Redshift, computing cheapest fares for the top 50,000 routes × 90-day window. Job runtime: ~45 minutes with 50 Spark executors. Results written to Redis; calendar queries never hit the live search path.

Key Trade-offs

Cached fares vs. real-time accuracy — a 15-minute stale fare shown as available may have been sold out or price-jumped; the final booking step at the airline always shows current price; this gap causes "bait and switch" perception that damages trust
Comprehensive coverage vs. API cost — querying all 750 global airlines for every search is cost-prohibitive; smart prioritization (show airlines with >5% market share on the route) covers >95% of bookings
Metasearch vs. OTA model — metasearch (deep-link to airlines) earns CPC revenue and avoids ticketing liability; OTA model (book directly) earns higher margin but requires airline ticketing contracts and customer service infrastructure
Personalization vs. price discrimination — showing different prices to different users based on browsing history improves conversion but violates consumer trust; most metasearch sites avoid personalized pricing