SYSTEM_DESIGN

System Design: Twitter/X

Comprehensive system design of Twitter/X covering tweet ingestion, real-time timeline generation, trending topics, and the unique challenges of celebrity fan-out at 500M users.

18 min readUpdated Jan 15, 2025
system-designtwittersocial-mediareal-time

Requirements

Functional Requirements:

  • Users can post tweets (280 chars), retweet, and reply
  • Users follow other users and see a home timeline
  • Tweets support media attachments, polls, and threads
  • Trending topics surface globally and locally
  • Full-text search over tweets in near real-time
  • Push notifications for mentions, likes, and follows

Non-Functional Requirements:

  • 500M DAU; ~6,000 tweets/sec average, 150K tweets/sec at peak
  • Home timeline load under 100ms (p99)
  • 99.99% uptime; tweets must be durable once acknowledged
  • Search index latency: new tweets indexed within 15 seconds

Scale Estimation

500M DAU each reading ~50 timeline items daily = 25B timeline reads/day = 290K reads/sec. At 6K tweets/sec with an average of 200 followers, naïve fan-out requires 1.2M DB writes/sec. Elon Musk tweeting to 130M followers would require 130M writes per tweet — this is why celebrity accounts require a special path. Storage: 6K tweets/sec × 280 bytes × 86400 = ~145GB text/day; media adds another several TB/day.

High-Level Architecture

Twitter's architecture centers on the timeline service. The write path flows: Tweet API → Kafka topic tweets-raw → Fanout Service → Redis timeline caches. The Fanout Service reads the tweeter's follower list from a Social Graph Service (backed by a FlockDB-style adjacency list store) and writes the tweet ID into each follower's Redis timeline list. For accounts under ~100K followers this is synchronous to a bounded latency; above that threshold, fan-out is async.

The read path hits the Timeline Service, which reads tweet IDs from Redis, then hydrates tweet content from a Tweet Store (Manhattan, Twitter's distributed key-value store, in production). The hydration step fetches user metadata from a User Service cache and assembles the full timeline response. A separate Earlybird search cluster (Lucene-based) maintains an in-memory inverted index of recent tweets updated via Kafka consumers, enabling sub-second search.

Core Components

Fanout Service

The Fanout Service is the heart of timeline generation. It consumes from the tweets Kafka topic, looks up the poster's follower list, and writes the tweet ID (not the full tweet content) to each follower's Redis timeline. This indirection (storing IDs, not content) means a retweet or edit only requires updating one record. The service uses a worker pool sized to handle the fan-out load; celebrity tweets are routed to a dedicated high-throughput lane.

Timeline Service

Timelines are stored as Redis sorted sets with tweet timestamp as the score. The Timeline Service reads the top N tweet IDs from the sorted set, then batch-fetches tweet content from the Tweet Cache (Redis) and falls back to Manhattan KV store on misses. The final response merges home timeline tweets with injected ads and algorithmic recommendations from a separate Recommendation Service.

Trending Service

Trending topics are computed using a sliding window count of hashtag occurrences in a Storm/Heron topology. Each bolt maintains a count-min sketch of hashtag frequencies over 5-minute windows. A top-K heap extracts trending candidates. Geo-filtering applies a separate locality layer to surface regional trends. Results are materialized into a Redis sorted set refreshed every 30 seconds.

Database Design

Tweet content is stored in Manhattan, Twitter's distributed NoSQL store (similar to Dynamo), keyed by tweet_id. The tweet_id is a 64-bit Snowflake ID encoding timestamp + datacenter ID + sequence number. This allows time-ordered scans without sorting. The social graph (follows) lives in a separate service backed by a custom graph store with adjacency lists sharded by user_id. User profiles use MySQL with read replicas. A separate media blob store (backed by HDFS + S3) stores image/video assets.

Timeline data in Redis uses ZADD timeline:{user_id} {timestamp} {tweet_id} with a max capacity of 800 items (older items are evicted). For users who haven't logged in recently (cold users), the timeline is rebuilt on first access by querying the followed accounts' recent tweets — a process called timeline reconstruction.

API Design

  • POST /api/v2/tweets — Create a tweet; body contains text, media_ids, reply_to_id; returns tweet object with Snowflake ID
  • GET /api/v2/timelines/home?max_results=20&pagination_token={token} — Fetch home timeline with cursor pagination
  • GET /api/v2/tweets/search/recent?query={q}&max_results=100 — Search recent tweets via Earlybird index
  • GET /api/v2/trends/place?woeid={id} — Fetch trending topics for a location

Scaling & Bottlenecks

The celebrity fan-out problem is Twitter's most famous scaling challenge. The solution is a hybrid model: for users with >~100K followers, fan-out is done lazily at read time. When a user loads their timeline, the Timeline Service detects which followed accounts are celebrities and fetches their recent tweets directly from the Tweet Store, merging them with the precomputed timeline from Redis. This eliminates the write amplification for celebrity tweets.

Kafka partitioning is done by user_id mod N to keep a user's tweets on the same partition, preserving ordering. The fanout worker fleet scales horizontally; each worker handles a shard of the follower ID space. Redis cluster for timelines uses consistent hashing with 200 virtual nodes per physical node, and cross-datacenter replication uses async replication with read-your-writes guarantees via sticky routing.

Key Trade-offs

  • Hybrid fan-out: Synchronous fan-out for normal users (fast reads) + lazy fan-out for celebrities (avoids write storms) — complexity cost is worth the scalability gain
  • Tweet IDs not content in timelines: Storing only IDs in Redis keeps memory usage 10x lower and makes content edits trivial — hydration latency is the cost
  • Eventual consistency for timelines: Followers may see a new tweet 1-2 seconds after posting; the simplicity and throughput gains outweigh strict consistency
  • Earlybird in-memory index: Keeping recent tweets in RAM for search gives sub-second latency but limits the searchable window; older tweets require a separate Hadoop-backed index

GO DEEPER

Master this topic in our 12-week cohort

Our Advanced System Design cohort covers this and 11 other deep-dive topics with live sessions, assignments, and expert feedback.