System Design: Watch History & Resume Playback

Requirements

Functional Requirements:

Record every video watched by a user with timestamp, duration watched, and completion percentage
Resume playback: when a user starts a partially-watched video on any device, playback resumes from the last position
Continue-watching row on the home screen showing in-progress videos sorted by last watched
Watch history page with search, filtering by date range, and deletion of individual entries
Cross-device synchronization: position updates on one device are reflected on all others within 5 seconds
Privacy controls: users can pause history recording, delete all history, or delete specific entries

Non-Functional Requirements:

200 million DAU, each watching an average of 2 hours of content
Position update events: every 10 seconds during playback = 200M users × 720 updates/session = 144B events/day = 1.67M events/sec
Resume position retrieval in under 100ms (it's in the playback start critical path)
99.99% availability for position reads (failure means poor user experience: video starts from the beginning)
Storage: 3 years of watch history per user

Scale Estimation

200M DAU × 2 hours average = 400M viewing hours/day. Position update events: 1 event every 10 seconds × 7,200 seconds per 2-hour session = 720 events/user/day. Total: 200M × 720 = 144B events/day = 1.67M events/sec. However, we only need to persist the latest position per user-video pair — the intermediate events are transient. Persistent writes (last position per video at session end): 200M × 3 videos/day = 600M writes/day = 6,944/sec. Watch history records: 200M users × 3 videos/day × 365 days × 3 years = 657B records. At 100 bytes per record = 65.7TB. Continue-watching feed: 200M users × 10 in-progress videos = 2B entries in hot storage.

High-Level Architecture

The architecture separates the hot path (real-time position tracking during active playback) from the warm path (watch history storage and retrieval). The Hot Path handles position updates during active playback. The video player sends a heartbeat every 10 seconds to the Position Service: {user_id, video_id, position_seconds, duration_seconds, device_id}. The Position Service writes the latest position to an in-memory cache (Redis) keyed by user_id:video_id. This cache serves resume requests — when a user starts a video, the player fetches the last position from Redis in under 10ms. Only the latest position per user-video pair is stored in Redis, keeping memory usage bounded.

The Warm Path persists watch history for long-term storage. When a viewing session ends (detected by position update timeout — no heartbeat for 60 seconds), a Session Finalizer writes the session summary to Cassandra: {user_id, video_id, watched_at, position_seconds, total_duration, completion_pct, device}. This write also updates the continue-watching list for the user. Cassandra is chosen for its write-heavy optimization and time-series access pattern (user_id partition key, watched_at clustering key for chronological history retrieval).

The Continue-Watching Service maintains the home screen feed. When a session is finalized with completion < 90%, the video is added to (or updated in) the user's continue-watching list in Redis (sorted set: user_id:continue_watching, scored by watched_at timestamp). When completion >= 90%, the entry is removed (the user finished the video). The home screen fetches the top 20 entries from this sorted set.

Core Components

Position Sync Service

The Position Sync Service handles 1.67M events/sec. Each event is a lightweight HTTP PUT (or WebSocket message) containing ~100 bytes. The service is stateless, reading and writing to Redis. Redis deployment: a cluster of 30 shards (hash-partitioned by user_id) with 3 replicas per shard. Each shard handles ~56K writes/sec, well within Redis's throughput capacity. Memory sizing: 2B active user-video position entries × 120 bytes (key + value) = 240GB — spread across 30 shards = 8GB per shard. Cross-device sync is inherently handled: all devices read from the same Redis key. A short-polling approach (player checks for remote position updates every 30 seconds) detects if another device has advanced the position, offering the user a prompt to jump to the newer position.

Watch History Store

Cassandra stores the complete watch history with a schema optimized for the primary access pattern: "show me my watch history, most recent first." Table: watch_history (user_id UUID, watched_at TIMESTAMP, video_id UUID, position_seconds INT, total_duration INT, completion_pct FLOAT, device VARCHAR, PRIMARY KEY (user_id, watched_at)) WITH CLUSTERING ORDER BY (watched_at DESC). This enables efficient range scans for paginated history retrieval. A secondary index on video_id enables deletion of all history for a specific video (e.g., when a video is removed from the catalog). Compaction strategy: TimeWindowCompactionStrategy with 7-day windows, optimized for time-series write patterns. TTL: 3 years (configured at table level).

Continue-Watching Feed

The continue-watching feed is a critical home screen component. It is maintained as a Redis sorted set per user: ZADD user:{uid}:cw watched_at_timestamp video_id. When the home screen loads, ZREVRANGE returns the most recently watched in-progress videos. The feed is updated on every session finalization: if completion < 90%, ZADD updates the entry's score (latest watch time); if completion >= 90%, ZREM removes it. Edge cases: TV series — completing an episode should remove that episode and potentially add the next episode (handled by a content-aware post-processing step that queries the catalog for the next unwatched episode). The sorted set is capped at 50 entries (ZREMRANGEBYRANK removing the oldest beyond 50) to bound memory usage.

Database Design

Redis stores hot data: Position cache: STRING key user:{uid}:video:{vid}:pos → JSON {position_seconds, updated_at, device_id}, TTL 30 days (inactive positions expire). Continue-watching: SORTED SET user:{uid}:cw → {video_id: last_watched_timestamp}. Total Redis memory: position cache (240GB) + continue-watching (2B entries × 50 bytes = 100GB) = 340GB across 30 shards.

Cassandra stores the full watch history. Cluster sizing: 657B records × 100 bytes = 65.7TB. With replication factor 3: ~200TB total. A 20-node cluster with 10TB NVMe SSDs each provides capacity with room for growth. Write throughput: 6,944 session finalizations/sec distributed across 20 nodes = 347 writes/sec per node — well within Cassandra's capacity. Read throughput for history pages: 200M users × 1% accessing history page daily = 2M reads/day = 23 reads/sec (low; history browsing is infrequent).

PostgreSQL stores privacy settings and deletion requests: privacy_settings (user_id PK, history_paused BOOLEAN, paused_at nullable). Deletion requests are processed asynchronously: a deletion job scans Cassandra for the user's entries and removes them in batches (processing 1M deletions takes ~10 minutes). Redis entries are deleted immediately.

API Design

PUT /api/v1/playback/{video_id}/position — Update playback position; body contains position_seconds, device_id; called every 10 seconds during playback
GET /api/v1/playback/{video_id}/position — Fetch last known position for a video (resume point); returns position_seconds and device_id
GET /api/v1/continue-watching?limit=20 — Fetch the continue-watching feed for the home screen
GET /api/v1/history?cursor={cursor}&limit=50 — Fetch watch history with cursor-based pagination (most recent first)

Scaling & Bottlenecks

The position update path at 1.67M events/sec is the primary throughput challenge. The stateless Position Service scales horizontally behind a load balancer. Each instance handles 50K requests/sec (async I/O with Redis pipeline batching). A fleet of 40 instances provides the required throughput with 20% headroom. Redis pipeline batching (grouping multiple SET commands into a single network round-trip) reduces network overhead by 10x.

Cassandra write throughput for session finalization (6,944/sec) is modest for a 20-node cluster. The real challenge is storage management over 3 years: 200TB with compaction overhead requires careful disk planning. TimeWindowCompactionStrategy minimizes compaction I/O for time-series data (no need to merge old windows). Data older than 3 years is automatically TTL-expired. For delete-all-history requests (GDPR), a batch deletion job runs asynchronously, scanning the user's partition and issuing tombstone deletes — Cassandra handles tombstones efficiently with compaction, but excessive deletes on a single partition require gc_grace_seconds tuning.

Key Trade-offs

Redis for hot position data vs database only: Redis provides sub-10ms position retrieval (critical for playback start latency), while a database-only approach would add 20-50ms — the additional infrastructure is justified for 200M DAU playback starts
Position heartbeat every 10 seconds vs more/less frequent: 10-second intervals balance position accuracy (max 10 seconds lost on crash) with event volume — 5-second intervals would double the 1.67M events/sec rate, while 30-second intervals lose too much progress on unexpected exits
Cassandra vs DynamoDB for watch history: Cassandra's TimeWindowCompactionStrategy is ideal for time-series write patterns and provides better cost efficiency at 200TB scale — DynamoDB would be simpler operationally but significantly more expensive for this access pattern
Session-end persistence vs every-heartbeat persistence: Persisting only at session end (6,944 writes/sec) is 240x cheaper than persisting every heartbeat (1.67M/sec) — the trade-off is that a crash loses the last session's intermediate positions, which is acceptable (video position is approximate anyway)