SYSTEM_DESIGN
System Design: Social Fitness Platform (Strava-style)
Design a Strava-scale social fitness platform supporting GPS activity uploads, route mapping, segment leaderboards, and social feed. Covers geospatial data, activity processing pipelines, and social graph feed generation at tens of millions of users.
Requirements
Functional Requirements:
- Upload GPS activity files (GPX/FIT format) with automatic sport detection and stats calculation
- Route visualization on interactive maps with elevation profiles
- Segment detection: match user routes against known segments and rank on leaderboards
- Social feed: see activities from followed athletes with kudos and comments
- Clubs and group challenges with leaderboards among members
- Route creation and discovery: plan routes and find popular routes in any area
Non-Functional Requirements:
- Activity processing (stats, segments, map rendering) complete within 60 seconds of upload
- Social feed loads within 500ms for 99th percentile
- Segment leaderboards serve up to 1M athletes with sub-second read latency
- 99.9% uptime; athletes upload activities immediately post-workout
- GPS track data encrypted at rest; athlete privacy zones (home/work) applied before route display
Scale Estimation
At Strava scale: 100M registered athletes, 2M activities uploaded/day = ~23/second. Each activity: average 1 hour × 1 GPS point/second = 3,600 points × 40 bytes = ~144KB per activity compressed. Storage: 2M × 144KB/day = 288GB/day = ~105TB/year. Segment matching: each activity matched against ~10k potential segments = 2M × 10k = 20B segment match evaluations/day. Social feed requests: 100M DAU × 5 feed loads = 500M/day = ~5,790/second.
High-Level Architecture
The platform is organized around an Activity Processing Pipeline, a Geospatial Service, a Social Graph & Feed Service, and a Leaderboard Service. Activity upload triggers an async pipeline: parse GPS file → compute stats → snap track to road/trail network → detect segments → generate map tiles → publish to social feed. This pipeline decouples upload acknowledgment (immediate) from full activity enrichment (within 60 seconds).
Segment detection is the most computationally intensive operation. The algorithm: for each activity track, identify sub-sequences that match known segment polylines within a configurable spatial tolerance (within 10 meters). This is equivalent to subsequence matching on a 2D coordinate sequence, accelerated by a spatial index. Matched segments trigger leaderboard updates — each match is a potential new rank on the segment leaderboard.
The social feed uses a fan-out-on-write model for users with fewer than 10k followers. When an athlete completes an activity, the Feed Service writes the activity reference to each follower's feed store (Redis sorted set, scored by activity timestamp). For athletes with 10k+ followers (celebrities/professional athletes), fan-out-on-read is used — their activities are not written to each follower's feed; instead, the feed service fetches them at read time. This hybrid approach (identical to Twitter's approach) balances write throughput and read latency.
Core Components
Activity Processing Pipeline
A Kafka-driven pipeline of workers. Stages: (1) Parser: decodes GPX/FIT file, extracts GPS track, timestamps, heart rate, power if available. (2) Stats Calculator: computes distance, elevation gain/loss, pace/speed, heart rate zones, power metrics. (3) Privacy Filter: removes GPS points within user-defined privacy zones (50-meter radius around home/work address). (4) Segment Matcher: spatial subsequence matching against segment library. (5) Map Tile Generator: renders the route on a raster map using Mapbox or Google Static Maps API. Each stage is an independent worker pool consuming from and producing to Kafka, enabling independent scaling and failure isolation.
Geospatial & Segment Service
Maintains the segment library (polyline representations of popular routes). The spatial index (PostGIS or a custom R-tree) enables fast segment candidate lookup: given an activity bounding box, find all segments whose bounding box intersects. Candidate segments are then evaluated for track similarity using the Hausdorff distance or a custom similarity measure. Matched segments log the match with the athlete's effort time — this becomes a leaderboard entry. Segment creation is a user-facing feature — any athlete can define a segment from their activity track.
Feed & Social Graph Service
Maintains the follow graph in a graph database (Neo4j) or a denormalized PostgreSQL adjacency list for fast follower enumeration. Fan-out-on-write populates per-user Redis sorted sets (score = activity timestamp) with activity IDs. Feed reads fetch activity IDs from the sorted set and hydrate with activity data from a cache (Redis) backed by PostgreSQL. Kudos and comments are stored in PostgreSQL and aggregated into activity objects for feed display. Comment notifications are delivered via push with a Kafka-backed notification fan-out service.
Database Design
Activities: activity_id UUID, athlete_id, sport_type ENUM, started_at, elapsed_time, distance_meters, elevation_gain, avg_speed, avg_heart_rate, avg_power, map_polyline GEOMETRY(LineString), map_tile_url, privacy ENUM(public, followers, private). GPS tracks stored as compressed binary in S3 (Polyline encoding, gzip). Segments: segment_id, name, sport_type, polyline GEOMETRY(LineString), distance, elevation_gain, kom_time_seconds, kom_athlete_id.
Leaderboard entries: segment_efforts (effort_id, segment_id, athlete_id, activity_id, elapsed_time, achieved_at) with an index on (segment_id, elapsed_time) for fast rank computation. Top-10 leaderboard per segment is cached in Redis and refreshed on each new effort. Social: follows (follower_id, following_id, created_at). Clubs: clubs (club_id, name, sport, privacy), club_members (club_id, athlete_id, role). Activity kudos: kudos (activity_id, athlete_id, given_at) — count is denormalized on the activity record.
API Design
POST /api/v1/activities/upload — accepts GPX/FIT file; returns {activity_id, status: "processing"}; async pipeline starts immediately.
GET /api/v1/activities/{activityId} — returns full activity data including stats, map URL, and segment efforts.
GET /api/v1/feed?before={cursor}&limit=20 — returns paginated feed of followed athletes' activities.
GET /api/v1/segments/{segmentId}/leaderboard?gender=&age_group=&date_range= — returns filtered leaderboard.
Scaling & Bottlenecks
Segment matching is the most CPU-intensive operation. At 20B segment evaluations/day, the matching workers must process ~230k evaluations/second. Pre-filtering by bounding box reduces the full geometric comparison to ~50 candidates per activity. Vectorized geometric operations (NumPy or custom C extension) and batch processing within a worker reduce per-evaluation cost to microseconds. The segment worker pool auto-scales based on Kafka consumer lag.
Feed reads at 5,790/second require Redis to serve feeds for 100M users. Each user's feed is a sorted set of up to 1,000 activity IDs — stored in Redis with a 200MB base footprint per million users = 20GB for 100M users, manageable with Redis Cluster. Feed write fan-out is the write bottleneck: athletes with 1M followers generate 1M Redis writes per activity. The hybrid fan-out strategy (fan-out-on-read for high-follower athletes) bounds the maximum writes per activity to ~10k.
Key Trade-offs
- Fan-out-on-write vs. fan-out-on-read: Fan-out-on-write gives instant feed reads but expensive writes for popular athletes; fan-out-on-read scales writes but adds latency for reading celebrity activities; the hybrid model handles both cases at the cost of implementation complexity.
- Segment matching precision vs. cost: Strict geometric matching (Fréchet distance) is more accurate for determining if a route truly covers a segment but is computationally expensive; looser bounding-box + Hausdorff distance is faster but may produce false positives on parallel routes.
- Privacy zones complexity: Removing GPS points near home/work requires knowing the user's privacy zone polygons at render time, not just storage time; rendering-time masking adds latency to map display; storage-time masking is simpler but means if a user changes their zone, old activities don't update automatically.
- Leaderboard freshness vs. consistency: Real-time leaderboard updates on every new effort are possible with Redis but create thundering herds on popular segments; a 1-minute batched leaderboard refresh is sufficient for most users and dramatically reduces Redis write contention.
GO DEEPER
Master this topic in our 12-week cohort
Our Advanced System Design cohort covers this and 11 other deep-dive topics with live sessions, assignments, and expert feedback.