SYSTEM_DESIGN

System Design: Personalized Workout Recommendation

Design a personalized workout recommendation engine that adapts to user fitness levels, goals, available equipment, and recovery state. Covers ML model serving, user profile management, and real-time feedback loops.

12 min readUpdated Jan 15, 2025
system-designfitnessrecommendationmachine-learningpersonalizationfeedback-loop

Requirements

Functional Requirements:

  • Generate personalized daily workout plans based on user goals, fitness level, and equipment availability
  • Adapt recommendations based on feedback: completed workouts, skipped exercises, difficulty ratings
  • Support multiple goal types: weight loss, muscle gain, endurance, flexibility, general fitness
  • Integrate with wearable data to factor in recovery metrics (HRV, sleep quality) for intensity recommendations
  • Library of 5,000+ exercises with video demonstrations, muscle group targeting, and substitution suggestions
  • Schedule management: rest day detection, progressive overload programming over weeks

Non-Functional Requirements:

  • Workout recommendation generated within 500ms for 99th percentile
  • Recommendation model updated daily with new feedback data
  • 99.9% uptime; users rely on the app for daily workout schedules
  • Cold start support: provide useful recommendations for new users with minimal initial data
  • A/B testing framework for evaluating recommendation algorithm variants

Scale Estimation

For a platform with 5M active users: 5M daily workout requests = ~58/second average, 3x morning peak = ~174/second. Feedback events (exercise completions, ratings): 5M users × 20 exercises/workout × 30% completion = 30M feedback events/day = ~347/second. Model training: daily batch on 30M events from the past day, ~100M from past week. Exercise library: 5,000 exercises × 2MB video = 10GB of video content served via CDN.

High-Level Architecture

The system is organized around a User Profile Service, a Recommendation Engine, a Content Library Service, and a Feedback Loop. The Recommendation Engine is the core: it takes a user profile snapshot (goals, history, equipment, recovery state) and generates a ranked list of workout plans. The recommendation logic is a hybrid of rule-based programming (progressive overload rules, rest day enforcement, muscle group balance) and ML-based personalization (collaborative filtering to identify users with similar progression patterns and recommend exercises they found effective).

The User Profile is a mutable document combining: stated preferences (goals, equipment, available time), computed fitness metrics (estimated 1-rep max per exercise, aerobic capacity estimate), recent workout history (last 30 days), and biometric context (today's HRV, sleep score from wearable sync). This profile snapshot is assembled at recommendation time from multiple sources and passed to the recommendation engine.

The Feedback Loop processes post-workout ratings and completion data as a Kafka stream. A batch job aggregates feedback into the user profile (updating estimated 1-rep maxes, difficulty ratings per exercise type, adherence patterns). The ML model is retrained daily on the accumulated feedback corpus using a scheduled Spark job, and the new model artifact is deployed to the Model Serving Service via a blue-green deployment.

Core Components

User Profile Service

Maintains a composite profile assembled from multiple data sources. The profile document is stored in MongoDB (schema flexibility for evolving fitness attributes). Key profile sections: goals JSONB, equipment JSONB, fitness_levels {exercise_id: current_level}, recent_workouts [last_30_days], recovery_state {hrv_score, sleep_score, fatigue_level}. Profile reads are served from Redis cache (TTL 5 minutes) to avoid MongoDB reads on every recommendation request. Profile updates from workout completions and wearable syncs are written to MongoDB and invalidate the cache.

Recommendation Engine

A two-stage pipeline: (1) Candidate Generation: rule-based workout template selection based on goals, equipment availability, scheduled muscle group rotation, and rest day detection. Templates are pre-built by certified trainers and stored in the Content Library. (2) Personalization: ML model re-ranks exercises within each candidate template based on predicted adherence (will the user actually do this?) and predicted effectiveness (based on similar user progression data). The ML model is a gradient boosted trees model (LightGBM) with features: user fitness tier, exercise category preference scores, recent difficulty feedback, time-of-day, and similar-user engagement rates.

Content Library & Exercise Service

Maintains the exercise database with metadata: name, description, muscle groups targeted, difficulty tier, equipment required, video URL (CDN), and substitution alternatives for each exercise. Video content is stored in S3 and served via CloudFront with HLS adaptive streaming. Exercise data is rarely updated and is cached aggressively (24-hour TTL in Redis). A Content Management System allows fitness content editors to add exercises, update instructions, and flag deprecated exercises. Exercise updates trigger cache invalidation via a Pub/Sub event.

Database Design

User profiles: MongoDB collection user_profiles with document per user. Workout history: completed_workouts (workout_id, user_id, plan_id, started_at, completed_at, exercises JSONB), exercise_feedback (feedback_id, user_id, exercise_id, difficulty_rating 1-5, completed BOOL, notes) in PostgreSQL. These feed the ML feature store.

Feature store: Apache Hive or BigQuery tables storing pre-computed user features (weekly exercise frequency, preferred workout duration, exercise category engagement rates) updated daily by Spark jobs. The online feature store (Redis) serves these features at recommendation time with sub-millisecond latency. Exercise library: exercises (exercise_id, name, category, muscle_groups[], difficulty_tier, equipment_required[], video_s3_key, thumbnail_url, substitutions[]) in PostgreSQL. Workout templates: workout_templates (template_id, goal_type, difficulty, duration_minutes, exercise_sequence JSONB) maintained by content team.

API Design

GET /api/v1/recommendations/today — returns today's personalized workout plan for the authenticated user; includes exercises, sets, reps, and video links.

POST /api/v1/workouts/{workoutId}/complete — submits completion data with per-exercise ratings; updates profile and enqueues feedback events.

GET /api/v1/exercises/{exerciseId}/alternatives?equipment=[] — returns equipment-appropriate substitutions for an exercise.

PUT /api/v1/profile/goals — updates user fitness goals; triggers recommendation recalibration.

Scaling & Bottlenecks

The 174 recommendations/second peak is modest for a stateless service, but each recommendation assembles a profile from Redis (fast) and runs the ML model inference (50-100ms). Model serving uses a dedicated inference service (TorchServe or BentoML) with the model loaded in memory on each instance. Horizontal scaling of the inference service handles the peak load. Caching completed recommendations for the day ("today's workout") in Redis means repeat requests (user opens app multiple times) don't re-invoke the model.

Daily model retraining on 100M+ feedback events uses a scheduled Spark cluster that spins up on demand (EMR or Databricks), trains in 2-4 hours during off-peak hours, evaluates model performance, and publishes the artifact to an MLflow model registry. The Model Serving Service polls the registry for new artifacts and hot-swaps the model with zero downtime via a blue-green deployment.

Key Trade-offs

  • Rule-based vs. pure ML recommendations: Rule-based programming (progressive overload, rest days) ensures safety and training science correctness; pure ML maximizes engagement but may learn patterns that feel good short-term but are poor long-term programming; the hybrid ensures safety guarantees while personalizing within those constraints.
  • Cold start strategy: New users have no feedback history; options include onboarding questionnaire (explicit preferences), population-based defaults (average user progression for the stated fitness level), or diversity sampling (explore different workout types to gather feedback quickly).
  • Daily vs. per-request model updates: Per-request model updates based on today's workout would give the most current recommendations but require online learning models (complex); daily batch retraining is simpler and sufficient since fitness adaptation is week-scale, not hour-scale.
  • Strict vs. flexible workout plans: Strict plans (same workout every Monday) are easier to program but low adherence; flexible adaptive plans that adjust to user schedule and energy are more complex but significantly improve long-term retention.

GO DEEPER

Master this topic in our 12-week cohort

Our Advanced System Design cohort covers this and 11 other deep-dive topics with live sessions, assignments, and expert feedback.