SYSTEM_DESIGN
System Design: Performance Review System
System design of a performance review platform covering goal setting, 360-degree feedback collection, calibration workflows, and analytics for enterprise organizations.
Requirements
Functional Requirements:
- Employees set quarterly/annual goals with measurable key results (OKR framework)
- Managers and peers submit structured performance reviews during review cycles (biannual or annual)
- 360-degree feedback: self-review, manager review, peer reviews (3-5 nominated peers), and optional upward reviews
- Calibration workflow where department leaders normalize ratings across teams to eliminate bias
- Historical performance data with trend visualization across review cycles
- Integration with compensation systems to link performance ratings to merit increases
Non-Functional Requirements:
- Support 5,000 enterprise customers with up to 500K employees each; total 50M employee profiles
- Handle review cycle surge: 80% of an organization's reviews submitted within a 2-week window
- 99.9% availability during review submission windows (business-critical deadlines)
- Strong consistency for review submissions and calibration decisions
- Data encryption and strict access controls (reviews visible only to authorized parties)
Scale Estimation
50M employee profiles across 5,000 enterprises. Review cycles: average 2 per year per employee = 100M reviews/year. Each review includes self, manager, and 4 peer reviews = 600M review documents/year. Concentrated during 2-week submission windows: 600M / 26 weeks × 80% in 2 weeks = ~185M submissions in peak 2-week window = 150 submissions/sec sustained during peak. Goal updates: 50M × 4 quarterly updates = 200M goal events/year. Each review document averages 2KB (structured scores + text feedback) → 1.2TB/year of review data. Calibration sessions: 100K departments × 2 sessions/year = 200K calibration events.
High-Level Architecture
The system follows an event-driven architecture built on a multi-tenant SaaS platform. The Review Cycle Engine is the orchestrator: HR administrators configure a review cycle (dates, review types, participant rules, rating scales) and launch it. The engine generates review tasks for all participants based on organizational hierarchy and peer nomination rules. Each task has a deadline and reminder schedule. The engine tracks completion percentages and sends escalation notifications to managers for incomplete reviews.
The Feedback Collection Service handles review form rendering and submission. Review forms are configurable per organization: structured rubrics (1-5 rating scales on competencies like "Technical Skills", "Leadership", "Communication") plus free-text sections ("Strengths", "Areas for Growth", "Key Accomplishments"). Form configurations are stored as JSON schemas, enabling each customer to customize without code changes. Submissions are validated against the schema, encrypted, and stored in PostgreSQL.
The Calibration Service supports collaborative rating normalization. Department leaders view a scatter plot or 9-box grid of their team's ratings, discuss outliers, and adjust ratings to ensure consistency across managers. Calibration sessions use real-time collaboration (WebSocket-based) where multiple leaders can view and discuss changes simultaneously. The session state is persisted after consensus, and adjusted ratings flow back to individual reviews.
Core Components
Review Cycle Engine
The engine models each review cycle as a state machine with phases: Configuration → Nomination → Submission → Calibration → Release. Each phase has entry/exit criteria (e.g., Submission cannot start until Nomination is 90% complete). The engine runs as a scheduled job (every 5 minutes) that evaluates all active cycles, sends reminders for approaching deadlines, and transitions phases when criteria are met. Task generation uses the organizational hierarchy graph: for each employee, the engine creates tasks for self-review, manager-review, and peer-reviews (nominated peers approved by the manager). The engine handles edge cases: employee transfers mid-cycle, manager changes, leaves of absence.
Calibration Workflow
Calibration is modeled as a hierarchical process: team-level calibration (manager + skip-level) → department-level (director + managers) → organization-level (VP + directors). At each level, leaders see aggregated ratings for their scope, identify outliers, and discuss adjustments. The system provides statistical aids: distribution charts showing rating spread vs expected distribution (e.g., forced ranking curves), comparisons across teams, and historical trends. Calibration changes are tracked with full audit trails (original rating, adjusted rating, adjusted_by, reason). A locking mechanism prevents further changes to calibrated ratings unless a senior leader explicitly unlocks them.
Access Control & Privacy
Performance reviews contain highly sensitive data requiring fine-grained access controls. The system implements a hierarchical permission model: employees see their own reviews (after release), managers see their direct reports' reviews, skip-level managers see aggregated data for their organization, and HR administrators have full access for compliance purposes. Peer review content is anonymized by default (the reviewer's identity is hidden from the reviewee). All review data is encrypted at rest with per-tenant keys and in transit with TLS 1.3. Access to review data is logged in an immutable audit trail.
Database Design
Core tables in PostgreSQL with RLS: review_cycles (cycle_id, tenant_id, name, start_date, end_date, phases JSONB, status), review_tasks (task_id, cycle_id, reviewer_id, reviewee_id, review_type ENUM(self, manager, peer, upward), status, deadline, submitted_at), review_responses (response_id, task_id, ratings JSONB, feedback_text_encrypted, submitted_at, calibrated_rating nullable, calibrated_by nullable). Goals: goals (goal_id, employee_id, tenant_id, title, description, key_results JSONB, status ENUM(draft, active, completed, deferred), quarter, year, progress_pct).
Indexes: (tenant_id, cycle_id, status) for dashboard queries, (reviewer_id, cycle_id) for "my pending reviews", (reviewee_id, cycle_id) for individual review summaries. The organizational hierarchy is stored as an adjacency list: org_relationships (employee_id, manager_id, effective_date, end_date) enabling point-in-time hierarchy reconstruction (important when a manager changes mid-cycle). Calibration session state uses a separate table with JSONB snapshots of the rating grid at each save point.
API Design
POST /api/v1/cycles— Create a review cycle; body contains dates, review types, rating schema; returns cycle_idGET /api/v1/tasks?reviewer_id={id}&cycle_id={id}&status=pending— Fetch pending review tasks for a reviewerPOST /api/v1/tasks/{task_id}/submit— Submit a review; body contains ratings JSONB and feedback text; returns response_idPUT /api/v1/calibration/{cycle_id}/adjustments— Submit calibration rating adjustments; body contains [{employee_id, adjusted_rating, reason}]
Scaling & Bottlenecks
The review submission surge is the primary scaling challenge. During a 2-week window, 80% of an organization's employees submit reviews, creating a 10x traffic spike compared to baseline. The system pre-provisions database connections and application server capacity 1 week before known cycle deadlines. Write-heavy workload during submission is handled by batching review submissions in an in-memory queue and flushing to PostgreSQL in bulk every 500ms, reducing database round-trips by 50x. Read replicas serve dashboard queries (completion rates, aggregate statistics) to offload the primary.
Calibration sessions with 50+ participants create WebSocket fan-out challenges. Each change to a rating must be broadcast to all session participants in real-time. The system uses a Redis Pub/Sub channel per calibration session; each WebSocket server subscribes to relevant channels. For large sessions, rate limiting UI updates to 2 per second (batching intermediate changes) prevents visual noise and reduces WebSocket traffic.
Key Trade-offs
- Configurable review forms (JSON schema) vs fixed templates: JSON schemas allow each enterprise to customize rating scales, competencies, and feedback sections without code changes, but increase UI rendering complexity and make cross-customer analytics harder
- Hierarchical calibration vs flat calibration: Multiple calibration levels (team → department → org) produce fairer ratings but extend the review cycle timeline by 2-3 weeks — most enterprises accept this trade-off for rating consistency
- Anonymized peer feedback vs attributed: Anonymity encourages honest feedback but prevents follow-up conversations — the system lets each organization configure their policy, with anonymity as the default
- Surge-provisioned infrastructure vs always-on capacity: Pre-provisioning for review cycles saves 60% on infrastructure costs during the 48 non-surge weeks but requires capacity planning coordination with customer success teams
GO DEEPER
Master this topic in our 12-week cohort
Our Advanced System Design cohort covers this and 11 other deep-dive topics with live sessions, assignments, and expert feedback.