System Design: Employee Directory

Requirements

Functional Requirements:

Search employees by name, title, department, skills, location, and manager
View organizational hierarchy as an interactive org chart (tree visualization)
Employee profiles with contact info, reporting chain, team members, office location, and expertise tags
Self-service profile editing for fields like bio, skills, profile photo, and preferred pronouns
Integration with HRIS (Workday, BambooHR), Active Directory/Okta for SSO, and Slack for presence status
Team pages grouping employees by department/project with aggregate views

Non-Functional Requirements:

Support organizations up to 500K employees with 10-level deep hierarchies
Search results returned in under 200ms with typo tolerance and fuzzy matching
99.9% availability; the directory is accessed throughout the workday
Near-real-time sync with HRIS: employee changes reflected within 5 minutes
GDPR-compliant: employees can control visibility of personal information fields

Scale Estimation

500K employees in the largest organization. DAU: 60% of employees = 300K/day. Average 3 searches/day = 900K searches/day = 10.4 searches/sec. Profile views: 5 per user per day = 1.5M/day = 17.4/sec. Profile updates: 1% of employees update profiles daily = 5K updates/day. HRIS sync: batch of 500K records reconciled every 6 hours; delta sync of changed records every 5 minutes. Org chart rendering: 50K renders/day with average scope of 200 nodes. Data size: 500K employees × 5KB profile = 2.5GB (easily fits in memory). Search index: 500K documents × 2KB = 1GB.

High-Level Architecture

The directory uses a read-optimized architecture since the read-to-write ratio exceeds 1000:1. The Search Layer uses Elasticsearch with a single index containing all employee profiles. The index is optimized for the directory use case: name fields use edge-ngram tokenization for typeahead, title and department use keyword + text multi-fields, skills use a nested object for faceted filtering, and location uses geo_point for proximity search. Typo tolerance uses Elasticsearch's fuzzy query with max_edits=2. The search API supports compound queries: "engineering manager in Seattle with Python experience" is parsed into structured filters + full-text query.

The Profile Service manages employee data in PostgreSQL as the source of truth. Profiles combine HRIS-sourced fields (name, title, department, manager, hire date — read-only for employees) and self-service fields (bio, skills, profile photo, social links — editable by the employee). A sync engine runs every 5 minutes, polling the HRIS API for changed records (using a modified_since cursor) and updating PostgreSQL. Changes trigger CDC events to Kafka, which are consumed by the Elasticsearch indexer and the cache invalidation service.

The Org Chart Service builds and serves hierarchical views. The full org tree is pre-computed nightly from the manager relationship graph and stored as a materialized JSON structure in Redis. When a user views the org chart, the service returns the pre-computed subtree rooted at the requested employee, pruned to the requested depth (typically 3 levels down, full path to root up). Interactive expansion (clicking to load more children) fetches additional levels on demand.

Core Components

People Search

The search experience prioritizes speed and relevance. As the user types, a debounced request (150ms delay) hits the search API, which executes a multi_match query across name (boosted 3x), title (boosted 2x), department, skills, and location fields. Results are sorted by a combination of text relevance and organizational proximity (employees in the searcher's department are boosted). For exact name searches, a prefix query on the keyword field provides instant results. The search also supports structured queries via recognized patterns: "manager:John" filters by manager name, "location:NYC" filters by office. Search analytics track popular queries and zero-result queries to improve synonym configuration.

HRIS Integration Engine

The integration engine supports bidirectional sync with multiple HRIS providers. An adapter pattern provides a unified interface: each HRIS (Workday, BambooHR, SAP SuccessFactors) has a specific adapter that handles authentication (OAuth2 or API key), pagination, field mapping, and error handling. The sync process: (1) fetch changed records since last sync cursor, (2) map HRIS fields to internal schema using a configurable field mapping (JSON), (3) detect conflicts (HRIS change vs employee self-service edit on the same field — HRIS wins for managed fields, employee wins for self-service fields), (4) apply changes to PostgreSQL and emit CDC events. A reconciliation job runs every 6 hours, doing a full comparison to catch any missed delta syncs.

Org Chart Rendering

The org chart is rendered client-side using D3.js with a collapsible tree layout. The backend serves hierarchical JSON: each node contains employee summary (name, title, photo_url, direct_report_count) and a children array. For performance, only 3 levels of children are loaded initially; deeper levels are fetched via API calls on node expansion. The full org tree for a 500K-person organization is pre-computed as an adjacency list in PostgreSQL (employee_id, manager_id) and materialized as nested JSON in Redis. Subtree extraction uses a recursive CTE in PostgreSQL for on-demand queries and the Redis cache for common starting points (CEO, VP-level).

Database Design

PostgreSQL schema: employees (employee_id UUID PK, tenant_id, hris_id, name, email, title, department_id, manager_id FK→employees, hire_date, location_id, office, bio TEXT, skills ARRAY, profile_photo_s3_path, pronouns, social_links JSONB, visibility_settings JSONB, updated_at, hris_synced_at). The manager_id self-referential FK enables hierarchy traversal. departments (dept_id, tenant_id, name, parent_dept_id, head_employee_id). locations (location_id, name, address, city, country, latitude, longitude, timezone).

Indexes: (tenant_id, name) for direct lookups, (tenant_id, manager_id) for org chart traversal, (tenant_id, department_id) for team views. The visibility_settings JSONB field controls per-field visibility: {"phone": "team_only", "personal_email": "hidden", "bio": "public"} — the API layer filters response fields based on the requester's relationship to the profile owner. A materialized view aggregates team statistics (headcount by department, location distribution) refreshed every hour.

API Design

GET /api/v1/search?q={query}&department={dept}&location={loc}&limit=20 — Search employees with optional filters
GET /api/v1/employees/{employee_id}/profile — Fetch employee profile (fields filtered by visibility settings and requester relationship)
GET /api/v1/employees/{employee_id}/org-chart?depth=3 — Fetch org chart subtree rooted at employee
PUT /api/v1/employees/{employee_id}/profile — Update self-service profile fields; body contains editable fields

Scaling & Bottlenecks

Search performance is the primary concern. With 500K documents and 10 searches/sec, a single Elasticsearch node handles the load comfortably. The challenge is search relevance, not throughput. Continuous A/B testing of search ranking parameters (field boosts, fuzzy distance, proximity scoring) optimizes result quality. For multi-tenant deployments (SaaS serving many organizations), tenant-level indexes provide data isolation; smaller tenants share an index with tenant_id filtering.

HRIS sync reliability is the operational bottleneck. HRIS APIs have rate limits, occasional downtime, and inconsistent data formats. The adapter layer implements exponential backoff retry (up to 5 retries), idempotent upserts (using hris_id as the deduplication key), and a dead-letter queue for records that fail validation (e.g., missing required fields). A monitoring dashboard tracks sync freshness per tenant, alerting when the last successful sync exceeds 30 minutes.

Key Trade-offs

Elasticsearch over PostgreSQL full-text search: Elasticsearch provides superior fuzzy matching, typeahead, and relevance tuning, but adds infrastructure complexity — justified for the directory's search-heavy access pattern
Pre-computed org chart in Redis vs on-demand recursive queries: Pre-computation provides sub-50ms response times for any subtree but requires a nightly rebuild job and accepts up to 24 hours of staleness — acceptable since org changes are infrequent
HRIS as source of truth for managed fields: Employee self-service edits to HRIS-managed fields (e.g., title) are overwritten on the next sync — this ensures consistency with HR records but may frustrate employees who want immediate updates
Visibility controls per field vs all-or-nothing profiles: Per-field visibility respects employee privacy preferences but adds complexity to every API response — the privacy benefit outweighs the engineering cost