System Design: Job Board (LinkedIn Jobs-scale)

Requirements

Functional Requirements:

Employers create job postings with structured fields (title, description, requirements, salary range, location, remote policy)
Job seekers search and filter jobs by keyword, location, salary, experience level, and job type
ML-powered job recommendations personalized to each candidate's profile and application history
One-click apply using a stored profile/resume; external apply redirect for ATS-hosted applications
Job alerts: automated notifications when new jobs match saved search criteria
Recruiter dashboard with applicant pipeline views and messaging

Non-Functional Requirements:

20 million active job postings, 200 million registered users, 40 million DAU job seekers
Search results returned in under 500ms (p95) with relevance ranking
99.95% availability for search and apply flows
Near-real-time indexing: new job postings searchable within 60 seconds
Handle seasonal spikes (January, September hiring seasons) with 2x baseline traffic

Scale Estimation

40M DAU performing an average of 5 searches/day = 200M searches/day = 2,315 searches/sec average, 6,000/sec peak. 500K new job postings/day = 5.8/sec. Applications submitted: 10M/day = 116/sec. Job alert matching: 100M saved alerts checked against 500K new postings/day requires evaluating 50 trillion alert-job pairs — requiring smart indexing to avoid brute-force matching. Stored profiles/resumes: 200M × 500KB average = 100TB. Search index: 20M active postings × 10KB average = 200GB (fits comfortably in Elasticsearch memory).

High-Level Architecture

The platform uses a search-centric architecture with three major subsystems. The Ingestion Layer processes job postings from employers and partner ATS systems. New postings arrive via the Job API or bulk feed ingestion (XML/JSON feeds from ATS platforms like Workday, Greenhouse, and Lever). Each posting is normalized to a canonical schema, enriched with derived fields (standardized job title via an NLP classifier, geo-coordinates from location text, salary range estimation when not provided), and indexed in Elasticsearch. A CDC pipeline from PostgreSQL (source of truth for job data) to Elasticsearch ensures the search index reflects the latest state within 1 second.

The Search & Ranking Layer handles queries from job seekers. A query hits the Search API, which constructs an Elasticsearch query combining full-text matching (BM25 on title and description), structured filters (location radius, salary range, experience level), and personalized re-ranking. The re-ranking model (a LambdaMART learning-to-rank model) scores results using features: text relevance, job-seeker profile match (skills overlap, experience level fit, location proximity), posting freshness, employer quality score, and historical click-through rate for similar queries. The model is trained weekly on click and apply logs.

The Matching & Recommendations Layer proactively surfaces relevant jobs to candidates. A nightly batch pipeline computes candidate-job embeddings: candidate profiles are encoded into 256-dimensional vectors using a BERT-based model fine-tuned on job-resume pairs. Job postings are similarly encoded. Approximate nearest neighbor search (FAISS with HNSW index) retrieves the top 100 matching jobs per candidate. These recommendations are stored in Redis and served via the Feed API. Real-time recommendation updates are triggered when a candidate updates their profile or applies to a job.

Core Components

Search Infrastructure

Elasticsearch powers the search backend with a cluster of 30 nodes, 20M active job documents across 10 shards. The index mapping uses multi-field types: job_title is indexed as both a text field (for BM25 full-text search) and a keyword field (for exact matching and aggregations). Location search uses geo_point fields with distance-based filtering. Synonym expansion (e.g., "SWE" → "Software Engineer") is applied at query time using a curated synonym dictionary. Autocomplete uses an edge-ngram analyzer on the job_title field, providing instant suggestions as the user types. Search results are cached in Redis (query hash → result IDs) with a 5-minute TTL; cache hit rate is 40% due to the long-tail nature of search queries.

Job Alert Engine

The alert system matches 100M saved alerts against 500K new postings daily. Instead of evaluating each alert against each posting (50T pairs), the system uses inverted matching: each alert is decomposed into filter predicates (location, keywords, job type). New postings are routed through a matching pipeline that indexes them against alert predicates using a reverse search approach — Elasticsearch Percolator queries. Each saved alert is stored as a percolator query; when a new job is indexed, the percolator evaluates all matching alerts efficiently using the inverted index. Matched alerts trigger email/push notifications via a batched notification service (aggregated into daily digest emails to avoid spam).

Application Pipeline

When a candidate applies, the Application Service creates a record in PostgreSQL (application_id, job_id, candidate_id, resume_version_id, cover_letter, status, applied_at) and publishes an event to Kafka. Downstream consumers include: (1) the ATS Integration Service, which forwards applications to the employer's ATS via webhooks or API; (2) the Recommendation Service, which updates the candidate's profile with the applied job's features for improved future recommendations; (3) the Analytics Service, which tracks conversion funnels (view → click → apply → interview → hire). Resume parsing (extraction of skills, experience, education) runs asynchronously using an NLP pipeline and stores structured data in the candidate profile.

Database Design

Job data is stored in PostgreSQL: jobs (job_id UUID PK, employer_id, title, title_normalized, description TEXT, requirements TEXT, salary_min, salary_max, currency, location_text, latitude, longitude, remote_policy ENUM, experience_level, job_type ENUM, status ENUM(active, paused, closed, expired), posted_at, expires_at). Indexes on (status, posted_at DESC) for recent job queries and a trigram GIN index on title for fuzzy matching at the database level.

Candidate profiles live in a separate PostgreSQL instance: candidates (candidate_id, name, email, headline, location, skills ARRAY, experience_years, education JSONB, resume_s3_path, profile_embedding VECTOR(256)). The pgvector extension enables ANN searches directly in PostgreSQL for smaller-scale similarity matching. For production-scale matching (200M candidates), FAISS runs on dedicated GPU instances with periodic index rebuilds from the candidate table.

Application data uses a DynamoDB table: PK=candidate_id, SK=application_id, with a GSI on job_id for employer-side queries. This schema supports efficient queries both from the candidate perspective ("my applications") and the employer perspective ("applicants for this job").

API Design

GET /api/v1/jobs/search?q={query}&location={loc}&radius=50km&salary_min=100000&page=1 — Search jobs with filters and pagination
POST /api/v1/jobs — Create a job posting; body contains structured job data; returns job_id
POST /api/v1/jobs/{job_id}/apply — Submit application; body contains candidate_id, resume_version_id, cover_letter; returns application_id
GET /api/v1/candidates/{candidate_id}/recommendations?limit=20 — Fetch personalized job recommendations

Scaling & Bottlenecks

Search latency is the critical bottleneck. The learning-to-rank re-ranking step adds 50-100ms on top of Elasticsearch's base query time. To stay under 500ms p95, the re-ranker only scores the top 200 results from Elasticsearch (not the full result set). Caching frequent queries in Redis reduces average latency to 50ms. During hiring season spikes (2x traffic), Elasticsearch auto-scales by adding read replicas (replica shards on new nodes) within minutes. The search cluster maintains 40% headroom capacity to absorb bursts without degradation.

The job alert percolator is computationally expensive: 100M stored queries evaluated against each new posting. The percolator index is sharded across 50 Elasticsearch nodes, with each node handling 2M alert queries. New postings are broadcast to all shards in parallel, and matching alerts are collected and deduplicated. The total matching time for a single new posting is under 2 seconds across the cluster. Notification delivery is rate-limited and batched into daily digests to avoid overwhelming users and email providers.

Key Trade-offs

Elasticsearch Percolator for alerts vs periodic batch matching: Percolator provides near-real-time alert matching (new jobs matched within minutes) but requires storing 100M queries in the index, consuming significant memory — batch matching would be cheaper but delay alerts by hours
Learning-to-rank over BM25-only search: LTR provides 30% better click-through rates by incorporating personalization and engagement signals, but adds 50-100ms latency and requires continuous model training infrastructure
Unified search index vs per-geography shards: A single global index simplifies the architecture but makes geo-filtered queries scan irrelevant documents — geo-based sharding (US, EU, APAC) would improve query efficiency at the cost of cross-region search complexity
One-click apply vs always redirect to employer ATS: One-click apply increases application volume 5x (better candidate experience) but gives employers less control over the application flow and data collection — offering both options lets employers choose