System Design: Medical Appointment Scheduling

Requirements

Functional Requirements:

Patients can search available appointment slots by provider, specialty, location, and insurance accepted
Book, reschedule, and cancel appointments with automated confirmation and reminder notifications (SMS, email, push)
Support complex provider availability templates: recurring schedules, block time, multi-location providers, and visit type durations
Waitlist management with automatic slot offering when cancellations create openings
Real-time insurance eligibility verification before booking via EDI 270/271 transactions
Multi-resource scheduling: book a provider, exam room, and equipment (e.g., ultrasound machine) atomically

Non-Functional Requirements:

Handle 50,000 scheduling searches/minute during peak hours (Monday mornings)
Prevent double-booking with strong consistency guarantees on slot reservation
Appointment booking latency under 2 seconds end-to-end including insurance verification
HIPAA-compliant: patient appointment data encrypted at rest (AES-256) and in transit (TLS 1.3)
99.95% availability — scheduling downtime disrupts clinical operations across all facilities

Scale Estimation

A large health system with 5,000 providers averaging 20 patients/day schedules 100,000 appointments daily. Each provider has a 90-day booking horizon with 15-minute slot granularity = 2,160 slots per provider × 5,000 providers = 10.8M total schedulable slots. Availability searches must scan available slots across multiple providers filtered by specialty, location, and insurance — with 50,000 searches/minute, the availability index is queried 833 times/sec. Appointment writes: 100K bookings + 15K cancellations + 10K reschedules = 125K writes/day (1.4 TPS average, 15 TPS peak). Reminder notifications: 3 reminders per appointment (1 week, 1 day, 1 hour before) = 300K notifications/day. Insurance eligibility checks: 100K/day via EDI 270/271 with 1.5 second average response time from payers.

High-Level Architecture

The scheduling system uses an event-driven architecture with a CQRS (Command Query Responsibility Segregation) pattern to handle the read-heavy availability search workload separately from the write-heavy booking path. The Write Model runs on PostgreSQL with row-level locking to prevent double-booking. The Read Model maintains a denormalized availability index in Elasticsearch, updated asynchronously within 500ms of any write via Kafka events.

The booking flow: Patient searches for available slots → Availability Service queries Elasticsearch returning matching open slots with provider details → Patient selects a slot → Booking Service attempts to reserve the slot using a PostgreSQL advisory lock on the (provider_id, slot_time) pair → if the lock is acquired, the service runs insurance eligibility verification in parallel with room/equipment reservation → if all checks pass, the booking is committed atomically → Kafka publishes a BookingCreated event → downstream consumers update the Elasticsearch availability index (removing the booked slot), trigger confirmation notifications, and sync to the EHR.

The Provider Schedule Service manages complex availability rules. Provider templates define recurring weekly patterns (e.g., Monday 8AM-5PM at Location A, Tuesday 9AM-3PM at Location B) with exceptions for PTO, conferences, and administrative time. A daily batch job materializes templates into concrete available slots for the next 90 days. Real-time overrides (provider leaves early, adds same-day slots) are handled synchronously.

Core Components

Availability Engine

The Availability Engine maintains a real-time view of all schedulable slots across the health system. Slots are materialized from provider templates by a nightly batch job and stored in PostgreSQL as rows: slot_id, provider_id, location_id, start_time, end_time, visit_type, status (OPEN/BOOKED/BLOCKED/HELD). The Elasticsearch availability index is a denormalized projection: each document contains the slot plus provider specialty, location coordinates (for geo-search), accepted insurance plans, and languages spoken. Searches support compound queries: "find open cardiology appointments within 10 miles accepting Blue Cross in the next 2 weeks" executes as an Elasticsearch bool query with geo_distance filter. The index updates within 500ms of any slot status change via a Kafka consumer, ensuring near-real-time accuracy. A short-lived slot hold mechanism (90-second TTL in Redis) prevents two patients from booking the same slot simultaneously during the checkout flow.

Booking & Reservation Service

The Booking Service orchestrates the multi-step reservation process with ACID guarantees. When a patient confirms a slot, the service: (1) acquires a PostgreSQL advisory lock on the provider+timeslot, (2) verifies the slot is still OPEN (optimistic concurrency check via version column), (3) initiates parallel requests to the Insurance Verification Service and the Resource Allocation Service, (4) if both succeed, atomically updates the slot status to BOOKED, creates the appointment record, and releases the lock within a single database transaction. If insurance verification fails (patient not eligible), the slot is released and the patient is informed. The entire flow completes within the 2-second SLA. Idempotency is enforced via a unique constraint on (patient_id, provider_id, slot_start_time, idempotency_key) preventing duplicate bookings from retried requests.

Waitlist & Cancellation Manager

The Waitlist Service manages patients who want appointments sooner than the next available slot. When a cancellation creates an opening, the service queries the waitlist for matching patients (same provider or specialty, compatible location, matching insurance) ordered by waitlist position and urgency. The top-matched patient receives an SMS/push notification with a 30-minute claim window. If unclaimed, the offer cascades to the next patient. The waitlist is stored in Redis Sorted Sets keyed by (provider_id, visit_type) with scores combining waitlist timestamp and urgency weight. Cancellation events trigger the matching algorithm via a Kafka consumer. Smart cancellation prediction uses a logistic regression model trained on historical no-show data (features: appointment lead time, patient history, day of week, weather) to pre-identify likely cancellations and proactively offer waitlist patients tentative slots.

Database Design

PostgreSQL serves as the system of record. Core tables: appointments (appointment_id, patient_id, provider_id, location_id, room_id, start_time, end_time, visit_type, status, insurance_verification_id, cancellation_reason, created_at, updated_at, version), provider_schedules (schedule_id, provider_id, location_id, day_of_week, start_time, end_time, visit_types, effective_from, effective_to), schedule_exceptions (exception_id, provider_id, exception_date, exception_type BLOCK/OVERRIDE, start_time, end_time, reason), slots (slot_id, provider_id, location_id, start_time, end_time, visit_type, status, appointment_id, version). Indexes on (provider_id, start_time, status) and (location_id, start_time, status) support efficient slot lookups.

The waitlist table: waitlist_entry_id, patient_id, preferred_provider_id (nullable), specialty, preferred_locations, preferred_time_ranges (JSONB), insurance_plan_id, urgency, created_at, status (WAITING/OFFERED/CLAIMED/EXPIRED). A GIN index on preferred_locations and preferred_time_ranges supports the matching query.

API Design

GET /v1/availability?specialty=cardiology&location=40.7,-74.0&radius=10mi&insurance=bcbs_ppo&from=2025-02-01&to=2025-02-14&limit=20 — Search available slots with filtering; returns paginated list of slots with provider details
POST /v1/appointments — Book an appointment; body contains slot_id, patient_id, visit_reason, insurance_member_id, idempotency_key; returns appointment confirmation with details
POST /v1/appointments/{id}/cancel — Cancel an appointment; body contains cancellation_reason; triggers waitlist matching for the freed slot
POST /v1/waitlist — Add patient to waitlist; body contains patient_id, specialty, preferred_providers, preferred_locations, preferred_times, urgency

Scaling & Bottlenecks

The availability search is the primary scalability challenge at 833 QPS. Elasticsearch handles this with a 3-node cluster (1 primary + 2 replicas) with slot documents sharded by location_id for geo-locality. Cache hit rates are high: a Redis cache stores the top 100 most-searched specialty+location combinations with a 60-second TTL, absorbing 70% of search traffic. The remaining 250 QPS hitting Elasticsearch is well within a modest cluster's capacity.

Insurance eligibility verification is the latency bottleneck at 1.5 seconds average response time from payer clearinghouses. This is mitigated by: (1) caching eligibility responses for 24 hours keyed by (patient_id, insurance_plan_id, service_type), reducing payer calls by 60%, (2) running verification in parallel with room reservation rather than sequentially, (3) offering a "book pending verification" flow for returning patients with recently verified eligibility. The slot hold mechanism in Redis (90-second TTL) prevents phantom availability — without it, two patients could see the same slot as available and both attempt booking.

Key Trade-offs

CQRS with Elasticsearch over single PostgreSQL: Separating reads (Elasticsearch) from writes (PostgreSQL) enables fast availability search without impacting booking transaction performance, but introduces eventual consistency — a slot booked 500ms ago may still appear available in search results, handled by the advisory lock catch at booking time
Materialized slots over dynamic availability calculation: Pre-computing individual slot rows simplifies search and booking logic, but requires a nightly materialization job and consumes more storage (10.8M slot rows) — the alternative of computing availability from schedule templates at query time is too slow for 833 QPS
90-second slot holds over immediate booking: The hold mechanism provides a shopping-cart-like UX where patients can review details before confirming, but temporarily reduces apparent availability — the short TTL and automatic release minimize impact
Proactive waitlist matching over patient-initiated search: Automatically notifying waitlisted patients of cancellations fills slots faster (reducing provider idle time) but adds notification fatigue risk — the 30-minute claim window and cascading offers balance urgency with patient convenience