SYSTEM_DESIGN
System Design: Cryptocurrency Exchange
Design a cryptocurrency exchange handling spot and derivatives trading, hot/cold wallet management, blockchain settlement, and 24/7 operations with institutional-grade security and performance.
Requirements
Functional Requirements:
- Spot trading with order matching for 500+ trading pairs (BTC/USDT, ETH/USDT, etc.)
- Deposit and withdrawal of cryptocurrencies with on-chain confirmation tracking
- Hot wallet and cold wallet management with automated rebalancing
- Real-time order book, trade history, and candlestick chart data via WebSocket
- Margin trading with liquidation engine and funding rate calculation
- Staking and earn products for supported proof-of-stake assets
Non-Functional Requirements:
- Order matching latency under 1ms for co-located engines (24/7 operation, no market close)
- Support 1M concurrent WebSocket connections for market data
- 99.99% availability — every minute of downtime is lost trading volume
- Cold wallet stores 95% of assets offline with multi-signature approval for withdrawals
- Regulatory compliance: KYC/AML, travel rule (FATF), proof-of-reserves auditing
Scale Estimation
A major crypto exchange processes 5M trades/day across 500 trading pairs, peaking at 50,000 orders/sec during volatile market events (BTC flash crashes, major economic announcements). Market data: 500 pairs × 200 updates/sec = 100K quote messages/sec. With 1M WebSocket subscribers, fanout produces 100B messages/day. Deposits/withdrawals: 500K blockchain transactions/day across 50 supported blockchains, each requiring confirmation monitoring (6 confirmations for BTC = ~60 minutes, 30 for ETH = ~6 minutes). Hot wallet rebalancing: 200 transfers/day between hot and cold storage. Total assets under custody: $10B+ requiring proof-of-reserves attestation.
High-Level Architecture
The exchange architecture is divided into the Trading Engine (performance-critical), the Wallet Infrastructure (security-critical), and the Data Platform (market data and analytics). These run on physically separate infrastructure with different security postures.
The Trading Engine cluster runs the matching engines, one per trading pair (or grouped for low-volume pairs). Each engine is a single-threaded C++ process using the LMAX Disruptor pattern with a ring buffer for order ingestion. Orders arrive from the API layer via an internal message bus (Aeron) and are matched using price-time priority. Execution reports flow back through the message bus to the Account Service, which updates user balances in a PostgreSQL ledger with ACID guarantees. The entire trading flow (order in → match → balance update → execution report out) targets sub-5ms end-to-end.
The Wallet Infrastructure runs in an air-gapped environment. Hot wallets (holding ~5% of each asset for instant withdrawals) run on hardened servers with HSMs for key management. Cold wallets (95% of assets) use multi-signature schemes (3-of-5 for BTC, multi-sig smart contracts for EVM chains) with key ceremonies requiring physical presence of key holders. Deposit detection runs blockchain-specific listener services that monitor the mempool and confirmed blocks for incoming transactions to exchange-assigned deposit addresses.
Core Components
Order Matching Engine
Each trading pair has a dedicated matching engine maintaining an in-memory order book as two Red-Black Trees (bids sorted descending, asks sorted ascending by price, FIFO within each price level). The engine processes orders sequentially from a ring buffer: limit orders are inserted into the book or matched against resting orders; market orders walk the book consuming liquidity until filled. Self-trade prevention (STP) is implemented to prevent wash trading — if both sides of a match belong to the same user, the resting order is cancelled. The engine emits trade events to Kafka for downstream consumption. State is journaled to NVMe for recovery and replicated to a warm standby for failover within 100ms.
Wallet & Custody Service
The Wallet Service manages a hierarchical deterministic (HD) wallet infrastructure. Each user gets a unique deposit address derived from a master public key using BIP-32 derivation paths — the private key never exists on the hot server, only the extended public key (xpub). Deposit detection services monitor each blockchain: a Bitcoin listener runs a full node and watches for transactions to known deposit addresses; an Ethereum listener monitors events via a Geth archive node. After sufficient confirmations, the Deposit Service credits the user's exchange balance. Withdrawals are processed in batches every 15 minutes: the Withdrawal Aggregator collects pending requests, performs AML screening, and submits a batch signing request to the HSM-backed signing service. Withdrawals above a threshold ($50K) require manual approval from the security team.
Liquidation Engine
For margin trading, the Liquidation Engine continuously monitors all leveraged positions against their maintenance margin. It consumes mark price updates (calculated as a combination of the exchange's last traded price and external index prices to prevent manipulation) and computes the margin ratio for each position. When a position's margin ratio falls below the maintenance threshold, the engine initiates a graduated liquidation: first, it attempts to reduce the position by 25% at market price; if the mark price continues to move adversely, it escalates to 50%, then 100%. If the liquidated position results in a loss exceeding the user's margin, the deficit is covered by the Insurance Fund (funded by liquidation penalties). The engine processes 10K position checks per mark price update, using vectorized computation across position arrays.
Database Design
The trading ledger uses PostgreSQL with double-entry bookkeeping. Tables: accounts (account_id, user_id, asset, available_balance NUMERIC(28,18), locked_balance NUMERIC(28,18)), ledger_entries (entry_id, transaction_id, account_id, direction, amount, created_at). The NUMERIC(28,18) precision handles cryptocurrency's high decimal precision (18 decimals for ERC-20 tokens). Each trade creates 4 ledger entries: debit buyer's quote currency, credit buyer's base currency, debit seller's base currency, credit seller's quote currency.
Blockchain transaction tracking uses a separate PostgreSQL database: deposits (deposit_id, user_id, asset, amount, tx_hash, block_number, confirmations, status PENDING/CONFIRMED/CREDITED), withdrawals (withdrawal_id, user_id, asset, amount, destination_address, tx_hash, status PENDING/SIGNED/BROADCAST/CONFIRMED, batch_id). Market data history uses TimescaleDB for candlestick data: candles (trading_pair, interval 1m/5m/1h/1d, open_time, open, high, low, close, volume, quote_volume).
API Design
POST /v1/orders— Place an order; body contains trading_pair, side (BUY/SELL), type (LIMIT/MARKET/STOP_LIMIT), quantity, price, time_in_force; HMAC-signed request with API key and timestamp for replay protectionGET /v1/orderbook/{trading_pair}?depth=50— Snapshot of current order book with specified depth levelsWS /v1/ws— WebSocket multiplexed stream; client subscribes to channels: orderbook@btcusdt, trades@ethusdt, account (private); server streams updatesPOST /v1/withdrawals— Request withdrawal; body contains asset, amount, address, network; requires 2FA confirmation; subject to withdrawal limits and AML screening
Scaling & Bottlenecks
The matching engine scales by sharding across trading pairs — each pair runs independently. However, cross-pair operations (e.g., liquidation selling BTC to cover a margin call on an ETH position) require coordination. This is handled by a Cross-Engine Coordinator that submits market orders to multiple engines and aggregates results. The WebSocket layer is the distribution bottleneck: 100K messages/sec to 1M clients requires a multi-tier fanout tree (matching engine → Kafka → regional aggregators → edge WebSocket servers). Each edge server holds 10K connections and subscribes only to channels its connected clients have requested.
Blockchain infrastructure is a unique bottleneck. Supporting 50 blockchains requires running full nodes for each, consuming significant disk (Bitcoin: 500GB, Ethereum: 2TB archive node) and bandwidth. Block reorganizations (reorgs) can reverse confirmed deposits — the confirmation threshold (6 for BTC, 30 for ETH) balances deposit speed against reorg risk. During network congestion (high gas fees on Ethereum), withdrawal batching becomes critical — the Aggregator bundles multiple withdrawals into a single on-chain transaction to amortize gas costs.
Key Trade-offs
- Hot/cold wallet split (5%/95%) over all-hot: Cold storage with multi-sig dramatically reduces hack exposure, but introduces withdrawal delays when hot wallet reserves are depleted — automated rebalancing with threshold alerts mitigates this
- Single-threaded matching engine over multi-threaded: Deterministic ordering eliminates race conditions in matching but limits per-pair throughput — acceptable since even BTC/USDT rarely exceeds 10K orders/sec
- Mark price from external index over exchange last price: Prevents liquidation manipulation by large traders on the exchange, but introduces dependency on external price feeds — a median of 5 external exchanges with outlier filtering provides robustness
- Batch withdrawal processing over instant: Batching every 15 minutes reduces on-chain transaction costs and enables batch AML screening, but increases withdrawal latency — users expecting instant crypto withdrawals may churn
GO DEEPER
Master this topic in our 12-week cohort
Our Advanced System Design cohort covers this and 11 other deep-dive topics with live sessions, assignments, and expert feedback.