SYSTEM_DESIGN
System Design: Digital Wallet (Venmo/PayPal)
Deep dive into designing a digital wallet system like Venmo or PayPal, covering peer-to-peer payments, stored value accounts, social feed, instant transfers, and regulatory compliance at scale.
Requirements
Functional Requirements:
- Users can send and receive money to/from other users (P2P payments)
- Maintain stored-value wallet balances with top-up from bank accounts or cards
- Instant transfer to linked bank accounts via RTP or push-to-debit
- Social feed showing friends' payment activity with privacy controls
- Pay merchants using wallet balance, linked cards, or split payments
- Transaction history with search, filtering, and downloadable statements
Non-Functional Requirements:
- 80M monthly active users with 5M daily P2P transactions
- Payment completion latency under 2 seconds end-to-end
- ACID compliance for all balance mutations — no double-spending
- Regulatory compliance: FinCEN MSB registration, state money transmitter licenses, BSA/AML
- 99.99% availability for the payment critical path
Scale Estimation
5M daily P2P transactions = 58 TPS average, peaking at 300 TPS during Friday evenings and holidays (Venmo sees 5x spikes around Thanksgiving, Super Bowl). Each P2P transaction involves 2 balance mutations (sender debit, receiver credit) plus ledger entries = 10M balance writes/day. Social feed: each payment generates a feed item visible to both users' friend networks — with an average of 150 friends, each payment fans out to 300 feed items = 1.5B feed items/day written to feed storage. Wallet top-ups and withdrawals add another 2M transactions/day. Total ledger entries: ~14M/day. User profile reads for search-by-name during P2P: 50M queries/day.
High-Level Architecture
The digital wallet architecture separates the financial core (money movement) from the social layer (feed, notifications) with different consistency and availability requirements. The financial core runs on PostgreSQL with serializable isolation and double-entry bookkeeping. The social layer uses Cassandra for the feed timeline and Redis for real-time notifications.
The payment flow for a P2P transfer: Sender initiates payment → API Gateway authenticates and rate-limits → Payment Service validates recipient, checks sender balance or funding source → if funded by wallet balance, the Ledger Service atomically debits sender and credits receiver in a single PostgreSQL transaction → if funded by linked bank account, the Funding Service initiates an ACH pull (settles in 1-3 days) and immediately credits the receiver's wallet from a platform funding account (the platform assumes settlement risk) → the Social Service publishes a feed item to the activity feed → Push Notification Service alerts the receiver.
For instant withdrawals, the Disbursement Service uses push-to-debit (Visa Direct or Mastercard Send) to deliver funds to a linked debit card within 30 minutes, or RTP (Real-Time Payments) network for bank account delivery within seconds. Standard withdrawals use ACH (1-3 business days) at no cost.
Core Components
Ledger & Balance Service
The ledger implements double-entry bookkeeping with account types: USER_WALLET, PLATFORM_FUNDING, PLATFORM_FEE, SETTLEMENT_RESERVE, and SUSPENSE. Every money movement creates balanced entries. A P2P payment from Alice to Bob creates: DEBIT Alice's USER_WALLET, CREDIT Bob's USER_WALLET. A bank-funded payment creates: DEBIT PLATFORM_FUNDING (platform fronts the money), CREDIT Bob's USER_WALLET, and a pending entry for the ACH pull from Alice's bank. Balance checks use SELECT FOR UPDATE with NOWAIT to fail fast on contention rather than blocking. Double-spend prevention is enforced at the database level with a CHECK constraint on balance >= 0 and serializable isolation.
Social Feed Service
The social feed is a fanout-on-write architecture using Cassandra. When a payment is made with visibility set to "friends" or "public," the Social Service reads the sender and receiver's friend lists and writes a feed item to each friend's timeline partition in Cassandra. The feed item contains: payment_id, actor_id, recipient_id, note (emoji/text), amount (hidden by default per privacy settings), timestamp. For high-follower users (celebrities with 100K+ friends), fanout switches to pull-based: their payments are stored in a separate "celebrity feed" table and merged at read time. Feed reads are paginated using Cassandra's token-based pagination with 20 items per page.
Funding & Disbursement Service
The Funding Service manages connections to external financial rails: ACH (via Nacha file generation submitted to the Federal Reserve), push-to-debit (via Visa Direct API and Mastercard Send API), and card networks for merchant payments. ACH processing is batch-oriented: the service accumulates funding requests throughout the day and generates a Nacha file at 5 PM ET for next-day settlement. Failed ACH pulls (insufficient funds) trigger a reversal flow: debit the receiver's wallet and notify both parties. The Disbursement Service handles outbound money movement with retry logic and reconciliation against bank settlement files received daily via SFTP.
Database Design
The financial core uses PostgreSQL (Citus for horizontal scaling by user_id). Core tables: accounts (account_id, user_id, account_type, balance, currency, status, created_at), ledger_entries (entry_id, transaction_id, account_id, direction DEBIT/CREDIT, amount BIGINT, currency, created_at), transactions (transaction_id, type P2P/TOPUP/WITHDRAWAL, sender_id, receiver_id, amount, currency, status, idempotency_key, note, visibility, created_at). A unique index on (sender_id, idempotency_key) prevents duplicate payments.
The social layer uses Cassandra with a feed_items table partitioned by user_id and clustered by timestamp DESC. Columns: user_id, timestamp, payment_id, actor_id, recipient_id, note, visibility. A separate friendships table in PostgreSQL manages the social graph with columns: user_id, friend_id, status (PENDING/ACCEPTED/BLOCKED), created_at. User search uses Elasticsearch indexing user profiles (name, username, email hash) for fast lookup during P2P recipient selection.
API Design
POST /v1/payments— Initiate a P2P payment; body contains recipient_id (or phone/email for non-users), amount, currency, note, funding_source_id, visibility (private/friends/public), idempotency_keyPOST /v1/wallet/topup— Add funds from a linked bank or card; body contains funding_source_id, amount, idempotency_keyPOST /v1/wallet/withdraw— Withdraw to linked bank account or debit card; body contains destination_id, amount, speed (instant/standard), idempotency_keyGET /v1/feed?cursor={token}&limit=20— Retrieve the authenticated user's social feed with cursor-based pagination
Scaling & Bottlenecks
The ledger database is the primary bottleneck during peak P2P activity. At 300 TPS peak with serializable isolation, hot user accounts (users receiving many payments simultaneously — e.g., splitting a dinner bill with 20 people) create lock contention. This is mitigated by Citus sharding by user_id (co-locating a user's account and ledger entries on the same shard) and using advisory locks scoped to the sender's account rather than table-level locks. For truly hot accounts (merchant accounts receiving thousands of payments), a balance pre-splitting technique distributes the balance across N sub-accounts, and incoming credits round-robin across sub-accounts.
The social feed fanout generates significant write amplification — a single payment between users with 150 friends each produces 300 Cassandra writes. During viral moments (holiday weekends), this can peak at 500K writes/sec to Cassandra. The cluster is sized for 3x this peak with RF=3, using LeveledCompactionStrategy to maintain consistent read latency. Feed read latency is kept under 50ms by limiting timeline partitions to 10,000 entries with TTL-based expiration of items older than 1 year.
Key Trade-offs
- Platform-funded instant credits over waiting for ACH settlement: Crediting recipients immediately when the sender uses a bank account dramatically improves UX, but the platform assumes settlement risk — mitigated by ACH return rate monitoring and limiting instant credit for new users
- Fanout-on-write for social feed over fanout-on-read: Pre-materializing each user's feed in Cassandra enables fast reads (single partition scan) but creates massive write amplification — the hybrid approach (fanout-on-write for normal users, fanout-on-read for celebrity accounts) balances the trade-off
- Serializable isolation over optimistic concurrency: Serializable prevents all balance anomalies including write skew, but limits throughput — the sharding strategy ensures each shard operates within its throughput ceiling
- ACH batch processing over real-time rail: ACH is essentially free but settles in 1-3 days; instant transfers via push-to-debit cost $0.25+ per transaction — offering both options lets users choose their cost/speed preference
GO DEEPER
Master this topic in our 12-week cohort
Our Advanced System Design cohort covers this and 11 other deep-dive topics with live sessions, assignments, and expert feedback.