SYSTEM_DESIGN
How to Design a URL Shortener (TinyURL)
Complete system design breakdown of a URL shortener service like TinyURL or Bitly, covering high-level architecture, database design, hash generation, and scaling strategies.
How to Design a URL Shortener
A URL shortener takes a long URL and creates a short, unique alias. When users visit the short URL, they are redirected to the original long URL. Services like TinyURL, Bitly, and short.io handle billions of redirections every month. Let's walk through how to design one from scratch.
Requirements
Functional Requirements
- Given a URL, generate a shorter unique alias
- When users access the short link, redirect to the original URL
- Users can optionally pick a custom short link
- Links expire after a default timespan
Non-Functional Requirements
- High availability
- Minimal latency for URL redirection
- Short links should not be predictable
Back-of-Envelope Estimation
- 100M new URLs per day
- Read:Write ratio = 100:1
- QPS for writes: 100M / 86400 ≈ 1160 writes/sec
- QPS for reads: 116,000 reads/sec
- Storage per URL: ~500 bytes
- Storage for 5 years: 100M × 365 × 5 × 500 bytes ≈ 91 TB
High-Level Architecture
The system consists of:
- API Gateway — handles incoming requests, rate limiting, authentication
- URL Shortening Service — generates short URLs, stores mappings
- Redirection Service — looks up short URLs, returns 301/302 redirects
- Database — stores URL mappings (short → long)
- Cache Layer — Redis cache for hot URLs
- Analytics Service — tracks click counts, geographic data
Database Design
| Column | Type | Description |
|---|---|---|
| id | BIGINT | Auto-incrementing primary key |
| short_url | VARCHAR(7) | The generated short code |
| long_url | TEXT | Original URL |
| created_at | TIMESTAMP | Creation time |
| expires_at | TIMESTAMP | Expiration time |
| user_id | BIGINT | Optional creator |
Hash Generation Approaches
Approach 1: Base62 Encoding
Convert auto-incrementing ID to base62 (a-z, A-Z, 0-9). A 7-character string gives 62^7 = 3.5 trillion unique URLs.
Approach 2: MD5/SHA256 + Truncation
Hash the long URL, take first 7 characters. Handle collisions by appending a counter.
Approach 3: Pre-generated Keys
Generate keys offline and store in a key database. When a new URL comes in, assign the next available key.
Deep Dive: Key Generation Service
The Key Generation Service (KGS) pre-generates random 7-character strings and stores them in a database. Two tables are used:
- keys_available — pool of unused keys
- keys_used — keys that have been assigned
When a new URL arrives, KGS moves a key from keys_available to keys_used and assigns it. This avoids runtime collision handling entirely.
To avoid concurrency issues, KGS loads a batch of keys into memory. If a server dies, those keys are lost — but with 3.5 trillion possible keys, this is acceptable.
Caching Strategy
We use Redis as a caching layer between the application servers and the database:
- Cache the most frequently accessed URLs (80/20 rule)
- Use LRU eviction policy
- Cache size: 20% of daily traffic × 500 bytes ≈ 10 GB
- On cache miss, query the database and update the cache
Scaling Strategies
- Database Sharding — shard by hash of short URL
- Read Replicas — separate read/write traffic
- Caching — cache frequently accessed URLs in Redis
- CDN — serve redirects from edge locations
- Rate Limiting — prevent abuse
URL Cleanup and Expiration
A background cleanup service runs periodically to:
- Remove expired URLs from the database
- Return expired keys back to the key pool
- Update analytics for expired links
Trade-offs
- 301 vs 302 Redirect: 301 is permanent (browsers cache), 302 is temporary (better for analytics)
- SQL vs NoSQL: SQL for ACID guarantees, NoSQL for horizontal scaling
- Custom aliases: Allow but validate to prevent abuse
Summary
A URL shortener is deceptively simple on the surface but involves interesting design decisions around hash generation, caching, database sharding, and analytics. The key insight is that reads vastly outnumber writes, so the system should be optimized for fast lookups with aggressive caching and database read replicas.
GO DEEPER
Master this topic in our 12-week cohort
Our Advanced System Design cohort covers this and 11 other deep-dive topics with live sessions, assignments, and expert feedback.