INTERVIEW_QUESTIONS

Database Design Interview Questions for Senior Engineers (2026)

Top database design interview questions with detailed answer frameworks covering schema design, indexing, partitioning, replication, and choosing the right database for your use case.

20 min readUpdated Apr 19, 2026
interview-questionsdatabase-designsystem-designsenior-engineer

Why Database Design Matters in Senior Engineering Interviews

Database design decisions are among the most impactful and difficult-to-reverse architectural choices a senior engineer makes. The wrong database choice, schema design, or indexing strategy can lead to performance cliffs that are expensive to fix after data has accumulated. Interviewers test database design to assess whether candidates can reason about data modeling, access patterns, consistency requirements, and operational concerns.

At companies like Amazon and Google, database design questions often appear within broader system design problems. The interviewer will ask you to design the data layer for a system, and your choices reveal how deeply you understand storage engines, query optimization, and the trade-offs between different database paradigms.

This guide covers the most critical database design questions for senior engineering interviews. For foundational reading, see our database design concepts and the system design interview guide. Explore our learning paths for structured preparation.

1. How do you choose between a relational database and a NoSQL database?

What the interviewer is really asking: Can you articulate the trade-offs beyond the surface level and match database characteristics to specific requirements?

Answer framework:

Avoid the trap of listing generic pros and cons. Instead, reason from requirements.

Choose relational (PostgreSQL, MySQL) when: you need ACID transactions across multiple tables, your data has complex relationships with many joins, your query patterns are diverse and unpredictable (ad hoc queries), you need strong consistency guarantees, or your data model is well-defined and relatively stable. Relational databases are the right default choice for most applications.

Choose a document store (MongoDB, DynamoDB) when: your data is naturally hierarchical (documents with nested objects), you need flexible schemas that evolve frequently, your access pattern is primarily key-value or document retrieval, and you need horizontal scalability with predictable latency at any scale.

Choose a wide-column store (Cassandra, HBase) when: you have massive write throughput requirements, your data has a natural time-series component, you need multi-region replication with tunable consistency, and your query patterns are known in advance (Cassandra requires you to model your tables around your queries).

Choose a graph database (Neo4j, Amazon Neptune) when: your core queries involve traversing relationships (social networks, recommendation engines, fraud detection), the relationships themselves have properties, and you need queries like "find all friends of friends who bought product X."

Discuss the CAP theorem implications: most relational databases are CP (consistent but unavailable during partitions), while many NoSQL databases are AP (available but eventually consistent). DynamoDB offers both through its consistency parameter.

Common mistake: choosing NoSQL for scalability alone. Modern relational databases like CockroachDB, Spanner, and Aurora can scale horizontally while providing SQL semantics and ACID transactions. The scalability gap between SQL and NoSQL has narrowed significantly. See our database comparison guide for detailed benchmarks.

Follow-up questions:

  • When would you use both a relational and a NoSQL database in the same system?
  • How does NewSQL (CockroachDB, Spanner) change this decision?
  • What database would you choose for an e-commerce platform and why?

2. Explain database indexing strategies and their trade-offs

What the interviewer is really asking: Do you understand indexes beyond "they make queries faster" and can you reason about when indexes help, when they hurt, and which type to use?

Answer framework:

B-tree indexes (the default in most databases): balanced tree structure that keeps data sorted. O(log n) for point lookups and range queries. Every SQL database supports these. The trade-off: each index adds overhead to write operations (the index must be updated on every insert, update, and delete). A table with 10 indexes will have significantly slower writes than one with 2.

Hash indexes: O(1) for point lookups but cannot support range queries or sorting. Use when you only need exact-match lookups (e.g., looking up a user by session token). Memory-mapped hash indexes (like in Bitcask) provide extremely fast reads.

Composite indexes: an index on multiple columns (e.g., (country, city, name)). The column order matters critically: the index can be used for queries that filter on country, or country+city, or country+city+name, but NOT for queries that filter only on city or name (the leftmost prefix rule). Design composite indexes based on your most common query patterns.

Covering indexes: an index that includes all columns needed by a query, so the database can answer the query from the index alone without reading the table (an index-only scan). This eliminates the random I/O of going from the index to the table rows, dramatically improving performance for read-heavy queries.

Partial indexes: an index on a subset of rows (e.g., CREATE INDEX ON orders(user_id) WHERE status = 'active'). Smaller than a full index, faster to maintain, and useful when you only query a specific subset of data frequently.

Full-text indexes: inverted indexes for text search (GIN indexes in PostgreSQL, FULLTEXT indexes in MySQL). These tokenize text content and support keyword search with relevance ranking. For more advanced search, consider a dedicated search engine like Elasticsearch.

Discuss index monitoring: look at slow query logs to identify missing indexes, use EXPLAIN ANALYZE to verify the query planner uses your indexes, monitor index size and bloat, and periodically review unused indexes (they cost write performance without benefiting reads).

Follow-up questions:

  • How do you decide which columns to index?
  • What is index bloat and how do you handle it in PostgreSQL?
  • How do indexes interact with database caching?

3. Design a schema for a social media platform

What the interviewer is really asking: Can you make practical schema design decisions that balance normalization, query performance, and scalability?

Answer framework:

Start with the core entities: users, posts, comments, likes, follows, and messages. Identify the key relationships: users create posts (one-to-many), users follow users (many-to-many), posts have comments (one-to-many), users like posts (many-to-many).

For the users table: id (primary key), username (unique index), email (unique index), display_name, bio, avatar_url, created_at. Keep the user profile lean. Store extended profile data in a separate user_profiles table to keep the frequently-accessed users table small.

For the follows table (social graph): follower_id, following_id, created_at. Primary key on (follower_id, following_id). Add a secondary index on (following_id, follower_id) for "who follows me" queries. This is a many-to-many relationship and will be one of the largest tables.

For posts: id, user_id (indexed), content, media_urls (JSON array), created_at (indexed), like_count (denormalized counter), comment_count (denormalized counter). Discuss the denormalization decision: keeping counters on the post avoids expensive COUNT queries but introduces the challenge of keeping counters accurate.

For the timeline (the most interesting design challenge): discuss fan-out on write vs fan-out on read as covered in our news feed system design. For the database layer, a materialized timeline table (user_id, post_id, created_at) with a compound index on (user_id, created_at DESC) enables fast timeline retrieval.

Discuss database sharding: shard users and their posts by user_id. This co-locates a user's data for efficient single-shard queries. Cross-shard queries (e.g., a post's comments from many users) require scatter-gather. Consider denormalization to avoid cross-shard joins.

Discuss eventual consistency trade-offs: like counts can be slightly stale (update asynchronously). Follow counts can lag. But user authentication data must be strongly consistent.

For scalability: separate read and write paths. Write to the primary, read from replicas. Use caching for hot data (popular posts, user profiles). Store media in object storage, only URLs in the database.

Follow-up questions:

  • How would you handle the migration when adding a new feature like Stories?
  • How do you shard the follows table when both follower and following queries are common?
  • What changes when the platform grows from 1M to 1B users?

4. How does database sharding work and when should you shard?

What the interviewer is really asking: Do you understand the operational complexity of sharding and can you evaluate it against alternatives?

Answer framework:

Sharding is horizontal partitioning: distributing rows across multiple database instances based on a shard key. Each shard holds a subset of the data and can be on a separate server.

When to shard: NOT as a first resort. Before sharding, exhaust other options: read replicas (scale reads), caching (reduce database load), query optimization (indexes, query rewriting), vertical scaling (bigger machine), and table partitioning within a single instance. Shard when your write volume exceeds what a single machine can handle, or your data size exceeds what fits on a single machine's storage.

Shard key selection is the most critical decision. A good shard key has high cardinality (many distinct values for even distribution), is present in most queries (avoids cross-shard queries), distributes writes evenly (no hot shards), and does not change frequently (resharding is expensive). Example: user_id is often a good shard key for user-centric applications.

Sharding strategies: hash-based (hash(shard_key) mod N, even distribution but no range queries across shards), range-based (key ranges assigned to shards, supports range queries but can have hot spots), and directory-based (a lookup table maps keys to shards, most flexible but the directory is a single point of failure).

Cross-shard challenges: joins across shards are not supported by the database, you must do them in the application layer. Global unique constraints require coordination across shards. Transactions spanning multiple shards require distributed transactions (2PC) or the saga pattern. Aggregation queries (COUNT, SUM across all data) require scatter-gather from all shards.

Resharding (adding or removing shards): this is operationally painful. Use consistent hashing to minimize data movement. Tools like Vitess (for MySQL) and Citus (for PostgreSQL) help manage sharding transparently.

Discuss alternatives to manual sharding: managed databases that handle sharding automatically (DynamoDB, Spanner, CockroachDB). These provide the scalability of sharding with a simpler operational model. The trade-off is less control and higher cost.

Follow-up questions:

  • How would you reshard a production database without downtime?
  • What happens when one shard becomes much larger than others?
  • How do you handle reporting queries that need data from all shards?

5. Explain transaction isolation levels and when to use each

What the interviewer is really asking: Do you understand the practical implications of isolation levels on data correctness and performance?

Answer framework:

Transaction isolation levels control what data a transaction can see from concurrent transactions. From weakest to strongest:

Read Uncommitted: a transaction can read data written by other uncommitted transactions (dirty reads). Almost never used in practice. One scenario: real-time analytics where approximate results are acceptable and performance is critical.

Read Committed (default in PostgreSQL, Oracle): a transaction only sees data committed before the current statement (no dirty reads). But if the same query runs twice within a transaction, it might return different results (non-repeatable reads) because another transaction committed between the two queries. Use for: most OLTP workloads where perfect consistency within a transaction is not required.

Repeatable Read (default in MySQL InnoDB): a transaction sees a snapshot of the database as of the transaction start. The same query always returns the same results within a transaction. Prevents dirty reads and non-repeatable reads. But phantom reads can still occur (new rows matching a query condition appear between queries). In practice, MySQL's implementation prevents phantoms too (using gap locks).

Serializable (strongest): transactions execute as if they were serial (one at a time). Prevents all anomalies. Implementations use either two-phase locking (pessimistic, blocks concurrent access) or Serializable Snapshot Isolation (SSI, optimistic, detects conflicts and aborts). SSI is used by PostgreSQL and CockroachDB. Trade-off: highest correctness but highest abort rate and potential for deadlocks.

Practical guidance: use Read Committed for most operations. Use Serializable for financial transactions, inventory management, and any operation where inconsistency causes real harm. The performance difference is often smaller than expected, especially with SSI, because the common case (no conflicts) is fast.

Discuss the write skew anomaly (only prevented by Serializable): two transactions read the same data, make decisions based on it, and write non-conflicting rows. Each transaction individually is correct, but together they violate an invariant. Example: two doctors check that at least one is on call, both see the other is on call, and both go off call.

Follow-up questions:

  • How does MVCC relate to isolation levels?
  • What is the performance impact of Serializable isolation in practice?
  • How do isolation levels interact with database replication?

6. How do you design for high write throughput?

What the interviewer is really asking: Do you know techniques beyond basic optimization for handling write-intensive workloads?

Answer framework:

Write amplification is the enemy. Every write to a B-tree database can cause multiple disk I/O operations (update the data page, update each index, write to the WAL). Strategies to reduce this.

Batch writes: instead of inserting rows one at a time, batch them (INSERT INTO table VALUES (v1), (v2), ..., (v1000)). This amortizes the transaction overhead. Most databases see 10-100x throughput improvement with batching.

Use an LSM-tree based storage engine: LSM trees (used by Cassandra, RocksDB, LevelDB) buffer writes in memory (memtable) and flush to disk as sorted files (SSTables). Writes are sequential (fast) rather than random (slow). Trade-off: reads require checking multiple levels, so read performance is lower than B-trees.

Write-behind with a message queue: decouple the API response from the database write. Accept the write into Kafka or a similar queue, respond to the client immediately, and write to the database asynchronously. This trades strong consistency for throughput. Use for: analytics events, logs, non-critical data.

Partitioning writes: split a high-contention resource across multiple rows. A single counter row that all writes contend on is a bottleneck. Split it into 100 counter rows and sum them on read. This trades read complexity for write throughput.

Minimize indexes: each index adds write overhead. Audit your indexes and remove unused ones. Consider different index strategies for write-heavy tables (fewer, more targeted indexes).

Use append-only data models: instead of updating rows in place, append new versions. The latest version is the current state. This eliminates update contention and enables time-travel queries. Compact old versions in the background. Event sourcing takes this to the extreme by storing every state change as an immutable event.

Discuss database design patterns for write-heavy systems: CQRS (Command Query Responsibility Segregation) separates write-optimized storage from read-optimized storage, connected by event-driven synchronization.

Follow-up questions:

  • How does PostgreSQL's HOT (Heap Only Tuple) optimization help with updates?
  • When would you choose an LSM-tree over a B-tree storage engine?
  • How do you monitor and diagnose write bottlenecks?

7. Design a database schema for a multi-tenant SaaS application

What the interviewer is really asking: Can you balance tenant isolation, operational simplicity, and cost efficiency in a multi-tenant database design?

Answer framework:

Three isolation models, from strongest to weakest:

Database-per-tenant: each tenant gets their own database instance. Strongest isolation (security, performance, compliance). Simplest schema (no tenant_id everywhere). But highest cost (one instance per tenant) and hardest to manage operationally (schema migrations must be applied to every database). Use for: enterprise SaaS with strict compliance requirements, tenants with very different data volumes.

Schema-per-tenant: shared database instance, but each tenant has their own schema (namespace). Good isolation (queries cannot accidentally cross schemas). Moderate cost. Schema migrations still must be applied per schema. Use for: moderate number of tenants (hundreds, not thousands) with different customization needs.

Shared tables with tenant_id: all tenants share the same tables, with a tenant_id column on every table. Most cost-efficient. Simplest to manage (one schema, one migration). But weakest isolation: a missing WHERE tenant_id = ? clause leaks data between tenants. Use Row-Level Security (PostgreSQL) to enforce tenant isolation at the database level as a safety net. Use for: high-volume SaaS (thousands of tenants), tenants with similar and small data volumes.

For the shared-table approach, design considerations: add tenant_id to every primary key and index. For composite indexes, always lead with tenant_id (this ensures each tenant's queries only scan their own data). Use connection pooling with tenant-aware routing.

Discuss noisy neighbor problems: one tenant running expensive queries can affect all other tenants. Solutions: query timeout limits, per-tenant connection limits, resource quotas (PostgreSQL resource groups), or workload isolation (route large tenants to dedicated read replicas).

For scalability: when the shared database becomes a bottleneck, shard by tenant_id. Large tenants get dedicated shards. Small tenants share shards. This hybrid approach optimizes cost while providing isolation where needed.

For data retention: support per-tenant data retention policies and deletion. GDPR requires the ability to delete all of a tenant's data on request. With database-per-tenant, this is a simple DROP DATABASE. With shared tables, it requires carefully deleting rows while holding appropriate locks.

Follow-up questions:

  • How would you migrate from shared tables to database-per-tenant for a large customer?
  • How do you handle cross-tenant analytics and reporting?
  • How do you manage schema migrations across thousands of tenant databases?

8. How do you design for database high availability?

What the interviewer is really asking: Do you understand replication topologies, failover mechanisms, and the operational practices needed for database reliability?

Answer framework:

The foundation is replication: maintain multiple copies of the database so that if one node fails, another can take over.

Synchronous replication: the primary waits for at least one replica to confirm the write before acknowledging the client. Guarantees no data loss on failover (zero RPO). Trade-off: every write has the latency of the slowest required replica. PostgreSQL synchronous_standby_names configures this.

Asynchronous replication: the primary acknowledges writes immediately and replicates in the background. Lower write latency. But failover can lose recent writes (non-zero RPO). The replication lag is the window of potential data loss. Monitor replication lag continuously.

Semi-synchronous: wait for one replica (out of several) to confirm. Balances durability and latency. If the synchronous replica is slow, promote another replica to synchronous.

Automatic failover: use a consensus-based system (etcd, ZooKeeper, Patroni for PostgreSQL) to detect primary failure and promote a replica. The key challenge is avoiding split-brain: two nodes both believing they are primary. Fencing mechanisms (cutting off the old primary's access to storage or the network) prevent this.

For high availability targets: 99.9% (8.7 hours downtime per year), 99.99% (52 minutes per year), 99.999% (5 minutes per year). Each nine requires significantly more investment. For 99.99%+ you need multi-region deployment.

Multi-region database deployment: active-passive (one region handles writes, the other is a warm standby), active-active with conflict resolution (both regions accept writes, conflicts resolved via CRDTs or last-writer-wins), and globally distributed databases like Spanner that use synchronized clocks for consistent cross-region transactions.

Backup strategy (defense in depth): continuous WAL archiving (point-in-time recovery to any second), daily full backups to a separate storage system, cross-region backup replication, and regular restore testing. A backup that has never been tested is not a backup.

Discuss disaster recovery planning: define RTO (Recovery Time Objective: how quickly you need to recover) and RPO (Recovery Point Objective: how much data loss is acceptable). These drive your replication and backup strategy.

Follow-up questions:

  • How do you handle failover without losing in-flight transactions?
  • What is the impact of replication lag on read consistency?
  • How would you test your disaster recovery plan?

9. Explain database connection pooling and why it matters at scale

What the interviewer is really asking: Do you understand the operational bottleneck that database connections create and how to manage them?

Answer framework:

Database connections are expensive: each connection consumes memory (5-10MB in PostgreSQL for work_mem, temp buffers, etc.), a process or thread, and file descriptors. A PostgreSQL server with 16GB of RAM can typically handle 200-500 connections. If you have 50 application servers each opening 20 connections, you need 1,000 connections, which can overwhelm the database.

Application-level pooling (HikariCP, SQLAlchemy pool): each application instance maintains a pool of reusable connections. A request checks out a connection, uses it, and returns it. Pool sizes of 10-20 per instance are typical. Advantages: simple, no additional infrastructure. Limitations: total connections = pool_size x number_of_instances, which still grows with horizontal scaling.

External connection pooler (PgBouncer, ProxySQL): sits between applications and the database. All application connections go to the pooler, which maintains a smaller pool of actual database connections. Multiplexes many client connections onto few database connections. This decouples application scaling from database connection scaling.

PgBouncer modes: session pooling (connection assigned for the session duration, least aggressive), transaction pooling (connection assigned for the transaction duration, returned between transactions, most common), and statement pooling (connection assigned per statement, most aggressive but breaks multi-statement transactions).

Sizing the pool: too small and requests queue waiting for connections (latency spikes). Too large and you overwhelm the database (context switching, memory pressure). A good starting point for the database-side pool: 2 x CPU cores + disk spindles. Monitor queue wait time and adjust.

Discuss connection management in microservices: each service has its own connection pool. With dozens of services, total connections multiply quickly. An external pooler or a shared database proxy becomes essential. Consider whether each service needs its own database (microservice pattern) or shares a database (reduces total connections but increases coupling).

Discuss serverless challenges: serverless functions (Lambda) can spin up thousands of instances simultaneously, each opening database connections. Solutions: Amazon RDS Proxy (connection pooling for Lambda), using DynamoDB instead (HTTP-based, no persistent connections), or implementing connection limits at the application layer.

Follow-up questions:

  • What happens when the connection pool is exhausted?
  • How do you diagnose connection leaks?
  • How does connection pooling interact with load balancing across read replicas?

10. How do you approach database migration and schema evolution?

What the interviewer is really asking: Can you evolve a production database schema safely without causing downtime or data loss?

Answer framework:

The golden rule: every migration must be backward compatible with the currently running application code. This enables zero-downtime deployments where the migration and code deployment are separate steps.

Safe operations: adding a nullable column (instant in PostgreSQL 11+, MySQL 8.0+), adding an index concurrently (CREATE INDEX CONCURRENTLY in PostgreSQL, no table lock), adding a new table, and adding a column with a default (instant in PostgreSQL 11+).

Dangerous operations: dropping a column (the running code might still reference it), renaming a column (breaks existing queries), changing a column type (requires rewriting the table), and adding a NOT NULL constraint (needs to validate all existing rows).

Safe pattern for dropping a column: (1) deploy new code that stops reading or writing the column, (2) wait for all old instances to be replaced, (3) run the migration to drop the column. This two-phase approach ensures the column is no longer in use before it is removed.

Safe pattern for renaming a column: (1) add the new column, (2) deploy code that writes to both old and new columns, (3) backfill the new column from the old column, (4) deploy code that reads from the new column, (5) deploy code that stops writing to the old column, (6) drop the old column. This is a multi-step process spanning multiple deployments.

For large table migrations: ALTER TABLE on a table with billions of rows can lock the table for hours. Tools like gh-ost (GitHub) and pt-online-schema-change (Percona) create a copy of the table, apply the migration to the copy, sync ongoing writes, and swap the tables atomically. This keeps the original table available throughout.

Discuss migration testing: always test migrations on a copy of production data (anonymized if needed). Measure how long the migration takes. Ensure the rollback migration works. Include both up and down migrations.

For distributed systems: schema migrations across a sharded database must be coordinated carefully. Apply the migration to all shards. If one shard fails, you must be able to roll back all shards. Tools like Vitess handle this orchestration.

Follow-up questions:

  • How do you handle a migration that fails halfway through?
  • How would you migrate a column from one type to another on a table with 10 billion rows?
  • How do you coordinate schema changes across microservices with shared databases?

11. Design a time-series data storage system

What the interviewer is really asking: Do you understand the unique access patterns and optimization opportunities for time-series data?

Answer framework:

Time-series data has unique characteristics: writes are append-only (no updates), queries are almost always time-range based, recent data is accessed far more frequently than old data, and data can be downsampled (aggregated) for older time ranges.

Storage engine considerations: LSM-tree based engines are ideal for time-series because writes are sequential (appending to the current time). B-tree engines are less efficient because they require random I/O to insert into the sorted tree.

Compression is critical for time-series: timestamps are highly compressible (delta-of-delta encoding, since timestamps arrive at regular intervals, reduces each timestamp to 1-2 bits), values use XOR encoding (consecutive values are often similar, XOR produces many leading zeros), and dictionary encoding for tag values (status=ok appears millions of times, store it once and reference by ID). These techniques achieve 10-15x compression.

Partitioning strategy: time-based partitioning (each partition covers a time range, e.g., one day). This enables efficient range queries (only scan relevant partitions) and efficient deletion of old data (drop an entire partition instead of deleting rows). Add tag-based sub-partitioning for queries that filter by specific tags.

Downsampling and retention: store raw data for the recent window (e.g., 30 days at 1-second resolution), downsample to 1-minute resolution for the last year, 1-hour resolution for the last 5 years. Run downsampling jobs in the background. This dramatically reduces storage costs while preserving long-term trends.

Query optimization: pre-compute common aggregations (rollups) at multiple time resolutions. A query for "average CPU over the last year" should read pre-computed hourly rollups, not scan billions of raw data points.

Discuss purpose-built time-series databases: InfluxDB, TimescaleDB (PostgreSQL extension), and Prometheus. Compare with using a general-purpose database: PostgreSQL with TimescaleDB is a good middle ground that gives you SQL compatibility with time-series optimizations.

Follow-up questions:

  • How would you handle out-of-order data arriving late?
  • How do you query across multiple time partitions efficiently?
  • When would you use Prometheus vs InfluxDB vs TimescaleDB?

12. How do you design for data consistency in a microservices architecture?

What the interviewer is really asking: Do you understand the challenges of maintaining data consistency when each service owns its own database?

Answer framework:

In a microservices architecture, each service owns its data. There is no shared database and no distributed transactions (in practice). Yet many business operations span multiple services. How do you maintain consistency?

The saga pattern: decompose a distributed transaction into a sequence of local transactions. Each service performs its local transaction and publishes an event. The next service reacts to the event. If any step fails, compensating transactions undo previous steps. Orchestration sagas (a central coordinator) are easier to reason about than choreography sagas (each service reacts independently).

Event sourcing: store the sequence of events that led to the current state, not just the current state. The event log is the source of truth. Any service can rebuild its state by replaying events. This naturally provides an audit trail and enables temporal queries. Store events in an append-only event store and use projections to build materialized views for queries.

CQRS: separate the write model (optimized for validation and business rules) from the read model (optimized for queries). Update the read model asynchronously from the write model using events. This accepts eventual consistency between writes and reads but enables each to be optimized independently.

Outbox pattern: when a service needs to update its database AND publish an event atomically (to avoid the dual-write problem where one succeeds and the other fails), write the event to an outbox table in the same database transaction as the business data. A separate process (Debezium CDC, polling publisher) reads the outbox and publishes to Kafka.

Discuss the practical implications of eventual consistency: users might see stale data for a few seconds after a write. Design the UI to account for this. After a user creates an order, show the order from the local cache rather than querying the read model that might not have been updated yet. This is read-your-own-writes consistency at the application level.

Follow-up questions:

  • How do you handle a compensating transaction that fails?
  • How does event sourcing handle schema evolution of events?
  • What is the dual-write problem and how does the outbox pattern solve it?

13. Explain query optimization strategies for complex queries

What the interviewer is really asking: Can you go beyond adding indexes and reason about query execution at a deeper level?

Answer framework:

Start with EXPLAIN ANALYZE: this shows the actual execution plan, not just the estimated one. Look for sequential scans on large tables (often means a missing index), nested loop joins on large tables (consider hash join or merge join), high row estimates vs actual (statistics might be stale, run ANALYZE), and sort operations that spill to disk (increase work_mem or add an index that provides the ordering).

Query rewriting techniques: replace correlated subqueries with JOINs (subqueries execute once per row, JOINs execute once), use CTEs wisely (in PostgreSQL pre-12, CTEs were optimization fences, in 12+ they can be inlined), replace IN (SELECT ...) with EXISTS for large subquery results, and use window functions instead of self-joins for running totals and ranking.

Join optimization: understand the three join algorithms. Nested loop join: for each row in table A, scan table B. O(nm) but fast when the inner table has an index. Hash join: build a hash table of the smaller table, probe with the larger table. O(n+m) but requires memory for the hash table. Merge join: sort both tables and merge. O(n log n + m log m) but efficient when data is already sorted (from an index).

Partition pruning: if tables are partitioned (by date, tenant, region), ensure queries include the partition key so the database only scans relevant partitions. Without the partition key, the database scans all partitions.

Materialized views: pre-compute expensive queries and store the results. Refresh periodically or on demand. Useful for dashboards, reports, and aggregations. Trade-off: storage space and refresh latency vs query speed.

Denormalization for performance: join-heavy queries on normalized schemas can be slow. Selectively denormalize by adding redundant columns or pre-joined tables. Trade-off: faster reads but complex write logic (must update all copies). Use triggers or application-level logic to maintain consistency.

Discuss caching: for queries that do not need real-time data, cache the results in Redis. Cache invalidation is the challenge: use TTL-based expiration for approximate freshness, or event-based invalidation for stronger freshness guarantees.

Follow-up questions:

  • How do you optimize a query that must scan a billion-row table?
  • How do you handle slow queries in a production environment without affecting users?
  • What is the difference between query optimization in OLTP vs OLAP databases?

14. When and how would you use a graph database?

What the interviewer is really asking: Do you understand when the relational model breaks down and graph databases provide genuine advantages?

Answer framework:

Graph databases shine when the core queries involve traversing relationships of variable depth. In a relational database, each level of traversal requires an additional JOIN, and the query complexity grows exponentially. In a graph database, traversal is a constant-time operation per hop.

Use cases where graph databases excel: social networks (find friends of friends, degrees of separation), fraud detection (find circular payment patterns, connected suspicious accounts), recommendation engines (users who liked similar items), knowledge graphs (entity relationships with properties), and access control (can user X access resource Y through any chain of permissions?).

The two main graph models: property graphs (nodes and edges have properties, used by Neo4j) and RDF/triple stores (subject-predicate-object triples, used by Amazon Neptune, Stardog). Property graphs are more intuitive for application developers. RDF is more suited for semantic web and knowledge management.

Query languages: Cypher (Neo4j, intuitive, SQL-like), Gremlin (Apache TinkerPop, imperative traversal), and SPARQL (RDF stores, pattern matching). Cypher is the most developer-friendly.

When NOT to use a graph database: simple CRUD operations with well-defined schemas (use relational), write-heavy workloads (graph databases are typically optimized for reads), tabular or document data without complex relationships, or when you need strong ACID transactions across the entire graph (though Neo4j supports ACID).

Performance characteristics: graph traversal is O(k^d) where k is the average degree and d is the depth. For shallow traversals (2-3 hops), graph databases are dramatically faster than relational databases. For deep traversals (6+ hops), even graph databases can be slow.

Hybrid approaches: use a relational database as the primary data store and sync a subset of the data (entities and relationships) to a graph database for graph-specific queries. Use event-driven architecture to keep them in sync.

Follow-up questions:

  • How would you model a social network graph for efficient "mutual friends" queries?
  • How do graph databases handle horizontal scaling?
  • When would you use a graph database vs a recursive CTE in PostgreSQL?

15. Design a database archival and data lifecycle management strategy

What the interviewer is really asking: Can you think about the long-term operational aspects of database management, including cost, compliance, and performance?

Answer framework:

Data lifecycle management addresses the fact that not all data is equally valuable or equally accessed over time. Recent data is hot (frequently accessed, needs fast storage). Older data is warm (occasionally accessed) or cold (rarely accessed, kept for compliance).

Tiered storage: keep hot data in the primary database (fast SSD, fully indexed). Move warm data to a read-optimized store (columnar format, compressed, cheaper storage). Move cold data to archive storage (S3, Glacier, tape). Define retention policies per data type: transactional data might be hot for 90 days, warm for 1 year, cold for 7 years.

Archival strategies: partition tables by date and detach old partitions (PostgreSQL detach partition). This is fast (metadata operation) and does not disrupt the active table. Export detached partitions to compressed storage (Parquet format on S3). Maintain a metadata catalog so queries can find archived data if needed.

For compliance: GDPR requires data deletion on request. If data is archived, you must be able to find and delete specific records in archives. Use a deletion log: when a deletion request comes in, check if the user's data has been archived, mark it for deletion in the archive, and process during the next archive maintenance window.

For database design: plan for archival from the start. Use partition-friendly schemas (include a date column in every table that generates high-volume data). Avoid foreign keys that reference archived tables (use logical references instead).

Cost optimization: hot storage (SSD) costs roughly 10x more per GB than warm storage (HDD) and 100x more than cold storage (S3 Glacier). For a database with 10TB of data where only 1TB is actively used, proper tiering can reduce storage costs by 80% or more.

Discuss the operational aspect: automate the lifecycle management. Schedule archival jobs during low-traffic periods. Monitor that archival is keeping pace with data growth. Alert if the primary database grows beyond expected thresholds.

Follow-up questions:

  • How do you handle queries that span hot and archived data?
  • How do you test that archived data can be successfully restored?
  • How would you handle GDPR deletion requests for data that has been archived to immutable storage?

Common Mistakes in Database Design Interviews

  1. Choosing a database based on hype. Picking MongoDB because it is popular without understanding if a document model fits the data. Always start with requirements and access patterns.

  2. Over-normalizing. Pure third normal form leads to joins on every query. Controlled denormalization for performance-critical paths is a sign of pragmatic engineering.

  3. Ignoring write patterns. Focusing only on read optimization while the system is actually write-heavy. Consider the write amplification of your schema and index choices.

  4. Not planning for growth. A schema that works at 1M rows can fail catastrophically at 1B rows. Discuss how your design scales and when you would need to make changes.

  5. Forgetting operational concerns. Not discussing backups, monitoring, migration strategies, and disaster recovery. These matter as much as the schema itself.

How to Prepare for Database Design Interviews

Practice designing schemas for common systems: e-commerce, social media, messaging, and analytics platforms. For each, think about the data model, access patterns, indexes, and scaling strategy.

Study the internals of at least one database deeply: how the storage engine works, how the query planner makes decisions, and how replication works. PostgreSQL's documentation is exceptional for this.

Review our database design concepts, SQL vs NoSQL comparison, and the system design interview guide. For hands-on practice, explore our learning paths. For staff-level preparation, review the senior to staff engineer transition guide.

Related Resources

GO DEEPER

Master this topic in our 12-week cohort

Our Advanced System Design cohort covers this and 11 other deep-dive topics with live sessions, assignments, and expert feedback.