REST API Interview Questions for Senior Engineers (2026)

Why REST API Design Matters in Senior Engineering Interviews

REST API design interviews assess whether a senior engineer can create interfaces that are intuitive, scalable, secure, and maintainable over years of evolution. APIs are the contracts between services, between teams, and between companies and their customers. A poorly designed API creates compounding technical debt as every consumer builds against its quirks, making changes increasingly expensive and risky.

At companies like Google and Amazon, API design is treated as a first-class engineering discipline. AWS's success is built on well-designed APIs that have remained stable for over a decade while growing in capability. Senior engineers are expected to design APIs that other teams can consume without constant support, that handle edge cases gracefully, and that can evolve without breaking existing clients.

Interviewers evaluate your understanding of how REST APIs work at a deep level: not just GET and POST, but idempotency guarantees, cache semantics, content negotiation, HATEOAS, and the subtle distinctions between REST and RPC-style APIs. They want to see that you can make pragmatic decisions, knowing when to follow REST orthodoxy and when to deviate for practical reasons.

This guide covers 15 commonly asked REST API questions with structured answer frameworks that demonstrate the depth expected at the senior level. For related topics, explore REST vs GraphQL and REST vs gRPC comparisons. For a complete preparation strategy, see our system design interview guide and explore learning paths tailored to senior engineers.

1. What makes an API truly RESTful versus just HTTP-based?

What the interviewer is really asking: Do you understand the architectural constraints that define REST, and can you articulate why most APIs claiming to be RESTful are actually RPC-over-HTTP?

Answer framework:

REST (Representational State Transfer) is an architectural style defined by Roy Fielding in his 2000 dissertation. It is defined by six constraints, not by using HTTP or JSON. Many APIs labeled RESTful only satisfy one or two constraints.

The six constraints are: Client-Server separation (independent evolution of client and server), Statelessness (each request contains all information needed to process it, no server-side session state), Cacheability (responses explicitly declare whether they can be cached), Uniform Interface (the most important and most misunderstood constraint), Layered System (client cannot tell whether it communicates directly with the server or through intermediaries), and Code on Demand (optional, server can extend client functionality by sending executable code).

The Uniform Interface constraint has four sub-constraints: resource identification through URIs (every resource has a unique identifier), resource manipulation through representations (clients work with representations like JSON, not the resource itself), self-descriptive messages (each message contains enough information to describe how to process it, including content type and cache directives), and Hypermedia as the Engine of Application State (HATEOAS, responses contain links that tell the client what actions are available next).

Most production APIs stop at resource identification and manipulation. They use nouns in URLs and HTTP methods correctly, but lack self-descriptive messages (relying on out-of-band documentation) and completely ignore HATEOAS. This makes them HTTP-based APIs rather than truly RESTful APIs. The distinction matters because HATEOAS enables independent client-server evolution. Without it, clients are tightly coupled to URL structures documented externally.

Pragmatically, most teams adopt REST Level 2 on the Richardson Maturity Model (resources plus HTTP verbs) and add selective HATEOAS for navigational links. Full Level 3 HATEOAS compliance is rare outside of hypermedia-native domains. Understanding this spectrum shows maturity since you know the ideal, you know the practical, and you can articulate why you choose a specific level.

For deeper understanding of how these principles compare to alternative approaches, see REST vs GraphQL and REST vs gRPC.

Follow-up questions:

When would you choose to deviate from RESTful principles and why?
How does HATEOAS work in practice and when is it worth the implementation cost?
What is the Richardson Maturity Model and where does your typical API design fall?

2. How do you design resource URLs for a complex domain?

What the interviewer is really asking: Can you model a domain as resources with clear hierarchies, handle relationships between resources, and create intuitive URL structures that scale?

Answer framework:

Resource URL design starts with identifying the nouns in your domain. For an e-commerce system: products, orders, customers, reviews, inventory. Each becomes a resource collection: /products, /orders, /customers.

For hierarchical relationships, use nesting judiciously. /customers/123/orders makes sense because orders belong to a customer. But limit nesting to one or two levels. /customers/123/orders/456/items/789/reviews is too deep. Instead, promote deeply nested resources to top-level: /order-items/789/reviews. The rule: nest when the child cannot exist without the parent and clients commonly access children through the parent.

For non-hierarchical relationships, use query parameters or links. A product has reviews from many customers: /products/456/reviews (acceptable nesting, reviews belong to a product). But an order involves multiple products: don't nest orders under products. Instead, /orders/456 returns a representation that links to its products.

Resource naming conventions: use plural nouns (/products not /product), lowercase with hyphens for multi-word (/order-items not /orderItems), and no verbs in URLs. Actions that don't fit CRUD naturally (like sending an email or cancelling an order) can use sub-resource actions: POST /orders/456/cancellation rather than POST /cancel-order. This models the action as creating a resource (a cancellation record).

For filtering, sorting, and field selection, use query parameters: /products?category=electronics&sort=-price&fields=id,name,price. The resource URL identifies what, query parameters modify how.

Handle API versioning in the URL (/v1/products) or via headers (Accept: application/vnd.api.v1+json). URL versioning is more discoverable and easier to route at the infrastructure level. Header versioning is more RESTful but creates operational complexity.

Always design for the consumer's mental model. If your customers think in terms of "my orders," provide /me/orders as a convenience alias for /customers/{authenticated_user_id}/orders.

Follow-up questions:

How would you handle a resource that belongs to multiple parents?
When do you use query parameters versus path parameters?
How do you design URLs for actions that do not map to CRUD operations?

3. Explain idempotency in REST APIs and why it matters

What the interviewer is really asking: Do you understand which HTTP methods are idempotent, why this matters for reliability, and how to implement idempotency for non-idempotent operations?

Answer framework:

An operation is idempotent if performing it multiple times produces the same result as performing it once. In REST, GET, PUT, DELETE, and HEAD are defined as idempotent. POST and PATCH are not inherently idempotent.

Why idempotency matters in distributed systems: networks are unreliable. A client sends a request, the server processes it, but the response is lost due to a network timeout. The client does not know if the operation succeeded. With an idempotent operation, the client can safely retry without fear of duplicate side effects. Without idempotency, a retry might charge a credit card twice or create duplicate orders.

GET is naturally idempotent as reading data does not change state. PUT is idempotent because setting a resource to a specific state repeatedly yields the same state. DELETE is idempotent because deleting something that is already deleted has no additional effect (though you might return 404 on subsequent calls, the server state is unchanged).

POST is the problem. Creating a resource with POST is not idempotent because each call creates a new resource. To make POST idempotent, implement an idempotency key pattern. The client generates a unique key (UUID) and sends it in a header (Idempotency-Key: abc-123). The server: (1) checks if this key was seen before, (2) if yes, returns the stored response without re-executing, (3) if no, executes the operation, stores the response against the key, and returns it. Stripe uses this pattern for payment APIs.

Implementation details matter. Store idempotency keys in a fast store like Redis with a TTL (24-48 hours). The stored value includes the response status code, headers, and body. Handle race conditions: if two identical requests arrive simultaneously, use a lock on the idempotency key so only one executes while the other waits for the result.

PATCH idempotency depends on the patch format. A JSON Merge Patch (set field X to value Y) is idempotent. A JSON Patch with relative operations (increment counter by 1) is not. Choose your patch format based on whether idempotency is required.

For the system design perspective on how idempotency enables reliable distributed systems, see how REST APIs work and the URL shortener system design which discusses idempotency in URL creation.

Follow-up questions:

How long should you retain idempotency keys?
What happens if the server crashes after processing but before storing the idempotency response?
How does idempotency interact with concurrent requests using the same key?

4. How do you handle API versioning and breaking changes?

What the interviewer is really asking: Can you balance API evolution with backward compatibility, and do you have a strategy for managing multiple API versions in production?

Answer framework:

API versioning strategies fall into four categories, each with distinct trade-offs.

URL path versioning (/v1/users, /v2/users): most common in practice (used by Stripe, Twitter). Advantages: explicit, easy to route at the load balancer level, simple for consumers to understand. Disadvantages: not RESTful (the URI should identify the resource, not the representation version), leads to code duplication if versions share most logic.

Query parameter versioning (/users?version=2): similar to URL versioning but keeps the resource path clean. Less common, harder to cache because caches often ignore query parameters by default.

Header versioning (Accept: application/vnd.myapi.v2+json): most RESTful approach, uses content negotiation. Advantages: resource URLs remain stable, versioning is a representation concern. Disadvantages: harder to test in a browser, less discoverable, complicates routing.

No explicit versioning with additive changes only: the ideal but requires discipline. Never remove fields, never change field types, never change field semantics. Add new fields freely. Use feature flags or capability discovery for new behaviors. This works well for internal APIs with controlled consumers but poorly for public APIs with unknown consumers.

For managing breaking changes, define what constitutes a breaking change: removing a field, renaming a field, changing a field's type, changing the meaning of a value, removing an endpoint, changing authentication requirements, or changing error response formats. Adding optional fields, adding new endpoints, and adding new optional parameters are non-breaking.

Sunset strategy: when deprecating a version, set a Sunset HTTP header with the deprecation date. Provide migration guides. Monitor traffic per version and reach out to consumers still on deprecated versions. Maintain deprecated versions for a minimum period (6-12 months for public APIs).

Internally, implement versioning with an adapter layer. The core business logic is unversioned. Version-specific adapters transform between the versioned API contract and the internal representation. This avoids duplicating business logic across versions.

Follow-up questions:

How do you handle versioning in a microservices architecture where services consume each other's APIs?
What is your strategy for database schema changes that affect multiple API versions?
How do you communicate deprecation effectively to API consumers?

5. Design pagination for a REST API that handles millions of records

What the interviewer is really asking: Do you understand the trade-offs between pagination strategies, and can you handle edge cases like concurrent modifications during pagination?

Answer framework:

Three primary pagination strategies exist, each suited to different use cases.

Offset-based pagination (/products?offset=40&limit=20): the simplest approach. The database skips offset rows and returns limit rows. Advantages: supports jumping to any page, simple to implement. Disadvantages: performance degrades with large offsets (the database must scan and discard offset rows), inconsistent results when data is inserted or deleted during pagination (items can be skipped or duplicated).

Cursor-based pagination (/products?after=eyJpZCI6MTAwfQ&limit=20): the cursor is an opaque token encoding the position (usually the last seen ID or sort key). The server decodes the cursor and uses it in a WHERE clause (WHERE id > 100 LIMIT 20). Advantages: consistent O(1) performance regardless of position, stable results during concurrent modifications. Disadvantages: cannot jump to arbitrary pages, only forward/backward navigation.

Keyset pagination (a specific implementation of cursor-based): use the natural sort key as the cursor. For /products?sort=price, the cursor is the last seen price and ID: WHERE (price > 29.99 OR (price = 29.99 AND id > 500)) LIMIT 20. Handles non-unique sort keys by including the primary key as a tiebreaker.

Response format should include pagination metadata: total count (optional, expensive for large datasets), links to next/previous pages (HATEOAS), and whether more results exist. Example response structure: {data: [...], pagination: {next_cursor: "...", has_more: true}}. Including total_count is optional because COUNT() on millions of rows is expensive. Provide it only if the UI needs it, and consider caching approximate counts.

For system design at scale, cursor-based pagination is strongly preferred. Offset pagination with OFFSET 1000000 forces the database to scan a million rows. Cursor pagination with WHERE id > last_id always uses the index efficiently.

Handle edge cases: what if the item pointed to by a cursor is deleted? The query still works because it is finding items after that position, not the item itself. What about sorting by non-unique fields? Always include a unique tiebreaker (primary key) in the sort to ensure deterministic ordering.

For real-time feeds where new items are constantly added, cursor-based pagination naturally handles this: paginating forward shows newer items, and the cursor ensures no items are missed.

Follow-up questions:

How would you implement pagination for a search endpoint with relevance scoring?
How do you handle pagination when the sort order can change between requests?
What is your strategy for paginated exports of very large datasets?

6. How do you design error responses for a REST API?

What the interviewer is really asking: Can you create a consistent error format that helps consumers debug issues, uses HTTP status codes correctly, and handles different error scenarios appropriately?

Answer framework:

Start with correct HTTP status code usage. The status code is the first thing clients check and it drives retry logic. 4xx means client error (do not retry without changes). 5xx means server error (safe to retry). Specific codes matter: 400 (malformed request), 401 (not authenticated), 403 (authenticated but not authorized), 404 (resource not found), 409 (conflict, like duplicate creation), 422 (syntactically valid but semantically invalid), 429 (rate limited).

Design a consistent error response body. Every error response should follow the same structure regardless of the error type. A proven format: {error: {code: "VALIDATION_ERROR", message: "Human-readable description", details: [...], request_id: "req_abc123", documentation_url: "https://docs.example.com/errors/VALIDATION_ERROR"}}.

The code field is a machine-readable string that consumers can switch on. The message is human-readable and may change without notice. The details array provides field-level information for validation errors: [{field: "email", code: "INVALID_FORMAT", message: "Must be a valid email address"}].

Always include a request_id in error responses. This allows consumers to reference specific failures when contacting support, and allows the API team to trace the request through their systems.

For validation errors (422), return all validation failures at once, not just the first one. Consumers should be able to fix all issues and retry once rather than playing whack-a-mole with sequential errors.

Never leak implementation details in error messages. Do not expose stack traces, database column names, or internal service names in production. These are security risks. Log detailed error information server-side with the request_id as the correlation key.

For rate limiting (429), include Retry-After header telling the client when to retry. Include rate limit headers on all responses: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset. This lets clients self-throttle before hitting the limit.

Design error codes to be actionable. PAYMENT_METHOD_EXPIRED tells the consumer exactly what to fix. INTERNAL_ERROR tells them nothing. Invest in specific, actionable error codes for common failure modes.

Follow-up questions:

How do you handle errors in batch/bulk operations where some items succeed and some fail?
How do you version error response formats without breaking error-handling code in consumers?
How do you handle errors that originate in downstream services?

7. How do you secure a REST API?

What the interviewer is really asking: Do you understand authentication, authorization, rate limiting, input validation, and the full spectrum of API security concerns at a senior level?

Answer framework:

API security is defense in depth with multiple layers.

Authentication: verify who is making the request. For machine-to-machine: API keys (simple but no built-in expiration, rotate regularly) or OAuth 2.0 client credentials flow (more complex but supports scoping and automatic expiration). For user-facing: OAuth 2.0 authorization code flow with PKCE for SPAs and mobile apps. Never accept plain passwords in API requests. Use short-lived access tokens (15-60 minutes) with refresh tokens for renewal.

JWT implementation details: include minimal claims (user ID, roles, expiration), sign with RS256 (asymmetric) for distributed systems so services can verify without the signing key, validate issuer/audience/expiration on every request, and handle token revocation through short expiry plus a revocation list for compromised tokens.

Authorization: verify what the authenticated entity is allowed to do. Implement at multiple levels: endpoint-level (can this role access this endpoint at all?), resource-level (can this user access this specific resource?), and field-level (can this user see salary fields?). Use RBAC (Role-Based Access Control) for coarse-grained permissions and ABAC (Attribute-Based Access Control) for fine-grained rules.

Rate limiting: protect against abuse and ensure fair usage. Implement multiple tiers: per-IP rate limits for unauthenticated traffic, per-API-key limits for authenticated traffic, per-endpoint limits for expensive operations. Use the token bucket algorithm for smooth rate limiting. Return 429 with Retry-After header when limits are exceeded.

Input validation: never trust client input. Validate content types (reject unexpected Content-Type headers), validate request body against a schema, sanitize string inputs against injection attacks, enforce maximum request body sizes, and validate path/query parameters. Use allowlists over denylists.

Transport security: require HTTPS everywhere (HSTS header), use TLS 1.3 minimum, pin certificates for mobile clients, and disable insecure cipher suites.

Additional defenses: CORS configuration (whitelist allowed origins), content security headers, request signing for webhook callbacks (HMAC-SHA256), and audit logging of all authentication and authorization events for forensic analysis.

Follow-up questions:

How do you handle API key rotation without downtime?
How do you implement fine-grained authorization for multi-tenant APIs?
How do you secure webhooks that your API sends to consumer endpoints?

8. How do you design a REST API for long-running operations?

What the interviewer is really asking: Can you handle asynchronous workflows over a synchronous protocol, provide status tracking, and handle timeouts and failures gracefully?

Answer framework:

Long-running operations (report generation, video transcoding, data migrations) cannot return results within a typical HTTP timeout (30-60 seconds). The standard pattern is asynchronous request-response.

The flow: Client sends POST /reports (to start the operation). Server immediately returns 202 Accepted with a Location header pointing to a status resource: Location: /operations/op_abc123. The response body includes the operation ID and initial status.

The client polls the status endpoint: GET /operations/op_abc123. The response includes: status (PENDING, IN_PROGRESS, COMPLETED, FAILED), progress percentage if available, estimated time remaining, and when completed, a link to the result resource.

Status resource example: {id: "op_abc123", status: "IN_PROGRESS", progress: 65, estimated_completion: "2026-04-20T14:30:00Z", created_at: "...", result_url: null}. When completed: {id: "op_abc123", status: "COMPLETED", progress: 100, result_url: "/reports/rpt_xyz789"}.

Alternatives to polling: webhooks (server calls a client-provided URL when the operation completes) and Server-Sent Events (SSE, server pushes status updates over a persistent HTTP connection). Webhooks are preferred for server-to-server scenarios. SSE is good for browser clients that need real-time progress updates.

Design for failure: long-running operations can fail mid-way. Provide clear failure information: {status: "FAILED", error: {code: "INSUFFICIENT_DATA", message: "Source dataset contains no records for the specified date range"}}. Allow retry: POST /operations/op_abc123/retry to restart a failed operation.

Idempotency is critical here. If a client submits the same report request twice, should it create two operations or return the existing one? Use idempotency keys (as discussed in question 3) to deduplicate. For naturally idempotent operations (generate report for date range X), use the parameters as a natural deduplication key.

For cancellation, support DELETE /operations/op_abc123 or POST /operations/op_abc123/cancellation. The operation transitions to CANCELLING then CANCELLED. Cancellation may not be immediate if the operation is mid-processing.

This pattern appears extensively in cloud APIs. AWS uses it for CloudFormation stacks, GCP for long-running operations, and Azure for async resource provisioning. For a deeper look at how this fits into larger architectures, see our system design interview guide.

Follow-up questions:

How do you handle operations that take hours or days?
How do you implement progress reporting for operations with unpredictable duration?
What is your strategy for cleaning up resources from abandoned operations?

9. How do you design API rate limiting?

What the interviewer is really asking: Can you implement rate limiting that is fair, transparent, and handles edge cases like distributed systems and burst traffic?

Answer framework:

Rate limiting protects API availability, ensures fair resource allocation among consumers, and prevents abuse. Design decisions include the algorithm, the limit dimensions, and the consumer communication strategy.

Algorithms: Token bucket (most common): a bucket holds N tokens, refills at a fixed rate. Each request consumes one token. When empty, requests are rejected. This naturally allows bursts (up to bucket capacity) while enforcing an average rate. Sliding window: track request count in a time window that slides with each request. More precise than fixed windows, avoids the boundary burst problem (where a consumer uses their limit at the end of one window and start of the next, getting 2x in a short period). Leaky bucket: requests enter a queue processed at a fixed rate. Smooths traffic but adds latency.

Limit dimensions: per API key (primary), per endpoint (expensive endpoints get lower limits), per IP (for unauthenticated traffic), per user (for multi-user API keys). Apply the most restrictive applicable limit. Example: global limit of 1000 req/min per key, plus endpoint-specific limit of 10 req/min for /reports (expensive operation).

Communication via headers on every response: X-RateLimit-Limit (the limit ceiling), X-RateLimit-Remaining (requests remaining in current window), X-RateLimit-Reset (Unix timestamp when the window resets). On 429 responses, include Retry-After header with seconds to wait.

Distributed rate limiting: in a multi-server deployment, rate limits must be shared across instances. Use Redis with atomic increment operations. The INCR command with EXPIRE provides a simple fixed-window counter. For token bucket, store the bucket state (token count and last refill timestamp) in Redis. Use Lua scripts for atomic check-and-decrement to prevent race conditions.

Graceful degradation: instead of hard rejection at the limit, consider a degraded response mode. Allow requests slightly over the limit to proceed but with reduced functionality (cached results instead of fresh queries, lower quality responses). This improves user experience during brief spikes.

Consider tiered limits based on pricing plans (relevant for commercial APIs). Free tier: 100 req/hour. Professional: 10,000 req/hour. Enterprise: custom limits. This connects to pricing strategy for API products.

For truly fair rate limiting in a multi-tenant system, use weighted fair queuing: allocate capacity proportional to each consumer's tier, ensuring that one heavy user cannot starve others.

Follow-up questions:

How do you handle rate limiting for webhook deliveries where you are the client?
How do you implement dynamic rate limiting that adjusts based on system load?
How do you design rate limits for APIs with both synchronous and async operations?

10. How do you handle partial updates in a REST API?

What the interviewer is really asking: Do you understand the nuances of PATCH versus PUT, the different patch formats, and the challenges of partial updates in complex domains?

Answer framework:

PUT replaces the entire resource. The client sends a complete representation, and the server replaces whatever is stored. PUT is idempotent by definition. PATCH applies a partial modification. The client sends only the changes, and the server applies them to the existing resource.

The challenge with PATCH is defining what "the changes" means. Two standard formats exist.

JSON Merge Patch (RFC 7396): send a JSON object with only the fields to change. {name: "New Name", email: "new@example.com"} updates name and email, leaving other fields unchanged. To remove a field, set it to null: {middle_name: null}. Limitation: you cannot distinguish between "set to null" and "do not change" for nullable fields. This format works well for simple flat objects.

JSON Patch (RFC 6902): send an array of operations. [{op: "replace", path: "/name", value: "New Name"}, {op: "remove", path: "/middle_name"}, {op: "add", path: "/addresses/2", value: {...}}]. More powerful: supports add, remove, replace, move, copy, and test (conditional). The test operation enables optimistic concurrency: [{op: "test", path: "/version", value: 5}, {op: "replace", path: "/status", value: "active"}]. If the version is not 5, the entire patch fails.

Choosing between them: JSON Merge Patch for simple APIs with flat resources. JSON Patch for complex resources, arrays, or when you need atomic conditional updates.

Concurrency control for partial updates: two users editing the same resource simultaneously can overwrite each other's changes. Solutions: ETag/If-Match headers (server returns ETag with GET, client sends If-Match with PATCH, server rejects if resource changed), version fields (include a version number in the resource, reject patches with stale versions), or field-level timestamps (track when each field was last modified, reject changes to fields modified since the client last fetched).

Validation for partial updates: validate the resulting state, not just the patch. A patch might be syntactically valid but create an invalid resource state (removing a required field, setting conflicting values). Apply business rules to the complete post-patch state.

For nested resources, consider whether PATCH should apply recursively. Patching {address: {city: "Boston"}} should not remove address.street and address.zip. JSON Merge Patch specifies recursive merge for objects but replacement for other types including arrays. Document this behavior clearly.

Follow-up questions:

How do you handle PATCH on a collection (bulk partial updates)?
How do you validate that a partial update does not violate business rules?
How do you design PATCH for resources with complex nested structures?

11. How do you design API authentication for a microservices architecture?

What the interviewer is really asking: Can you handle the complexity of authentication and authorization across service boundaries, token propagation, and the performance implications of validation at every hop?

Answer framework:

In a microservices architecture, the challenge is authenticating the external user at the edge and propagating identity securely to downstream services without creating a performance bottleneck.

The API Gateway pattern: a single entry point handles authentication for all external requests. The gateway validates the token (JWT signature verification, expiration check, revocation check), extracts identity claims, and forwards the request to internal services with identity information in headers or a validated JWT. This centralizes authentication logic and keeps internal services simpler.

Token propagation approaches: (1) Forward the original JWT to downstream services. Each service re-validates (fast for JWTs since only signature verification is needed). Advantage: each service independently verifies authenticity. Disadvantage: if the token is compromised, every service trusts it until expiration. (2) Gateway replaces the external token with an internal token (token exchange). The internal token contains enriched claims (full permissions, organizational context). Advantage: external and internal token concerns are separated. Disadvantage: added latency for token exchange.

Service-to-service authentication: internal services must also authenticate each other to prevent unauthorized internal access. Use mutual TLS (mTLS) where each service has a certificate, and services verify each other's certificates. Alternatively, use service-level JWTs issued by an internal authority with short expiration (5 minutes).

Authorization in microservices: centralized versus decentralized. Centralized: a Policy Decision Point (PDP) service evaluates authorization rules. Services call the PDP with (subject, action, resource) and receive allow/deny. Advantage: consistent policy enforcement. Disadvantage: PDP becomes a latency-adding bottleneck. Decentralized: each service enforces its own authorization rules based on the claims in the propagated token. Advantage: no extra network call. Disadvantage: policy changes require redeploying services.

Performance optimization: cache authorization decisions at the gateway level for read operations. Use short-lived caches (30-60 seconds) for permission checks. Pre-compute permission sets into the JWT claims so services do not need to look up permissions per request.

For the broader system architecture considerations, see our system design interview guide and how REST APIs work for the fundamentals of request authentication flows.

Follow-up questions:

How do you handle token revocation in a stateless JWT-based system?
How do you implement cross-service authorization for complex workflows?
How do you handle authentication for background jobs and event-driven communication?

12. How do you design a REST API for real-time data?

What the interviewer is really asking: Can you reconcile the request-response nature of REST with real-time requirements, and do you know when to extend beyond pure REST?

Answer framework:

REST is fundamentally request-response: the client asks, the server answers. Real-time data (live scores, stock prices, collaborative editing) requires server-to-client push. Several approaches bridge this gap.

Short polling: the client repeatedly requests the resource at intervals (every 2-5 seconds). Simple and REST-compatible, but wasteful (most responses are "no change") and limited in freshness (updates are delayed by the polling interval). Use conditional requests (If-None-Match with ETag) to reduce bandwidth. Appropriate for: low-frequency updates where seconds of delay are acceptable.

Long polling: the client sends a request, the server holds the connection open until data is available (or timeout), then responds. The client immediately sends another request. Near-real-time delivery, REST-compatible, works through all proxies and firewalls. More efficient than short polling but complex to implement correctly (handling timeouts, reconnection, message ordering). Appropriate for: moderate-frequency updates, environments where WebSockets are blocked.

Server-Sent Events (SSE): the server sends a stream of events over a persistent HTTP connection. Standardized format (event: type, data: JSON), automatic reconnection with Last-Event-ID, works through HTTP/2 multiplexing. Unidirectional (server to client only). Appropriate for: one-way notifications, live feeds, progress updates. Fits the REST model better than WebSockets since it uses standard HTTP.

WebSockets: full-duplex communication over a persistent connection. Not REST (it is a different protocol), but necessary for bidirectional real-time communication (chat, collaborative editing, gaming). Appropriate for: high-frequency bidirectional communication.

The pragmatic REST approach for real-time: provide a standard REST endpoint for current state (GET /stocks/AAPL returns the current price) and a separate SSE or WebSocket endpoint for updates (GET /stocks/AAPL/stream). The REST endpoint serves as the source of truth and recovery mechanism, the stream provides real-time updates. Clients subscribe to the stream but can always fall back to polling the REST endpoint if the stream disconnects.

For comparing REST with other approaches, note that GraphQL has built-in subscription support for real-time data, which is one area where it provides a more cohesive solution than REST.

Follow-up questions:

How do you handle reconnection and catch up on missed events with SSE?
How do you authenticate persistent connections like WebSockets?
What is your strategy for graceful degradation when real-time delivery fails?

13. How do you design bulk and batch operations in a REST API?

What the interviewer is really asking: Can you handle the tension between RESTful resource-per-request semantics and the practical need for efficient bulk operations?

Answer framework:

Bulk operations are essential for performance: creating 1000 resources one-by-one means 1000 HTTP round trips. At intercontinental latencies (100-200ms), that is 100-200 seconds of latency alone. Batch endpoints reduce this to a single request.

Approach 1 - Batch resource endpoint: POST /products/batch with a body containing an array of resources to create. Returns an array of results (one per input). This is the simplest approach and works well for homogeneous operations (all creates, all updates). Response includes per-item status: [{id: "p1", status: "created"}, {id: "p2", status: "failed", error: {...}}].

Approach 2 - Batch operations endpoint: POST /batch with a body containing multiple heterogeneous operations: [{method: "POST", path: "/products", body: {...}}, {method: "PATCH", path: "/products/123", body: {...}}, {method: "DELETE", path: "/products/456"}]. The server processes each as if it were an individual request. Google's API uses this pattern. Response mirrors the request array with individual responses.

Semantics decisions: should batch operations be atomic (all succeed or all fail) or partial (each item processed independently)? Atomic is simpler for the client but limits throughput and fails entirely if one item has a validation error. Partial success is more practical for large batches but requires the client to handle mixed results. Most APIs choose partial success for creates/updates and atomic for operations that must be consistent (transferring money between multiple accounts).

Response format for partial success: return 207 Multi-Status (from WebDAV, increasingly used for REST batches) or 200 with a body that details per-item results. Each item result includes its status code, any created resource representation, or error details.

Performance considerations: set a maximum batch size (100-1000 items) to prevent timeout and memory issues. For very large batches (millions of items), use the async pattern from question 8: accept the batch, return 202, process in background, report results via status endpoint.

Idempotency for batches: support per-item idempotency keys so that retrying a partially failed batch does not re-process already-succeeded items.

For how this relates to system design, batch APIs are critical in data pipelines, ETL processes, and administrative operations that manage resources at scale.

Follow-up questions:

How do you handle ordering dependencies within a batch (item B references item A)?
How do you design batch endpoints that interact with rate limiting?
What is your strategy for batch operations that trigger expensive side effects (emails, webhooks)?

14. How do you design content negotiation and API response formats?

What the interviewer is really asking: Do you understand how HTTP content negotiation works, when to support multiple formats, and how to design flexible response structures?

Answer framework:

Content negotiation allows a single resource to be represented in multiple formats. The client requests a preferred format via the Accept header, and the server responds in that format with the Content-Type header confirming.

Standard mechanism: Client sends Accept: application/json (prefers JSON). Server responds with Content-Type: application/json. If the server cannot produce the requested format, it returns 406 Not Acceptable. The Accept header supports quality factors for preference ordering: Accept: application/json;q=1.0, application/xml;q=0.8, text/csv;q=0.5.

When to support multiple formats: JSON is the universal default for web APIs. Add XML support only if enterprise consumers require it (SOAP legacy systems). CSV is valuable for data export endpoints consumed by spreadsheets and data tools. Protocol Buffers (via gRPC) for internal high-performance communication.

Response envelope design: the debate between naked responses ({id: 1, name: "Product"}) and enveloped responses ({data: {id: 1, name: "Product"}, meta: {request_id: "..."}}) is important. Envelopes provide a consistent place for metadata (pagination, rate limit info, warnings) but add nesting. Compromise: use envelopes for collections (where pagination metadata is essential) and naked responses for single resources (where metadata can go in headers).

Field filtering (sparse fieldsets): let clients request only fields they need. /products/123?fields=id,name,price returns only those fields. This reduces payload size and can improve server performance if expensive fields (like computed aggregations) are omitted. Implement with a field selection layer that filters the response before serialization.

Field expansion (embedding related resources): /orders/456?expand=customer,items returns the order with customer and item details inline rather than just IDs. This reduces the number of round trips (solving the N+1 problem for API consumers). Define which fields are expandable and limit expansion depth to prevent performance issues.

Compression: always support gzip/br encoding for responses. The Accept-Encoding/Content-Encoding headers handle this transparently. For large JSON responses, compression typically achieves 80-90 percent size reduction.

For how content negotiation relates to API evolution, custom media types (application/vnd.myapi.product.v2+json) combine versioning with content negotiation, enabling per-resource versioning. This is more granular than path-based versioning.

Follow-up questions:

How do you handle content negotiation for error responses?
How do you design response formats for APIs consumed by both mobile and web clients with different bandwidth constraints?
When would you use a custom media type versus standard application/json?

15. How do you design webhooks as part of a REST API?

What the interviewer is really asking: Can you design a reliable event delivery system, handle the challenges of calling external systems, and ensure webhook consumers receive events exactly once?

Answer framework:

Webhooks invert the communication direction: instead of the consumer polling for changes, the API pushes events to consumer-provided URLs. This is essential for event-driven integrations where timeliness matters and polling is wasteful.

Webhook registration API: consumers register webhooks via your REST API. POST /webhooks with body: {url: "https://consumer.com/webhook", events: ["order.created", "order.updated"], secret: "..."}. The server validates the URL is reachable (send a verification challenge), stores the subscription, and returns the webhook ID.

Event delivery format: standardize on a consistent envelope. {id: "evt_abc123", type: "order.created", created_at: "2026-04-20T10:00:00Z", data: {order_id: "ord_789", total: 99.99, currency: "USD"}}. Include enough data for the consumer to act without calling back to your API (fat events), or include minimal data forcing them to fetch details (thin events). Fat events are preferred for reducing load on your API and improving consumer latency.

Security: sign every webhook payload with HMAC-SHA256 using the webhook secret. Include the signature in a header (X-Webhook-Signature). Consumers verify the signature before processing to ensure the event is authentic and untampered. Include a timestamp in the signed payload to prevent replay attacks. Consumers should reject events with timestamps older than 5 minutes.

Reliability and retry: delivery will fail (consumer down, network issues, timeouts). Implement exponential backoff retry: 1 minute, 5 minutes, 30 minutes, 2 hours, 8 hours, 24 hours. After exhausting retries, mark the webhook as failing and alert the consumer. Provide a webhook event log endpoint (GET /webhooks/{id}/deliveries) showing delivery attempts with status codes and response times.

Idempotency: consumers must handle duplicate deliveries (retries may deliver the same event multiple times). Include a unique event ID that consumers use for deduplication. Document this requirement clearly.

Ordering: events may arrive out of order, especially with retries. Include a sequence number or timestamp that consumers can use to detect and handle out-of-order delivery. For critical ordering, include the previous event ID to form a chain.

For scaling webhook delivery, see the URL shortener system design which discusses event fan-out patterns, and explore learning paths for deeper system design preparation.

Follow-up questions:

How do you handle a consumer that is consistently slow to respond?
How do you implement webhook replay for consumers that experienced downtime?
How do you test webhooks during consumer development?

Common Mistakes in REST API Interviews

Using HTTP methods incorrectly. POST for everything is the most common mistake. Understand the semantic meaning: GET for retrieval (safe, idempotent, cacheable), POST for creation (unsafe, non-idempotent), PUT for full replacement (idempotent), PATCH for partial update, DELETE for removal (idempotent). Using the wrong method breaks caching, proxy behavior, and client expectations.
Ignoring backward compatibility. Senior engineers must think about API evolution from day one. Design for additive changes: use optional fields, default values, and extension points. Never remove or rename fields in existing versions without a deprecation strategy.
Over-fetching and under-fetching without solutions. Returning the entire resource graph on every request wastes bandwidth and processing. Not providing expansion or inclusion mechanisms forces consumers into N+1 request patterns. Design pagination, field selection, and resource expansion from the start.
Inconsistent naming and conventions. Mixing camelCase and snake_case, using plural for some resources and singular for others, inconsistent error formats across endpoints. Define and enforce API design guidelines before building. Consistency is more important than any individual convention choice.
Not considering the consumer experience. Designing solely from the server's domain model perspective rather than the consumer's use cases. APIs should model consumer workflows, not internal database schemas. Talk to consumers, understand their use cases, and design resources around how they will be used.

How to Prepare for REST API Design Interviews

Start by deeply understanding how REST APIs work beyond the surface level. Read Roy Fielding's dissertation chapter on REST (chapter 5). Understand why each constraint exists and what problem it solves.

Study APIs you use daily: Stripe, GitHub, Twilio. These are widely regarded as well-designed. Notice their conventions: how they handle errors, pagination, versioning, and authentication. Stripe's API documentation is particularly educational for error handling and idempotency.

Practice designing APIs for real-world scenarios. Pick a domain (hotel booking, food delivery, social media) and design the full API: resources, endpoints, request/response formats, error handling, authentication, and pagination. Time yourself to 30 minutes per exercise.

Understand the alternatives: study REST vs GraphQL and REST vs gRPC deeply. Knowing when REST is NOT the right choice demonstrates senior judgment. GraphQL excels when consumers have diverse data needs. gRPC excels for internal service communication requiring high performance.

Study real-world API design guidelines from companies like Google (Google API Design Guide) and Microsoft (Azure REST API Guidelines). These codify decades of API design experience into actionable rules.

For comprehensive preparation across all interview types, explore the learning paths on our platform. Structured preparation through our pricing plans provides access to detailed walkthroughs, mock interview practice, and expert feedback on your API designs.

REST API Interview Questions for Senior Engineers (2026)

Why REST API Design Matters in Senior Engineering Interviews

1. What makes an API truly RESTful versus just HTTP-based?

2. How do you design resource URLs for a complex domain?

3. Explain idempotency in REST APIs and why it matters

4. How do you handle API versioning and breaking changes?

5. Design pagination for a REST API that handles millions of records

6. How do you design error responses for a REST API?

7. How do you secure a REST API?

8. How do you design a REST API for long-running operations?

9. How do you design API rate limiting?

10. How do you handle partial updates in a REST API?

11. How do you design API authentication for a microservices architecture?

12. How do you design a REST API for real-time data?

13. How do you design bulk and batch operations in a REST API?

14. How do you design content negotiation and API response formats?

15. How do you design webhooks as part of a REST API?

Common Mistakes in REST API Interviews

How to Prepare for REST API Design Interviews

Related Resources

Master this topic in our 12-week cohort