INTERVIEW_QUESTIONS
Security Interview Questions for Senior Engineers (2026)
Top security interview questions with detailed answer frameworks covering authentication, authorization, encryption, network security, application security, and threat modeling for senior engineering interviews.
Why Security Matters in Senior Engineering Interviews
Security has moved from a specialized concern to a core competency expected of every senior engineer. Modern interview loops at top technology companies increasingly include dedicated security rounds or weave security considerations into system design discussions. The reason is straightforward: a single vulnerability can compromise millions of user records, destroy customer trust, and create regulatory liability that threatens the entire business.
For senior and staff-level candidates, interviewers expect more than surface-level awareness of OWASP top ten. They want to see that you can perform threat modeling on a system you are designing, reason about trust boundaries, make informed decisions about cryptographic primitives, and build defense-in-depth architectures that degrade gracefully under attack. You should be able to articulate the trade-offs between security controls and usability, explain why certain design patterns are vulnerable, and propose mitigations that are practical to implement and operate.
The questions in this guide reflect what companies like Google, Stripe, and other security-conscious organizations ask in real interviews. Each question includes the underlying intent, a structured answer framework, and follow-up questions that probe deeper understanding. For a broader interview preparation strategy, see our system design interview guide and explore learning paths tailored to senior engineers preparing for security-focused roles.
1. How would you design a secure authentication system for a large-scale web application?
What the interviewer is really asking: Can you reason about the full lifecycle of identity management, from credential storage to session management to account recovery, while balancing security with usability?
Answer framework:
Start by clarifying the scope: is this consumer-facing (millions of users, self-service registration) or enterprise (SSO integration, compliance requirements)? The answer changes significantly based on the threat model. For a consumer application handling sensitive data, begin with credential storage.
For password storage, never store plaintext or simple hashes. Use bcrypt, scrypt, or Argon2id with a work factor that takes at least 250ms on current hardware. Argon2id is the current recommendation because it is resistant to both GPU-based and side-channel attacks. Each password gets a unique salt generated from a cryptographically secure random number generator. Store the salt alongside the hash since it does not need to be secret.
For session management after authentication, generate session tokens using a CSPRNG with at least 128 bits of entropy. Store sessions server-side in Redis with a TTL. Set cookies with Secure, HttpOnly, SameSite=Strict flags. Implement absolute session timeouts (for example, 24 hours) and idle timeouts (30 minutes of inactivity). For a detailed walkthrough of building this, see our authentication system design.
Add multi-factor authentication as a required option for sensitive operations. Support TOTP (time-based one-time passwords via apps like Google Authenticator), WebAuthn/FIDO2 hardware keys (the strongest option, phishing-resistant), and SMS as a fallback only (vulnerable to SIM swapping). Store MFA recovery codes hashed, not in plaintext.
For account recovery, this is often the weakest link. Avoid security questions since the answers are frequently guessable or discoverable via social media. Use email-based recovery with time-limited, single-use tokens. Rate-limit recovery attempts aggressively and notify the user on all registered channels when a recovery is initiated.
Implement brute-force protection using progressive delays after failed attempts, CAPTCHA after three failures, and account lockout after ten failures with a 30-minute cooldown. Use a distributed rate limiter to prevent credential stuffing across different IP addresses.
Follow-up questions:
- How would you migrate from a legacy password hashing algorithm to Argon2id without forcing all users to reset passwords?
- What is the security trade-off between stateless JWTs and server-side sessions?
- How would you detect and respond to a credential stuffing attack in real-time?
2. Explain how OAuth 2.0 and OpenID Connect work and their security implications.
What the interviewer is really asking: Do you understand delegated authorization and federated identity deeply enough to implement them securely and identify common misconfigurations?
Answer framework:
OAuth 2.0 is an authorization framework that allows a third-party application to obtain limited access to a user's resources without exposing credentials. OpenID Connect (OIDC) is an identity layer built on top of OAuth 2.0 that adds authentication. Understanding the distinction is critical: OAuth tells you what a user can access, OIDC tells you who the user is.
Walk through the Authorization Code flow with PKCE, which is the recommended flow for all clients as of current best practices. The client generates a code_verifier (random string) and code_challenge (SHA-256 hash of the verifier). The user is redirected to the authorization server with the code_challenge. After authentication and consent, the authorization server redirects back with an authorization code. The client exchanges the code plus the code_verifier for tokens. The authorization server verifies that SHA-256(code_verifier) matches the stored code_challenge before issuing tokens. PKCE prevents authorization code interception attacks even for public clients.
For a comprehensive walkthrough, see how OAuth works and our OAuth/SSO platform design.
Discuss token types: the access token (short-lived, typically 15-60 minutes, used to access APIs), the refresh token (long-lived, stored securely, used to obtain new access tokens), and the ID token (JWT containing user identity claims, consumed only by the client, never sent to resource servers). A common mistake is sending the ID token to APIs as a bearer token instead of the access token.
For security considerations, always validate the state parameter to prevent CSRF. Validate the redirect_uri exactly (no wildcard matching). Use short-lived authorization codes (under 60 seconds). Store refresh tokens encrypted at rest and bind them to the client. Implement token rotation for refresh tokens: each use of a refresh token invalidates it and issues a new one, so a stolen token can only be used once.
Discuss the security of JWTs: always verify the signature algorithm (reject "none" algorithm), validate the issuer and audience claims, check expiration, and use asymmetric signing (RS256 or ES256) so that resource servers can validate tokens without sharing a secret key.
Follow-up questions:
- How would you implement token revocation for JWTs given that they are stateless?
- What are the security risks of the implicit flow, and why is it deprecated?
- How do you handle cross-tenant token confusion in a multi-tenant OIDC provider?
3. How does TLS work, and what are common vulnerabilities in its deployment?
What the interviewer is really asking: Do you understand the cryptographic handshake, certificate validation chain, and practical deployment pitfalls that lead to real-world breaches?
Answer framework:
TLS provides three guarantees: confidentiality (encryption), integrity (tamper detection), and authentication (server identity verification). Walk through the TLS 1.3 handshake since it is the current standard and simpler than 1.2. For the full protocol details, see how TLS handshake works and how HTTPS works.
In TLS 1.3, the handshake completes in one round trip (1-RTT). The client sends a ClientHello with supported cipher suites and key shares (Diffie-Hellman public values). The server selects a cipher suite, sends its key share and certificate in ServerHello. Both sides compute the shared secret from the DH exchange. The client verifies the server certificate against its trust store, checking the entire chain up to a trusted root CA, validating that the certificate is not expired, and checking revocation status via OCSP stapling or CRL.
TLS 1.3 removed vulnerable features: no RSA key exchange (only ephemeral Diffie-Hellman, providing forward secrecy), no CBC mode ciphers (only AEAD ciphers like AES-GCM and ChaCha20-Poly1305), no compression (preventing CRIME attack), and no renegotiation.
Common deployment vulnerabilities include allowing TLS 1.0 or 1.1 (vulnerable to BEAST, POODLE), using weak cipher suites, misconfigured certificate chains (missing intermediate certificates cause validation failures in some clients), failing to implement HSTS (HTTP Strict Transport Security) allowing SSL stripping attacks, certificate pinning mistakes (pinning leaf certificates that rotate frequently causes outages), and not implementing OCSP stapling (allowing revocation check failures to be silently ignored).
Discuss certificate transparency (CT) logs: all publicly trusted CAs must submit certificates to CT logs. Monitor CT logs for your domains to detect unauthorized certificate issuance. This is how companies detect CA compromises and misissuance.
For operational best practices: automate certificate rotation using ACME/Let's Encrypt, test configurations with tools like SSL Labs, implement CAA DNS records to restrict which CAs can issue certificates for your domain.
Follow-up questions:
- What is the difference between forward secrecy and regular key exchange, and why does it matter?
- How would you debug a TLS connection failure between two internal microservices?
- What are the trade-offs of mutual TLS (mTLS) for service-to-service authentication?
4. How would you implement authorization in a microservices architecture?
What the interviewer is really asking: Can you design a scalable, consistent authorization system that works across service boundaries without creating a bottleneck or security gaps?
Answer framework:
Authorization in a monolith is relatively straightforward since a single process has access to all context needed to make access decisions. In a microservices architecture, the challenge multiplies: each service needs to make authorization decisions but may lack the full context, and consistency across services is critical.
Discuss the fundamental approaches. First, gateway-level authorization: an API gateway or service mesh enforces coarse-grained access control (does this user have access to this API endpoint?). This centralizes policy but cannot handle fine-grained decisions (can this user edit this specific document?). Second, service-level authorization: each service makes its own authorization decisions using local data. This handles fine-grained access but risks inconsistency.
The recommended approach is layered authorization. The API gateway handles authentication (validating the JWT, extracting user claims) and coarse-grained authorization (role-based endpoint access). Individual services handle fine-grained authorization using a policy engine. Consider a dedicated authorization service using a policy language like OPA (Open Policy Agent) or Cedar. Policies are defined declaratively, and each service queries the policy engine with the subject, action, and resource.
For the data model, discuss RBAC (Role-Based Access Control) as the baseline: users are assigned roles, roles have permissions. Explain its limitations (role explosion when you need fine-grained control). Then discuss ABAC (Attribute-Based Access Control): decisions based on attributes of the user, resource, action, and environment. Example: "allow if user.department == resource.department AND user.clearance_level >= resource.sensitivity_level." ABAC is more flexible but harder to audit.
For relationship-based access control (ReBAC), inspired by Google Zanzibar: model permissions as relationships in a graph. "User X is an editor of Document Y" is a relationship. Check permissions by traversing the graph. This naturally handles hierarchical permissions (editor of a folder implies editor of all documents in that folder).
Address performance: authorization checks happen on every request. Cache authorization decisions with short TTLs (seconds to minutes depending on sensitivity). Pre-compute and materialize permission sets for hot paths. Use bloom filters for quick negative lookups.
Follow-up questions:
- How do you handle authorization for cross-service data aggregation where no single service has complete context?
- How would you audit and test authorization policies to ensure there are no privilege escalation paths?
- How do you handle the propagation delay when a user's permissions are revoked?
5. What is your approach to threat modeling a new system?
What the interviewer is really asking: Do you have a structured methodology for identifying and prioritizing security risks before writing code, and can you communicate threats to non-security stakeholders?
Answer framework:
Threat modeling is the practice of systematically identifying what can go wrong, who might attack the system, and what the impact would be. It should happen during the design phase, not after deployment. The most widely used framework is STRIDE, developed at Microsoft.
STRIDE categorizes threats into six types. Spoofing: pretending to be someone or something else (forged tokens, IP spoofing). Tampering: modifying data or code in transit or at rest (SQL injection, man-in-the-middle). Repudiation: denying an action was performed (insufficient logging). Information Disclosure: exposing data to unauthorized parties (misconfigured S3 buckets, verbose error messages). Denial of Service: making the system unavailable (volumetric DDoS, resource exhaustion). Elevation of Privilege: gaining unauthorized capabilities (exploiting a vulnerability to gain admin access).
The process begins by creating a data flow diagram showing trust boundaries, processes, data stores, and external entities. For each element crossing a trust boundary, apply STRIDE to identify potential threats. For each identified threat, assess risk using DREAD (Damage potential, Reproducibility, Exploitability, Affected users, Discoverability) or a simpler likelihood-times-impact matrix.
Prioritize mitigations based on risk score and implementation cost. Not every threat needs mitigation since some risks are acceptable given the cost of mitigation. Document accepted risks with justification.
Walk through a concrete example: threat modeling a payment API. Trust boundaries exist between the client and API gateway, between internal services, and between services and the database. At the client-to-gateway boundary, threats include credential theft (Spoofing), request tampering (Tampering), and DDoS. Mitigations include TLS for transport security, request signing for integrity, rate limiting for availability, and strong authentication.
Discuss integrating threat modeling into the development lifecycle: require threat models for new features above a certain risk threshold, review threat models in design reviews, and update them when the system architecture changes.
Follow-up questions:
- How do you decide which threats to accept versus mitigate?
- How would you threat model a system that processes PII under GDPR?
- How do you keep threat models current as the system evolves?
6. How would you design a secrets management system for a cloud-native application?
What the interviewer is really asking: Do you understand the lifecycle of secrets, from generation to rotation to revocation, and can you build a system that prevents secret sprawl and exposure?
Answer framework:
Secrets include API keys, database credentials, TLS certificates, encryption keys, and service account tokens. The fundamental principle is that secrets should never be stored in code, configuration files, or version control. This seems obvious but remains one of the most common vulnerability classes.
For the architecture, use a dedicated secrets manager (HashiCorp Vault, AWS Secrets Manager, or GCP Secret Manager). The secrets manager provides a centralized, audited, encrypted store. Applications authenticate to the secrets manager at runtime to retrieve secrets. This eliminates static secrets in configuration files.
Discuss authentication to the secrets manager itself. This is the bootstrapping problem: how does a service prove its identity to get secrets? In cloud environments, use platform identity (AWS IAM roles, GCP service accounts) so the cloud provider attests the service identity. In Kubernetes, use service account tokens with projected volumes. For VMs, use the instance metadata service with role-based access.
For encryption architecture, discuss envelope encryption: the secrets manager holds a master key (which may itself be stored in an HSM). Each secret is encrypted with a unique data encryption key (DEK). The DEK is encrypted with the master key and stored alongside the ciphertext. To decrypt, first unwrap the DEK using the master key, then decrypt the secret with the DEK. This limits the blast radius of a single key compromise.
Secret rotation is critical and often neglected. Design for automated rotation: database credentials rotated every 24 hours, API keys rotated every 30 days, TLS certificates rotated before expiration. The rotation process must be zero-downtime: generate a new credential, update the secrets manager, applications pick up the new credential, verify the new credential works, then revoke the old one.
For operational security, implement comprehensive audit logging of all secret access, set up alerts for unusual access patterns (a service accessing secrets it has never accessed before), and use dynamic secrets where possible (Vault generates a unique, short-lived database credential per request rather than sharing a static credential).
Follow-up questions:
- How do you handle secrets for local development without exposing production credentials?
- What is your strategy for responding to a leaked secret in a public GitHub commit?
- How do you implement secret zero: the initial credential that bootstraps access to the secrets manager?
7. Explain Cross-Site Scripting (XSS) and how to prevent it at the architecture level.
What the interviewer is really asking: Can you go beyond input validation to design a system-level defense against XSS, including CSP, output encoding, and architectural patterns that make XSS structurally impossible?
Answer framework:
XSS occurs when an attacker injects malicious scripts into content that is rendered in another user's browser. There are three types: Stored XSS (malicious script persisted in the database and served to all users), Reflected XSS (malicious script in a URL parameter reflected in the response), and DOM-based XSS (client-side JavaScript processes untrusted data unsafely).
The impact of XSS is severe: session hijacking, credential theft, phishing via trusted domains, cryptocurrency mining, and defacement. XSS effectively gives the attacker full control of the user's session.
For prevention, build defense in depth with multiple layers. The first layer is output encoding: encode all dynamic content based on the context where it appears. HTML context requires HTML entity encoding, JavaScript context requires JavaScript encoding, URL context requires URL encoding, CSS context requires CSS encoding. Use a templating engine that auto-escapes by default (React's JSX, Go's html/template). The most common XSS bugs occur when developers bypass auto-escaping (React's dangerouslySetInnerHTML, Angular's bypassSecurityTrustHtml).
The second layer is Content Security Policy (CSP). CSP is an HTTP header that tells the browser which sources of content are allowed. A strict CSP looks like: Content-Security-Policy: default-src 'self'; script-src 'self' 'nonce-{random}'; style-src 'self'; img-src 'self' data:; frame-ancestors 'none'. The nonce-based approach allows inline scripts only if they contain a server-generated nonce, effectively blocking injected scripts.
The third layer is input validation and sanitization. For fields that must accept rich text (like a blog editor), use a whitelist-based HTML sanitizer like DOMPurify that strips all tags and attributes except explicitly allowed ones. Never use blacklist-based filtering since attackers will find bypasses.
The fourth layer is architectural isolation. Use the Same-Origin Policy effectively: serve user-generated content from a separate domain (like googleusercontent.com for Google) so that XSS in user content cannot access the main application's cookies. Set authentication cookies with the HttpOnly flag so they are inaccessible to JavaScript.
Follow-up questions:
- How does CSP reporting work, and how would you use it to detect XSS attempts?
- What is the difference between a reflected XSS and a DOM-based XSS from a defense perspective?
- How would you handle XSS prevention in a single-page application that renders content entirely client-side?
8. How would you design a zero-trust network architecture?
What the interviewer is really asking: Do you understand modern network security principles beyond perimeter-based security, and can you implement identity-based access controls for service-to-service and user-to-service communication?
Answer framework:
The traditional perimeter model assumes that everything inside the network is trusted. Zero trust assumes that no entity, whether inside or outside the network, is inherently trusted. Every request must be authenticated, authorized, and encrypted regardless of its network origin. This model was formalized by Google's BeyondCorp and NIST SP 800-207.
The core principles are: verify explicitly (authenticate and authorize every request based on all available data points), use least-privilege access (grant the minimum permissions needed for the specific task), and assume breach (design as if the attacker is already inside the network).
For implementation, start with strong identity for every entity. Every user gets a cryptographic identity via certificates or platform tokens. Every service gets a cryptographic identity via mTLS certificates or SPIFFE IDs. Every device gets a posture assessment (OS version, security patches, disk encryption status).
For service-to-service communication, implement mutual TLS (mTLS) using a service mesh like Istio or Linkerd. Every service has a certificate issued by an internal CA. When Service A calls Service B, both present certificates. The service mesh sidecar handles TLS termination transparently. Authorization policies define which services can communicate: "Service A can call Service B's /api/orders endpoint with GET method." This is a dramatic improvement over IP-based firewall rules.
For user-to-service access, replace VPNs with an identity-aware proxy. Users authenticate using their corporate identity (SSO via OIDC), and the proxy makes access decisions based on user identity, device posture, request context, and resource sensitivity. No network-level access is granted; the proxy mediates every connection.
Discuss micro-segmentation: instead of flat networks where any compromised host can reach any other host, segment the network so that services can only communicate with their declared dependencies. Implement this using network policies in Kubernetes or security groups in cloud environments.
For monitoring in a zero-trust architecture, log every access decision (allowed and denied), analyze patterns for anomalies (a service making unusual API calls), and implement continuous verification (re-evaluate access decisions periodically, not just at initial connection).
Follow-up questions:
- How do you handle the performance overhead of mTLS on every service-to-service call?
- How would you implement zero trust for a hybrid cloud environment with on-premises and cloud workloads?
- What is the migration path from a perimeter-based network to zero trust without disrupting existing services?
9. How would you prevent and detect SQL injection in a large codebase?
What the interviewer is really asking: Beyond parameterized queries, can you design a comprehensive strategy that includes code analysis, runtime protection, monitoring, and architectural patterns that make injection structurally impossible?
Answer framework:
SQL injection remains in the OWASP top ten because despite being well-understood, it persists in real-world applications. The root cause is mixing code (SQL) with data (user input) in the same channel.
The primary defense is parameterized queries (prepared statements). In parameterized queries, the SQL structure is defined separately from the data values. The database engine parses and compiles the query structure first, then binds the data values. Since the data is never interpreted as SQL, injection is impossible regardless of the input content. Every database driver in every major programming language supports parameterized queries. There is no legitimate performance or usability reason to use string concatenation for SQL.
For dynamic queries where parameterized queries are insufficient (dynamic column names, dynamic ORDER BY, dynamic table names), use a whitelist approach: map user input to a fixed set of allowed values. For example, if the user selects a sort column, map their input to a predefined enum rather than interpolating it into the query.
For an ORM strategy, ORMs like Hibernate, ActiveRecord, and SQLAlchemy use parameterized queries by default. However, they typically offer an escape hatch for raw SQL. Establish a coding standard that raw SQL must go through a security review. Use static analysis tools (Semgrep, CodeQL) to detect raw SQL construction patterns in CI/CD.
For defense in depth, apply the principle of least privilege to database accounts: the application's database user should have only the permissions it needs (SELECT, INSERT, UPDATE on specific tables), not DBA privileges. This limits the damage if injection occurs. Use stored procedures with parameterized inputs for complex operations. Implement a Web Application Firewall (WAF) as a secondary layer that detects common injection patterns, but never rely on it as the primary defense.
For detection, log all database errors with full context (sanitized). SQL injection attempts often trigger syntax errors. Set up alerts for unusual patterns: a sudden spike in database errors from a specific endpoint, queries against system tables (information_schema), or UNION-based patterns.
Follow-up questions:
- How do you handle SQL injection risks in a system that uses dynamically generated queries for a search feature?
- What is the difference between first-order and second-order SQL injection?
- How would you remediate a SQL injection vulnerability discovered in production?
10. How would you design an API rate limiting and abuse prevention system?
What the interviewer is really asking: Can you build a system that protects APIs from abuse while maintaining good user experience for legitimate traffic, handling distributed rate limiting across multiple server instances?
Answer framework:
Rate limiting serves multiple purposes: protecting backend services from overload, ensuring fair usage across clients, preventing abuse (scraping, brute-force attacks), and managing costs for metered APIs.
Discuss the common algorithms. Token bucket: a bucket holds tokens up to a maximum capacity. Each request consumes a token. Tokens are added at a fixed rate. When the bucket is empty, requests are rejected. This allows bursts up to the bucket size while enforcing an average rate. Sliding window log: store the timestamp of each request. When a new request arrives, count requests in the past window (for example, 60 seconds). If the count exceeds the limit, reject. This is precise but memory-intensive. Sliding window counter: divide time into fixed windows, maintain a counter per window. Approximate the current rate using a weighted combination of the current and previous window. Less precise but very memory-efficient.
For distributed rate limiting (multiple API server instances), the state must be shared. Use Redis with atomic INCR and EXPIRE operations. The token bucket can be implemented with a Redis key storing the token count and last refill timestamp. Use Lua scripts in Redis for atomic check-and-update operations.
Design the rate limiting hierarchy: global limits (protect the overall system), per-API-key limits (fair usage per customer), per-endpoint limits (protect expensive endpoints), per-IP limits (prevent anonymous abuse). Apply limits in order from most specific to least specific.
For the response, return HTTP 429 (Too Many Requests) with a Retry-After header indicating when the client can retry. Include rate limit headers in every response: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset. This helps well-behaved clients self-regulate.
For abuse prevention beyond rate limiting, implement progressive challenges: normal traffic passes through, suspicious traffic gets CAPTCHA, and clearly abusive traffic gets blocked. Use behavioral analysis: a legitimate user browses pages with variable timing, while a bot makes requests at perfectly regular intervals. Consider device fingerprinting for anonymous abuse detection.
For DDoS protection, local rate limiting is insufficient since the traffic volume overwhelms your infrastructure before rate limits apply. Use a CDN or DDoS mitigation service (Cloudflare, AWS Shield) that absorbs traffic at the edge. Design your system to degrade gracefully under load: serve cached responses, disable expensive features, and prioritize authenticated users.
Follow-up questions:
- How do you rate limit WebSocket connections differently from HTTP requests?
- How would you implement rate limiting that accounts for the computational cost of different API endpoints?
- How do you handle rate limiting for microservices where a single user request fans out to many internal calls?
11. How would you implement end-to-end encryption for a messaging application?
What the interviewer is really asking: Do you understand the Signal Protocol or similar E2EE schemes, key management challenges, and the trade-offs between security and features like search and backup?
Answer framework:
End-to-end encryption (E2EE) ensures that only the communicating parties can read messages. The server transports ciphertext but cannot decrypt it. This protects against server compromise, insider threats, and lawful interception.
The Signal Protocol is the gold standard, used by Signal, WhatsApp, and Google Messages. It combines three cryptographic constructs. The Double Ratchet algorithm: combines a Diffie-Hellman ratchet (new DH key pair per message exchange, providing break-in recovery) with a symmetric ratchet (hash chain derivation, providing forward secrecy per message). If an attacker compromises a key, they can only decrypt a limited number of messages before the DH ratchet advances.
X3DH (Extended Triple Diffie-Hellman) key agreement: enables asynchronous key exchange so that Alice can send an encrypted message to Bob even if Bob is offline. Bob publishes pre-keys to the server. Alice uses her identity key, an ephemeral key, and Bob's pre-keys to derive a shared secret without interaction.
Sesame for multi-device support: each device has its own set of keys. A message to Bob is encrypted separately for each of Bob's devices. This multiplies ciphertext size by the number of devices.
For group messaging, discuss Sender Keys: the sender establishes a symmetric key with each group member via pairwise Signal sessions. Then uses the symmetric key for all group messages, reducing the per-message overhead from O(N) encryptions to O(1). The trade-off is that forward secrecy is weaker since compromising the sender key reveals all subsequent messages until the key is rotated.
For key verification, implement a safety number (fingerprint) comparison feature. Users can verify each other's identity keys out of band to detect man-in-the-middle attacks by the server.
Discuss the feature trade-offs of E2EE: server-side search is impossible (the server cannot read messages), message backup requires encrypting the backup with a user-held key (losing the key means losing all message history), and link previews require client-side fetching.
Follow-up questions:
- How do you handle key management when a user gets a new device?
- What happens to message history when a new member joins an E2EE group?
- How would you implement disappearing messages in an E2EE system?
12. Explain CSRF attacks and modern prevention techniques.
What the interviewer is really asking: Do you understand why CSRF works, how the browser's same-origin policy interacts with cookies, and what the modern defenses are beyond the classic synchronizer token?
Answer framework:
CSRF (Cross-Site Request Forgery) tricks a user's browser into making an unintended request to a target site where the user is authenticated. The attack works because browsers automatically attach cookies (including session cookies) to every request to a domain, regardless of which site initiated the request.
Example: a user is logged into their bank at bank.com. They visit evil.com which contains a hidden form that submits a transfer request to bank.com. The browser attaches the bank.com session cookie, and the bank processes the transfer as a legitimate authenticated request.
Modern prevention uses multiple layers. The first and most effective is the SameSite cookie attribute. SameSite=Strict: the cookie is never sent on cross-site requests. This prevents CSRF entirely but breaks legitimate cross-site navigation (clicking a link to bank.com from an email will not include the cookie). SameSite=Lax: the cookie is sent on top-level navigations (clicking links) but not on cross-site POST requests, form submissions from other origins, or API calls. This is the default in modern browsers and prevents most CSRF while preserving usability.
The second layer is the synchronizer token pattern: include a random, per-session CSRF token in every form as a hidden field. The server verifies the token on form submission. Since the attacker's site cannot read tokens from bank.com (Same-Origin Policy prevents cross-origin reads), they cannot include the correct token in their forged request.
The third layer is checking the Origin and Referer headers. On every state-changing request, verify that the Origin header matches your domain. This is effective but some browsers omit the Origin header in certain scenarios, so it should not be the sole defense.
For APIs that use token-based authentication (Bearer tokens in the Authorization header) rather than cookies, CSRF is not a concern because the token is explicitly added by the client code, not automatically by the browser. This is one security advantage of token-based authentication for SPAs.
Discuss edge cases: login CSRF (forcing a user to log into the attacker's account, capturing subsequent activity), JSON CSRF (exploiting endpoints that accept form-encoded data by mistake), and cross-subdomain CSRF.
Follow-up questions:
- Why is SameSite=Lax not sufficient for all CSRF prevention?
- How does CSRF interact with CORS, and why is CORS not a CSRF defense?
- How would you protect a legacy application that cannot easily add CSRF tokens?
13. How would you secure a CI/CD pipeline?
What the interviewer is really asking: Do you understand the software supply chain security risks and can you design a pipeline that prevents unauthorized code from reaching production?
Answer framework:
The CI/CD pipeline is a high-value target because compromising it provides a path to inject malicious code into production. The SolarWinds attack demonstrated the catastrophic impact of a compromised build pipeline.
Start with source code security. Enforce branch protection rules: require code review from at least two approved reviewers, require signed commits (GPG or SSH signatures), prevent force pushes to main branches, and require status checks (CI passing) before merge. Use CODEOWNERS files to ensure security-sensitive code changes are reviewed by security team members.
For build integrity, use hermetic builds: the build environment should be fully defined and reproducible. Pin all dependency versions (not just direct dependencies, but the entire dependency tree). Use lock files (package-lock.json, go.sum) and verify checksums. Scan dependencies for known vulnerabilities using tools like Dependabot, Snyk, or Grype. Implement SLSA (Supply-chain Levels for Software Artifacts) framework: at Level 3, the build process is fully auditable, and the provenance of every artifact is cryptographically verifiable.
For secret management in CI/CD, never store secrets in pipeline configuration files or environment variables visible in logs. Use the CI platform's secret management (GitHub Actions secrets, GitLab CI variables) with masking enabled. Better yet, use OIDC federation: the CI platform assumes a cloud IAM role using short-lived tokens, eliminating long-lived credentials entirely.
For artifact security, sign container images using Cosign (Sigstore) or Notary. Before deployment, the deployment pipeline verifies the signature, ensuring the image was built by the trusted CI system and has not been tampered with. Scan images for vulnerabilities before pushing to the registry.
For runtime protection, implement admission controllers in Kubernetes that reject unsigned images, images with critical vulnerabilities, or images not from the trusted registry. Use network policies to limit what deployed workloads can access.
For auditing, log every pipeline execution with who triggered it, what code was built, what dependencies were used, and what artifacts were produced. Implement anomaly detection: alert on unusual patterns like pipeline executions at odd hours or from unusual locations.
Follow-up questions:
- How do you handle a compromised dependency that passes all vulnerability scans because the vulnerability is not yet publicly known?
- How would you implement canary deployments from a security perspective?
- What is the risk of self-hosted CI runners, and how do you mitigate it?
14. How would you design a system that is compliant with data privacy regulations like GDPR?
What the interviewer is really asking: Can you translate legal requirements into technical architecture decisions, particularly around data classification, consent management, retention policies, and the right to deletion?
Answer framework:
GDPR (and similar regulations like CCPA) imposes technical requirements that must be baked into the system architecture, not bolted on afterward. This is the principle of privacy by design.
Start with data classification. Categorize all data your system processes: PII (personally identifiable information: name, email, phone, IP address), sensitive PII (financial data, health data, biometrics), and non-personal data. Maintain a data inventory that maps each data element to its purpose, legal basis for processing, retention period, and storage location.
For consent management, build a consent service that records explicit, granular consent. Users must be able to consent to specific purposes independently (marketing emails, analytics, third-party sharing). Consent must be freely given, specific, informed, and unambiguous. Store consent records immutably with timestamps for audit purposes.
For the right to erasure (Article 17), this is architecturally the most challenging requirement. When a user requests deletion, you must delete or anonymize their data from all systems: primary databases, caches, search indexes, backups, analytics warehouses, third-party systems, and logs. Design for this from the start: use a consistent user identifier across all systems, build a deletion pipeline that orchestrates removal from each system, and verify completeness. For backups, the accepted approach is to maintain a deletion log and apply it when backups are restored.
For data minimization, collect only the data you need for the stated purpose. Implement automatic data retention policies: data older than the retention period is automatically deleted or anonymized. Use pseudonymization where possible: replace direct identifiers with pseudonyms, store the mapping separately with stricter access controls.
For cross-border data transfers, understand where your data is stored and processed. Use data residency controls to keep EU user data in EU regions. For transfers outside the EU, implement Standard Contractual Clauses and ensure adequate encryption.
For data breach notification, GDPR requires notification within 72 hours. Build monitoring and alerting that can detect unauthorized data access quickly. Maintain an incident response playbook specific to data breaches that includes notification templates and regulatory contact information.
Follow-up questions:
- How do you handle the right to erasure for data that has been processed through ML training pipelines?
- How would you implement data portability (Article 20) for a complex application?
- How do you balance data retention for fraud prevention with the right to erasure?
15. How would you detect and respond to a security incident in a production system?
What the interviewer is really asking: Do you have practical experience with incident response, and can you design detection systems and response procedures that minimize the blast radius and recovery time?
Answer framework:
Incident response follows a lifecycle: preparation, detection, containment, eradication, recovery, and lessons learned. Senior engineers should be able to design systems that support each phase.
For detection, implement multiple signal sources. Application-level: anomalous error rates, unusual API call patterns, authentication failures, privilege escalation attempts. Infrastructure-level: unexpected network connections, unusual process execution, file integrity changes. Security-specific: WAF alerts, IDS/IPS alerts, threat intelligence feeds. Aggregate all signals into a SIEM (Security Information and Event Management) system. Use correlation rules: a single failed login is noise, but 1,000 failed logins from different IPs targeting the same account in 5 minutes is a credential stuffing attack.
For alerting, reduce false positives ruthlessly since alert fatigue is the enemy of effective detection. Classify alerts by severity: P1 (active data exfiltration, requires immediate response), P2 (confirmed unauthorized access, requires response within 1 hour), P3 (suspicious activity, requires investigation within 24 hours). Define escalation paths for each severity level.
For containment, have pre-built runbooks for common scenarios. Compromised user account: force session revocation, reset credentials, disable account, notify user. Compromised service: isolate the service by restricting network access (do not shut it down since you may lose forensic evidence), rotate all secrets the service had access to, and deploy a known-good version. Data exfiltration: identify the exfiltration channel, block it at the network level, and assess the scope of data accessed.
For forensics, ensure sufficient logging is in place before an incident occurs. Log authentication events, authorization decisions, data access patterns, administrative actions, and network flows. Retain logs for at least 90 days in tamper-proof storage (write-once, append-only). Use structured logging with consistent fields (timestamp, user_id, action, resource, source_ip) to enable efficient querying during an incident.
For recovery, restore from known-good backups if necessary. Rotate all potentially compromised credentials. Conduct a thorough review to ensure the attacker has not established persistence (backdoor accounts, scheduled tasks, modified binaries).
For lessons learned, conduct a blameless post-incident review within 72 hours. Document the timeline, root cause, detection gap (how could we have detected this sooner?), and remediation actions. Track remediation items to completion.
Follow-up questions:
- How do you balance the need for forensic evidence preservation with the urgency of containment?
- How would you design a system to detect insider threats?
- What is the role of chaos engineering in security incident preparedness?
Common Mistakes in Security Interviews
-
Treating security as an afterthought. Senior engineers are expected to integrate security considerations into the design from the start, not propose bolting on a WAF at the end. Interviewers notice when security is mentioned only when prompted.
-
Recommending encryption without specifying what and how. Saying "we will encrypt the data" is insufficient. Specify encryption at rest (AES-256-GCM with envelope encryption and a KMS-managed master key) and in transit (TLS 1.3 with certificate pinning for mobile clients). Discuss key management, rotation, and what happens when keys are compromised.
-
Ignoring the human element. Many breaches start with social engineering, phishing, or insider threats. A complete security answer acknowledges that technical controls must be complemented by training, access reviews, and monitoring for anomalous human behavior.
-
Over-relying on a single defense. Defense in depth means that if one control fails, others still protect the system. Proposing only input validation for XSS prevention, without CSP, output encoding, and architectural isolation, shows shallow understanding.
-
Not considering operational complexity. A security solution that is too complex to operate correctly will be misconfigured or bypassed. The best security architectures are simple enough that the team can maintain them reliably. When comparing solutions, see our comparison of Auth0 vs Clerk for a practical example of evaluating security tooling trade-offs.
How to Prepare for Security Interviews
Build your security knowledge systematically over 4-6 weeks. Start with foundational cryptography: understand symmetric vs asymmetric encryption, hashing, digital signatures, and key exchange. Then study authentication and authorization protocols in depth, including OAuth 2.0, OIDC, SAML, and mTLS.
Study the OWASP Top Ten thoroughly, but go beyond memorization. For each vulnerability class, understand the root cause, practice identifying it in code, and know the defense-in-depth approach. Build a small vulnerable application and practice exploiting and fixing each vulnerability type.
Read post-mortems from real-world breaches. The Equifax breach (unpatched vulnerability), the Capital One breach (SSRF exploiting cloud metadata), and the SolarWinds attack (supply chain compromise) each illustrate different attack patterns and defensive lessons.
Practice threat modeling on system design problems. Take any system design question and identify the trust boundaries, enumerate threats using STRIDE, and propose mitigations. This skill transfers directly to system design interviews where security is a key evaluation criterion.
For hands-on practice, explore distributed systems security patterns and review how security integrates with system design by working through our learning paths. Understanding how companies like Stripe approach security provides valuable real-world context for interview discussions. Visit our pricing page to explore premium preparation materials.
Related Resources
GO DEEPER
Master this topic in our 12-week cohort
Our Advanced System Design cohort covers this and 11 other deep-dive topics with live sessions, assignments, and expert feedback.