INTERVIEW_QUESTIONS

Networking Interview Questions for Senior Engineers (2026)

Master TCP/UDP/DNS and networking interview questions with structured answer frameworks. Covers transport protocols, DNS resolution, socket programming, and network debugging for senior engineering interviews.

20 min readUpdated Apr 19, 2026
interview-questionsnetworkingsenior-engineer

Networking Interview Questions for Senior Engineers (2026)

Networking knowledge separates senior engineers from mid-level ones. While junior developers can build applications without understanding TCP handshakes or DNS resolution, senior engineers are expected to debug production issues that span network layers, design systems that handle millions of concurrent connections, and make informed decisions about transport protocols.

In senior-level interviews, networking questions test whether you can reason about distributed system behavior under real-world conditions — packet loss, latency spikes, DNS failures, and connection exhaustion. Interviewers want to see that you understand not just the theory but the practical implications for system design.

This guide covers the most frequently asked networking interview questions at companies like Google, Meta, Amazon, and Netflix. Each question includes the hidden intent behind it, a structured answer framework, and follow-up questions you should be prepared for. Whether you are preparing for a system design interview or a deep-dive technical round, these questions will help you demonstrate senior-level networking expertise.

For a broader view of distributed systems concepts, see our distributed systems guide.


Questions

1. Explain the TCP three-way handshake and why it is necessary.

What the interviewer is really asking: Do you understand connection-oriented protocols at a fundamental level, and can you explain why reliability requires this overhead?

Answer framework:

The TCP three-way handshake establishes a reliable, bidirectional communication channel between two hosts. The process involves three steps:

  1. SYN: The client sends a SYN (synchronize) segment with an initial sequence number (ISN). This ISN is randomly generated to prevent spoofing and collisions with old connections.
  2. SYN-ACK: The server responds with its own SYN and an ACK of the client's ISN+1. The server also sends its own randomly generated ISN.
  3. ACK: The client acknowledges the server's ISN+1, completing the handshake.

The handshake is necessary for several reasons. First, both sides need to agree on initial sequence numbers for reliable, ordered delivery. Second, it confirms that both sides can send and receive — a two-way handshake would only confirm one direction. Third, it prevents old duplicate SYN segments from opening phantom connections (the ISN validation handles this).

A key performance consideration is that the handshake adds one round-trip time (RTT) of latency before any data can be sent. For short-lived connections (like HTTP/1.0 requests), this overhead is significant. This is why HTTP/1.1 introduced keep-alive connections, HTTP/2 uses multiplexing over a single connection, and HTTP/3 uses QUIC (over UDP) which combines the transport and TLS handshake into a single round trip.

In production systems, you can optimize by using TCP Fast Open (TFO), which allows data to be sent in the SYN packet on subsequent connections using a cached cookie. Linux supports this with the TCP_FASTOPEN socket option.

Connection establishment also involves negotiating parameters like Maximum Segment Size (MSS), window scaling, and selective acknowledgments (SACK) through TCP options in the SYN segments.

Follow-up questions:

  • What happens if the final ACK is lost? (Server retransmits SYN-ACK; connection eventually times out if ACK never arrives)
  • How does TCP Fast Open work and what are its security implications?
  • Why are initial sequence numbers randomized?

2. What are the differences between TCP and UDP, and when would you choose each?

What the interviewer is really asking: Can you make informed protocol choices based on application requirements rather than defaulting to TCP for everything?

Answer framework:

TCP and UDP are both transport-layer protocols but serve fundamentally different purposes.

TCP provides reliable, ordered, connection-oriented communication. It guarantees delivery through acknowledgments and retransmissions, maintains ordering through sequence numbers, and provides flow control (receiver window) and congestion control (slow start, congestion avoidance). The cost is higher latency and overhead — the three-way handshake, per-segment ACKs, and head-of-line blocking.

UDP is a minimal, connectionless protocol. It provides no delivery guarantees, no ordering, and no congestion control. Each datagram is independent. The benefit is lower latency, lower overhead (8-byte header vs TCP's 20-byte minimum), and no head-of-line blocking.

Choose TCP when: Data integrity matters more than latency — web traffic (HTTP), file transfers, email, database connections. Any application where losing a single byte corrupts the entire message.

Choose UDP when: Latency matters more than reliability — real-time video/voice (WebRTC), online gaming (player position updates), DNS queries (small, idempotent), IoT sensor data (high volume, individual losses acceptable). Also when you need multicast/broadcast capabilities.

Modern nuance: Many applications build reliability on top of UDP to get the best of both worlds. QUIC (used by HTTP/3) runs over UDP but implements its own reliability, ordering, and congestion control — with the advantage of avoiding head-of-line blocking between multiplexed streams. Google's experience showed that moving to QUIC reduced search latency by 8% and YouTube rebuffering by 18%.

In system design interviews, protocol choice matters when designing real-time systems or video streaming architectures.

Follow-up questions:

  • What is head-of-line blocking and how does QUIC solve it?
  • How does a streaming service like Netflix decide between TCP and UDP?
  • What congestion control algorithms does TCP use and why do they matter?

3. Walk me through what happens when you type a URL into a browser.

What the interviewer is really asking: Can you demonstrate end-to-end understanding of the network stack, from application layer to physical layer, and identify where performance optimizations occur?

Answer framework:

This is the classic networking question. A strong answer walks through each layer systematically:

  1. URL Parsing: Browser parses the URL into protocol (HTTPS), hostname (example.com), port (443), and path. It checks HSTS preload list to determine if HTTPS is required.

  2. DNS Resolution: Browser checks its DNS cache, then OS cache, then the resolver. The resolver performs recursive resolution: root servers → TLD servers (.com) → authoritative nameservers. DNSSEC validates the response. The result is an IP address (or multiple for load balancing). See how DNS works for details.

  3. TCP Connection: Browser initiates a TCP three-way handshake to the resolved IP on port 443. For HTTPS, this is followed by a TLS handshake (TLS 1.3 requires just one additional round trip). With TLS 1.3 and session resumption, subsequent connections can use 0-RTT.

  4. HTTP Request: Browser sends an HTTP GET request with headers including Host, User-Agent, Accept, Accept-Encoding (gzip, br), cookies, and cache-control directives. For HTTP/2, the request is framed as a HEADERS frame on a new stream.

  5. Server Processing: The request may hit a CDN edge server first. If not cached, it reaches a load balancer, then an application server. The server processes the request, potentially querying databases and microservices.

  6. Response: Server returns HTML with status code, headers (cache-control, content-type, content-encoding), and body. The browser parses HTML, discovers additional resources (CSS, JS, images), and makes additional requests — potentially using HTTP/2 multiplexing.

  7. Rendering: Browser constructs the DOM tree, CSSOM tree, render tree, performs layout, paint, and compositing. JavaScript execution can block rendering.

Key optimizations at each layer: DNS prefetching, connection prewarming, HTTP/2 server push, compression, CDN caching, browser caching.

Follow-up questions:

  • Where would you add monitoring to debug a slow page load?
  • How does HTTP/2 multiplexing differ from HTTP/1.1 pipelining?
  • What role does the OS TCP buffer play in performance?

4. How does DNS resolution work, and what are common DNS failure modes in production?

What the interviewer is really asking: Have you debugged real DNS issues in production, and do you understand DNS beyond basic name resolution?

Answer framework:

DNS resolution translates domain names to IP addresses through a hierarchical, distributed system.

Resolution process:

  1. Application calls getaddrinfo() (or similar), which checks /etc/hosts, then the local DNS cache.
  2. If not cached, the stub resolver queries the configured recursive resolver (corporate DNS, ISP DNS, or public resolvers like 8.8.8.8).
  3. The recursive resolver checks its cache. On a miss, it queries root nameservers (13 logical servers, hundreds of physical instances via anycast).
  4. Root servers direct to TLD nameservers (.com, .org, etc.).
  5. TLD nameservers direct to the domain's authoritative nameservers.
  6. Authoritative nameservers return the final answer with a TTL.

Record types that matter: A (IPv4), AAAA (IPv6), CNAME (alias), MX (mail), SRV (service discovery), TXT (verification, SPF, DKIM), NS (delegation).

Common production failure modes:

  • TTL expiry during failover: If you use DNS for failover with a 300s TTL, clients may take up to 5 minutes to switch. Some resolvers ignore low TTLs. Mitigation: use lower TTLs proactively before planned changes.
  • Negative caching: Failed lookups (NXDOMAIN) are cached per the SOA record's negative TTL. This can delay recovery after fixing a DNS misconfiguration.
  • Resolver capacity: Under high traffic, the local resolver may become a bottleneck. Connection-oriented DNS (DoT, DoH) can exhaust resolver connections.
  • CNAME chains: Deep CNAME chains add latency and can create loops. Each CNAME requires an additional lookup.
  • Split-horizon DNS: Internal vs external DNS returning different results can cause confusing bugs in hybrid environments.

For production reliability, use multiple DNS providers, monitor resolution latency, and keep TTLs appropriate for your failover requirements. Companies like Netflix use custom DNS infrastructure to handle their scale.

Follow-up questions:

  • How does DNS-based global load balancing work?
  • What is the difference between authoritative and recursive DNS servers?
  • How would you debug a DNS issue where some users cannot reach your service?

5. Explain TCP congestion control and why it matters for distributed systems.

What the interviewer is really asking: Do you understand how TCP adapts to network conditions, and can you reason about its impact on application performance?

Answer framework:

TCP congestion control prevents senders from overwhelming the network. Without it, simultaneous senders would cause congestion collapse — a state where the network is busy but almost no useful data gets through (as happened on the early Internet in 1986).

Core mechanisms:

  • Congestion Window (cwnd): Limits how much unacknowledged data can be in flight. Starts small and grows.
  • Slow Start: cwnd starts at 1-10 MSS and doubles every RTT (exponential growth). Despite the name, it grows quickly.
  • Congestion Avoidance: After reaching the slow-start threshold (ssthresh), cwnd grows linearly — one MSS per RTT.
  • Fast Retransmit/Recovery: Three duplicate ACKs trigger retransmission without waiting for timeout. cwnd is halved rather than reset.

Modern algorithms:

  • Reno/NewReno: Classic loss-based. Halves cwnd on packet loss. Fair but underutilizes high-bandwidth links.
  • CUBIC: Default in Linux. Uses a cubic function for window growth. Better for high-bandwidth, high-latency networks.
  • BBR (Bottleneck Bandwidth and RTT): Developed by Google. Measures actual bandwidth and RTT rather than inferring congestion from loss. Significantly better for long-distance and lossy links.

Impact on distributed systems:

  1. Cold connections: A new TCP connection starts with a small cwnd (typically 10 segments = ~14KB). Transferring 1MB requires multiple RTTs of ramping up. This is why connection pooling matters.
  2. Incast: In data center architectures, many servers responding simultaneously to a single aggregator can overwhelm the switch buffer, causing synchronized packet loss and timeout. Solutions include staggering responses and using ECN.
  3. Cross-datacenter replication: High-latency links with loss suffer dramatically with loss-based congestion control. BBR helps, but can be unfair to Reno/CUBIC flows.

In system design, understanding congestion control helps you reason about why connection pooling, batching, and compression improve performance.

Follow-up questions:

  • What is the TCP incast problem and how do data centers solve it?
  • How does BBR differ from loss-based congestion control?
  • Why does connection pooling improve throughput beyond reducing handshake overhead?

6. What is a TCP connection pool and why is it critical for microservices?

What the interviewer is really asking: Do you understand the performance implications of connection management in distributed architectures?

Answer framework:

A TCP connection pool maintains a set of pre-established, reusable connections to a remote host. Instead of opening a new connection per request (incurring handshake latency and slow-start overhead), applications borrow a connection from the pool, use it, and return it.

Why it is critical for microservices:

  1. Latency reduction: Each new TCP+TLS connection costs 2-3 RTTs. In a microservice call chain of 5 services, that is 10-15 RTTs of pure overhead per request without pooling.
  2. Warm congestion windows: Reused connections maintain their cwnd, allowing immediate high-throughput transfers.
  3. Port exhaustion prevention: Each new connection uses an ephemeral port. Under high load, you can exhaust the 16-bit port range (64K ports minus reserved ones). With TIME_WAIT sockets lasting 2 minutes, this becomes critical.
  4. File descriptor conservation: Each connection consumes a file descriptor. Connection pools limit the maximum concurrent connections.

Pool configuration parameters:

  • Max pool size: Maximum connections to a single destination. Too small causes queuing; too large wastes resources.
  • Min idle: Minimum warm connections to maintain. Prevents cold-start latency.
  • Idle timeout: How long unused connections live before being closed. Must be less than the server's idle timeout to avoid sending on closed connections.
  • Connection lifetime: Maximum age of a connection before forced recycling. Prevents issues with stale DNS and load balancer draining.
  • Health checking: Periodic validation that pooled connections are still alive.

Common pitfalls:

  • Not setting a connection lifetime, causing connections to stick to old server instances after deploys.
  • Setting pool size too large, preventing load balancer rebalancing when new instances are added.
  • Not handling DNS changes — the pool connects to resolved IPs, so DNS TTL changes are invisible until connections recycle.

In Kubernetes environments, connection pooling interacts with service meshes and sidecar proxies. Service mesh proxies like Envoy manage their own connection pools, which must be tuned in coordination with application-level pools.

Follow-up questions:

  • How do you size a connection pool for a service handling 10K RPS?
  • What happens when the destination service deploys and connections break?
  • How does HTTP/2 multiplexing change the connection pooling strategy?

7. Explain the differences between L4 and L7 load balancing.

What the interviewer is really asking: Do you understand the trade-offs in load balancing strategies and when to use each?

Answer framework:

L4 (Transport Layer) Load Balancing operates at the TCP/UDP level. It sees IP addresses, ports, and TCP flags, but not application-layer content. Decisions are based on connection metadata. Technologies: Linux IPVS, AWS NLB, HAProxy in TCP mode.

  • Routes entire TCP connections, not individual requests.
  • Very fast — can handle millions of connections per second with minimal CPU.
  • Uses techniques like Direct Server Return (DSR) where return traffic bypasses the load balancer.
  • Cannot make routing decisions based on HTTP headers, paths, or cookies.
  • Health checks are limited to TCP connect or simple probe.

L7 (Application Layer) Load Balancing operates at the HTTP/gRPC level. It fully parses requests and can make routing decisions based on headers, paths, cookies, request body, and more. Technologies: Nginx, Envoy, AWS ALB, HAProxy in HTTP mode.

  • Can route /api/users to one service and /api/orders to another.
  • Supports content-based routing, A/B testing, canary deployments.
  • Can modify requests and responses (add headers, rewrite URLs, compress).
  • Terminates TLS and can inspect encrypted traffic.
  • Higher CPU cost per request due to protocol parsing.
  • Can distribute individual requests across backends for HTTP/2 multiplexed connections.

When to choose each:

  • L4 for raw throughput, TCP/UDP services, or when you need the lowest possible latency.
  • L7 for HTTP microservices, path-based routing, header-based routing, or when you need request-level observability.
  • Many architectures use both: L4 at the edge (handling millions of connections) fronting multiple L7 load balancers (handling application routing).

For cloud architecture design, understanding this distinction is essential. See our system design interview guide for more architectural patterns.

Follow-up questions:

  • How does consistent hashing improve L4 load balancing for stateful protocols?
  • What is the connection draining problem and how do L7 load balancers handle it?
  • How does gRPC load balancing differ from HTTP/1.1 load balancing?

8. What is the TCP TIME_WAIT state and how can it cause production issues?

What the interviewer is really asking: Have you encountered real-world TCP state machine issues in production?

Answer framework:

When a TCP connection is closed, the side that initiates the close (sends the first FIN) enters the TIME_WAIT state. This state lasts for 2 × MSL (Maximum Segment Lifetime), typically 60 seconds on Linux.

Why TIME_WAIT exists:

  1. Reliable connection termination: If the final ACK is lost, the peer will retransmit its FIN. The TIME_WAIT state allows the closer to resend the ACK.
  2. Prevent segment confusion: Old segments from a previous connection on the same 4-tuple (source IP, source port, dest IP, dest port) could be misinterpreted as belonging to a new connection. TIME_WAIT ensures all old segments have expired.

Production issues:

  • Port exhaustion: A server making many outbound connections (to databases, caches, microservices) accumulates TIME_WAIT sockets. With ~28K ephemeral ports and 60-second TIME_WAIT, you are limited to ~470 new connections/second per destination IP. This is a common cause of "Cannot assign requested address" errors.
  • Memory consumption: Each TIME_WAIT socket consumes kernel memory (~160 bytes on Linux). Millions of TIME_WAIT sockets use significant memory.

Solutions:

  • Connection pooling: Reuse connections instead of creating new ones. This is the best solution.
  • tcp_tw_reuse: Allow reusing TIME_WAIT sockets for new outgoing connections (safe for clients, uses TCP timestamps to prevent confusion).
  • Increase ephemeral port range: net.ipv4.ip_local_port_range can be expanded to 15000-65535.
  • SO_LINGER with timeout 0: Forces RST instead of graceful close, skipping TIME_WAIT entirely. Use with extreme caution — this can cause data loss.

This is particularly relevant when designing high-throughput systems that make many short-lived outbound connections. Understanding TIME_WAIT is essential for SRE-level debugging.

Follow-up questions:

  • What is the difference between tcp_tw_reuse and the deprecated tcp_tw_recycle?
  • How does HTTP keep-alive reduce TIME_WAIT accumulation?
  • How would you diagnose a port exhaustion issue in production?

9. How does TLS 1.3 work, and what improvements does it offer over TLS 1.2?

What the interviewer is really asking: Do you understand modern transport security and the performance-security trade-offs?

Answer framework:

TLS (Transport Layer Security) provides encryption, authentication, and integrity for TCP connections. TLS 1.3 (RFC 8446) was a major redesign.

TLS 1.3 Handshake (1-RTT):

  1. Client sends ClientHello with supported cipher suites AND key shares (guessing which key exchange the server will choose).
  2. Server responds with ServerHello, its key share, encrypted certificate, and Finished message — all in one flight.
  3. Client sends Finished. Data can flow.

Compared to TLS 1.2's 2-RTT handshake, TLS 1.3 saves one round trip by sending key shares speculatively.

Key improvements:

  • Reduced latency: 1-RTT for new connections, 0-RTT for resumed connections (with replay protection caveats).
  • Simplified cipher suites: Removed insecure algorithms (RC4, SHA-1, RSA key exchange, static DH). Only supports AEAD ciphers (AES-GCM, ChaCha20-Poly1305).
  • Forward secrecy mandatory: All key exchanges use ephemeral Diffie-Hellman (ECDHE). Compromising the server's private key does not decrypt past traffic.
  • Encrypted certificates: The server certificate is encrypted, preventing passive observers from identifying which site is being accessed (though SNI still leaks this unless using ECH).
  • 0-RTT resumption: Clients can send data in the first flight using a pre-shared key from a previous session. However, 0-RTT data is vulnerable to replay attacks, so it should only be used for idempotent requests.

Production considerations:

  • 0-RTT replay risk: Never allow 0-RTT for state-changing operations (POST, PUT, DELETE).
  • Certificate management: Automate with ACME/Let's Encrypt. Short-lived certificates (90 days) reduce compromise window.
  • OCSP stapling: Server attaches proof of certificate validity, eliminating client-side OCSP lookups.

For security interview questions, TLS understanding is foundational. See also our guide on OAuth and authentication.

Follow-up questions:

  • What is the replay attack risk with 0-RTT and how do you mitigate it?
  • How does certificate pinning work and when should you use it?
  • What is Encrypted Client Hello (ECH) and why does it matter for privacy?

10. Explain how NAT works and its implications for peer-to-peer communication.

What the interviewer is really asking: Do you understand network address translation at a practical level, especially for real-time communication systems?

Answer framework:

NAT (Network Address Translation) allows multiple devices on a private network to share a single public IP address. The NAT device (typically a router) rewrites source IP/port on outgoing packets and maintains a mapping table to route return traffic back to the correct internal host.

Types of NAT:

  • Full Cone: Once a mapping exists, any external host can send to the mapped address. Most permissive.
  • Address-Restricted Cone: External host must have received a packet from the internal host first (IP-level restriction).
  • Port-Restricted Cone: Like address-restricted, but also restricts by port.
  • Symmetric: Creates a different mapping for each destination. Most restrictive.

Implications for P2P:

NAT breaks the end-to-end principle — hosts behind NAT cannot receive unsolicited incoming connections. This is a fundamental challenge for P2P applications like video calls, file sharing, and multiplayer gaming.

NAT traversal techniques:

  1. STUN (Session Traversal Utilities for NAT): Client sends a request to a public STUN server, which reports back the client's public IP and port. Works for Full Cone and Restricted Cone NAT.
  2. TURN (Traversal Using Relays around NAT): When direct connection fails, relay all traffic through a public server. Always works but adds latency and server cost.
  3. ICE (Interactive Connectivity Establishment): Framework that tries multiple candidates (local, STUN-derived, TURN relay) and selects the best working path. Used by WebRTC.
  4. Hole Punching: Both peers simultaneously send UDP packets to each other's public address. The outgoing packets create NAT mappings that allow the incoming packets through. Works for most NAT types except symmetric.

WebRTC uses ICE with STUN and TURN fallback. In production, approximately 80-90% of connections succeed with STUN, and 10-20% require TURN relaying.

For designing real-time communication systems, understanding NAT traversal is essential.

Follow-up questions:

  • Why is symmetric NAT the hardest to traverse?
  • How does IPv6 affect the need for NAT?
  • What is the cost model for running TURN servers at scale?

11. How would you debug a network latency issue between two microservices?

What the interviewer is really asking: Do you have a systematic debugging methodology for production network issues?

Answer framework:

A systematic approach to network latency debugging moves from application layer down to network layer:

Step 1: Characterize the problem

  • Is it all requests or specific ones? (If specific, likely application-level)
  • Is it constant or intermittent? (Intermittent suggests congestion or GC pauses)
  • When did it start? (Correlate with deploys, traffic changes, infrastructure changes)
  • What percentile is affected? (p50 vs p99 tells different stories)

Step 2: Application-layer analysis

  • Check distributed tracing (Jaeger, Zipkin) for the slow span.
  • Examine connection pool metrics: are requests queuing for a connection?
  • Check thread pool saturation: are requests waiting for a worker thread?
  • Look at serialization/deserialization time for large payloads.

Step 3: Transport-layer analysis

  • Use ss -ti to examine TCP connection stats: RTT, retransmissions, cwnd, send buffer.
  • High retransmissions suggest packet loss. Check netstat -s | grep retransmit.
  • Small cwnd suggests recent packet loss or new connections without pooling.
  • Check for TIME_WAIT accumulation with ss -s.

Step 4: Network-layer analysis

  • traceroute/mtr to identify which hop introduces latency.
  • tcpdump or Wireshark to capture and analyze actual packet timing.
  • Check for MTU issues causing fragmentation: ping -M do -s 1472.
  • Verify DNS resolution time: dig +stats example.com.

Step 5: Infrastructure analysis

  • Check NIC error counters: ethtool -S eth0 (drops, overruns).
  • Verify no bandwidth throttling or noisy neighbor issues (common in cloud).
  • Check if the issue is same-AZ vs cross-AZ traffic.
  • Review security group and ACL rules for unintended bottlenecks.

For production observability best practices, structured monitoring at each layer is essential. See our SRE guide for more debugging frameworks.

Follow-up questions:

  • How would you distinguish between network latency and application latency in traces?
  • What is bufferbloat and how can it cause latency issues?
  • How does TCP Nagle's algorithm affect latency and when should you disable it?

12. What is BGP and why do BGP issues cause large-scale outages?

What the interviewer is really asking: Do you understand Internet routing at a high level and can you reason about large-scale infrastructure failures?

Answer framework:

BGP (Border Gateway Protocol) is the routing protocol that glues the Internet together. It enables autonomous systems (AS) — networks operated by a single organization — to exchange routing information and determine paths to reach any IP prefix.

How BGP works:

  • Each AS announces the IP prefixes it owns to its BGP peers.
  • Announcements propagate across the Internet. Each AS prepends its AS number, creating an AS path.
  • When multiple paths exist, BGP selects the most specific prefix (longest prefix match), then shortest AS path, then various tie-breakers.
  • BGP is a trust-based protocol — ASes trust that their peers announce legitimate routes.

Why BGP causes large-scale outages:

  1. Route leaks: An AS accidentally announces routes it should not, attracting traffic it cannot handle. In 2019, a small ISP leaked routes for Cloudflare, AWS, and others, causing widespread outages.
  2. Route hijacking: Malicious or accidental announcement of someone else's IP prefixes. Traffic gets misdirected to the wrong network.
  3. Configuration errors: Withdrawing routes (like Facebook's October 2021 outage) makes entire networks unreachable. Facebook's DNS servers became unreachable because the BGP routes to them were withdrawn.
  4. Convergence delays: When routes change, it takes minutes for the entire Internet to converge on new paths. During convergence, traffic can loop or blackhole.

Mitigations:

  • RPKI (Resource Public Key Infrastructure): Cryptographically validates that an AS is authorized to announce specific prefixes.
  • BGP route filtering: Apply strict import/export policies at peering boundaries.
  • Monitoring: Use BGP monitoring services (RIPE RIS, RouteViews) to detect anomalous announcements.
  • Multiple providers: Use multiple transit providers and peering to reduce single-AS dependency.

Understanding BGP is relevant for cloud architecture and designing globally distributed systems.

Follow-up questions:

  • How did the Facebook October 2021 outage happen at a technical level?
  • What is RPKI and how does it prevent route hijacking?
  • How do CDNs use BGP anycast for global load distribution?

13. Explain the concept of network partitions and their impact on distributed systems.

What the interviewer is really asking: Do you understand the CAP theorem at a practical level and how network failures affect system behavior?

Answer framework:

A network partition occurs when nodes in a distributed system cannot communicate with each other, even though the individual nodes are still operational. The network splits into two or more groups that can communicate internally but not across groups.

Types of partitions:

  • Complete partition: Two groups with zero communication between them.
  • Partial partition: Some nodes can communicate across the split, others cannot.
  • Asymmetric partition: Node A can send to node B, but B cannot send to A.
  • Transient partition: Brief network interruption (seconds to minutes).

Impact on distributed systems (CAP theorem):

The CAP theorem states that during a network partition, a system must choose between consistency and availability:

  • CP (Consistency + Partition tolerance): System rejects writes or reads that cannot be verified as consistent. Examples: ZooKeeper, etcd, HBase. The minority partition becomes unavailable.
  • AP (Availability + Partition tolerance): Both sides continue serving requests, potentially returning stale or conflicting data. Examples: Cassandra, DynamoDB, CouchDB. Conflicts are resolved after the partition heals.

Real-world partition scenarios:

  1. Cross-datacenter link failure: Two data centers lose connectivity. If your database uses synchronous replication, one DC stops accepting writes.
  2. Switch failure: A rack becomes isolated from the rest of the cluster. Nodes in the rack may elect a new leader, causing split-brain.
  3. Cloud availability zone isolation: An AZ loses connectivity to other AZs. Services with strict quorum requirements may lose availability.

Handling partitions in practice:

  • Use consensus protocols (Raft, Paxos) that require a majority quorum — the minority side becomes read-only or unavailable.
  • Implement partition detection with heartbeats and failure detectors.
  • Design for partition recovery: conflict resolution strategies (last-write-wins, merge functions, CRDTs).

For more on distributed systems trade-offs, see our distributed systems guide and consistent hashing concepts.

Follow-up questions:

  • How does a Raft cluster behave during a network partition?
  • What are CRDTs and how do they provide partition tolerance without conflict?
  • Can you give an example of a system that sacrifices availability during partitions and explain why?

14. What is socket programming and how do modern servers handle millions of concurrent connections?

What the interviewer is really asking: Do you understand the evolution of server architectures and the C10K/C10M problem?

Answer framework:

A socket is an endpoint for bidirectional communication. The traditional server pattern is: socket() → bind() → listen() → accept() in a loop, spawning a thread per connection.

Evolution of connection handling:

  1. Thread-per-connection: Simple model. One OS thread per client. Limited by thread memory overhead (~1MB stack per thread). Practical limit: ~1K-10K connections.

  2. Process-per-connection (Apache prefork): Even heavier than threads. Each connection gets its own process with separate memory space.

  3. Event-driven I/O (select/poll): Single thread monitors multiple file descriptors. select() is limited to 1024 FDs. poll() removes this limit but still scales O(n) per call.

  4. epoll/kqueue (Linux/BSD): Kernel maintains the interest set. epoll_wait() returns only ready file descriptors — O(1) per event. This enables the C10K breakthrough. Used by Nginx, Node.js, Redis.

  5. io_uring (Linux 5.1+): Submission and completion ring buffers shared between user space and kernel. Eliminates system call overhead for I/O operations. Enables millions of IOPS with minimal CPU.

Modern architecture for millions of connections:

  • Non-blocking sockets with epoll/kqueue for event notification.
  • Event loop per CPU core (Nginx worker model).
  • User-space networking (DPDK, XDP) to bypass the kernel network stack entirely.
  • Connection-level optimizations: TCP_NODELAY, SO_REUSEPORT (kernel-level load balancing across workers).
  • Memory optimization: minimize per-connection state, use memory pools.

Practical considerations:

  • File descriptor limits: ulimit -n and fs.file-max.
  • Ephemeral port range for outbound connections.
  • Socket buffer tuning: net.core.rmem_max, net.core.wmem_max.
  • Accept queue size: net.core.somaxconn.

This knowledge is essential for designing high-performance systems and understanding the architecture behind load balancers and reverse proxies.

Follow-up questions:

  • What is the difference between edge-triggered and level-triggered epoll?
  • How does SO_REUSEPORT help with the thundering herd problem?
  • Why does Redis use a single-threaded event loop and still achieve high performance?

15. How do service discovery and health checking work in modern distributed systems?

What the interviewer is really asking: Do you understand how services find and communicate with each other in dynamic environments?

Answer framework:

Service discovery is the mechanism by which services locate the network addresses of other services. In dynamic environments where instances are constantly created and destroyed (containers, auto-scaling), hardcoded addresses are impractical.

Approaches:

  1. DNS-based discovery: Services register with a DNS server (e.g., Consul DNS, AWS Cloud Map). Clients resolve payment-service.internal to get instance IPs. Simple but limited by DNS TTL caching and lack of real-time updates.

  2. Registry-based discovery: A dedicated registry (Consul, etcd, ZooKeeper) maintains service instance lists. Two sub-patterns:

    • Client-side discovery: Client queries the registry and load-balances across instances. More efficient but couples clients to the registry.
    • Server-side discovery: Client talks to a load balancer that queries the registry. Simpler clients but adds a hop.
  3. Platform-level discovery: Kubernetes Services provide built-in discovery through kube-dns and kube-proxy. Service meshes like Istio and Linkerd add L7-aware discovery.

Health checking patterns:

  • Active health checks: The registry or load balancer periodically probes instances (HTTP GET /health, TCP connect, gRPC health check protocol).
  • Passive health checks: Monitor real traffic for errors. If error rate exceeds a threshold, mark unhealthy.
  • Self-reporting: Instances send heartbeats to the registry. Missing heartbeats trigger deregistration.

Health check design:

  • Liveness: Is the process running? (Restart if not)
  • Readiness: Can the service handle traffic? (Remove from load balancing if not)
  • Startup: Has the service finished initializing? (Prevents premature health check failures)

Common pitfalls:

  • Health check that only verifies the process is alive but not that it can serve traffic (e.g., database connection pool exhausted but health check returns 200).
  • Cascading failures from overly aggressive health checks: under load, health check timeouts cause instances to be removed, increasing load on remaining instances.
  • Stale service registry entries due to missed deregistration.

For more on service communication patterns, see our service mesh interview questions and microservices architecture.

Follow-up questions:

  • How does Kubernetes service discovery work under the hood?
  • What is the difference between client-side and server-side load balancing?
  • How would you handle service discovery across multiple data centers?

Common Mistakes

  1. Memorizing protocol details without understanding trade-offs. Knowing that TCP has a 20-byte header is less useful than understanding why TCP's reliability guarantees add latency and how this affects your design choices.

  2. Ignoring the application layer. Many networking performance issues are actually application-level problems — inefficient serialization, missing connection pooling, or chatty API designs that make too many round trips.

  3. Assuming the network is reliable. The fallacies of distributed computing exist for a reason. Always design for network failures, latency variance, and bandwidth limitations.

  4. Not knowing your debugging tools. Senior engineers should be comfortable with tcpdump, Wireshark, ss, dig, mtr, and strace. Practice using them before your interview.

  5. Overlooking DNS in system design. DNS is often the first point of failure and a common source of latency. Always consider DNS resolution time, TTL implications, and failover behavior.

  6. Confusing throughput with latency. High bandwidth does not mean low latency. A satellite link has high bandwidth but 600ms RTT. Understanding this distinction is critical for distributed system design.


How to Prepare

Week 1-2: Fundamentals

  • Review the TCP/IP protocol stack, focusing on transport and application layers.
  • Implement a simple TCP client and server using sockets.
  • Practice with networking tools: tcpdump, Wireshark, dig, ss, netstat.

Week 3: Applied Networking

  • Study how HTTP/2, HTTP/3, and gRPC work at the protocol level.
  • Understand TLS 1.3 handshake and certificate management.
  • Learn DNS internals: resolution, caching, record types, DNSSEC.

Week 4: Production Networking

  • Study load balancing strategies (L4 vs L7, consistent hashing).
  • Understand connection pooling, keep-alive, and multiplexing.
  • Practice debugging scenarios: high latency, packet loss, DNS failures.

Ongoing:

  • Read post-mortems of networking-related outages (Cloudflare, Facebook, AWS).
  • Practice explaining concepts clearly — interviewers value communication as much as knowledge.

For a comprehensive preparation plan, see our system design interview guide and explore our learning paths.


Related Resources

GO DEEPER

Master this topic in our 12-week cohort

Our Advanced System Design cohort covers this and 11 other deep-dive topics with live sessions, assignments, and expert feedback.