By Rahul — Google Frontend Engineer
The Problem Keep-Alive Solves
In HTTP/1.0, every request opened a new TCP connection. Loading a page with 30 resources meant 30 TCP handshakes. Each handshake takes 1 RTT (round-trip time). On a 100ms latency connection, that is 3 seconds just for handshakes.
How It Works
In HTTP/1.1, Connection: Keep-Alive is the default. The TCP connection stays open after a response, and subsequent requests reuse it. The server closes the connection after an idle timeout or after a maximum number of requests.
Keep-Alive vs HTTP/2 Multiplexing
Keep-Alive in HTTP/1.1 has a limitation: head-of-line blocking. You must wait for a response before sending the next request on the same connection. HTTP/2 solves this with multiplexing — multiple requests and responses can fly simultaneously on the same connection.
Real Production Impact
Connection Limits
Browsers allow only 6 TCP connections per domain in HTTP/1.1. With Keep-Alive, these 6 connections are reused. Without it, you would need a new connection for every request, hitting the limit constantly.
CDN and Domain Sharding (Legacy)
Before HTTP/2, engineers used domain sharding — serving assets from multiple subdomains to get more parallel connections:
With HTTP/2, domain sharding actually hurts performance because you lose the benefit of multiplexing over a single connection.
When to Disable Keep-Alive
Almost never. But there are edge cases:
- Load balancers that need to redistribute connections
- Server under memory pressure with too many open connections
- Single-request APIs where the overhead of maintaining connections is not worth it
Best Practices
- Use HTTP/2 — it makes Keep-Alive's limitations irrelevant
- Do not use domain sharding with HTTP/2
- Set reasonable idle timeouts on your servers (5-15 seconds)
- Monitor open connection counts in production
Summary
Keep-Alive reuses TCP connections to avoid repeated handshakes. It is the default in HTTP/1.1. HTTP/2 takes this further with multiplexing. As frontend engineers, we benefit from this automatically, but understanding it helps debug slow page loads.