WebSockets at Scale: nginx, HAProxy, and Sticky-Session Load Balancing That Actually Works

Your app works perfectly on a single server. You add a second node to handle the load. Within minutes, half your users start seeing disconnections and random errors — but nobody can reproduce the bug consistently. You check the logs, everything looks fine. The other half of your users are perfectly happy.

Welcome to the WebSocket scaling trap.

This isn’t a bug in your code. It’s a fundamental mismatch between how HTTP load balancers work by default and what WebSockets actually need. HTTP is stateless — send a request to any server, get a response, done. WebSockets are the opposite: a single TCP connection that stays open for minutes, hours, or the lifetime of the user session. The moment your load balancer sends a reconnect attempt to a different backend than the one holding the active connection state, everything falls apart.

This article is a practical breakdown of how to solve this correctly using two of the most common load balancers in the self-hosted and production world: nginx and HAProxy. We’ll cover the HTTP Upgrade handshake, persistent upstream connections, sticky sessions by cookie and IP hash, and a few production patterns that most tutorials skip entirely.

Why WebSockets Break Under Standard Load Balancing

Before touching config files, it’s worth understanding the failure mode precisely.

A WebSocket connection starts as a normal HTTP/1.1 request with an Upgrade header. The server responds with 101 Switching Protocols, and from that point on the connection is a raw TCP stream — no more HTTP framing. The load balancer has to proxy this bidirectional stream indefinitely.

Round-robin load balancing assigns each new connection to the next available backend. For HTTP APIs this is fine — each request is a connection. But WebSocket clients reconnect constantly. Dropped connections, network hiccups, mobile clients going in and out of coverage — every reconnect is a new TCP connection. Round-robin will route it to whichever backend is next in rotation, which is probably not the one holding the user’s session state.

If your backend is completely stateless (room lists in Redis, presence in a shared pub/sub), you can technically route to any node. Most apps aren’t that clean. Even if your core data is in Redis, there’s often in-process state: subscription filters, rate limit counters, queued messages waiting to flush. Stateless WebSocket backends are an architectural ideal, not a default reality.

The fix comes in two forms:

IP hash or cookie-based sticky sessions — route each client consistently to the same backend
Shared external state — move all session state out of process so any backend can serve any client

You’ll likely need elements of both. Let’s start with the load balancer config.

nginx: WebSocket Proxying Done Right

nginx doesn’t proxy WebSockets out of the box. By default it won’t forward the Upgrade and Connection headers, so the handshake silently fails or degrades to HTTP. Here’s a minimal working configuration:

# /etc/nginx/conf.d/websocket.conf

map $http_upgrade $connection_upgrade {
    default upgrade;
    ''      close;
}

upstream ws_backends {
    # Simple round-robin — fine only if your backend is truly stateless
    server backend1:8080;
    server backend2:8080;
    server backend3:8080;

    # Keep idle connections to backends alive — critical for performance
    keepalive 64;
}

server {
    listen 80;
    server_name ws.example.com;

    location / {
        proxy_pass http://ws_backends;

        # The two headers that make WebSockets work
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection $connection_upgrade;

        # Pass real client IP to the backend
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

        # Increase timeouts — WebSocket connections are long-lived
        proxy_read_timeout 3600s;
        proxy_send_timeout 3600s;
        proxy_connect_timeout 10s;

        # Disable buffering — you want frames to flow immediately
        proxy_buffering off;
    }
}

The map directive is the key detail most guides gloss over. nginx needs to set Connection: upgrade when the client sends an Upgrade header, and Connection: close otherwise (for normal HTTP requests on the same vhost). Without this, some clients will establish the WebSocket but then see the connection close unexpectedly after nginx’s default keepalive timeout.

proxy_http_version 1.1 is mandatory. HTTP/1.0 doesn’t support keepalive or connection upgrades. Without it, nginx uses 1.0 and the upgrade silently fails.

proxy_buffering off is non-negotiable for WebSockets. nginx’s default buffering accumulates response data before forwarding it. For streaming WebSocket frames, this introduces latency and can cause the connection to appear stalled.

Sticky Sessions in nginx

nginx’s built-in ip_hash directive is the simplest form of sticky sessions:

upstream ws_backends {
    ip_hash;  # Routes each client IP consistently to the same backend

    server backend1:8080;
    server backend2:8080;
    server backend3:8080;

    keepalive 64;
}

ip_hash hashes on the first three octets of the client IP (so a /24 subnet always goes to the same backend). This works reasonably well except for two scenarios: clients behind CGNAT or corporate proxies (thousands of users sharing one IP), and IPv6 (nginx hashes only the first 64 bits of the address, which still concentrates large subnets).

For proper cookie-based sticky sessions, you need the nginx-sticky-module-ng module or the commercial nginx Plus sticky cookie directive. In the open-source world, the pragmatic alternative is to handle stickiness at the application layer (encode the backend ID in your connection token) or push the problem to HAProxy, which handles it natively.

Gotcha: nginx Worker Connections

Each proxied WebSocket connection holds two file descriptors open — one for the client-facing socket, one for the backend. With worker_connections 1024 (a common default), nginx can handle 512 simultaneous WebSocket connections per worker. With 4 workers that’s 2048 total — fine for small deployments, a ceiling you’ll hit suddenly in production.

# /etc/nginx/nginx.conf
events {
    worker_connections 65535;
    use epoll;
    multi_accept on;
}

Also increase the OS file descriptor limit:

# /etc/security/limits.conf
nginx soft nofile 65535
nginx hard nofile 65535

HAProxy: The Better Tool for Sticky WebSockets

HAProxy was designed for persistent TCP connections. It has native cookie-based sticky sessions, active health checks, and proper connection draining. For WebSocket workloads specifically, it handles the Upgrade protocol transparently — no special header manipulation required.

Here’s a production-ready HAProxy configuration:

# /etc/haproxy/haproxy.cfg

global
    log /dev/log local0
    maxconn 100000
    # Tune for long-lived connections
    tune.bufsize 32768

defaults
    log     global
    mode    http
    option  httplog
    option  dontlognull
    timeout connect     5s
    # Long timeouts for WebSocket idle connections
    timeout client      1h
    timeout server      1h
    # HAProxy-specific tunnel timeout — applies once the protocol upgrades
    timeout tunnel      1h
    option  forwardfor
    option  http-server-close

frontend ws_frontend
    bind *:80
    bind *:443 ssl crt /etc/ssl/certs/example.pem alpn h2,http/1.1

    # Detect WebSocket upgrade requests
    acl is_websocket hdr(Upgrade) -i websocket

    # Route WebSocket and regular HTTP to different backends if needed
    use_backend ws_cluster if is_websocket
    default_backend http_cluster

backend ws_cluster
    balance leastconn   # Better than roundrobin for long-lived connections

    # Cookie-based sticky sessions — the right way to do this
    # HAProxy inserts a cookie called SERVERID on the first response
    # Subsequent requests with that cookie go to the same backend
    cookie SERVERID insert indirect nocache

    option httpchk GET /health
    http-check expect status 200

    server backend1 10.0.0.1:8080 check cookie backend1 inter 5s fall 3 rise 2
    server backend2 10.0.0.2:8080 check cookie backend2 inter 5s fall 3 rise 2
    server backend3 10.0.0.3:8080 check cookie backend3 inter 5s fall 3 rise 2

backend http_cluster
    balance roundrobin

    option httpchk GET /health
    http-check expect status 200

    server backend1 10.0.0.1:8080 check inter 5s
    server backend2 10.0.0.2:8080 check inter 5s
    server backend3 10.0.0.3:8080 check inter 5s

A few things worth unpacking here.

timeout tunnel is HAProxy-specific and applies after the HTTP Upgrade completes. The timeout client and timeout server values control the initial HTTP phase. Once the connection upgrades to a WebSocket tunnel, timeout tunnel takes over. Without this, HAProxy will kill long-idle WebSocket connections using the shorter HTTP timeouts — a bug that manifests as random disconnections every few minutes in low-traffic deployments.

balance leastconn distributes new connections to the backend with the fewest active connections rather than cycling through backends. For WebSockets this is almost always better than round-robin: if backend2 has 500 long-lived connections and backend3 just had 200 users disconnect, you want new connections going to backend3, not cycling based on request count.

When cookie SERVERID insert is configured, HAProxy intercepts the first HTTP response from the backend and injects a Set-Cookie: SERVERID=backend1 header before the connection upgrades. Every subsequent request from that client includes the SERVERID cookie, and HAProxy routes it directly to the named backend — bypassing the load balancing algorithm entirely.

The nocache flag prevents intermediate caches from stripping or storing the sticky cookie. indirect means HAProxy only inserts the cookie if the backend didn’t already set one with the same name, which lets you override stickiness at the application level.

This is fundamentally more reliable than IP hash for mobile clients and users behind shared NAT.

Here’s a subtle problem: the sticky cookie is set on the HTTP response that precedes the 101 Switching Protocols. After the upgrade, no more HTTP headers are exchanged — it’s raw WebSocket frames. The stickiness is therefore established during the handshake and holds for the lifetime of that TCP connection. This is correct behavior.

The failure mode is when clients don’t preserve cookies across reconnects. Some WebSocket client libraries don’t implement cookie jars. If your client is a browser using the native WebSocket API, cookies work automatically. If it’s a custom client in Go, Python, or C++, you need to explicitly store and resend the Set-Cookie header on reconnect.

Test this by checking your load balancer stats page (/haproxy_stats) and watching which backends your test client lands on across multiple reconnects. If it’s random, your client isn’t sending the cookie.

Docker Compose: Running the Stack Locally

Here’s a Docker Compose setup to test this locally with HAProxy in front of three Node.js WebSocket servers:

# docker-compose.yml
version: "3.9"

services:
  haproxy:
    image: haproxy:2.9-alpine
    ports:
      - "80:80"
      - "8404:8404"   # Stats page
    volumes:
      - ./haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg:ro
    depends_on:
      - backend1
      - backend2
      - backend3
    networks:
      - ws_net

  backend1:
    image: node:20-alpine
    working_dir: /app
    volumes:
      - ./server:/app
    command: node server.js
    environment:
      - SERVER_ID=backend1
      - PORT=8080
    networks:
      - ws_net

  backend2:
    image: node:20-alpine
    working_dir: /app
    volumes:
      - ./server:/app
    command: node server.js
    environment:
      - SERVER_ID=backend2
      - PORT=8080
    networks:
      - ws_net

  backend3:
    image: node:20-alpine
    working_dir: /app
    volumes:
      - ./server:/app
    command: node server.js
    environment:
      - SERVER_ID=backend3
      - PORT=8080
    networks:
      - ws_net

networks:
  ws_net:
    driver: bridge

And a minimal WebSocket server that identifies which backend you’re connected to:

// server/server.js
const http = require("http");
const { WebSocketServer } = require("ws");

const PORT = process.env.PORT || 8080;
const SERVER_ID = process.env.SERVER_ID || "unknown";

const server = http.createServer((req, res) => {
  if (req.url === "/health") {
    res.writeHead(200);
    res.end("ok");
    return;
  }
  res.writeHead(404);
  res.end();
});

const wss = new WebSocketServer({ server });

wss.on("connection", (ws, req) => {
  console.log(`[${SERVER_ID}] client connected from ${req.socket.remoteAddress}`);

  ws.send(JSON.stringify({ server: SERVER_ID, ts: Date.now() }));

  ws.on("message", (data) => {
    ws.send(JSON.stringify({ echo: data.toString(), server: SERVER_ID }));
  });

  ws.on("close", () => {
    console.log(`[${SERVER_ID}] client disconnected`);
  });
});

server.listen(PORT, () => {
  console.log(`${SERVER_ID} listening on :${PORT}`);
});

The /health endpoint is what HAProxy polls every 5 seconds. Without it, HAProxy has no way to detect a dead backend and will keep sending connections there until the TCP timeout fires.

Production Patterns You Won’t Find in the Docs

Connection Draining on Deploy

Rolling deploys are brutal for WebSocket servers. When you bring down backend2 to deploy a new version, all connected clients get disconnected. HAProxy supports graceful removal via the DRAIN state, which stops new connections from routing to a backend while letting existing ones finish naturally.

# Mark backend2 as draining before deploy
echo "set server ws_cluster/backend2 state drain" | socat stdio /var/run/haproxy/admin.sock

# Wait for connections to drop to zero
watch 'echo "show servers state ws_cluster" | socat stdio /var/run/haproxy/admin.sock'

# Take it down, deploy, bring it back
echo "set server ws_cluster/backend2 state ready" | socat stdio /var/run/haproxy/admin.sock

For nginx, there’s no equivalent — you’d need to implement drain logic in your application or use a sidecar.

Health Checks That Actually Reflect WebSocket Readiness

An HTTP 200 from /health tells you the process is running, not that it can accept WebSocket connections. A backend whose Redis connection is dropped or whose internal message queue is full will still return 200 for HTTP while silently dropping or queueing WebSocket messages.

A better health check sends an actual WebSocket upgrade and closes it immediately. HAProxy’s tcp-check mode supports this:

backend ws_cluster
    option tcp-check
    tcp-check connect
    tcp-check send "GET /ws-health HTTP/1.1\r\nHost: localhost\r\nUpgrade: websocket\r\nConnection: Upgrade\r\nSec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==\r\nSec-WebSocket-Version: 13\r\n\r\n"
    tcp-check expect string "101"

Your backend needs a /ws-health endpoint that does a minimal WebSocket handshake — verify the connection to whatever shared state it depends on, then close cleanly.

Gotcha: Sticky Sessions and Backend Failures

When backend2 goes down, HAProxy has to route the sticky SERVERID=backend2 clients somewhere. With option redispatch enabled, HAProxy will automatically reroute them to another available backend on the next request:

defaults
    option redispatch   # Reroute if sticky backend is unavailable
    retries 3

Without redispatch, those clients get a 503 until they reconnect and receive a new cookie. With it, they lose stickiness temporarily — which may cause state inconsistency depending on your application. There’s no universally correct answer here; it depends on what failing over to a different backend means for your users. If your state is purely in Redis, redispatch is safe. If there’s meaningful in-process state, you want to control the disconnect explicitly from the client side.

TLS Termination and WebSocket

Terminate TLS at the load balancer, not at the backend. This is standard advice for HTTP and it applies equally here. HAProxy handles this cleanly:

frontend ws_frontend
    bind *:443 ssl crt /etc/ssl/private/example.pem alpn h2,http/1.1
    # wss:// connections arrive here as WSS, leave to backend as ws://

One edge case: some clients use wss:// exclusively and refuse to fall back to ws://. If you terminate TLS at HAProxy and proxy to backends over plain HTTP, the backend log will show ws:// connections while the client sent wss://. This causes confusion in debugging. Put X-Forwarded-Proto: https in your headers and log it on the backend side.

WebSockets Behind nginx + HAProxy (Layered Proxy)

If nginx handles SSL termination and HAProxy does the load balancing behind it, you need to be careful about the Upgrade header. Each proxy in the chain needs to pass it through. A common mistake is having nginx translate the connection to HTTP/1.0 internally (which drops the Upgrade header), then passing a broken request to HAProxy.

The chain should be: client → nginx (TLS termination, HTTP/1.1 keepalive) → HAProxy (sticky session routing) → backend. Both proxies must preserve Upgrade and Connection headers. nginx’s map directive shown earlier handles this correctly.

Choosing Between nginx and HAProxy

For pure WebSocket workloads at scale, HAProxy wins. It was built for this — connection-level visibility, cookie stickiness without modules, proper drain states, and a statistics dashboard that shows you active connections per backend in real time. The config syntax is more verbose than nginx, but the operational capabilities are worth it.

nginx is the right choice when you’re also serving static files, handling HTTP/2, or running as a reverse proxy for multiple services on the same host. Its WebSocket support is solid, but sticky sessions require either the commercial version or routing to HAProxy. The combination of nginx for TLS/HTTP termination and HAProxy for backend distribution is common in production and works well.

If you’re on Kubernetes, neither applies directly — you’d use an ingress controller (nginx-based or similar) with session affinity annotations, or skip sticky sessions entirely in favor of a fully stateless backend with Redis-backed session state.

The real architectural goal is to not need sticky sessions. Sticky sessions are a load balancer workaround for application-level statefulness. Every new feature you build on top of shared Redis state instead of in-process state is one less thing that breaks when a backend restarts. The load balancer configs here buy you time and reliability; the Redis migration is what actually solves the problem permanently.

Start with the HAProxy config, enable the stats page on port 8404, and watch the connection counts as you connect and disconnect clients. That real-time visibility alone will teach you more about your WebSocket traffic patterns than any monitoring dashboard you’ll set up later.