Back to Blog

WebSocket Server Internals: HTTP Upgrade, Frame Protocol, Connection Management & Scaling

May 3, 202672 min read0 views

WebSocket Server Internals: HTTP Upgrade, Frame Protocol, Connection Management & Scaling

Why Understanding WebSocket Internals Matters

WebSocket provides full-duplex communication over a single TCP connection. Unlike HTTP's request-response model, either side can send messages at any time — no polling, no long-polling hacks. This makes WebSocket the protocol of choice for real-time applications: chat, live dashboards, collaborative editors, gaming, and financial data streams. But WebSocket introduces challenges HTTP doesn't have: persistent connections consume server resources, scaling requires sticky sessions or pub/sub, and the frame-level protocol has subtleties around masking, fragmentation, and control frames that most developers never see.

HTTP vs WebSocket:

HTTP (Request-Response):
Client ──GET──► Server    Client initiates. Server responds.
Client ──GET──► Server    Need data? Poll again.
Client ──GET──► Server    Waste bandwidth polling for no change.

WebSocket (Full-Duplex):
Client ──Upgrade──► Server    One-time handshake
Client ◄═══════════► Server   Bidirectional messages, any time
       No overhead per message (2-14 byte frame header vs ~800 byte HTTP headers)

Long-Polling (Hack):
Client ──GET──► Server    Server holds connection open
                          ... waits for event ...
Client ◄──200──  Server   Event! Response sent. Connection closed.
Client ──GET──► Server    Immediately reconnect. Repeat.
                          Works, but: connection churn, timeouts, latency.

WebSocket Architecture

┌──────────────────────────────────────────────────────────────────┐
│ WebSocket Connection Lifecycle                                   │
│                                                                  │
│  Client                              Server                     │
│    │                                   │                         │
│    │── HTTP GET /ws ─────────────────►│                         │
│    │   Upgrade: websocket              │                         │
│    │   Connection: Upgrade             │                         │
│    │   Sec-WebSocket-Key: dGhlIH...    │                         │
│    │   Sec-WebSocket-Version: 13       │                         │
│    │                                   │                         │
│    │◄─ 101 Switching Protocols ───────│                         │
│    │   Upgrade: websocket              │                         │
│    │   Connection: Upgrade             │                         │
│    │   Sec-WebSocket-Accept: s3pP...   │ (SHA-1(key + GUID))    │
│    │                                   │                         │
│    │═══ WebSocket frames ═════════════│ Full-duplex             │
│    │                                   │                         │
│    │── Text Frame: "hello" ──────────►│                         │
│    │◄── Text Frame: "world" ──────────│                         │
│    │── Binary Frame: <bytes> ────────►│                         │
│    │── Ping Frame ───────────────────►│                         │
│    │◄── Pong Frame ───────────────────│ (automatic)             │
│    │                                   │                         │
│    │── Close Frame (code=1000) ─────►│                         │
│    │◄── Close Frame (code=1000) ──────│                         │
│    │                                   │                         │
│    │   TCP FIN ──────────────────────►│                         │
│    │◄── TCP FIN ──────────────────────│                         │
└──────────────────────────────────────────────────────────────────┘

Building a WebSocket Server from Scratch

HTTP Upgrade Handshake

import * as http from 'http';
import * as net from 'net';
import * as crypto from 'crypto';

const WEBSOCKET_GUID = '258EAFA5-E914-47DA-95CA-5AB5DC085B11';

class WebSocketServer {
  private httpServer: http.Server;
  private connections: Map<string, WebSocketConnection> = new Map();
  private rooms: Map<string, Set<string>> = new Map();
  private onConnectionCallback: ((ws: WebSocketConnection) => void) | null = null;

  constructor(server: http.Server) {
    this.httpServer = server;
    
    // Listen for HTTP Upgrade requests
    this.httpServer.on('upgrade', (req, socket, head) => {
      this.handleUpgrade(req, socket as net.Socket, head);
    });
  }

  onConnection(callback: (ws: WebSocketConnection) => void): void {
    this.onConnectionCallback = callback;
  }

  private handleUpgrade(req: http.IncomingMessage, socket: net.Socket, head: Buffer): void {
    // Validate WebSocket upgrade request
    const upgradeHeader = req.headers['upgrade'];
    if (!upgradeHeader || upgradeHeader.toLowerCase() !== 'websocket') {
      socket.write('HTTP/1.1 400 Bad Request\r\n\r\n');
      socket.destroy();
      return;
    }
    
    const key = req.headers['sec-websocket-key'];
    if (!key) {
      socket.write('HTTP/1.1 400 Bad Request\r\n\r\n');
      socket.destroy();
      return;
    }
    
    const version = req.headers['sec-websocket-version'];
    if (version !== '13') {
      socket.write('HTTP/1.1 426 Upgrade Required\r\nSec-WebSocket-Version: 13\r\n\r\n');
      socket.destroy();
      return;
    }
    
    // Generate accept key: SHA-1(client-key + GUID), base64 encoded
    const acceptKey = crypto
      .createHash('sha1')
      .update(key + WEBSOCKET_GUID)
      .digest('base64');
    
    // Send 101 Switching Protocols
    const headers = [
      'HTTP/1.1 101 Switching Protocols',
      'Upgrade: websocket',
      'Connection: Upgrade',
      `Sec-WebSocket-Accept: ${acceptKey}`,
      '', ''  // Empty line terminates headers
    ].join('\r\n');
    
    socket.write(headers);
    
    // Create WebSocket connection
    const connId = crypto.randomUUID();
    const ws = new WebSocketConnection(connId, socket, req);
    this.connections.set(connId, ws);
    
    // Process any data that arrived with the upgrade
    if (head.length > 0) {
      ws.handleData(head);
    }
    
    ws.onClose(() => {
      this.connections.delete(connId);
      // Remove from all rooms
      for (const [room, members] of this.rooms) {
        members.delete(connId);
        if (members.size === 0) this.rooms.delete(room);
      }
    });
    
    this.onConnectionCallback?.(ws);
  }

  // Room management for pub/sub
  joinRoom(connId: string, room: string): void {
    if (!this.rooms.has(room)) {
      this.rooms.set(room, new Set());
    }
    this.rooms.get(room)!.add(connId);
  }

  leaveRoom(connId: string, room: string): void {
    this.rooms.get(room)?.delete(connId);
    if (this.rooms.get(room)?.size === 0) {
      this.rooms.delete(room);
    }
  }

  broadcastToRoom(room: string, data: string | Buffer, excludeId?: string): void {
    const members = this.rooms.get(room);
    if (!members) return;
    
    for (const connId of members) {
      if (connId === excludeId) continue;
      const conn = this.connections.get(connId);
      if (conn && conn.isOpen()) {
        conn.send(data);
      }
    }
  }

  broadcast(data: string | Buffer, excludeId?: string): void {
    for (const [id, conn] of this.connections) {
      if (id === excludeId) continue;
      if (conn.isOpen()) conn.send(data);
    }
  }

  getConnectionCount(): number {
    return this.connections.size;
  }
}

WebSocket Frame Parser

/*
  WebSocket Frame Format (RFC 6455):
  
   0                   1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
  +-+-+-+-+-------+-+-------------+-------------------------------+
  |F|R|R|R| opcode|M| Payload len |    Extended payload length    |
  |I|S|S|S|  (4)  |A|     (7)     |             (16/64)           |
  |N|V|V|V|       |S|             |   (if payload len==126/127)   |
  | |1|2|3|       |K|             |                               |
  +-+-+-+-+-------+-+-------------+ - - - - - - - - - - - - - - - +
  |     Extended payload length continued, if payload len == 127  |
  + - - - - - - - - - - - - - - - +-------------------------------+
  |                               |Masking-key, if MASK set to 1  |
  +-------------------------------+-------------------------------+
  | Masking-key (continued)       |          Payload Data         |
  +-------------------------------- - - - - - - - - - - - - - - - +
  :                     Payload Data continued ...                :
  + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +
  |                     Payload Data (continued)                  |
  +---------------------------------------------------------------+
  
  Opcodes:
  0x0: Continuation
  0x1: Text frame
  0x2: Binary frame
  0x8: Connection close
  0x9: Ping
  0xA: Pong
  
  Client→Server frames MUST be masked (XOR with 4-byte key).
  Server→Client frames MUST NOT be masked.
*/

enum Opcode {
  CONTINUATION = 0x0,
  TEXT = 0x1,
  BINARY = 0x2,
  CLOSE = 0x8,
  PING = 0x9,
  PONG = 0xA
}

interface WebSocketFrame {
  fin: boolean;        // Is this the final fragment?
  opcode: Opcode;
  masked: boolean;
  payload: Buffer;
}

class FrameParser {
  // Parse a WebSocket frame from raw bytes
  static parse(buffer: Buffer): { frame: WebSocketFrame; bytesConsumed: number } | null {
    if (buffer.length < 2) return null;
    
    let offset = 0;
    
    // Byte 0: FIN + opcode
    const byte0 = buffer[offset++];
    const fin = !!(byte0 & 0x80);
    const opcode = byte0 & 0x0f;
    
    // Byte 1: MASK + payload length
    const byte1 = buffer[offset++];
    const masked = !!(byte1 & 0x80);
    let payloadLength = byte1 & 0x7f;
    
    // Extended payload length
    if (payloadLength === 126) {
      if (buffer.length < offset + 2) return null;
      payloadLength = buffer.readUInt16BE(offset);
      offset += 2;
    } else if (payloadLength === 127) {
      if (buffer.length < offset + 8) return null;
      // JavaScript can't handle full 64-bit, but 53 bits is enough
      const high = buffer.readUInt32BE(offset);
      const low = buffer.readUInt32BE(offset + 4);
      payloadLength = high * 0x100000000 + low;
      offset += 8;
    }
    
    // Masking key (4 bytes, only if masked)
    let maskingKey: Buffer | null = null;
    if (masked) {
      if (buffer.length < offset + 4) return null;
      maskingKey = buffer.subarray(offset, offset + 4);
      offset += 4;
    }
    
    // Payload
    if (buffer.length < offset + payloadLength) return null;
    let payload = buffer.subarray(offset, offset + payloadLength);
    
    // Unmask payload
    if (masked && maskingKey) {
      payload = Buffer.from(payload); // Copy to avoid modifying original
      for (let i = 0; i < payload.length; i++) {
        payload[i] ^= maskingKey[i % 4];
      }
    }
    
    return {
      frame: { fin, opcode, masked, payload },
      bytesConsumed: offset + payloadLength
    };
  }

  // Build a WebSocket frame for sending (server→client, no masking)
  static build(opcode: Opcode, payload: Buffer, fin: boolean = true): Buffer {
    const parts: Buffer[] = [];
    
    // Byte 0: FIN + opcode
    const byte0 = (fin ? 0x80 : 0x00) | opcode;
    
    // Payload length encoding
    if (payload.length < 126) {
      const header = Buffer.alloc(2);
      header[0] = byte0;
      header[1] = payload.length; // No mask bit (server→client)
      parts.push(header);
    } else if (payload.length < 65536) {
      const header = Buffer.alloc(4);
      header[0] = byte0;
      header[1] = 126;
      header.writeUInt16BE(payload.length, 2);
      parts.push(header);
    } else {
      const header = Buffer.alloc(10);
      header[0] = byte0;
      header[1] = 127;
      header.writeUInt32BE(0, 2); // High 32 bits (0 for < 4GB)
      header.writeUInt32BE(payload.length, 6);
      parts.push(header);
    }
    
    parts.push(payload);
    return Buffer.concat(parts);
  }
}

WebSocket Connection

class WebSocketConnection {
  readonly id: string;
  private socket: net.Socket;
  private request: http.IncomingMessage;
  private buffer: Buffer = Buffer.alloc(0);
  private state: 'open' | 'closing' | 'closed' = 'open';
  private fragmentBuffer: Buffer[] = [];
  private fragmentOpcode: Opcode | null = null;
  
  private messageCallbacks: ((data: string | Buffer) => void)[] = [];
  private closeCallbacks: ((code: number, reason: string) => void)[] = [];
  private errorCallbacks: ((error: Error) => void)[] = [];
  
  private pingInterval: ReturnType<typeof setInterval> | null = null;
  private pongReceived: boolean = true;
  private lastActivity: number = Date.now();

  constructor(id: string, socket: net.Socket, request: http.IncomingMessage) {
    this.id = id;
    this.socket = socket;
    this.request = request;
    
    socket.on('data', (data) => this.handleData(data));
    socket.on('close', () => this.handleSocketClose());
    socket.on('error', (err) => this.handleError(err));
    
    // Start ping/pong heartbeat (detect dead connections)
    this.startHeartbeat();
  }

  onMessage(callback: (data: string | Buffer) => void): void {
    this.messageCallbacks.push(callback);
  }

  onClose(callback: (code: number, reason: string) => void): void {
    this.closeCallbacks.push(callback);
  }

  onError(callback: (error: Error) => void): void {
    this.errorCallbacks.push(callback);
  }

  send(data: string | Buffer): void {
    if (this.state !== 'open') return;
    
    const isText = typeof data === 'string';
    const payload = isText ? Buffer.from(data, 'utf8') : data;
    const opcode = isText ? Opcode.TEXT : Opcode.BINARY;
    
    // Fragment large messages (>64KB)
    const MAX_FRAME_SIZE = 65536;
    
    if (payload.length <= MAX_FRAME_SIZE) {
      const frame = FrameParser.build(opcode, payload, true);
      this.socket.write(frame);
    } else {
      // Send as fragments
      let offset = 0;
      let isFirst = true;
      
      while (offset < payload.length) {
        const chunk = payload.subarray(offset, offset + MAX_FRAME_SIZE);
        const isFinal = offset + MAX_FRAME_SIZE >= payload.length;
        const frameOpcode = isFirst ? opcode : Opcode.CONTINUATION;
        
        const frame = FrameParser.build(frameOpcode, chunk, isFinal);
        this.socket.write(frame);
        
        offset += MAX_FRAME_SIZE;
        isFirst = false;
      }
    }
    
    this.lastActivity = Date.now();
  }

  close(code: number = 1000, reason: string = ''): void {
    if (this.state !== 'open') return;
    
    this.state = 'closing';
    
    // Send close frame
    const reasonBytes = Buffer.from(reason, 'utf8');
    const payload = Buffer.alloc(2 + reasonBytes.length);
    payload.writeUInt16BE(code, 0);
    reasonBytes.copy(payload, 2);
    
    const frame = FrameParser.build(Opcode.CLOSE, payload, true);
    this.socket.write(frame);
    
    // Give peer time to send close frame back
    setTimeout(() => {
      if (this.state !== 'closed') {
        this.socket.destroy();
        this.state = 'closed';
      }
    }, 5000);
  }

  isOpen(): boolean {
    return this.state === 'open';
  }

  getClientIp(): string {
    return this.request.socket.remoteAddress || 'unknown';
  }

  handleData(data: Buffer): void {
    this.buffer = Buffer.concat([this.buffer, data]);
    this.lastActivity = Date.now();
    
    // Parse all complete frames in the buffer
    while (this.buffer.length > 0) {
      const result = FrameParser.parse(this.buffer);
      if (!result) break; // Incomplete frame
      
      this.buffer = this.buffer.subarray(result.bytesConsumed);
      this.handleFrame(result.frame);
    }
  }

  private handleFrame(frame: WebSocketFrame): void {
    switch (frame.opcode) {
      case Opcode.TEXT:
      case Opcode.BINARY:
        if (frame.fin) {
          // Complete single-frame message
          const data = frame.opcode === Opcode.TEXT
            ? frame.payload.toString('utf8')
            : frame.payload;
          this.emitMessage(data);
        } else {
          // Start of fragmented message
          this.fragmentOpcode = frame.opcode;
          this.fragmentBuffer = [frame.payload];
        }
        break;
      
      case Opcode.CONTINUATION:
        if (this.fragmentOpcode !== null) {
          this.fragmentBuffer.push(frame.payload);
          
          if (frame.fin) {
            // Final fragment — reassemble
            const fullPayload = Buffer.concat(this.fragmentBuffer);
            const data = this.fragmentOpcode === Opcode.TEXT
              ? fullPayload.toString('utf8')
              : fullPayload;
            
            this.fragmentOpcode = null;
            this.fragmentBuffer = [];
            this.emitMessage(data);
          }
        }
        break;
      
      case Opcode.PING:
        // Must respond with pong containing the same payload
        if (this.state === 'open') {
          const pong = FrameParser.build(Opcode.PONG, frame.payload, true);
          this.socket.write(pong);
        }
        break;
      
      case Opcode.PONG:
        this.pongReceived = true;
        break;
      
      case Opcode.CLOSE:
        this.handleCloseFrame(frame);
        break;
    }
  }

  private handleCloseFrame(frame: WebSocketFrame): void {
    let code = 1005; // No status code present
    let reason = '';
    
    if (frame.payload.length >= 2) {
      code = frame.payload.readUInt16BE(0);
      reason = frame.payload.toString('utf8', 2);
    }
    
    if (this.state === 'open') {
      // Peer initiated close — echo it back
      this.state = 'closing';
      const closeFrame = FrameParser.build(Opcode.CLOSE, frame.payload, true);
      this.socket.write(closeFrame);
    }
    
    this.state = 'closed';
    this.stopHeartbeat();
    this.socket.end();
    
    for (const cb of this.closeCallbacks) cb(code, reason);
  }

  private emitMessage(data: string | Buffer): void {
    for (const cb of this.messageCallbacks) cb(data);
  }

  private handleSocketClose(): void {
    if (this.state !== 'closed') {
      this.state = 'closed';
      this.stopHeartbeat();
      for (const cb of this.closeCallbacks) cb(1006, 'Connection lost');
    }
  }

  private handleError(error: Error): void {
    for (const cb of this.errorCallbacks) cb(error);
    if (this.state !== 'closed') {
      this.state = 'closed';
      this.stopHeartbeat();
    }
  }

  // Heartbeat: detect dead connections
  private startHeartbeat(): void {
    this.pingInterval = setInterval(() => {
      if (!this.pongReceived) {
        // No pong received for last ping — connection is dead
        this.close(1001, 'Ping timeout');
        return;
      }
      
      this.pongReceived = false;
      const ping = FrameParser.build(Opcode.PING, Buffer.alloc(0), true);
      this.socket.write(ping);
    }, 30000); // Every 30 seconds
  }

  private stopHeartbeat(): void {
    if (this.pingInterval) clearInterval(this.pingInterval);
  }
}

Scaling WebSockets with Pub/Sub

/*
  Scaling Problem:
  
  With 1 server, all WebSocket connections are in one process.
  Broadcasting to a room is just iterating local connections.
  
  With N servers, connections are distributed. A message sent
  to Server A must reach clients on Server B and C.
  
  Solution: Redis Pub/Sub as a message bus between servers.
  
  ┌──────────┐  ┌──────────┐  ┌──────────┐
  │ Server A │  │ Server B │  │ Server C │
  │ 1000     │  │ 1000     │  │ 1000     │
  │ clients  │  │ clients  │  │ clients  │
  └────┬─────┘  └────┬─────┘  └────┬─────┘
       │              │              │
       └──────────────┼──────────────┘
                      │
              ┌───────▼───────┐
              │  Redis Pub/Sub │
              │               │
              │  Channel:     │
              │  room:general │
              └───────────────┘
  
  Server A publishes message to "room:general".
  Redis broadcasts to all subscribers (B, C).
  B and C forward to their local clients in that room.
*/

interface PubSubAdapter {
  publish(channel: string, message: string): Promise<void>;
  subscribe(channel: string, callback: (message: string) => void): Promise<void>;
  unsubscribe(channel: string): Promise<void>;
}

class ScalableWebSocketServer {
  private wsServer: WebSocketServer;
  private pubsub: PubSubAdapter;
  private serverId: string;
  private localRooms: Map<string, Set<string>> = new Map();

  constructor(wsServer: WebSocketServer, pubsub: PubSubAdapter) {
    this.wsServer = wsServer;
    this.pubsub = pubsub;
    this.serverId = crypto.randomUUID();
  }

  async joinRoom(connId: string, room: string): Promise<void> {
    // Track locally
    if (!this.localRooms.has(room)) {
      this.localRooms.set(room, new Set());
      
      // First local member — subscribe to Redis channel
      await this.pubsub.subscribe(`room:${room}`, (message) => {
        const parsed = JSON.parse(message);
        
        // Don't re-broadcast our own messages
        if (parsed.serverId === this.serverId) return;
        
        // Forward to local clients in this room
        this.wsServer.broadcastToRoom(room, parsed.data);
      });
    }
    
    this.localRooms.get(room)!.add(connId);
    this.wsServer.joinRoom(connId, room);
  }

  async leaveRoom(connId: string, room: string): Promise<void> {
    this.wsServer.leaveRoom(connId, room);
    this.localRooms.get(room)?.delete(connId);
    
    // Last local member — unsubscribe from Redis
    if (this.localRooms.get(room)?.size === 0) {
      this.localRooms.delete(room);
      await this.pubsub.unsubscribe(`room:${room}`);
    }
  }

  async broadcastToRoom(room: string, data: string, excludeId?: string): Promise<void> {
    // Send to local clients
    this.wsServer.broadcastToRoom(room, data, excludeId);
    
    // Publish to Redis for other servers
    await this.pubsub.publish(`room:${room}`, JSON.stringify({
      serverId: this.serverId,
      data,
      timestamp: Date.now()
    }));
  }

  getStats(): {
    connections: number;
    rooms: number;
    serverId: string;
  } {
    return {
      connections: this.wsServer.getConnectionCount(),
      rooms: this.localRooms.size,
      serverId: this.serverId
    };
  }
}

Comparison Table

┌──────────────────┬───────────────┬───────────────┬───────────────┬───────────────┐
│                  │ WebSocket     │ Server-Sent   │ Long Polling  │ HTTP/2 Push   │
│                  │               │ Events (SSE)  │               │               │
├──────────────────┼───────────────┼───────────────┼───────────────┼───────────────┤
│ Direction        │ Bidirectional │ Server→Client │ Bidirectional │ Server→Client │
│ Protocol         │ ws:// or wss: │ HTTP (text/   │ HTTP          │ HTTP/2        │
│                  │ (upgrade)     │ event-stream) │               │               │
│ Connection       │ Persistent    │ Persistent    │ Per-request   │ Multiplexed   │
│                  │ TCP           │ HTTP          │ (reconnect)   │ streams       │
│ Message overhead │ 2-14 bytes    │ ~50 bytes     │ ~800 bytes    │ ~20 bytes     │
│ Binary support   │ Yes           │ No (text only)│ Yes (base64)  │ Yes           │
│ Auto-reconnect   │ Manual        │ Built-in      │ Manual        │ N/A           │
│ Proxy-friendly   │ Needs upgrade │ Standard HTTP │ Standard HTTP │ Standard HTTP │
│                  │ support       │ (works thru   │ (works thru   │               │
│                  │               │ proxies)      │ all proxies)  │               │
│ Best for         │ Chat, gaming, │ Live feeds,   │ Fallback when │ Preloading    │
│                  │ collaboration │ notifications │ WS/SSE       │ resources     │
│                  │               │               │ unavailable   │               │
│ Max connections  │ ~65K per IP   │ 6 per domain  │ 6 per domain  │ 1 connection  │
│ per browser      │ (per server)  │ (HTTP/1.1)    │ (HTTP/1.1)    │ per domain    │
└──────────────────┴───────────────┴───────────────┴───────────────┴───────────────┘

Interview Questions

Q1: How does the WebSocket handshake work and why does it start with HTTP?

WebSocket connections begin with an HTTP Upgrade request. The client sends a standard HTTP GET with special headers: Upgrade: websocket, Connection: Upgrade, Sec-WebSocket-Key (random base64), and Sec-WebSocket-Version: 13. The server validates these, computes Sec-WebSocket-Accept = Base64(SHA-1(key + "258EAFA5-E914-47DA-95CA-5AB5DC085B11")), and responds with 101 Switching Protocols. After this, the TCP connection switches from HTTP to the WebSocket frame protocol. Why HTTP first? (1) It works through HTTP infrastructure — proxies, load balancers, and firewalls that understand HTTP can forward the upgrade. (2) The same port (80/443) is used, avoiding firewall issues with non-standard ports. (3) Existing HTTP auth (cookies, headers) can be sent with the upgrade request for authentication. The accept key prevents a caching proxy from replaying an old WebSocket response to a normal HTTP request — the accept key is derived from the client's random key.

Q2: How do you scale WebSocket connections across multiple server instances?

Each WebSocket connection is a persistent TCP connection tied to one server. When you have N server instances, clients are distributed across them. A message intended for all users in a "room" must reach clients on every server. Solution: Pub/Sub backbone. Each server subscribes to a Redis Pub/Sub channel for each active room. When a server receives a message for a room, it (1) broadcasts to local clients and (2) publishes to Redis. Other servers receive the publication and forward to their local clients. Sticky sessions: WebSocket connections are long-lived, so they inherently "stick" to one server. But the initial HTTP upgrade must route to a server — use IP-hash or cookie-based affinity at the load balancer. Connection limits: A single server can handle ~50K-100K concurrent WebSocket connections (limited by file descriptors, memory per connection ~5-20KB). For 1M connections: 10-20 servers. Alternative to Redis Pub/Sub: NATS, Kafka, or a custom gossip protocol.

Q3: Why must client-to-server WebSocket frames be masked and server-to-client frames not?

This is a security requirement from RFC 6455, designed to prevent cache poisoning attacks on intermediary proxies. Without masking: An attacker could craft WebSocket frames that, viewed as raw bytes by an HTTP-unaware proxy, look like valid HTTP requests/responses. The proxy might cache these fake responses. When a real client requests the same URL, the proxy serves the attacker's content. Masking — XOR-ing payload bytes with a random 4-byte key — ensures the raw bytes on the wire don't resemble HTTP. The masking key is sent in the frame header, so the server can unmask trivially. Server→client frames aren't masked because the attack only works when an attacker controls the sender (client). The server is trusted. Performance note: Masking adds ~10% CPU overhead for large payloads (XOR loop). Implementations optimize this with SIMD instructions or word-level XOR. The mask must be cryptographically random per frame — predictable masks defeat the purpose.

Q4: How do you detect and handle dead WebSocket connections?

WebSocket runs over TCP, and TCP has a known problem: if one side crashes or the network path fails, the other side may not detect it for hours (TCP keepalive defaults to 2+ hours). Solution: Application-level heartbeat using WebSocket Ping/Pong frames. The server sends a Ping frame every 30 seconds. The client must respond with a Pong (browsers do this automatically). If no Pong is received within the next interval, the connection is dead — close it. Implementation: Track pongReceived per connection. Before sending a new Ping, check if the previous Pong arrived. If not, close with code 1001 (Going Away). Additional signals: Monitor TCP CLOSE_WAIT state (peer closed but we haven't). Track lastActivity timestamp and close connections idle beyond a threshold. Client side: Implement automatic reconnection with exponential backoff (1s, 2s, 4s, 8s, max 30s). Include a jitter factor to prevent all clients from reconnecting simultaneously after a server restart.

Q5: When should you use WebSocket vs Server-Sent Events (SSE)?

Use WebSocket when: You need bidirectional communication (chat, multiplayer games, collaborative editing). The client sends frequent messages to the server. You need binary data support. Low per-message overhead matters (2 bytes vs ~50 bytes for SSE). Use SSE when: Communication is primarily server→client (live feeds, notifications, dashboards). You want automatic reconnection (SSE has it built-in, WebSocket doesn't). You're behind restrictive proxies that don't support WebSocket upgrade. You want simpler server implementation (SSE is just long-lived HTTP with text/event-stream content type). SSE works with standard HTTP infrastructure without any upgrade mechanism. SSE limitations: Server→client only (client sends via regular HTTP POST). Text-only (no binary). Limited to 6 connections per domain on HTTP/1.1 (solved by HTTP/2). Practical choice: For real-time dashboards and notifications, SSE is simpler and sufficient. For chat, gaming, and collaboration, WebSocket is necessary.


Key Takeaways

  1. WebSocket begins with an HTTP Upgrade handshake on port 80/443: This ensures compatibility with existing HTTP infrastructure (proxies, load balancers, firewalls).

  2. The frame protocol has only 2-14 bytes of overhead: Compared to ~800 bytes for HTTP headers. This makes WebSocket efficient for frequent small messages.

  3. Client→server frames must be masked (XOR with random 4-byte key): This prevents cache poisoning attacks on HTTP-unaware intermediary proxies.

  4. Ping/Pong heartbeat at 30-second intervals detects dead connections: TCP alone can take hours to detect a dead peer. Application-level heartbeat is mandatory.

  5. Scaling requires a Pub/Sub backbone (Redis, NATS, Kafka): Messages from one server's clients must reach clients connected to other servers.

  6. A single server handles ~50K-100K concurrent WebSocket connections: Limited by file descriptors and memory (~5-20KB per connection).

  7. Use Server-Sent Events (SSE) for server→client-only streaming: SSE is simpler, has built-in reconnection, and works through all HTTP proxies.

  8. Fragment large messages into multiple frames: Sending a 10MB file as a single frame blocks the connection. Fragment into 64KB frames for interleaving with control frames.

  9. Implement exponential backoff with jitter for client reconnection: Prevents thundering herd when a server restarts and all clients reconnect simultaneously.

  10. Close codes communicate the reason for disconnection: 1000 = normal, 1001 = going away, 1008 = policy violation, 1011 = unexpected server error. Always send meaningful close codes.

What did you think?

© 2026 Vidhya Sagar Thakur. All rights reserved.