Backend Process & Thread Management: How Servers Actually Handle Concurrency

March 23, 202610 min read0 views

backend concurrency

performance engineering

system design

backend architecture

distributed systems

production engineering

Backend Process & Thread Management: How Servers Actually Handle Concurrency

Most backend developers have a mental model that goes something like: "request comes in, server handles it, response goes out." But what actually happens at the operating system level? How does Node.js handle 10,000 concurrent connections on a single thread? Why does Python's GIL exist? How do Go's goroutines differ from OS threads? And when you type cluster.fork(), what syscalls fire underneath?

This post tears apart process and thread management from the OS primitives up through the application frameworks. We'll build a multi-process server, a thread pool, a work-stealing scheduler, and an event loop — all from scratch in TypeScript — to understand exactly how modern backend runtimes manage concurrency.

The Operating System Foundation

Before we touch any application code, we need to understand what the OS provides:

┌─────────────────────────────────────────────────────────────────┐
│                         USER SPACE                               │
│                                                                  │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐        │
│  │ Process 1 │  │ Process 2 │  │ Process 3 │  │ Process N │       │
│  │           │  │           │  │           │  │           │       │
│  │ ┌──┐ ┌──┐│  │ ┌──┐ ┌──┐│  │ ┌──┐     │  │ ┌──┐ ┌──┐│       │
│  │ │T1│ │T2││  │ │T1│ │T2││  │ │T1│     │  │ │T1│ │T2││       │
│  │ └──┘ └──┘│  │ └──┘ └──┘│  │ └──┘     │  │ └──┘ └──┘│       │
│  └──────────┘  └──────────┘  └──────────┘  └──────────┘        │
│                                                                  │
├──────────────────────────────────────────────────────────────────┤
│                       KERNEL SPACE                               │
│                                                                  │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │                   SCHEDULER (CFS)                        │    │
│  │                                                          │    │
│  │  Run Queue:  [T1:P1] [T2:P1] [T1:P2] [T2:P2] [T1:P3]  │    │
│  │                                                          │    │
│  │  CPU 0: ──▶ T1:P1 ──▶ T2:P2 ──▶ T1:P3 ──▶ ...         │    │
│  │  CPU 1: ──▶ T1:P2 ──▶ T2:P1 ──▶ T1:PN ──▶ ...         │    │
│  └─────────────────────────────────────────────────────────┘    │
│                                                                  │
│  ┌──────────────────┐  ┌──────────────────┐                     │
│  │  Virtual Memory   │  │  File Descriptors │                    │
│  │  Management       │  │  & I/O Subsystem  │                    │
│  └──────────────────┘  └──────────────────┘                     │
└──────────────────────────────────────────────────────────────────┘

Process vs Thread: The Real Difference

PROCESS:
┌──────────────────────────────────┐
│  PID: 1234                       │
│  ┌────────────────────────────┐  │
│  │  Virtual Address Space      │  │  ← Each process gets its own
│  │  ┌──────┐ ┌──────┐        │  │
│  │  │ Code │ │ Data │        │  │
│  │  └──────┘ └──────┘        │  │
│  │  ┌──────┐ ┌──────┐        │  │
│  │  │ Heap │ │Stack │        │  │
│  │  └──────┘ └──────┘        │  │
│  └────────────────────────────┘  │
│  File Descriptors: [0,1,2,3...]  │  ← Own FD table
│  Signal Handlers: {...}          │  ← Own signal disposition
│  Environment: {...}              │  ← Own env vars
└──────────────────────────────────┘

THREADS (within a process):
┌──────────────────────────────────────────┐
│  PID: 1234                                │
│  ┌────────────────────────────────────┐  │
│  │  SHARED: Code, Data, Heap, FDs     │  │  ← All threads share
│  └────────────────────────────────────┘  │
│                                           │
│  ┌─────────┐  ┌─────────┐  ┌─────────┐  │
│  │ Thread 1 │  │ Thread 2 │  │ Thread 3 │ │
│  │ ┌─────┐  │  │ ┌─────┐  │  │ ┌─────┐  ││
│  │ │Stack│  │  │ │Stack│  │  │ │Stack│  ││  ← Each thread: own stack
│  │ └─────┘  │  │ └─────┘  │  │ └─────┘  ││
│  │ TID:1001 │  │ TID:1002 │  │ TID:1003 ││  ← Own thread ID
│  │ Regs:{} │  │ Regs:{} │  │ Regs:{} ││  ← Own CPU registers
│  └─────────┘  └─────────┘  └─────────┘  │
└──────────────────────────────────────────┘

Building a Process Manager from Scratch

Let's build the core abstraction that manages child processes — similar to what Node.js cluster module does internally:

import { EventEmitter } from "events";

// ─── Process Descriptor ───────────────────────────────────────
interface ProcessDescriptor {
  pid: number;
  state: "running" | "stopped" | "zombie" | "sleeping";
  cpuTime: number;       // milliseconds of CPU used
  memoryRSS: number;     // resident set size in bytes
  startTime: number;     // when process started
  exitCode: number | null;
  signal: string | null;
  restarts: number;
}

// ─── IPC Channel Types ────────────────────────────────────────
type IPCMessage = {
  type: string;
  payload: unknown;
  source: number;       // sender PID
  timestamp: number;
};

type IPCChannel = {
  send(msg: IPCMessage): boolean;
  onMessage(handler: (msg: IPCMessage) => void): void;
  close(): void;
};

// ─── Process Manager ─────────────────────────────────────────
class ProcessManager extends EventEmitter {
  private processes: Map<number, ProcessDescriptor> = new Map();
  private ipcChannels: Map<number, IPCChannel> = new Map();
  private maxProcesses: number;
  private restartPolicy: "always" | "on-failure" | "never";
  private maxRestarts: number;
  private restartDelay: number;

  constructor(options: {
    maxProcesses?: number;
    restartPolicy?: "always" | "on-failure" | "never";
    maxRestarts?: number;
    restartDelay?: number;
  } = {}) {
    super();
    this.maxProcesses = options.maxProcesses ?? require("os").cpus().length;
    this.restartPolicy = options.restartPolicy ?? "on-failure";
    this.maxRestarts = options.maxRestarts ?? 5;
    this.restartDelay = options.restartDelay ?? 1000;
  }

  // Fork a new child process
  fork(modulePath: string, args: string[] = [], env: Record<string, string> = {}): number {
    if (this.processes.size >= this.maxProcesses) {
      throw new Error(
        `Max processes (${this.maxProcesses}) reached. Cannot fork more.`
      );
    }

    const { fork: cpFork } = require("child_process");
    
    const child = cpFork(modulePath, args, {
      env: { ...process.env, ...env },
      stdio: ["pipe", "pipe", "pipe", "ipc"],  // stdin, stdout, stderr, ipc
      serialization: "advanced",  // Use V8 serialization for structured clone
    });

    const descriptor: ProcessDescriptor = {
      pid: child.pid!,
      state: "running",
      cpuTime: 0,
      memoryRSS: 0,
      startTime: Date.now(),
      exitCode: null,
      signal: null,
      restarts: 0,
    };

    this.processes.set(child.pid!, descriptor);
    this.setupIPC(child);
    this.setupEventHandlers(child, modulePath, args, env);

    this.emit("fork", { pid: child.pid, modulePath });
    return child.pid!;
  }

  // Setup IPC channel for a child process
  private setupIPC(child: any): void {
    const channel: IPCChannel = {
      send(msg: IPCMessage): boolean {
        if (!child.connected) return false;
        try {
          child.send(msg);
          return true;
        } catch {
          return false;
        }
      },
      onMessage(handler: (msg: IPCMessage) => void): void {
        child.on("message", handler);
      },
      close(): void {
        if (child.connected) {
          child.disconnect();
        }
      },
    };

    this.ipcChannels.set(child.pid!, channel);
  }

  // Setup lifecycle event handlers
  private setupEventHandlers(
    child: any,
    modulePath: string,
    args: string[],
    env: Record<string, string>
  ): void {
    child.on("exit", (code: number | null, signal: string | null) => {
      const descriptor = this.processes.get(child.pid!);
      if (!descriptor) return;

      descriptor.state = "zombie";
      descriptor.exitCode = code;
      descriptor.signal = signal;

      this.emit("exit", {
        pid: child.pid,
        code,
        signal,
        uptime: Date.now() - descriptor.startTime,
      });

      // Restart policy evaluation
      this.evaluateRestart(child.pid!, code, signal, modulePath, args, env);
    });

    child.on("error", (err: Error) => {
      this.emit("error", { pid: child.pid, error: err });
    });

    // Monitor memory and CPU periodically
    const monitorInterval = setInterval(() => {
      if (!child.connected) {
        clearInterval(monitorInterval);
        return;
      }
      try {
        const usage = child.usage?.() ?? process.cpuUsage();
        const descriptor = this.processes.get(child.pid!);
        if (descriptor) {
          descriptor.cpuTime = (usage.user + usage.system) / 1000;
        }
      } catch {
        clearInterval(monitorInterval);
      }
    }, 5000);
  }

  // Evaluate whether to restart a process
  private evaluateRestart(
    pid: number,
    code: number | null,
    signal: string | null,
    modulePath: string,
    args: string[],
    env: Record<string, string>
  ): void {
    const descriptor = this.processes.get(pid);
    if (!descriptor) return;

    const shouldRestart = (() => {
      switch (this.restartPolicy) {
        case "always":
          return true;
        case "on-failure":
          return code !== 0;
        case "never":
          return false;
      }
    })();

    if (!shouldRestart) {
      this.processes.delete(pid);
      this.ipcChannels.delete(pid);
      return;
    }

    if (descriptor.restarts >= this.maxRestarts) {
      this.emit("max-restarts", { pid, restarts: descriptor.restarts });
      this.processes.delete(pid);
      this.ipcChannels.delete(pid);
      return;
    }

    // Exponential backoff for restarts
    const delay = this.restartDelay * Math.pow(2, descriptor.restarts);

    this.emit("restarting", {
      pid,
      attempt: descriptor.restarts + 1,
      delay,
    });

    setTimeout(() => {
      this.processes.delete(pid);
      this.ipcChannels.delete(pid);
      const newPid = this.fork(modulePath, args, env);
      const newDescriptor = this.processes.get(newPid)!;
      newDescriptor.restarts = descriptor.restarts + 1;
    }, delay);
  }

  // Send message to a specific process
  sendTo(pid: number, type: string, payload: unknown): boolean {
    const channel = this.ipcChannels.get(pid);
    if (!channel) return false;
    return channel.send({
      type,
      payload,
      source: process.pid,
      timestamp: Date.now(),
    });
  }

  // Broadcast message to all processes
  broadcast(type: string, payload: unknown): Map<number, boolean> {
    const results = new Map<number, boolean>();
    for (const [pid] of this.processes) {
      results.set(pid, this.sendTo(pid, type, payload));
    }
    return results;
  }

  // Graceful shutdown of a specific process
  async shutdown(pid: number, timeout: number = 5000): Promise<boolean> {
    const descriptor = this.processes.get(pid);
    if (!descriptor || descriptor.state !== "running") return false;

    // Send shutdown signal via IPC first
    this.sendTo(pid, "shutdown", { deadline: Date.now() + timeout });

    return new Promise<boolean>((resolve) => {
      const timer = setTimeout(() => {
        // Force kill if graceful shutdown times out
        try {
          process.kill(pid, "SIGKILL");
        } catch {}
        resolve(false);
      }, timeout);

      const onExit = (event: { pid: number }) => {
        if (event.pid === pid) {
          clearTimeout(timer);
          this.removeListener("exit", onExit);
          resolve(true);
        }
      };

      this.on("exit", onExit);

      // Send SIGTERM
      try {
        process.kill(pid, "SIGTERM");
      } catch {
        clearTimeout(timer);
        resolve(false);
      }
    });
  }

  // Get status of all processes
  getStatus(): ProcessDescriptor[] {
    return Array.from(this.processes.values());
  }
}

The Event Loop: How Single-Threaded Servers Work

Node.js runs JavaScript on a single thread but handles thousands of concurrent connections. The secret is the event loop backed by libuv:

┌─────────────────────────────────────────────────────────┐
│                    NODE.JS EVENT LOOP                     │
│                                                          │
│  ┌─── Phase 1 ──────────────────────────────────────┐   │
│  │  TIMERS                                           │   │
│  │  Execute setTimeout() and setInterval() callbacks │   │
│  └──────────────────────────┬───────────────────────┘   │
│                              ▼                           │
│  ┌─── Phase 2 ──────────────────────────────────────┐   │
│  │  PENDING CALLBACKS                                │   │
│  │  I/O callbacks deferred to next loop iteration    │   │
│  └──────────────────────────┬───────────────────────┘   │
│                              ▼                           │
│  ┌─── Phase 3 ──────────────────────────────────────┐   │
│  │  IDLE / PREPARE                                   │   │
│  │  Internal use only (libuv internals)              │   │
│  └──────────────────────────┬───────────────────────┘   │
│                              ▼                           │
│  ┌─── Phase 4 ──────────────────────────────────────┐   │
│  │  POLL                                             │   │
│  │  Retrieve new I/O events                          │   │
│  │  Execute I/O callbacks (almost all except close)  │   │
│  │  *** Will block here when no work pending ***     │   │
│  └──────────────────────────┬───────────────────────┘   │
│                              ▼                           │
│  ┌─── Phase 5 ──────────────────────────────────────┐   │
│  │  CHECK                                            │   │
│  │  Execute setImmediate() callbacks                 │   │
│  └──────────────────────────┬───────────────────────┘   │
│                              ▼                           │
│  ┌─── Phase 6 ──────────────────────────────────────┐   │
│  │  CLOSE CALLBACKS                                  │   │
│  │  socket.on('close', ...) etc.                     │   │
│  └──────────────────────────┬───────────────────────┘   │
│                              │                           │
│       ◄──── loop back ──────┘                           │
│                                                          │
│  Between EVERY phase:                                    │
│  ┌──────────────────────────────────────────────────┐   │
│  │  MICROTASK QUEUE                                  │   │
│  │  process.nextTick() → Promise callbacks → ...     │   │
│  └──────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────┘

Let's build our own event loop to understand the mechanics:

// ─── Event Loop Implementation ────────────────────────────────

type TimerEntry = {
  id: number;
  callback: () => void;
  executeAt: number;
  interval: number | null;  // null for setTimeout, ms for setInterval
};

type IOCallback = {
  fd: number;
  event: "readable" | "writable" | "error";
  callback: () => void;
};

type MicrotaskEntry = {
  callback: () => void;
  priority: "nextTick" | "promise";
};

class EventLoop {
  private timers: TimerEntry[] = [];
  private pendingCallbacks: Array<() => void> = [];
  private ioWatchers: Map<number, IOCallback[]> = new Map();
  private checkCallbacks: Array<() => void> = [];
  private closeCallbacks: Array<() => void> = [];
  private microtaskQueue: MicrotaskEntry[] = [];
  
  private nextTimerId = 1;
  private running = false;
  private iteration = 0;

  // ─── Timer Registration ──────────────────────────────────
  setTimeout(callback: () => void, delay: number): number {
    const id = this.nextTimerId++;
    this.timers.push({
      id,
      callback,
      executeAt: Date.now() + delay,
      interval: null,
    });
    this.timers.sort((a, b) => a.executeAt - b.executeAt);
    return id;
  }

  setInterval(callback: () => void, interval: number): number {
    const id = this.nextTimerId++;
    this.timers.push({
      id,
      callback,
      executeAt: Date.now() + interval,
      interval,
    });
    this.timers.sort((a, b) => a.executeAt - b.executeAt);
    return id;
  }

  clearTimer(id: number): void {
    this.timers = this.timers.filter((t) => t.id !== id);
  }

  // ─── Microtask Registration ──────────────────────────────
  nextTick(callback: () => void): void {
    this.microtaskQueue.push({ callback, priority: "nextTick" });
  }

  queueMicrotask(callback: () => void): void {
    this.microtaskQueue.push({ callback, priority: "promise" });
  }

  // ─── I/O Registration ───────────────────────────────────
  watchFD(fd: number, event: "readable" | "writable" | "error", callback: () => void): void {
    if (!this.ioWatchers.has(fd)) {
      this.ioWatchers.set(fd, []);
    }
    this.ioWatchers.get(fd)!.push({ fd, event, callback });
  }

  // ─── setImmediate equivalent ─────────────────────────────
  setImmediate(callback: () => void): void {
    this.checkCallbacks.push(callback);
  }

  // ─── Close callback registration ────────────────────────
  onClose(callback: () => void): void {
    this.closeCallbacks.push(callback);
  }

  // ─── Drain microtask queue ──────────────────────────────
  private drainMicrotasks(): void {
    // nextTick has higher priority than promise callbacks
    // Process all nextTick first, then all promise callbacks
    // But new callbacks added during execution are also processed
    
    while (this.microtaskQueue.length > 0) {
      // Sort: nextTick before promise
      this.microtaskQueue.sort((a, b) => {
        if (a.priority === "nextTick" && b.priority === "promise") return -1;
        if (a.priority === "promise" && b.priority === "nextTick") return 1;
        return 0;  // preserve insertion order within same priority
      });

      const entry = this.microtaskQueue.shift()!;
      try {
        entry.callback();
      } catch (err) {
        console.error("Microtask error:", err);
      }
    }
  }

  // ─── Phase 1: Timers ────────────────────────────────────
  private processTimers(): void {
    const now = Date.now();
    const expired: TimerEntry[] = [];
    const remaining: TimerEntry[] = [];

    for (const timer of this.timers) {
      if (timer.executeAt <= now) {
        expired.push(timer);
      } else {
        remaining.push(timer);
      }
    }

    this.timers = remaining;

    for (const timer of expired) {
      try {
        timer.callback();
      } catch (err) {
        console.error("Timer callback error:", err);
      }

      // Re-schedule intervals
      if (timer.interval !== null) {
        this.timers.push({
          ...timer,
          executeAt: now + timer.interval,
        });
      }
    }

    this.timers.sort((a, b) => a.executeAt - b.executeAt);
  }

  // ─── Phase 2: Pending Callbacks ─────────────────────────
  private processPendingCallbacks(): void {
    const callbacks = this.pendingCallbacks.splice(0);
    for (const cb of callbacks) {
      try {
        cb();
      } catch (err) {
        console.error("Pending callback error:", err);
      }
    }
  }

  // ─── Phase 4: Poll ──────────────────────────────────────
  private poll(): void {
    // In a real implementation, this would call epoll_wait/kqueue/IOCP
    // and block until I/O events arrive or a timer expires
    
    // Simulate: check all watched FDs for readiness
    for (const [fd, watchers] of this.ioWatchers) {
      for (const watcher of watchers) {
        // In reality, this would check epoll readiness
        // For simulation, we'll process any registered callbacks
        try {
          watcher.callback();
        } catch (err) {
          this.pendingCallbacks.push(() => {
            console.error(`IO error on fd ${fd}:`, err);
          });
        }
      }
    }
  }

  // ─── Phase 5: Check ─────────────────────────────────────
  private processCheck(): void {
    const callbacks = this.checkCallbacks.splice(0);
    for (const cb of callbacks) {
      try {
        cb();
      } catch (err) {
        console.error("Check callback error:", err);
      }
    }
  }

  // ─── Phase 6: Close Callbacks ───────────────────────────
  private processCloseCallbacks(): void {
    const callbacks = this.closeCallbacks.splice(0);
    for (const cb of callbacks) {
      try {
        cb();
      } catch (err) {
        console.error("Close callback error:", err);
      }
    }
  }

  // ─── Has pending work? ──────────────────────────────────
  private hasPendingWork(): boolean {
    return (
      this.timers.length > 0 ||
      this.pendingCallbacks.length > 0 ||
      this.ioWatchers.size > 0 ||
      this.checkCallbacks.length > 0 ||
      this.closeCallbacks.length > 0 ||
      this.microtaskQueue.length > 0
    );
  }

  // ─── Main Loop ──────────────────────────────────────────
  async run(): Promise<void> {
    this.running = true;

    while (this.running && this.hasPendingWork()) {
      this.iteration++;

      // Phase 1: Timers
      this.processTimers();
      this.drainMicrotasks();

      // Phase 2: Pending callbacks
      this.processPendingCallbacks();
      this.drainMicrotasks();

      // Phase 3: Idle/Prepare (internal, skip)

      // Phase 4: Poll
      this.poll();
      this.drainMicrotasks();

      // Phase 5: Check (setImmediate)
      this.processCheck();
      this.drainMicrotasks();

      // Phase 6: Close callbacks
      this.processCloseCallbacks();
      this.drainMicrotasks();

      // Yield to prevent starving the actual event loop
      // (since we're running inside Node.js)
      await new Promise((resolve) => global.setTimeout(resolve, 0));
    }

    this.running = false;
  }

  stop(): void {
    this.running = false;
  }

  getStats(): {
    iteration: number;
    timers: number;
    ioWatchers: number;
    microtasks: number;
  } {
    return {
      iteration: this.iteration,
      timers: this.timers.length,
      ioWatchers: this.ioWatchers.size,
      microtasks: this.microtaskQueue.length,
    };
  }
}

Thread Pool: How libuv Offloads Blocking Work

Node.js uses a thread pool (default 4 threads, configurable via UV_THREADPOOL_SIZE) for operations that can't be done asynchronously at the OS level — like file system operations, DNS lookups, and crypto:

┌───────────────────────────────────────────────────────────┐
│                    MAIN THREAD                             │
│                                                            │
│  JavaScript ──▶ N-API ──▶ libuv                           │
│                              │                             │
│                    ┌─────────┴─────────┐                  │
│                    │ Is it async at OS? │                  │
│                    └─────────┬─────────┘                  │
│                     YES │         │ NO                     │
│                         ▼         ▼                        │
│                   ┌─────────┐ ┌────────────────────┐      │
│                   │ epoll / │ │   THREAD POOL       │     │
│                   │ kqueue  │ │ (UV_THREADPOOL_SIZE)│     │
│                   │         │ │                     │     │
│                   │ Network │ │ ┌───┐ ┌───┐ ┌───┐  │     │
│                   │ sockets │ │ │ W1│ │ W2│ │ W3│  │     │
│                   │ pipes   │ │ └───┘ └───┘ └───┘  │     │
│                   │ signals │ │ ┌───┐               │     │
│                   │         │ │ │ W4│  (default 4)  │     │
│                   └─────────┘ │ └───┘               │     │
│                               │                     │     │
│                               │ FS ops, DNS lookup, │     │
│                               │ compression, crypto │     │
│                               └─────────────────────┘     │
└───────────────────────────────────────────────────────────┘

Let's build a thread pool from scratch:

// ─── Task Queue for Thread Pool ───────────────────────────────

type TaskPriority = "high" | "normal" | "low";

interface WorkerTask<T = unknown> {
  id: string;
  fn: () => T | Promise<T>;
  priority: TaskPriority;
  timeout: number;
  enqueuedAt: number;
  resolve: (value: T) => void;
  reject: (error: Error) => void;
}

interface WorkerStats {
  id: number;
  state: "idle" | "busy" | "draining";
  tasksCompleted: number;
  totalBusyTime: number;
  currentTaskId: string | null;
  currentTaskStartedAt: number | null;
}

// ─── Thread Pool Implementation ──────────────────────────────

class ThreadPool {
  private taskQueue: WorkerTask[] = [];
  private workers: WorkerStats[] = [];
  private activeWorkers: Map<number, { task: WorkerTask; abortController: AbortController }> = new Map();
  private poolSize: number;
  private maxQueueSize: number;
  private draining = false;

  constructor(options: {
    poolSize?: number;
    maxQueueSize?: number;
  } = {}) {
    this.poolSize = options.poolSize ?? 4;
    this.maxQueueSize = options.maxQueueSize ?? 1024;

    // Initialize worker descriptors
    for (let i = 0; i < this.poolSize; i++) {
      this.workers.push({
        id: i,
        state: "idle",
        tasksCompleted: 0,
        totalBusyTime: 0,
        currentTaskId: null,
        currentTaskStartedAt: null,
      });
    }
  }

  // Submit a task to the pool
  submit<T>(
    fn: () => T | Promise<T>,
    options: {
      priority?: TaskPriority;
      timeout?: number;
    } = {}
  ): Promise<T> {
    if (this.draining) {
      return Promise.reject(new Error("Thread pool is draining"));
    }

    if (this.taskQueue.length >= this.maxQueueSize) {
      return Promise.reject(new Error("Task queue is full"));
    }

    return new Promise<T>((resolve, reject) => {
      const task: WorkerTask<T> = {
        id: `task_${Date.now()}_${Math.random().toString(36).slice(2, 8)}`,
        fn,
        priority: options.priority ?? "normal",
        timeout: options.timeout ?? 30000,
        enqueuedAt: Date.now(),
        resolve: resolve as (value: unknown) => void,
        reject,
      };

      // Insert based on priority
      this.insertByPriority(task);

      // Try to schedule immediately
      this.scheduleNext();
    });
  }

  // Priority-based insertion
  private insertByPriority(task: WorkerTask): void {
    const priorityOrder: Record<TaskPriority, number> = {
      high: 0,
      normal: 1,
      low: 2,
    };

    const taskPriority = priorityOrder[task.priority];
    let insertIndex = this.taskQueue.length;

    for (let i = 0; i < this.taskQueue.length; i++) {
      if (priorityOrder[this.taskQueue[i].priority] > taskPriority) {
        insertIndex = i;
        break;
      }
    }

    this.taskQueue.splice(insertIndex, 0, task);
  }

  // Schedule next task on an idle worker
  private scheduleNext(): void {
    if (this.taskQueue.length === 0) return;

    const idleWorker = this.workers.find((w) => w.state === "idle");
    if (!idleWorker) return;

    const task = this.taskQueue.shift()!;
    this.executeOnWorker(idleWorker, task);
  }

  // Execute task on a specific worker
  private async executeOnWorker(worker: WorkerStats, task: WorkerTask): Promise<void> {
    worker.state = "busy";
    worker.currentTaskId = task.id;
    worker.currentTaskStartedAt = Date.now();

    const abortController = new AbortController();
    this.activeWorkers.set(worker.id, { task, abortController });

    // Setup timeout
    const timeoutHandle = setTimeout(() => {
      abortController.abort();
    }, task.timeout);

    try {
      const result = await Promise.race([
        // Actual task execution
        (async () => {
          return await task.fn();
        })(),
        // Abort signal
        new Promise<never>((_, reject) => {
          abortController.signal.addEventListener("abort", () => {
            reject(new Error(`Task ${task.id} timed out after ${task.timeout}ms`));
          });
        }),
      ]);

      task.resolve(result);
    } catch (error) {
      task.reject(error instanceof Error ? error : new Error(String(error)));
    } finally {
      clearTimeout(timeoutHandle);

      const busyTime = Date.now() - (worker.currentTaskStartedAt ?? Date.now());
      worker.totalBusyTime += busyTime;
      worker.tasksCompleted++;
      worker.state = this.draining ? "draining" : "idle";
      worker.currentTaskId = null;
      worker.currentTaskStartedAt = null;

      this.activeWorkers.delete(worker.id);

      // Schedule next task if not draining
      if (!this.draining) {
        this.scheduleNext();
      }
    }
  }

  // Drain the pool: finish current tasks, reject queued tasks
  async drain(): Promise<void> {
    this.draining = true;

    // Reject all queued tasks
    for (const task of this.taskQueue) {
      task.reject(new Error("Thread pool is draining"));
    }
    this.taskQueue = [];

    // Wait for active tasks to complete
    while (this.activeWorkers.size > 0) {
      await new Promise((resolve) => setTimeout(resolve, 100));
    }

    for (const worker of this.workers) {
      worker.state = "draining";
    }
  }

  // Get pool statistics
  getStats(): {
    poolSize: number;
    busyWorkers: number;
    idleWorkers: number;
    queuedTasks: number;
    workers: WorkerStats[];
  } {
    return {
      poolSize: this.poolSize,
      busyWorkers: this.workers.filter((w) => w.state === "busy").length,
      idleWorkers: this.workers.filter((w) => w.state === "idle").length,
      queuedTasks: this.taskQueue.length,
      workers: [...this.workers],
    };
  }
}

Work-Stealing Scheduler

Go's runtime and many high-performance servers use work-stealing schedulers. Each thread has its own local queue, and idle threads "steal" work from busy threads:

┌──────────────────────────────────────────────────────────────┐
│                  WORK-STEALING SCHEDULER                      │
│                                                               │
│  ┌─────────────────┐    ┌─────────────────┐                 │
│  │   Processor P0   │    │   Processor P1   │                │
│  │                   │    │                   │               │
│  │  Local Queue:     │    │  Local Queue:     │               │
│  │  ┌──┬──┬──┬──┐   │    │  ┌──┐            │               │
│  │  │G1│G2│G3│G4│   │    │  │G5│ ← only 1   │               │
│  │  └──┴──┴──┴──┘   │    │  └──┘            │               │
│  │  ▲ push/pop      │    │       │           │               │
│  │  │ (LIFO)        │    │       │ idle!     │               │
│  └──┼───────────────┘    └───────┼───────────┘               │
│     │                            │                            │
│     │                     ┌──────▼──────┐                    │
│     │                     │  STEAL from  │                    │
│     │◄────────────────────│  P0's queue  │                    │
│     │   steal G1 (FIFO)   │  (bottom)    │                    │
│     │                     └─────────────┘                    │
│                                                               │
│  ┌─────────────────────────────────────┐                     │
│  │          GLOBAL RUN QUEUE            │                     │
│  │  ┌──┬──┬──┬──┬──┐                   │                    │
│  │  │G6│G7│G8│G9│..│   ← overflow      │                    │
│  │  └──┴──┴──┴──┴──┘     from locals   │                    │
│  └─────────────────────────────────────┘                     │
│                                                               │
│  Steal Order: 1. Local queue (LIFO)                          │
│               2. Global queue (FIFO)                          │
│               3. Other processor's queue (FIFO from bottom)   │
│               4. Network poller                               │
└──────────────────────────────────────────────────────────────┘

// ─── Work-Stealing Scheduler ──────────────────────────────────

interface Goroutine {
  id: number;
  fn: () => Promise<void>;
  state: "runnable" | "running" | "blocked" | "dead";
  processor: number | null;
  createdAt: number;
  totalRunTime: number;
}

// Deque (double-ended queue) for local run queues
class Deque<T> {
  private items: T[] = [];

  pushBack(item: T): void {
    this.items.push(item);
  }

  popBack(): T | undefined {
    return this.items.pop();
  }

  popFront(): T | undefined {
    return this.items.shift();
  }

  stealHalf(): T[] {
    const half = Math.ceil(this.items.length / 2);
    return this.items.splice(0, half);
  }

  get length(): number {
    return this.items.length;
  }

  isEmpty(): boolean {
    return this.items.length === 0;
  }
}

interface Processor {
  id: number;
  localQueue: Deque<Goroutine>;
  currentGoroutine: Goroutine | null;
  state: "running" | "idle" | "syscall" | "stopped";
  stealAttempts: number;
  successfulSteals: number;
  tasksExecuted: number;
}

class WorkStealingScheduler {
  private processors: Processor[] = [];
  private globalQueue: Goroutine[] = [];
  private allGoroutines: Map<number, Goroutine> = new Map();
  private nextGoroutineId = 1;
  private numProcessors: number;
  private maxLocalQueueSize = 256;
  private running = false;
  private scheduleCount = 0;

  constructor(numProcessors: number = 4) {
    this.numProcessors = numProcessors;

    for (let i = 0; i < numProcessors; i++) {
      this.processors.push({
        id: i,
        localQueue: new Deque<Goroutine>(),
        currentGoroutine: null,
        state: "idle",
        stealAttempts: 0,
        successfulSteals: 0,
        tasksExecuted: 0,
      });
    }
  }

  // Spawn a new goroutine (submit work)
  spawn(fn: () => Promise<void>): number {
    const g: Goroutine = {
      id: this.nextGoroutineId++,
      fn,
      state: "runnable",
      processor: null,
      createdAt: Date.now(),
      totalRunTime: 0,
    };

    this.allGoroutines.set(g.id, g);

    // Try to put on current processor's local queue first
    const currentP = this.findLeastLoadedProcessor();
    if (currentP && currentP.localQueue.length < this.maxLocalQueueSize) {
      currentP.localQueue.pushBack(g);
      g.processor = currentP.id;
    } else {
      // Overflow to global queue
      this.globalQueue.push(g);
    }

    return g.id;
  }

  private findLeastLoadedProcessor(): Processor | null {
    let min = Infinity;
    let result: Processor | null = null;
    for (const p of this.processors) {
      if (p.localQueue.length < min) {
        min = p.localQueue.length;
        result = p;
      }
    }
    return result;
  }

  // Find next goroutine for a processor
  private findWork(p: Processor): Goroutine | null {
    this.scheduleCount++;

    // Every 61st schedule, check global queue first to prevent starvation
    if (this.scheduleCount % 61 === 0 && this.globalQueue.length > 0) {
      const g = this.globalQueue.shift()!;
      // Also grab a batch from global queue for local
      const batchSize = Math.min(
        this.globalQueue.length,
        Math.floor(this.maxLocalQueueSize / 2)
      );
      for (let i = 0; i < batchSize; i++) {
        p.localQueue.pushBack(this.globalQueue.shift()!);
      }
      return g;
    }

    // 1. Check local queue (LIFO — pop from back)
    const local = p.localQueue.popBack();
    if (local) return local;

    // 2. Check global queue
    if (this.globalQueue.length > 0) {
      const g = this.globalQueue.shift()!;
      // Grab batch
      const batchSize = Math.min(
        this.globalQueue.length,
        Math.floor(this.maxLocalQueueSize / 2)
      );
      for (let i = 0; i < batchSize; i++) {
        p.localQueue.pushBack(this.globalQueue.shift()!);
      }
      return g;
    }

    // 3. Steal from other processors
    return this.stealFromOthers(p);
  }

  // Steal work from another processor's queue
  private stealFromOthers(thief: Processor): Goroutine | null {
    // Randomize start to avoid all processors stealing from the same target
    const start = Math.floor(Math.random() * this.numProcessors);

    for (let i = 0; i < this.numProcessors; i++) {
      const targetIdx = (start + i) % this.numProcessors;
      const target = this.processors[targetIdx];

      if (target.id === thief.id) continue;
      if (target.localQueue.isEmpty()) continue;

      thief.stealAttempts++;

      // Steal half from the front (FIFO — oldest tasks)
      const stolen = target.localQueue.stealHalf();
      if (stolen.length === 0) continue;

      thief.successfulSteals++;

      // Take first stolen item as work, put rest in local queue
      const work = stolen[0];
      for (let j = 1; j < stolen.length; j++) {
        thief.localQueue.pushBack(stolen[j]);
      }

      return work;
    }

    return null;
  }

  // Run the scheduler
  async run(): Promise<void> {
    this.running = true;

    // Run each processor as a concurrent "thread"
    const processorLoops = this.processors.map((p) => this.processorLoop(p));

    await Promise.all(processorLoops);
  }

  private async processorLoop(p: Processor): Promise<void> {
    p.state = "running";

    while (this.running) {
      const g = this.findWork(p);

      if (!g) {
        p.state = "idle";
        // Spinning: brief wait before retry
        await new Promise((resolve) => setTimeout(resolve, 1));

        // Check if all work is done
        if (this.isAllWorkDone()) {
          break;
        }
        continue;
      }

      p.state = "running";
      p.currentGoroutine = g;
      g.state = "running";
      g.processor = p.id;

      const startTime = Date.now();

      try {
        await g.fn();
        g.state = "dead";
      } catch (err) {
        console.error(`Goroutine ${g.id} panicked:`, err);
        g.state = "dead";
      }

      g.totalRunTime += Date.now() - startTime;
      p.currentGoroutine = null;
      p.tasksExecuted++;
    }

    p.state = "stopped";
  }

  private isAllWorkDone(): boolean {
    if (this.globalQueue.length > 0) return false;
    for (const p of this.processors) {
      if (!p.localQueue.isEmpty()) return false;
      if (p.currentGoroutine !== null) return false;
    }
    return true;
  }

  stop(): void {
    this.running = false;
  }

  getStats(): {
    processors: Array<{
      id: number;
      state: string;
      localQueueSize: number;
      tasksExecuted: number;
      stealAttempts: number;
      successfulSteals: number;
    }>;
    globalQueueSize: number;
    totalGoroutines: number;
    completedGoroutines: number;
  } {
    return {
      processors: this.processors.map((p) => ({
        id: p.id,
        state: p.state,
        localQueueSize: p.localQueue.length,
        tasksExecuted: p.tasksExecuted,
        stealAttempts: p.stealAttempts,
        successfulSteals: p.successfulSteals,
      })),
      globalQueueSize: this.globalQueue.length,
      totalGoroutines: this.allGoroutines.size,
      completedGoroutines: Array.from(this.allGoroutines.values()).filter(
        (g) => g.state === "dead"
      ).length,
    };
  }
}

Node.js Cluster Mode: Multi-Process Architecture

The cluster module forks multiple processes that share the same server port:

┌──────────────────────────────────────────────────────────────────┐
│                     CLUSTER ARCHITECTURE                          │
│                                                                   │
│  ┌──────────────────────────────────────┐                        │
│  │          MASTER PROCESS               │                       │
│  │                                       │                       │
│  │  ┌────────────────────────────────┐  │                       │
│  │  │  Load Balancer (Round-Robin)    │  │                       │
│  │  │  OR OS-level (SO_REUSEPORT)    │  │                       │
│  │  └────────────┬───────────────────┘  │                       │
│  │               │                       │                       │
│  │  Listening on port 3000              │                       │
│  └───────────────┼──────────────────────┘                       │
│                   │                                               │
│    ┌──────────────┼──────────────┬──────────────┐               │
│    │              │              │              │                │
│    ▼              ▼              ▼              ▼                │
│  ┌──────┐    ┌──────┐    ┌──────┐    ┌──────┐                  │
│  │Worker│    │Worker│    │Worker│    │Worker│                   │
│  │  #1  │    │  #2  │    │  #3  │    │  #4  │                  │
│  │      │    │      │    │      │    │      │                  │
│  │ pid: │    │ pid: │    │ pid: │    │ pid: │                  │
│  │ 1001 │    │ 1002 │    │ 1003 │    │ 1004 │                  │
│  │      │    │      │    │      │    │      │                  │
│  │ Own  │    │ Own  │    │ Own  │    │ Own  │                  │
│  │ V8   │    │ V8   │    │ V8   │    │ V8   │                  │
│  │ heap │    │ heap │    │ heap │    │ heap │                  │
│  └──┬───┘    └──┬───┘    └──┬───┘    └──┬───┘                  │
│     │           │           │           │                       │
│     └─────┬─────┴─────┬─────┘           │                       │
│           │           │                  │                       │
│     ┌─────▼─────┐  ┌──▼──────────┐  ┌──▼──────┐               │
│     │ IPC Pipe  │  │ Shared FD   │  │ Signals │                │
│     │ (messages)│  │ (SO_REUSEPORT)│ │ SIGTERM │               │
│     └───────────┘  └─────────────┘  └─────────┘               │
└──────────────────────────────────────────────────────────────────┘

Let's build a production-ready cluster manager:

// ─── Cluster Manager ──────────────────────────────────────────

import { EventEmitter } from "events";

interface WorkerInfo {
  id: number;
  pid: number;
  state: "online" | "listening" | "disconnected" | "dead";
  connections: number;
  requestsHandled: number;
  memoryUsage: number;
  startTime: number;
  lastHealthCheck: number;
  healthStatus: "healthy" | "unhealthy" | "unknown";
}

interface ClusterConfig {
  workers: number;
  schedulingPolicy: "round-robin" | "os-level";
  healthCheckInterval: number;
  healthCheckTimeout: number;
  maxMemoryPerWorker: number;   // bytes
  gracefulShutdownTimeout: number;
  rollingRestartDelay: number;
}

class ClusterManager extends EventEmitter {
  private config: ClusterConfig;
  private workers: Map<number, WorkerInfo> = new Map();
  private roundRobinIndex = 0;
  private shuttingDown = false;
  private healthCheckTimer: ReturnType<typeof setInterval> | null = null;

  constructor(config: Partial<ClusterConfig> = {}) {
    super();
    this.config = {
      workers: config.workers ?? require("os").cpus().length,
      schedulingPolicy: config.schedulingPolicy ?? "round-robin",
      healthCheckInterval: config.healthCheckInterval ?? 10000,
      healthCheckTimeout: config.healthCheckTimeout ?? 5000,
      maxMemoryPerWorker: config.maxMemoryPerWorker ?? 512 * 1024 * 1024,
      gracefulShutdownTimeout: config.gracefulShutdownTimeout ?? 30000,
      rollingRestartDelay: config.rollingRestartDelay ?? 2000,
    };
  }

  // Initialize cluster
  async start(): Promise<void> {
    console.log(`Starting cluster with ${this.config.workers} workers`);
    console.log(`Scheduling policy: ${this.config.schedulingPolicy}`);

    // Fork initial workers
    for (let i = 0; i < this.config.workers; i++) {
      await this.forkWorker(i);
    }

    // Start health checking
    this.startHealthChecks();

    this.emit("cluster:ready", {
      workers: this.config.workers,
      pids: Array.from(this.workers.values()).map((w) => w.pid),
    });
  }

  // Fork a new worker
  private async forkWorker(id: number): Promise<WorkerInfo> {
    // In production, this would use child_process.fork()
    // Simulating the worker info structure
    const info: WorkerInfo = {
      id,
      pid: process.pid + id + 1, // simulated PID
      state: "online",
      connections: 0,
      requestsHandled: 0,
      memoryUsage: 0,
      startTime: Date.now(),
      lastHealthCheck: Date.now(),
      healthStatus: "unknown",
    };

    this.workers.set(id, info);

    // Simulate worker becoming ready
    await new Promise((resolve) => setTimeout(resolve, 100));
    info.state = "listening";
    info.healthStatus = "healthy";

    this.emit("worker:online", { id, pid: info.pid });
    return info;
  }

  // Round-robin request distribution
  getNextWorker(): WorkerInfo | null {
    const healthyWorkers = Array.from(this.workers.values()).filter(
      (w) => w.state === "listening" && w.healthStatus === "healthy"
    );

    if (healthyWorkers.length === 0) return null;

    const worker = healthyWorkers[this.roundRobinIndex % healthyWorkers.length];
    this.roundRobinIndex++;
    return worker;
  }

  // Least-connections routing
  getLeastConnectionsWorker(): WorkerInfo | null {
    const healthyWorkers = Array.from(this.workers.values()).filter(
      (w) => w.state === "listening" && w.healthStatus === "healthy"
    );

    if (healthyWorkers.length === 0) return null;

    return healthyWorkers.reduce((min, worker) =>
      worker.connections < min.connections ? worker : min
    );
  }

  // Health check loop
  private startHealthChecks(): void {
    this.healthCheckTimer = setInterval(async () => {
      for (const [id, worker] of this.workers) {
        if (worker.state === "dead" || worker.state === "disconnected") continue;

        try {
          const healthy = await this.checkWorkerHealth(worker);
          worker.lastHealthCheck = Date.now();
          
          if (!healthy) {
            worker.healthStatus = "unhealthy";
            this.emit("worker:unhealthy", { id, pid: worker.pid });

            // Replace unhealthy worker
            if (!this.shuttingDown) {
              await this.replaceWorker(id);
            }
          } else {
            worker.healthStatus = "healthy";
          }

          // Check memory limits
          if (worker.memoryUsage > this.config.maxMemoryPerWorker) {
            this.emit("worker:memory-limit", {
              id,
              usage: worker.memoryUsage,
              limit: this.config.maxMemoryPerWorker,
            });
            if (!this.shuttingDown) {
              await this.replaceWorker(id);
            }
          }
        } catch (err) {
          worker.healthStatus = "unhealthy";
        }
      }
    }, this.config.healthCheckInterval);
  }

  // Check individual worker health
  private async checkWorkerHealth(worker: WorkerInfo): Promise<boolean> {
    // In production: send IPC health check message and wait for response
    // Check if worker responds within timeout
    return new Promise<boolean>((resolve) => {
      const timeout = setTimeout(() => resolve(false), this.config.healthCheckTimeout);

      // Simulate health check via IPC
      // In real implementation: worker.process.send({ type: 'health-check' })
      setTimeout(() => {
        clearTimeout(timeout);
        resolve(worker.state === "listening");
      }, 50);
    });
  }

  // Replace a worker (graceful)
  private async replaceWorker(id: number): Promise<void> {
    const oldWorker = this.workers.get(id);
    if (!oldWorker) return;

    this.emit("worker:replacing", { id, pid: oldWorker.pid });

    // 1. Fork new worker first (zero-downtime)
    const newWorker = await this.forkWorker(id + 1000); // temp ID

    // 2. Stop sending traffic to old worker
    oldWorker.state = "disconnected";

    // 3. Wait for existing connections to drain
    await this.drainConnections(oldWorker);

    // 4. Clean up old worker
    oldWorker.state = "dead";
    this.workers.delete(id);

    // 5. Reassign new worker to proper ID
    this.workers.delete(newWorker.id);
    newWorker.id = id;
    this.workers.set(id, newWorker);

    this.emit("worker:replaced", { id, newPid: newWorker.pid });
  }

  // Drain connections from a worker
  private async drainConnections(worker: WorkerInfo): Promise<void> {
    const deadline = Date.now() + this.config.gracefulShutdownTimeout;

    while (worker.connections > 0 && Date.now() < deadline) {
      await new Promise((resolve) => setTimeout(resolve, 100));
    }

    if (worker.connections > 0) {
      this.emit("worker:force-kill", {
        id: worker.id,
        remainingConnections: worker.connections,
      });
    }
  }

  // Rolling restart all workers
  async rollingRestart(): Promise<void> {
    const workerIds = Array.from(this.workers.keys());

    for (const id of workerIds) {
      if (this.shuttingDown) break;

      this.emit("rolling-restart:worker", { id, total: workerIds.length });
      await this.replaceWorker(id);
      await new Promise((resolve) =>
        setTimeout(resolve, this.config.rollingRestartDelay)
      );
    }

    this.emit("rolling-restart:complete");
  }

  // Graceful shutdown of entire cluster
  async shutdown(): Promise<void> {
    this.shuttingDown = true;

    if (this.healthCheckTimer) {
      clearInterval(this.healthCheckTimer);
    }

    this.emit("cluster:shutting-down");

    // Stop all workers gracefully
    const shutdownPromises = Array.from(this.workers.entries()).map(
      async ([id, worker]) => {
        worker.state = "disconnected";
        await this.drainConnections(worker);
        worker.state = "dead";
        this.emit("worker:stopped", { id, pid: worker.pid });
      }
    );

    await Promise.all(shutdownPromises);

    this.workers.clear();
    this.emit("cluster:shutdown");
  }

  // Get cluster status
  getStatus(): {
    totalWorkers: number;
    healthyWorkers: number;
    totalConnections: number;
    totalRequests: number;
    workers: WorkerInfo[];
  } {
    const workerList = Array.from(this.workers.values());
    return {
      totalWorkers: workerList.length,
      healthyWorkers: workerList.filter((w) => w.healthStatus === "healthy").length,
      totalConnections: workerList.reduce((sum, w) => sum + w.connections, 0),
      totalRequests: workerList.reduce((sum, w) => sum + w.requestsHandled, 0),
      workers: workerList,
    };
  }
}

Worker Threads: Shared Memory Concurrency in Node.js

Worker threads (Node.js worker_threads) run in the same process but with separate V8 isolates, allowing true parallelism:

┌──────────────────────────────────────────────────────────────┐
│                     SINGLE PROCESS                            │
│                                                               │
│  ┌────────────────────────────────────────────────────────┐  │
│  │              SHARED MEMORY (SharedArrayBuffer)          │  │
│  │  ┌──────────────────────────────────────────────────┐  │  │
│  │  │  0x00  0x04  0x08  0x0C  0x10  ...               │  │  │
│  │  │  [counter] [flag] [lock] [data] [...]             │  │  │
│  │  └──────────────────────────────────────────────────┘  │  │
│  └────────────┬───────────┬───────────┬───────────────────┘  │
│               │           │           │                       │
│  ┌────────────▼───┐  ┌───▼────────┐  ┌▼──────────────┐      │
│  │ Main Thread    │  │ Worker #1  │  │ Worker #2     │      │
│  │                │  │            │  │               │      │
│  │ V8 Isolate A   │  │ V8 Isolate B│  │ V8 Isolate C │      │
│  │                │  │            │  │               │      │
│  │ Own heap       │  │ Own heap   │  │ Own heap      │      │
│  │ Own stack      │  │ Own stack  │  │ Own stack     │      │
│  │ Own event loop │  │ Own evloop │  │ Own evloop    │      │
│  │                │  │            │  │               │      │
│  │ Atomics.wait() │  │ Atomics    │  │ Atomics       │      │
│  │ Atomics.notify()│ │ operations │  │ operations    │      │
│  └────────────────┘  └────────────┘  └───────────────┘      │
│                                                               │
│  Communication:                                               │
│  ├── SharedArrayBuffer (zero-copy, requires Atomics)         │
│  ├── MessagePort (structured clone, ownership transfer)      │
│  └── BroadcastChannel (pub/sub between workers)              │
└──────────────────────────────────────────────────────────────┘

// ─── Worker Thread Pool with Shared Memory ────────────────────

interface SharedMemoryLayout {
  // Offset 0: Lock (Mutex)
  LOCK_OFFSET: 0;
  // Offset 4: Task counter
  TASK_COUNTER_OFFSET: 4;
  // Offset 8: Completed task counter
  COMPLETED_COUNTER_OFFSET: 8;
  // Offset 12: Active workers
  ACTIVE_WORKERS_OFFSET: 12;
  // Offset 16: Shutdown flag
  SHUTDOWN_FLAG_OFFSET: 16;
  // Offset 20-1024: Ring buffer for task IDs
  RING_BUFFER_OFFSET: 20;
  RING_BUFFER_SIZE: 256;
}

const LAYOUT: SharedMemoryLayout = {
  LOCK_OFFSET: 0,
  TASK_COUNTER_OFFSET: 4,
  COMPLETED_COUNTER_OFFSET: 8,
  ACTIVE_WORKERS_OFFSET: 12,
  SHUTDOWN_FLAG_OFFSET: 16,
  RING_BUFFER_OFFSET: 20,
  RING_BUFFER_SIZE: 256,
};

// Mutex using Atomics
class AtomicMutex {
  private buffer: Int32Array;
  private offset: number;

  constructor(sharedBuffer: SharedArrayBuffer, offset: number) {
    this.buffer = new Int32Array(sharedBuffer);
    this.offset = offset / 4; // Convert byte offset to Int32 index
  }

  lock(): void {
    while (true) {
      // Try to acquire: CAS from 0 (unlocked) to 1 (locked)
      const old = Atomics.compareExchange(
        this.buffer,
        this.offset,
        0, // expected: unlocked
        1  // desired: locked
      );

      if (old === 0) {
        // Successfully acquired lock
        return;
      }

      // Lock is held, wait
      Atomics.wait(this.buffer, this.offset, 1);
    }
  }

  unlock(): void {
    Atomics.store(this.buffer, this.offset, 0);
    Atomics.notify(this.buffer, this.offset, 1);
  }

  tryLock(): boolean {
    const old = Atomics.compareExchange(this.buffer, this.offset, 0, 1);
    return old === 0;
  }
}

// Atomic counter
class AtomicCounter {
  private buffer: Int32Array;
  private index: number;

  constructor(sharedBuffer: SharedArrayBuffer, byteOffset: number) {
    this.buffer = new Int32Array(sharedBuffer);
    this.index = byteOffset / 4;
  }

  increment(): number {
    return Atomics.add(this.buffer, this.index, 1) + 1;
  }

  decrement(): number {
    return Atomics.sub(this.buffer, this.index, 1) - 1;
  }

  get(): number {
    return Atomics.load(this.buffer, this.index);
  }

  set(value: number): void {
    Atomics.store(this.buffer, this.index, value);
  }
}

// Lock-free ring buffer for task distribution
class AtomicRingBuffer {
  private buffer: Int32Array;
  private startIndex: number;
  private capacity: number;
  private headIndex: number; // next write position (byte offset / 4)
  private tailIndex: number; // next read position (byte offset / 4)

  constructor(
    sharedBuffer: SharedArrayBuffer,
    byteOffset: number,
    capacity: number
  ) {
    this.buffer = new Int32Array(sharedBuffer);
    this.startIndex = byteOffset / 4;
    this.capacity = capacity;

    // Head and tail are stored at the beginning of the region
    this.headIndex = this.startIndex;
    this.tailIndex = this.startIndex + 1;
    // Data starts at startIndex + 2
  }

  push(value: number): boolean {
    const head = Atomics.load(this.buffer, this.headIndex);
    const tail = Atomics.load(this.buffer, this.tailIndex);

    const nextHead = (head + 1) % this.capacity;
    if (nextHead === tail) {
      return false; // Buffer full
    }

    const dataIndex = this.startIndex + 2 + head;
    Atomics.store(this.buffer, dataIndex, value);
    Atomics.store(this.buffer, this.headIndex, nextHead);
    return true;
  }

  pop(): number | null {
    const head = Atomics.load(this.buffer, this.headIndex);
    const tail = Atomics.load(this.buffer, this.tailIndex);

    if (head === tail) {
      return null; // Buffer empty
    }

    const dataIndex = this.startIndex + 2 + tail;
    const value = Atomics.load(this.buffer, dataIndex);
    const nextTail = (tail + 1) % this.capacity;
    Atomics.store(this.buffer, this.tailIndex, nextTail);

    return value;
  }

  size(): number {
    const head = Atomics.load(this.buffer, this.headIndex);
    const tail = Atomics.load(this.buffer, this.tailIndex);
    return (head - tail + this.capacity) % this.capacity;
  }
}

// ─── Coordinated Worker Thread Pool ──────────────────────────

class WorkerThreadPool {
  private sharedBuffer: SharedArrayBuffer;
  private mutex: AtomicMutex;
  private taskCounter: AtomicCounter;
  private completedCounter: AtomicCounter;
  private activeWorkers: AtomicCounter;
  private ringBuffer: AtomicRingBuffer;
  private workerCount: number;

  constructor(workerCount: number = 4) {
    // Allocate shared memory (4KB should be enough)
    this.sharedBuffer = new SharedArrayBuffer(4096);

    this.mutex = new AtomicMutex(this.sharedBuffer, LAYOUT.LOCK_OFFSET);
    this.taskCounter = new AtomicCounter(this.sharedBuffer, LAYOUT.TASK_COUNTER_OFFSET);
    this.completedCounter = new AtomicCounter(this.sharedBuffer, LAYOUT.COMPLETED_COUNTER_OFFSET);
    this.activeWorkers = new AtomicCounter(this.sharedBuffer, LAYOUT.ACTIVE_WORKERS_OFFSET);
    this.ringBuffer = new AtomicRingBuffer(
      this.sharedBuffer,
      LAYOUT.RING_BUFFER_OFFSET,
      LAYOUT.RING_BUFFER_SIZE
    );

    this.workerCount = workerCount;
  }

  // Submit a task
  submitTask(taskId: number): boolean {
    this.mutex.lock();
    try {
      const success = this.ringBuffer.push(taskId);
      if (success) {
        this.taskCounter.increment();
      }
      return success;
    } finally {
      this.mutex.unlock();
    }
  }

  // Worker: pick up a task
  pickupTask(): number | null {
    this.mutex.lock();
    try {
      const taskId = this.ringBuffer.pop();
      if (taskId !== null) {
        this.activeWorkers.increment();
      }
      return taskId;
    } finally {
      this.mutex.unlock();
    }
  }

  // Worker: mark task complete
  completeTask(): void {
    this.completedCounter.increment();
    this.activeWorkers.decrement();
  }

  // Get shared buffer for passing to worker threads
  getSharedBuffer(): SharedArrayBuffer {
    return this.sharedBuffer;
  }

  getStats(): {
    submitted: number;
    completed: number;
    active: number;
    queued: number;
  } {
    return {
      submitted: this.taskCounter.get(),
      completed: this.completedCounter.get(),
      active: this.activeWorkers.get(),
      queued: this.ringBuffer.size(),
    };
  }
}

Concurrency Models Compared

┌─────────────────────────────────────────────────────────────────────┐
│              BACKEND CONCURRENCY MODELS COMPARED                     │
│                                                                      │
│  MODEL 1: Multi-Process (Pre-fork)                                  │
│  ┌─────────────────────────────────────────────┐                    │
│  │  Apache httpd, PHP-FPM, unicorn              │                   │
│  │                                              │                    │
│  │  Master ──fork()──▶ Worker (own memory)      │                   │
│  │         ──fork()──▶ Worker (own memory)      │                   │
│  │         ──fork()──▶ Worker (own memory)      │                   │
│  │                                              │                    │
│  │  Pros: Isolation, stability                  │                    │
│  │  Cons: Memory overhead, IPC cost             │                    │
│  └─────────────────────────────────────────────┘                    │
│                                                                      │
│  MODEL 2: Multi-Thread (Thread-per-request)                         │
│  ┌─────────────────────────────────────────────┐                    │
│  │  Java (Tomcat), C# (ASP.NET), Go            │                   │
│  │                                              │                    │
│  │  Process ──▶ Thread Pool                     │                   │
│  │              ├── Thread 1 (shared memory)    │                   │
│  │              ├── Thread 2 (shared memory)    │                   │
│  │              └── Thread N (shared memory)    │                   │
│  │                                              │                    │
│  │  Pros: Shared memory, lower overhead         │                   │
│  │  Cons: Race conditions, deadlocks            │                   │
│  └─────────────────────────────────────────────┘                    │
│                                                                      │
│  MODEL 3: Event Loop (Single-thread + async I/O)                    │
│  ┌─────────────────────────────────────────────┐                    │
│  │  Node.js, nginx, Redis                       │                   │
│  │                                              │                    │
│  │  Main Thread ──▶ Event Loop                  │                   │
│  │                   ├── epoll/kqueue for I/O   │                   │
│  │                   └── Thread pool for FS/DNS │                   │
│  │                                              │                    │
│  │  Pros: No context switching, no locks        │                   │
│  │  Cons: Can't use multiple cores (alone)      │                   │
│  └─────────────────────────────────────────────┘                    │
│                                                                      │
│  MODEL 4: Green Threads / Coroutines                                │
│  ┌─────────────────────────────────────────────┐                    │
│  │  Go (goroutines), Erlang, Java 21 (Loom)    │                   │
│  │                                              │                    │
│  │  M goroutines ──mapped──▶ N OS threads       │                   │
│  │  Cooperative scheduling in userspace         │                   │
│  │  Work-stealing across processors             │                   │
│  │                                              │                    │
│  │  Pros: Millions of concurrent tasks          │                   │
│  │  Cons: Runtime complexity, stack management  │                   │
│  └─────────────────────────────────────────────┘                    │
│                                                                      │
│  MODEL 5: Actor Model                                               │
│  ┌─────────────────────────────────────────────┐                    │
│  │  Erlang/OTP, Akka, Orleans                   │                   │
│  │                                              │                    │
│  │  Actor ◄──message──▶ Actor                   │                   │
│  │    │                    │                     │                   │
│  │    ▼                    ▼                     │                   │
│  │  Mailbox              Mailbox                 │                   │
│  │  (sequential)         (sequential)            │                   │
│  │                                              │                    │
│  │  Pros: No shared state, supervision trees    │                   │
│  │  Cons: Message overhead, ordering complexity │                   │
│  └─────────────────────────────────────────────┘                    │
└─────────────────────────────────────────────────────────────────────┘

Building an Actor System

// ─── Actor System Implementation ──────────────────────────────

type ActorMessage = {
  type: string;
  payload: unknown;
  sender: string | null;
  correlationId: string;
  timestamp: number;
};

type ActorBehavior = (
  context: ActorContext,
  message: ActorMessage
) => Promise<void>;

interface ActorRef {
  name: string;
  send(type: string, payload: unknown, sender?: string): void;
  ask<T>(type: string, payload: unknown, timeout?: number): Promise<T>;
}

class ActorContext {
  constructor(
    public readonly self: ActorRef,
    public readonly system: ActorSystem,
    public readonly parent: ActorRef | null
  ) {}

  // Spawn a child actor
  spawn(name: string, behavior: ActorBehavior): ActorRef {
    return this.system.createActor(
      `${this.self.name}/${name}`,
      behavior,
      this.self
    );
  }

  // Send message to another actor
  send(target: string, type: string, payload: unknown): void {
    this.system.send(target, type, payload, this.self.name);
  }

  // Stop self
  stop(): void {
    this.system.stopActor(this.self.name);
  }
}

class Actor {
  public readonly name: string;
  private behavior: ActorBehavior;
  private mailbox: ActorMessage[] = [];
  private processing = false;
  private context: ActorContext;
  private parent: ActorRef | null;
  private children: Set<string> = new Set();
  private pendingAsks: Map<
    string,
    { resolve: (value: unknown) => void; reject: (error: Error) => void }
  > = new Map();

  constructor(
    name: string,
    behavior: ActorBehavior,
    system: ActorSystem,
    parent: ActorRef | null
  ) {
    this.name = name;
    this.behavior = behavior;
    this.parent = parent;

    const ref: ActorRef = {
      name: this.name,
      send: (type, payload, sender) => {
        this.enqueue({
          type,
          payload,
          sender: sender ?? null,
          correlationId: `${Date.now()}_${Math.random().toString(36).slice(2)}`,
          timestamp: Date.now(),
        });
      },
      ask: <T>(type: string, payload: unknown, timeout = 5000) => {
        return this.ask<T>(type, payload, timeout);
      },
    };

    this.context = new ActorContext(ref, system, parent);
  }

  getRef(): ActorRef {
    return this.context.self;
  }

  addChild(name: string): void {
    this.children.add(name);
  }

  removeChild(name: string): void {
    this.children.delete(name);
  }

  getChildren(): Set<string> {
    return this.children;
  }

  private enqueue(message: ActorMessage): void {
    this.mailbox.push(message);
    this.processNext();
  }

  private async processNext(): Promise<void> {
    if (this.processing || this.mailbox.length === 0) return;

    this.processing = true;

    while (this.mailbox.length > 0) {
      const message = this.mailbox.shift()!;

      try {
        // Check if this is a response to an ask
        if (message.type === "__ask_response__") {
          const pending = this.pendingAsks.get(message.correlationId);
          if (pending) {
            pending.resolve(message.payload);
            this.pendingAsks.delete(message.correlationId);
            continue;
          }
        }

        await this.behavior(this.context, message);
      } catch (error) {
        // Supervision: notify parent of failure
        if (this.parent) {
          this.parent.send(
            "__child_failure__",
            {
              child: this.name,
              error: error instanceof Error ? error.message : String(error),
              message,
            },
            this.name
          );
        }
      }
    }

    this.processing = false;
  }

  private ask<T>(type: string, payload: unknown, timeout: number): Promise<T> {
    return new Promise<T>((resolve, reject) => {
      const correlationId = `ask_${Date.now()}_${Math.random().toString(36).slice(2)}`;

      const timer = setTimeout(() => {
        this.pendingAsks.delete(correlationId);
        reject(new Error(`Ask timed out after ${timeout}ms`));
      }, timeout);

      this.pendingAsks.set(correlationId, {
        resolve: (value: unknown) => {
          clearTimeout(timer);
          resolve(value as T);
        },
        reject: (error: Error) => {
          clearTimeout(timer);
          reject(error);
        },
      });

      this.enqueue({
        type,
        payload,
        sender: null,
        correlationId,
        timestamp: Date.now(),
      });
    });
  }
}

// ─── Supervision Strategies ──────────────────────────────────

type SupervisionStrategy = "one-for-one" | "one-for-all" | "rest-for-one";
type SupervisionDirective = "resume" | "restart" | "stop" | "escalate";

class Supervisor {
  private strategy: SupervisionStrategy;
  private maxRestarts: number;
  private withinMs: number;
  private restartHistory: number[] = [];

  constructor(options: {
    strategy?: SupervisionStrategy;
    maxRestarts?: number;
    withinMs?: number;
  } = {}) {
    this.strategy = options.strategy ?? "one-for-one";
    this.maxRestarts = options.maxRestarts ?? 3;
    this.withinMs = options.withinMs ?? 60000;
  }

  handleFailure(
    failedChild: string,
    siblings: Set<string>,
    error: string
  ): { directive: SupervisionDirective; targets: string[] } {
    // Check restart frequency
    const now = Date.now();
    this.restartHistory.push(now);
    this.restartHistory = this.restartHistory.filter(
      (t) => now - t < this.withinMs
    );

    if (this.restartHistory.length > this.maxRestarts) {
      return { directive: "escalate", targets: [failedChild] };
    }

    switch (this.strategy) {
      case "one-for-one":
        // Only restart the failed child
        return { directive: "restart", targets: [failedChild] };

      case "one-for-all":
        // Restart all children
        return {
          directive: "restart",
          targets: [failedChild, ...siblings],
        };

      case "rest-for-one":
        // Restart failed child and all children started after it
        // (simplified: restart failed + all siblings)
        return {
          directive: "restart",
          targets: [failedChild, ...siblings],
        };
    }
  }
}

// ─── Actor System ────────────────────────────────────────────

class ActorSystem {
  private actors: Map<string, Actor> = new Map();
  private supervisors: Map<string, Supervisor> = new Map();
  private deadLetters: ActorMessage[] = [];

  createActor(
    name: string,
    behavior: ActorBehavior,
    parent?: ActorRef | null
  ): ActorRef {
    if (this.actors.has(name)) {
      throw new Error(`Actor "${name}" already exists`);
    }

    const actor = new Actor(name, behavior, this, parent ?? null);
    this.actors.set(name, actor);

    // Register as child of parent
    if (parent) {
      const parentActor = this.actors.get(parent.name);
      if (parentActor) {
        parentActor.addChild(name);
      }
    }

    return actor.getRef();
  }

  send(
    target: string,
    type: string,
    payload: unknown,
    sender?: string
  ): void {
    const actor = this.actors.get(target);
    if (actor) {
      actor.getRef().send(type, payload, sender);
    } else {
      // Dead letter
      this.deadLetters.push({
        type,
        payload,
        sender: sender ?? null,
        correlationId: `dead_${Date.now()}`,
        timestamp: Date.now(),
      });
    }
  }

  stopActor(name: string): void {
    const actor = this.actors.get(name);
    if (!actor) return;

    // Stop all children first
    for (const childName of actor.getChildren()) {
      this.stopActor(childName);
    }

    this.actors.delete(name);
  }

  setSupervisor(actorName: string, supervisor: Supervisor): void {
    this.supervisors.set(actorName, supervisor);
  }

  getStats(): {
    totalActors: number;
    deadLetters: number;
    actorNames: string[];
  } {
    return {
      totalActors: this.actors.size,
      deadLetters: this.deadLetters.length,
      actorNames: Array.from(this.actors.keys()),
    };
  }
}

The GIL Problem: Why Python Can't Do Real Threads

THE GLOBAL INTERPRETER LOCK (GIL):

Python (CPython):
┌────────────────────────────────────────────────┐
│ Thread 1 ──▶ [HOLDS GIL] ── executing ──▶     │
│ Thread 2 ──▶ [WAITING] ────────────────▶       │
│ Thread 3 ──▶ [WAITING] ────────────────▶       │
│                                                 │
│ Time: ═══█████═════════█████═════════█████═══  │
│  T1:     ████▓                  ████▓          │
│  T2:          ████▓                  ████▓     │
│  T3:               ████▓                       │
│         (only one thread runs Python at a time) │
└────────────────────────────────────────────────┘

Node.js (V8):
┌────────────────────────────────────────────────┐
│ Main Thread ──▶ All JS execution               │
│ libuv Thread 1 ──▶ FS / DNS / crypto          │
│ libuv Thread 2 ──▶ FS / DNS / crypto          │
│ libuv Thread 3 ──▶ FS / DNS / crypto          │
│ libuv Thread 4 ──▶ FS / DNS / crypto          │
│                                                 │
│ JS is single-threaded BY DESIGN                 │
│ (no GIL needed — there's only one thread)       │
│                                                 │
│ Worker Threads: separate V8 isolates            │
│ (no shared JS objects, only SharedArrayBuffer)  │
└────────────────────────────────────────────────┘

Go:
┌────────────────────────────────────────────────┐
│ Goroutine 1 ──▶ OS Thread 1 ──▶ CPU Core 0   │
│ Goroutine 2 ──▶ OS Thread 2 ──▶ CPU Core 1   │
│ Goroutine 3 ──▶ OS Thread 1 ──▶ CPU Core 0   │
│ Goroutine 4 ──▶ OS Thread 3 ──▶ CPU Core 2   │
│                                                 │
│ No GIL. True parallelism.                       │
│ M:N scheduling (M goroutines : N OS threads)    │
│ Channel-based communication (CSP)               │
└────────────────────────────────────────────────┘

Java (21+ with Virtual Threads / Loom):
┌────────────────────────────────────────────────┐
│ Virtual Thread 1 ──▶ Carrier Thread 1 ──▶ CPU │
│ Virtual Thread 2 ──▶ Carrier Thread 2 ──▶ CPU │
│ Virtual Thread 3 ──▶ (parked / waiting)        │
│ ...                                             │
│ Virtual Thread 1M ──▶ Carrier Thread N         │
│                                                 │
│ Millions of virtual threads                     │
│ Mounted/unmounted on carrier (platform) threads │
│ Blocking calls automatically park the VT        │
└────────────────────────────────────────────────┘

Comparison Table: Process vs Thread vs Coroutine

Characteristic	Process	OS Thread	Green Thread / Coroutine
Memory isolation	Full (own address space)	Shared (within process)	Shared (within runtime)
Creation cost	~1-10ms, ~MBs	~10-100μs, ~1-8MB stack	~1-10μs, ~2-8KB stack
Context switch cost	~1-10μs (TLB flush)	~1-10μs (kernel)	~100ns (userspace)
Max concurrent	~1,000s	~10,000s	~1,000,000s
Communication	IPC (pipe, socket, shm)	Shared memory + locks	Channels, message passing
Failure isolation	Excellent (crash one, others live)	Poor (crash = process dead)	Runtime-dependent
CPU utilization	Multi-core native	Multi-core native	Multi-core with M:N
Scheduling	OS preemptive	OS preemptive	Cooperative / hybrid
Debugging	Easier (isolated)	Hard (race conditions)	Runtime-specific tools
Examples	Nginx workers, PHP-FPM	Java threads, pthreads	Go goroutines, Erlang procs

Backpressure and Flow Control

When producers are faster than consumers, you need backpressure mechanisms:

// ─── Backpressure Controller ─────────────────────────────────

interface BackpressureConfig {
  highWaterMark: number;
  lowWaterMark: number;
  strategy: "drop" | "block" | "buffer-and-signal";
  maxBufferSize: number;
}

class BackpressureController<T> {
  private buffer: T[] = [];
  private config: BackpressureConfig;
  private paused = false;
  private waiters: Array<{
    resolve: () => void;
    reject: (error: Error) => void;
  }> = [];
  private draining = false;
  private totalProcessed = 0;
  private totalDropped = 0;

  constructor(config: Partial<BackpressureConfig> = {}) {
    this.config = {
      highWaterMark: config.highWaterMark ?? 1000,
      lowWaterMark: config.lowWaterMark ?? 250,
      strategy: config.strategy ?? "buffer-and-signal",
      maxBufferSize: config.maxBufferSize ?? 10000,
    };
  }

  // Producer calls this to push items
  async push(item: T): Promise<boolean> {
    if (this.draining) {
      return false;
    }

    // Check backpressure
    if (this.buffer.length >= this.config.highWaterMark) {
      switch (this.config.strategy) {
        case "drop":
          this.totalDropped++;
          return false;

        case "block":
          // Block the producer until space available
          if (this.buffer.length >= this.config.maxBufferSize) {
            throw new Error("Buffer overflow: max buffer size reached");
          }
          await this.waitForDrain();
          break;

        case "buffer-and-signal":
          if (this.buffer.length >= this.config.maxBufferSize) {
            this.totalDropped++;
            return false;
          }
          if (!this.paused) {
            this.paused = true;
          }
          break;
      }
    }

    this.buffer.push(item);
    return true;
  }

  // Consumer calls this to pull items
  pull(batchSize: number = 1): T[] {
    const items = this.buffer.splice(0, batchSize);
    this.totalProcessed += items.length;

    // Check if we should resume producers
    if (this.paused && this.buffer.length <= this.config.lowWaterMark) {
      this.paused = false;
      this.resumeWaiters();
    }

    return items;
  }

  private waitForDrain(): Promise<void> {
    return new Promise<void>((resolve, reject) => {
      this.waiters.push({ resolve, reject });
    });
  }

  private resumeWaiters(): void {
    const waiters = this.waiters.splice(0);
    for (const waiter of waiters) {
      waiter.resolve();
    }
  }

  isPaused(): boolean {
    return this.paused;
  }

  drain(): void {
    this.draining = true;
    // Reject all waiting producers
    for (const waiter of this.waiters) {
      waiter.reject(new Error("Draining"));
    }
    this.waiters = [];
  }

  getStats(): {
    bufferSize: number;
    totalProcessed: number;
    totalDropped: number;
    isPaused: boolean;
    waitingProducers: number;
  } {
    return {
      bufferSize: this.buffer.length,
      totalProcessed: this.totalProcessed,
      totalDropped: this.totalDropped,
      isPaused: this.paused,
      waitingProducers: this.waiters.length,
    };
  }
}

Signal Handling and Process Lifecycle

UNIX SIGNAL LIFECYCLE:

  Process Start
      │
      ▼
  ┌──────────────┐
  │  RUNNING     │
  │              │◄──────── SIGCONT (resume)
  └──────┬───────┘
         │
    ┌────┼────┬────────────────────┐
    │    │    │                    │
    ▼    ▼    ▼                    ▼
  SIGTERM SIGINT SIGQUIT        SIGSTOP/SIGTSTP
  (soft)  (Ctrl+C) (core dump)  (suspend)
    │    │    │                    │
    ▼    ▼    ▼                    ▼
  ┌──────────────┐          ┌──────────────┐
  │ Signal Handler│          │  STOPPED     │
  │ (trappable)  │          │  (T state)   │
  └──────┬───────┘          └──────────────┘
         │
    ┌────┼────┐
    │         │
    ▼         ▼
  Handle    Default
  (cleanup) (terminate)
    │         │
    ▼         ▼
  ┌──────────────┐
  │  ZOMBIE (Z)  │
  │  (exit code  │
  │   stored)    │
  └──────┬───────┘
         │
    parent calls
    waitpid()
         │
         ▼
  ┌──────────────┐
  │  DEAD        │
  │  (reaped)    │
  └──────────────┘

SIGKILL (9) and SIGSTOP: CANNOT be caught or ignored

// ─── Process Lifecycle Manager ────────────────────────────────

type SignalHandler = (signal: string) => Promise<void> | void;

interface ShutdownHook {
  name: string;
  priority: number;  // lower = earlier
  handler: () => Promise<void>;
  timeout: number;
}

class ProcessLifecycle {
  private state: "starting" | "running" | "stopping" | "stopped" = "starting";
  private shutdownHooks: ShutdownHook[] = [];
  private signalHandlers: Map<string, SignalHandler[]> = new Map();
  private shutdownInProgress = false;
  private startTime = Date.now();

  constructor() {
    this.registerDefaultSignalHandlers();
  }

  private registerDefaultSignalHandlers(): void {
    // SIGTERM: Graceful shutdown (what orchestrators send)
    this.onSignal("SIGTERM", async () => {
      console.log("[lifecycle] Received SIGTERM, initiating graceful shutdown");
      await this.shutdown("SIGTERM");
    });

    // SIGINT: User interrupt (Ctrl+C)
    this.onSignal("SIGINT", async () => {
      console.log("[lifecycle] Received SIGINT (Ctrl+C)");
      await this.shutdown("SIGINT");
    });

    // SIGUSR1: Convention for custom actions (e.g., heap dump)
    this.onSignal("SIGUSR1", () => {
      console.log("[lifecycle] Received SIGUSR1");
      // Could trigger heap snapshot, log rotation, etc.
    });

    // SIGUSR2: Convention for custom actions (e.g., reopen log files)
    this.onSignal("SIGUSR2", () => {
      console.log("[lifecycle] Received SIGUSR2");
    });

    // Register with Node.js process
    for (const signal of ["SIGTERM", "SIGINT", "SIGUSR1", "SIGUSR2"]) {
      process.on(signal, () => {
        this.handleSignal(signal);
      });
    }

    // Uncaught exceptions
    process.on("uncaughtException", (error) => {
      console.error("[lifecycle] Uncaught exception:", error);
      this.shutdown("uncaughtException").finally(() => {
        process.exit(1);
      });
    });

    // Unhandled promise rejections
    process.on("unhandledRejection", (reason) => {
      console.error("[lifecycle] Unhandled rejection:", reason);
      // In newer Node.js, this will crash by default
    });
  }

  // Register a signal handler
  onSignal(signal: string, handler: SignalHandler): void {
    if (!this.signalHandlers.has(signal)) {
      this.signalHandlers.set(signal, []);
    }
    this.signalHandlers.get(signal)!.push(handler);
  }

  // Dispatch signal to handlers
  private async handleSignal(signal: string): Promise<void> {
    const handlers = this.signalHandlers.get(signal) ?? [];
    for (const handler of handlers) {
      try {
        await handler(signal);
      } catch (err) {
        console.error(`[lifecycle] Signal handler error (${signal}):`, err);
      }
    }
  }

  // Register a shutdown hook
  addShutdownHook(hook: ShutdownHook): void {
    this.shutdownHooks.push(hook);
    this.shutdownHooks.sort((a, b) => a.priority - b.priority);
  }

  // Mark process as ready
  markReady(): void {
    this.state = "running";
    console.log(
      `[lifecycle] Process ready (startup took ${Date.now() - this.startTime}ms)`
    );
  }

  // Execute graceful shutdown
  async shutdown(reason: string): Promise<void> {
    if (this.shutdownInProgress) {
      console.log("[lifecycle] Shutdown already in progress, ignoring");
      return;
    }

    this.shutdownInProgress = true;
    this.state = "stopping";
    console.log(`[lifecycle] Shutdown initiated. Reason: ${reason}`);

    const shutdownStart = Date.now();

    for (const hook of this.shutdownHooks) {
      console.log(`[lifecycle] Running shutdown hook: ${hook.name}`);

      try {
        await Promise.race([
          hook.handler(),
          new Promise<never>((_, reject) =>
            setTimeout(
              () => reject(new Error(`Hook "${hook.name}" timed out`)),
              hook.timeout
            )
          ),
        ]);
        console.log(`[lifecycle] ✓ ${hook.name} completed`);
      } catch (err) {
        console.error(`[lifecycle] ✗ ${hook.name} failed:`, err);
      }
    }

    this.state = "stopped";
    console.log(
      `[lifecycle] Shutdown complete (took ${Date.now() - shutdownStart}ms)`
    );
  }

  getState(): string {
    return this.state;
  }

  getUptime(): number {
    return Date.now() - this.startTime;
  }
}

Real-World Architecture: Production Server Setup

┌──────────────────────────────────────────────────────────────────────┐
│                  PRODUCTION NODE.JS DEPLOYMENT                        │
│                                                                       │
│  ┌────────────────────────────────────────────────────────────────┐  │
│  │  SYSTEMD / CONTAINER RUNTIME (PID 1)                          │  │
│  │                                                                │  │
│  │  Sends: SIGTERM (graceful) → wait → SIGKILL (force)           │  │
│  └─────────────────────┬──────────────────────────────────────────┘  │
│                         │                                             │
│  ┌─────────────────────▼──────────────────────────────────────────┐  │
│  │  MASTER PROCESS (cluster.isPrimary)                            │  │
│  │                                                                │  │
│  │  Responsibilities:                                             │  │
│  │  ├── Fork N worker processes (N = CPU cores)                  │  │
│  │  ├── Monitor worker health                                     │  │
│  │  ├── Restart crashed workers (with backoff)                    │  │
│  │  ├── Handle signals (SIGTERM → graceful shutdown)              │  │
│  │  ├── IPC message routing                                       │  │
│  │  └── Metrics aggregation                                       │  │
│  │                                                                │  │
│  │  Does NOT handle HTTP requests                                 │  │
│  └───┬────────────┬────────────┬────────────┬────────────────────┘  │
│      │            │            │            │                        │
│  ┌───▼──┐    ┌───▼──┐    ┌───▼──┐    ┌───▼──┐                     │
│  │ W #1 │    │ W #2 │    │ W #3 │    │ W #4 │                     │
│  │      │    │      │    │      │    │      │                     │
│  │ HTTP │    │ HTTP │    │ HTTP │    │ HTTP │                     │
│  │Server│    │Server│    │Server│    │Server│                     │
│  │      │    │      │    │      │    │      │                     │
│  │ Each worker:                                   │               │
│  │ ├── Own V8 instance (~40MB base)              │               │
│  │ ├── Own event loop                             │               │
│  │ ├── Shared server port (via fd passing)       │               │
│  │ ├── libuv thread pool (4 threads each)        │               │
│  │ └── IPC channel to master                      │               │
│  └──────┘    └──────┘    └──────┘    └──────┘                     │
│                                                                       │
│  Total threads per worker: 1 (main) + 4 (libuv) + 2 (V8 GC) = ~7   │
│  Total threads for 4 workers: ~28 + master = ~35 threads             │
│  Total memory: ~(40MB base + app * 4) ≈ 200-800MB                   │
└──────────────────────────────────────────────────────────────────────┘

Interview Questions & Answers

Q1: Why does Node.js use single-threaded event loop instead of multi-threading?

A: Node.js chose single-threaded event loop for I/O-bound workloads because:

No synchronization overhead: No mutexes, semaphores, or lock contention. This eliminates entire classes of bugs (deadlocks, race conditions).
Context switching cost: Thread context switches involve kernel transitions (~1-10μs). The event loop avoids this by running everything cooperatively on one thread.
Memory efficiency: Each OS thread typically reserves 1-8MB for its stack. With the event loop model, 10,000 concurrent connections don't require 10,000 stacks.
I/O is the bottleneck, not CPU: For a typical web server, the CPU spends >95% of time waiting for network/disk. You don't need multiple threads to wait.
Simpler programming model: No shared mutable state means no data races. Callback/async-await is simpler to reason about than locks.

The trade-off: CPU-intensive work blocks the event loop. Solutions include worker_threads for CPU parallelism, the cluster module for multi-process, and offloading to native addons.

Q2: What's the difference between `cluster.fork()` and `child_process.fork()`?

A: Both use the fork(2) syscall under the hood, but they serve different purposes:

child_process.fork(): Creates a new Node.js process with an IPC channel. The child runs a specified module. There's no shared server port — each process binds independently.

cluster.fork(): Also creates a new Node.js process, but with special handling:

The master passes the listening socket's file descriptor to workers
On Linux: uses SO_REUSEPORT or round-robin distribution
Workers call server.listen(port), but the kernel routes connections from the master's socket
Built-in load balancing (configurable: cluster.schedulingPolicy)

Key difference: cluster is specifically designed for sharing a server port across multiple processes. The master process either uses round-robin scheduling (default on Linux/macOS) or lets the OS handle distribution (SO_REUSEPORT).

Under the hood, cluster worker's listen() call doesn't actually call bind(2) — it sends a message to the master, which passes the file descriptor back via sendmsg(2) with SCM_RIGHTS.

Q3: How does Go's goroutine scheduling differ from OS thread scheduling?

A: Go implements an M:N scheduler (M goroutines on N OS threads):

OS thread scheduling:

Managed by kernel (CFS on Linux)
Preemptive: kernel interrupts via timer (typically every 4ms)
Context switch: ~1-10μs (save/restore registers, TLB flush for processes)
Stack: 1-8MB fixed at creation
Creation: system call (~10-100μs)

Go goroutine scheduling (GMP model):

G (Goroutine): the unit of work, starts with 2-8KB stack (growable)
M (Machine): OS thread, goroutines run on these
P (Processor): logical processor with local run queue

Key differences:

Stack size: Goroutines start at 2KB (vs 1MB for threads). Stacks grow/shrink dynamically via stack copying.
Context switch: ~100-200ns (userspace register swap, no kernel transition)
Cooperative + preemptive hybrid: Goroutines yield at function calls (cooperative). Since Go 1.14, the runtime also uses async preemption via signals.
Work stealing: Idle P steals from other P's local queue or the global queue.
Network poller integration: When a goroutine does I/O, it's parked and the M picks up another G. No thread is wasted waiting.

Q4: What happens when `fork()` is called? Walk through the OS-level steps.

A: When a process calls fork():

Kernel entry: The calling process traps into kernel mode via a system call.
Process descriptor allocation: The kernel allocates a new task_struct (Linux) — a ~6KB structure describing the process.
PID assignment: A new PID is allocated from the PID namespace.
Copy-on-Write (COW) page tables: The kernel does NOT copy the parent's memory. Instead, it:
- Copies the parent's page table entries
- Marks all pages as read-only in BOTH parent and child
- When either writes, a page fault triggers and the kernel copies just that page
File descriptor table: The child gets a copy of the parent's FD table. Both parent and child now reference the same open file descriptions (file position, flags are shared).
Signal disposition: Copied from parent. Pending signals are NOT inherited.
Scheduling: The child is placed on the scheduler's run queue. On modern Linux, the child typically runs first (to avoid COW overhead if it immediately calls exec()).
Return values: fork() returns the child's PID to the parent, and 0 to the child. This is how each process knows its role.

The COW optimization means fork() is extremely fast (~1ms) even for processes with GBs of memory — until actual writes cause page faults.

Q5: How would you design a process model for a high-throughput backend service?

A: The optimal model depends on the workload:

For I/O-bound services (95%+ of web backends):

Use Node.js cluster mode with N workers (N = CPU cores)
Each worker runs an event loop handling thousands of connections
Master process handles health checks, graceful restarts
Use SharedArrayBuffer for shared counters/metrics
Total memory: ~50-200MB per worker

For CPU-bound services (image processing, ML inference):

Use a thread pool model (Go, Java)
Thread count: 2 × CPU cores (to overlap I/O waits)
Use bounded work queues with backpressure
Consider separate process pools for isolation

For mixed workloads:

Main event loop for I/O (connection handling, routing)
Worker thread pool for CPU-intensive tasks
Backpressure between the event loop and thread pool
Circuit breaker: if thread pool saturates, fast-fail new CPU tasks

Production considerations:

Health check endpoints (readiness + liveness)
Graceful shutdown (drain connections on SIGTERM)
Memory limits per worker (restart if exceeded)
Core dump configuration for crash analysis
CPU affinity (pin workers to cores for cache locality)
cgroup limits in containers (respect --max-old-space-size in Node.js based on container memory limit, not host memory)

Real-World Problems & How to Solve Them

Problem 1: Node.js server freezes under CPU-heavy endpoints

Symptom: P99 latency spikes for all routes when one endpoint runs compression, image processing, or crypto.

Root cause: CPU-bound work runs on the main event-loop thread, preventing poll/check phases from handling other sockets.

Fix — Offload CPU-heavy tasks to worker threads:

import { Worker } from "node:worker_threads";

function runCpuTask(workerFile: string, payload: unknown): Promise<unknown> {
  return new Promise((resolve, reject) => {
    const worker = new Worker(workerFile, { workerData: payload });

    worker.once("message", resolve);
    worker.once("error", reject);
    worker.once("exit", (code) => {
      if (code !== 0) reject(new Error(`Worker exited with code ${code}`));
    });
  });
}

app.post("/thumbnail", async (req, res, next) => {
  try {
    const result = await runCpuTask("./workers/thumbnail.js", req.body);
    res.json(result);
  } catch (error) {
    next(error);
  }
});

Problem 2: Slow crypto/fs operations during traffic bursts

Symptom: Password hashing and file operations become randomly slow even when CPU appears available.

Root cause: The libuv thread pool (default size 4) is saturated by blocking async tasks (PBKDF2, zlib, fs), causing queueing delays.

Fix — Bound concurrency and tune UV_THREADPOOL_SIZE for workload:

import { pbkdf2 } from "node:crypto";

process.env.UV_THREADPOOL_SIZE = String(Math.min(32, require("os").cpus().length * 2));

class Semaphore {
  private permits: number;
  private waiters: Array<() => void> = [];

  constructor(permits: number) {
    this.permits = permits;
  }

  async acquire(): Promise<void> {
    if (this.permits > 0) {
      this.permits--;
      return;
    }
    await new Promise<void>((resolve) => this.waiters.push(resolve));
  }

  release(): void {
    const waiter = this.waiters.shift();
    if (waiter) waiter();
    else this.permits++;
  }
}

const cryptoSemaphore = new Semaphore(16);

async function hashPassword(password: string, salt: string): Promise<Buffer> {
  await cryptoSemaphore.acquire();
  try {
    return await new Promise<Buffer>((resolve, reject) => {
      pbkdf2(password, salt, 100_000, 64, "sha512", (err, key) => {
        if (err) reject(err);
        else resolve(key);
      });
    });
  } finally {
    cryptoSemaphore.release();
  }
}

Problem 3: Memory usage doubles after forking workers

Symptom: Clustered process memory grows rapidly after deploy, despite no traffic increase.

Root cause: Copy-on-write benefits are lost when workers mutate large preloaded objects, forcing private page copies per process.

Fix — Preload immutable shared data before fork() and avoid mutation:

import cluster from "node:cluster";
import os from "node:os";

const immutableConfig = Object.freeze(loadLargeRoutingTable());

if (cluster.isPrimary) {
  const workerCount = os.cpus().length;
  for (let i = 0; i < workerCount; i++) {
    cluster.fork({ ROUTING_TABLE_VERSION: immutableConfig.version });
  }
} else {
  // Read-only usage preserves CoW pages better than mutating shared blobs.
  startHttpServer({ routingTable: immutableConfig });
}

Problem 4: Restart loops create unstable worker pools

Symptom: Workers crash and restart continuously, causing request drops and noisy alerts.

Root cause: Process manager restarts instantly without backoff or max restart limits, amplifying failures.

Fix — Add restart budget + exponential delay in supervisor:

type WorkerState = { restarts: number; lastStartMs: number };
const workerState = new Map<number, WorkerState>();

function nextRestartDelay(restarts: number): number {
  return Math.min(30_000, 500 * Math.pow(2, restarts));
}

function shouldRestart(exitCode: number | null, restarts: number): boolean {
  const maxRestarts = 5;
  return exitCode !== 0 && restarts < maxRestarts;
}

function onWorkerExit(workerId: number, exitCode: number | null): void {
  const current = workerState.get(workerId) ?? { restarts: 0, lastStartMs: Date.now() };
  if (!shouldRestart(exitCode, current.restarts)) return;

  const delay = nextRestartDelay(current.restarts);
  setTimeout(() => {
    spawnWorker();
  }, delay);

  workerState.set(workerId, { restarts: current.restarts + 1, lastStartMs: Date.now() + delay });
}

Problem 5: One worker is overloaded while others are idle

Symptom: CPU utilization is skewed; one worker has a deep queue while others are underused.

Root cause: Naive round-robin dispatch ignores in-flight load and queue depth.

Fix — Route requests by least in-flight load:

type WorkerRef = { id: number; inFlight: number; send(req: unknown): void };

class LeastLoadedDispatcher {
  constructor(private workers: WorkerRef[]) {}

  dispatch(req: unknown): void {
    const worker = this.workers.reduce((best, current) =>
      current.inFlight < best.inFlight ? current : best,
    );

    worker.inFlight++;
    worker.send(req);
  }

  onComplete(workerId: number): void {
    const worker = this.workers.find((w) => w.id === workerId);
    if (worker) worker.inFlight = Math.max(0, worker.inFlight - 1);
  }
}

Problem 6: Internal queues grow until OOM

Symptom: Memory climbs steadily during load tests and process crashes with out-of-memory.

Root cause: Producers outpace consumers and no backpressure/high-water mark logic limits queue growth.

Fix — Add bounded queues with high/low-water backpressure signals:

class BoundedQueue<T> {
  private items: T[] = [];
  constructor(private highWater = 10_000, private lowWater = 5_000) {}

  push(item: T): boolean {
    if (this.items.length >= this.highWater) return false;
    this.items.push(item);
    return true;
  }

  shift(): T | undefined {
    return this.items.shift();
  }

  shouldResumeProducers(): boolean {
    return this.items.length <= this.lowWater;
  }

  size(): number {
    return this.items.length;
  }
}

const queue = new BoundedQueue<string>();
const accepted = queue.push("job-123");
if (!accepted) {
  throw new Error("Backpressure: queue is full, retry later");
}

Key Takeaways

Processes provide isolation, threads provide efficiency: Processes have separate address spaces (safe), threads share memory (fast). Choose based on whether you need isolation or low-overhead communication.
The event loop is not "single-threaded": Node.js has one JS thread but uses kernel async I/O (epoll/kqueue) plus a thread pool for blocking operations. The single-thread model avoids locks, not parallelism.
Copy-on-Write makes fork() cheap: The OS doesn't copy memory on fork — just page table entries. Actual copying happens lazily on write. This is why pre-fork servers work well.
Work-stealing balances load automatically: Each processor has its own queue. Idle processors steal from busy ones, naturally balancing work without centralized scheduling overhead.
Backpressure prevents cascade failures: When producers outpace consumers, you need explicit flow control (high/low water marks). Without it, unbounded queues consume memory until OOM.
Green threads trade OS control for efficiency: Goroutines and virtual threads achieve million-scale concurrency by managing scheduling in userspace with tiny stacks, at the cost of runtime complexity.
The GIL is a design choice, not a bug: Python's GIL simplifies the interpreter and C extension model. For CPU parallelism, use multiprocessing. For I/O concurrency, asyncio works fine.
Signal handling is critical for production: SIGTERM starts graceful shutdown, SIGKILL force-kills. Your process must handle SIGTERM to drain connections and clean up before the orchestrator sends SIGKILL.
Cluster mode shares ports, not memory: Node.js cluster workers share a server port via file descriptor passing, but each worker has its own V8 heap. Use IPC or SharedArrayBuffer for coordination.
Match the concurrency model to the workload: I/O-bound → event loop, CPU-bound → thread pool, mixed → event loop + worker threads. Attempting CPU-heavy work on an event loop thread is the #1 Node.js performance anti-pattern.

What did you think?