Backend Process & Thread Management: How Servers Actually Handle Concurrency
Backend Process & Thread Management: How Servers Actually Handle Concurrency
Most backend developers have a mental model that goes something like: "request comes in, server handles it, response goes out." But what actually happens at the operating system level? How does Node.js handle 10,000 concurrent connections on a single thread? Why does Python's GIL exist? How do Go's goroutines differ from OS threads? And when you type cluster.fork(), what syscalls fire underneath?
This post tears apart process and thread management from the OS primitives up through the application frameworks. We'll build a multi-process server, a thread pool, a work-stealing scheduler, and an event loop — all from scratch in TypeScript — to understand exactly how modern backend runtimes manage concurrency.
The Operating System Foundation
Before we touch any application code, we need to understand what the OS provides:
┌─────────────────────────────────────────────────────────────────┐
│ USER SPACE │
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Process 1 │ │ Process 2 │ │ Process 3 │ │ Process N │ │
│ │ │ │ │ │ │ │ │ │
│ │ ┌──┐ ┌──┐│ │ ┌──┐ ┌──┐│ │ ┌──┐ │ │ ┌──┐ ┌──┐│ │
│ │ │T1│ │T2││ │ │T1│ │T2││ │ │T1│ │ │ │T1│ │T2││ │
│ │ └──┘ └──┘│ │ └──┘ └──┘│ │ └──┘ │ │ └──┘ └──┘│ │
│ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │
│ │
├──────────────────────────────────────────────────────────────────┤
│ KERNEL SPACE │
│ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ SCHEDULER (CFS) │ │
│ │ │ │
│ │ Run Queue: [T1:P1] [T2:P1] [T1:P2] [T2:P2] [T1:P3] │ │
│ │ │ │
│ │ CPU 0: ──▶ T1:P1 ──▶ T2:P2 ──▶ T1:P3 ──▶ ... │ │
│ │ CPU 1: ──▶ T1:P2 ──▶ T2:P1 ──▶ T1:PN ──▶ ... │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────────┐ ┌──────────────────┐ │
│ │ Virtual Memory │ │ File Descriptors │ │
│ │ Management │ │ & I/O Subsystem │ │
│ └──────────────────┘ └──────────────────┘ │
└──────────────────────────────────────────────────────────────────┘
Process vs Thread: The Real Difference
PROCESS:
┌──────────────────────────────────┐
│ PID: 1234 │
│ ┌────────────────────────────┐ │
│ │ Virtual Address Space │ │ ← Each process gets its own
│ │ ┌──────┐ ┌──────┐ │ │
│ │ │ Code │ │ Data │ │ │
│ │ └──────┘ └──────┘ │ │
│ │ ┌──────┐ ┌──────┐ │ │
│ │ │ Heap │ │Stack │ │ │
│ │ └──────┘ └──────┘ │ │
│ └────────────────────────────┘ │
│ File Descriptors: [0,1,2,3...] │ ← Own FD table
│ Signal Handlers: {...} │ ← Own signal disposition
│ Environment: {...} │ ← Own env vars
└──────────────────────────────────┘
THREADS (within a process):
┌──────────────────────────────────────────┐
│ PID: 1234 │
│ ┌────────────────────────────────────┐ │
│ │ SHARED: Code, Data, Heap, FDs │ │ ← All threads share
│ └────────────────────────────────────┘ │
│ │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
│ │ Thread 1 │ │ Thread 2 │ │ Thread 3 │ │
│ │ ┌─────┐ │ │ ┌─────┐ │ │ ┌─────┐ ││
│ │ │Stack│ │ │ │Stack│ │ │ │Stack│ ││ ← Each thread: own stack
│ │ └─────┘ │ │ └─────┘ │ │ └─────┘ ││
│ │ TID:1001 │ │ TID:1002 │ │ TID:1003 ││ ← Own thread ID
│ │ Regs:{} │ │ Regs:{} │ │ Regs:{} ││ ← Own CPU registers
│ └─────────┘ └─────────┘ └─────────┘ │
└──────────────────────────────────────────┘
Building a Process Manager from Scratch
Let's build the core abstraction that manages child processes — similar to what Node.js cluster module does internally:
import { EventEmitter } from "events";
// ─── Process Descriptor ───────────────────────────────────────
interface ProcessDescriptor {
pid: number;
state: "running" | "stopped" | "zombie" | "sleeping";
cpuTime: number; // milliseconds of CPU used
memoryRSS: number; // resident set size in bytes
startTime: number; // when process started
exitCode: number | null;
signal: string | null;
restarts: number;
}
// ─── IPC Channel Types ────────────────────────────────────────
type IPCMessage = {
type: string;
payload: unknown;
source: number; // sender PID
timestamp: number;
};
type IPCChannel = {
send(msg: IPCMessage): boolean;
onMessage(handler: (msg: IPCMessage) => void): void;
close(): void;
};
// ─── Process Manager ─────────────────────────────────────────
class ProcessManager extends EventEmitter {
private processes: Map<number, ProcessDescriptor> = new Map();
private ipcChannels: Map<number, IPCChannel> = new Map();
private maxProcesses: number;
private restartPolicy: "always" | "on-failure" | "never";
private maxRestarts: number;
private restartDelay: number;
constructor(options: {
maxProcesses?: number;
restartPolicy?: "always" | "on-failure" | "never";
maxRestarts?: number;
restartDelay?: number;
} = {}) {
super();
this.maxProcesses = options.maxProcesses ?? require("os").cpus().length;
this.restartPolicy = options.restartPolicy ?? "on-failure";
this.maxRestarts = options.maxRestarts ?? 5;
this.restartDelay = options.restartDelay ?? 1000;
}
// Fork a new child process
fork(modulePath: string, args: string[] = [], env: Record<string, string> = {}): number {
if (this.processes.size >= this.maxProcesses) {
throw new Error(
`Max processes (${this.maxProcesses}) reached. Cannot fork more.`
);
}
const { fork: cpFork } = require("child_process");
const child = cpFork(modulePath, args, {
env: { ...process.env, ...env },
stdio: ["pipe", "pipe", "pipe", "ipc"], // stdin, stdout, stderr, ipc
serialization: "advanced", // Use V8 serialization for structured clone
});
const descriptor: ProcessDescriptor = {
pid: child.pid!,
state: "running",
cpuTime: 0,
memoryRSS: 0,
startTime: Date.now(),
exitCode: null,
signal: null,
restarts: 0,
};
this.processes.set(child.pid!, descriptor);
this.setupIPC(child);
this.setupEventHandlers(child, modulePath, args, env);
this.emit("fork", { pid: child.pid, modulePath });
return child.pid!;
}
// Setup IPC channel for a child process
private setupIPC(child: any): void {
const channel: IPCChannel = {
send(msg: IPCMessage): boolean {
if (!child.connected) return false;
try {
child.send(msg);
return true;
} catch {
return false;
}
},
onMessage(handler: (msg: IPCMessage) => void): void {
child.on("message", handler);
},
close(): void {
if (child.connected) {
child.disconnect();
}
},
};
this.ipcChannels.set(child.pid!, channel);
}
// Setup lifecycle event handlers
private setupEventHandlers(
child: any,
modulePath: string,
args: string[],
env: Record<string, string>
): void {
child.on("exit", (code: number | null, signal: string | null) => {
const descriptor = this.processes.get(child.pid!);
if (!descriptor) return;
descriptor.state = "zombie";
descriptor.exitCode = code;
descriptor.signal = signal;
this.emit("exit", {
pid: child.pid,
code,
signal,
uptime: Date.now() - descriptor.startTime,
});
// Restart policy evaluation
this.evaluateRestart(child.pid!, code, signal, modulePath, args, env);
});
child.on("error", (err: Error) => {
this.emit("error", { pid: child.pid, error: err });
});
// Monitor memory and CPU periodically
const monitorInterval = setInterval(() => {
if (!child.connected) {
clearInterval(monitorInterval);
return;
}
try {
const usage = child.usage?.() ?? process.cpuUsage();
const descriptor = this.processes.get(child.pid!);
if (descriptor) {
descriptor.cpuTime = (usage.user + usage.system) / 1000;
}
} catch {
clearInterval(monitorInterval);
}
}, 5000);
}
// Evaluate whether to restart a process
private evaluateRestart(
pid: number,
code: number | null,
signal: string | null,
modulePath: string,
args: string[],
env: Record<string, string>
): void {
const descriptor = this.processes.get(pid);
if (!descriptor) return;
const shouldRestart = (() => {
switch (this.restartPolicy) {
case "always":
return true;
case "on-failure":
return code !== 0;
case "never":
return false;
}
})();
if (!shouldRestart) {
this.processes.delete(pid);
this.ipcChannels.delete(pid);
return;
}
if (descriptor.restarts >= this.maxRestarts) {
this.emit("max-restarts", { pid, restarts: descriptor.restarts });
this.processes.delete(pid);
this.ipcChannels.delete(pid);
return;
}
// Exponential backoff for restarts
const delay = this.restartDelay * Math.pow(2, descriptor.restarts);
this.emit("restarting", {
pid,
attempt: descriptor.restarts + 1,
delay,
});
setTimeout(() => {
this.processes.delete(pid);
this.ipcChannels.delete(pid);
const newPid = this.fork(modulePath, args, env);
const newDescriptor = this.processes.get(newPid)!;
newDescriptor.restarts = descriptor.restarts + 1;
}, delay);
}
// Send message to a specific process
sendTo(pid: number, type: string, payload: unknown): boolean {
const channel = this.ipcChannels.get(pid);
if (!channel) return false;
return channel.send({
type,
payload,
source: process.pid,
timestamp: Date.now(),
});
}
// Broadcast message to all processes
broadcast(type: string, payload: unknown): Map<number, boolean> {
const results = new Map<number, boolean>();
for (const [pid] of this.processes) {
results.set(pid, this.sendTo(pid, type, payload));
}
return results;
}
// Graceful shutdown of a specific process
async shutdown(pid: number, timeout: number = 5000): Promise<boolean> {
const descriptor = this.processes.get(pid);
if (!descriptor || descriptor.state !== "running") return false;
// Send shutdown signal via IPC first
this.sendTo(pid, "shutdown", { deadline: Date.now() + timeout });
return new Promise<boolean>((resolve) => {
const timer = setTimeout(() => {
// Force kill if graceful shutdown times out
try {
process.kill(pid, "SIGKILL");
} catch {}
resolve(false);
}, timeout);
const onExit = (event: { pid: number }) => {
if (event.pid === pid) {
clearTimeout(timer);
this.removeListener("exit", onExit);
resolve(true);
}
};
this.on("exit", onExit);
// Send SIGTERM
try {
process.kill(pid, "SIGTERM");
} catch {
clearTimeout(timer);
resolve(false);
}
});
}
// Get status of all processes
getStatus(): ProcessDescriptor[] {
return Array.from(this.processes.values());
}
}
The Event Loop: How Single-Threaded Servers Work
Node.js runs JavaScript on a single thread but handles thousands of concurrent connections. The secret is the event loop backed by libuv:
┌─────────────────────────────────────────────────────────┐
│ NODE.JS EVENT LOOP │
│ │
│ ┌─── Phase 1 ──────────────────────────────────────┐ │
│ │ TIMERS │ │
│ │ Execute setTimeout() and setInterval() callbacks │ │
│ └──────────────────────────┬───────────────────────┘ │
│ ▼ │
│ ┌─── Phase 2 ──────────────────────────────────────┐ │
│ │ PENDING CALLBACKS │ │
│ │ I/O callbacks deferred to next loop iteration │ │
│ └──────────────────────────┬───────────────────────┘ │
│ ▼ │
│ ┌─── Phase 3 ──────────────────────────────────────┐ │
│ │ IDLE / PREPARE │ │
│ │ Internal use only (libuv internals) │ │
│ └──────────────────────────┬───────────────────────┘ │
│ ▼ │
│ ┌─── Phase 4 ──────────────────────────────────────┐ │
│ │ POLL │ │
│ │ Retrieve new I/O events │ │
│ │ Execute I/O callbacks (almost all except close) │ │
│ │ *** Will block here when no work pending *** │ │
│ └──────────────────────────┬───────────────────────┘ │
│ ▼ │
│ ┌─── Phase 5 ──────────────────────────────────────┐ │
│ │ CHECK │ │
│ │ Execute setImmediate() callbacks │ │
│ └──────────────────────────┬───────────────────────┘ │
│ ▼ │
│ ┌─── Phase 6 ──────────────────────────────────────┐ │
│ │ CLOSE CALLBACKS │ │
│ │ socket.on('close', ...) etc. │ │
│ └──────────────────────────┬───────────────────────┘ │
│ │ │
│ ◄──── loop back ──────┘ │
│ │
│ Between EVERY phase: │
│ ┌──────────────────────────────────────────────────┐ │
│ │ MICROTASK QUEUE │ │
│ │ process.nextTick() → Promise callbacks → ... │ │
│ └──────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────┘
Let's build our own event loop to understand the mechanics:
// ─── Event Loop Implementation ────────────────────────────────
type TimerEntry = {
id: number;
callback: () => void;
executeAt: number;
interval: number | null; // null for setTimeout, ms for setInterval
};
type IOCallback = {
fd: number;
event: "readable" | "writable" | "error";
callback: () => void;
};
type MicrotaskEntry = {
callback: () => void;
priority: "nextTick" | "promise";
};
class EventLoop {
private timers: TimerEntry[] = [];
private pendingCallbacks: Array<() => void> = [];
private ioWatchers: Map<number, IOCallback[]> = new Map();
private checkCallbacks: Array<() => void> = [];
private closeCallbacks: Array<() => void> = [];
private microtaskQueue: MicrotaskEntry[] = [];
private nextTimerId = 1;
private running = false;
private iteration = 0;
// ─── Timer Registration ──────────────────────────────────
setTimeout(callback: () => void, delay: number): number {
const id = this.nextTimerId++;
this.timers.push({
id,
callback,
executeAt: Date.now() + delay,
interval: null,
});
this.timers.sort((a, b) => a.executeAt - b.executeAt);
return id;
}
setInterval(callback: () => void, interval: number): number {
const id = this.nextTimerId++;
this.timers.push({
id,
callback,
executeAt: Date.now() + interval,
interval,
});
this.timers.sort((a, b) => a.executeAt - b.executeAt);
return id;
}
clearTimer(id: number): void {
this.timers = this.timers.filter((t) => t.id !== id);
}
// ─── Microtask Registration ──────────────────────────────
nextTick(callback: () => void): void {
this.microtaskQueue.push({ callback, priority: "nextTick" });
}
queueMicrotask(callback: () => void): void {
this.microtaskQueue.push({ callback, priority: "promise" });
}
// ─── I/O Registration ───────────────────────────────────
watchFD(fd: number, event: "readable" | "writable" | "error", callback: () => void): void {
if (!this.ioWatchers.has(fd)) {
this.ioWatchers.set(fd, []);
}
this.ioWatchers.get(fd)!.push({ fd, event, callback });
}
// ─── setImmediate equivalent ─────────────────────────────
setImmediate(callback: () => void): void {
this.checkCallbacks.push(callback);
}
// ─── Close callback registration ────────────────────────
onClose(callback: () => void): void {
this.closeCallbacks.push(callback);
}
// ─── Drain microtask queue ──────────────────────────────
private drainMicrotasks(): void {
// nextTick has higher priority than promise callbacks
// Process all nextTick first, then all promise callbacks
// But new callbacks added during execution are also processed
while (this.microtaskQueue.length > 0) {
// Sort: nextTick before promise
this.microtaskQueue.sort((a, b) => {
if (a.priority === "nextTick" && b.priority === "promise") return -1;
if (a.priority === "promise" && b.priority === "nextTick") return 1;
return 0; // preserve insertion order within same priority
});
const entry = this.microtaskQueue.shift()!;
try {
entry.callback();
} catch (err) {
console.error("Microtask error:", err);
}
}
}
// ─── Phase 1: Timers ────────────────────────────────────
private processTimers(): void {
const now = Date.now();
const expired: TimerEntry[] = [];
const remaining: TimerEntry[] = [];
for (const timer of this.timers) {
if (timer.executeAt <= now) {
expired.push(timer);
} else {
remaining.push(timer);
}
}
this.timers = remaining;
for (const timer of expired) {
try {
timer.callback();
} catch (err) {
console.error("Timer callback error:", err);
}
// Re-schedule intervals
if (timer.interval !== null) {
this.timers.push({
...timer,
executeAt: now + timer.interval,
});
}
}
this.timers.sort((a, b) => a.executeAt - b.executeAt);
}
// ─── Phase 2: Pending Callbacks ─────────────────────────
private processPendingCallbacks(): void {
const callbacks = this.pendingCallbacks.splice(0);
for (const cb of callbacks) {
try {
cb();
} catch (err) {
console.error("Pending callback error:", err);
}
}
}
// ─── Phase 4: Poll ──────────────────────────────────────
private poll(): void {
// In a real implementation, this would call epoll_wait/kqueue/IOCP
// and block until I/O events arrive or a timer expires
// Simulate: check all watched FDs for readiness
for (const [fd, watchers] of this.ioWatchers) {
for (const watcher of watchers) {
// In reality, this would check epoll readiness
// For simulation, we'll process any registered callbacks
try {
watcher.callback();
} catch (err) {
this.pendingCallbacks.push(() => {
console.error(`IO error on fd ${fd}:`, err);
});
}
}
}
}
// ─── Phase 5: Check ─────────────────────────────────────
private processCheck(): void {
const callbacks = this.checkCallbacks.splice(0);
for (const cb of callbacks) {
try {
cb();
} catch (err) {
console.error("Check callback error:", err);
}
}
}
// ─── Phase 6: Close Callbacks ───────────────────────────
private processCloseCallbacks(): void {
const callbacks = this.closeCallbacks.splice(0);
for (const cb of callbacks) {
try {
cb();
} catch (err) {
console.error("Close callback error:", err);
}
}
}
// ─── Has pending work? ──────────────────────────────────
private hasPendingWork(): boolean {
return (
this.timers.length > 0 ||
this.pendingCallbacks.length > 0 ||
this.ioWatchers.size > 0 ||
this.checkCallbacks.length > 0 ||
this.closeCallbacks.length > 0 ||
this.microtaskQueue.length > 0
);
}
// ─── Main Loop ──────────────────────────────────────────
async run(): Promise<void> {
this.running = true;
while (this.running && this.hasPendingWork()) {
this.iteration++;
// Phase 1: Timers
this.processTimers();
this.drainMicrotasks();
// Phase 2: Pending callbacks
this.processPendingCallbacks();
this.drainMicrotasks();
// Phase 3: Idle/Prepare (internal, skip)
// Phase 4: Poll
this.poll();
this.drainMicrotasks();
// Phase 5: Check (setImmediate)
this.processCheck();
this.drainMicrotasks();
// Phase 6: Close callbacks
this.processCloseCallbacks();
this.drainMicrotasks();
// Yield to prevent starving the actual event loop
// (since we're running inside Node.js)
await new Promise((resolve) => global.setTimeout(resolve, 0));
}
this.running = false;
}
stop(): void {
this.running = false;
}
getStats(): {
iteration: number;
timers: number;
ioWatchers: number;
microtasks: number;
} {
return {
iteration: this.iteration,
timers: this.timers.length,
ioWatchers: this.ioWatchers.size,
microtasks: this.microtaskQueue.length,
};
}
}
Thread Pool: How libuv Offloads Blocking Work
Node.js uses a thread pool (default 4 threads, configurable via UV_THREADPOOL_SIZE) for operations that can't be done asynchronously at the OS level — like file system operations, DNS lookups, and crypto:
┌───────────────────────────────────────────────────────────┐
│ MAIN THREAD │
│ │
│ JavaScript ──▶ N-API ──▶ libuv │
│ │ │
│ ┌─────────┴─────────┐ │
│ │ Is it async at OS? │ │
│ └─────────┬─────────┘ │
│ YES │ │ NO │
│ ▼ ▼ │
│ ┌─────────┐ ┌────────────────────┐ │
│ │ epoll / │ │ THREAD POOL │ │
│ │ kqueue │ │ (UV_THREADPOOL_SIZE)│ │
│ │ │ │ │ │
│ │ Network │ │ ┌───┐ ┌───┐ ┌───┐ │ │
│ │ sockets │ │ │ W1│ │ W2│ │ W3│ │ │
│ │ pipes │ │ └───┘ └───┘ └───┘ │ │
│ │ signals │ │ ┌───┐ │ │
│ │ │ │ │ W4│ (default 4) │ │
│ └─────────┘ │ └───┘ │ │
│ │ │ │
│ │ FS ops, DNS lookup, │ │
│ │ compression, crypto │ │
│ └─────────────────────┘ │
└───────────────────────────────────────────────────────────┘
Let's build a thread pool from scratch:
// ─── Task Queue for Thread Pool ───────────────────────────────
type TaskPriority = "high" | "normal" | "low";
interface WorkerTask<T = unknown> {
id: string;
fn: () => T | Promise<T>;
priority: TaskPriority;
timeout: number;
enqueuedAt: number;
resolve: (value: T) => void;
reject: (error: Error) => void;
}
interface WorkerStats {
id: number;
state: "idle" | "busy" | "draining";
tasksCompleted: number;
totalBusyTime: number;
currentTaskId: string | null;
currentTaskStartedAt: number | null;
}
// ─── Thread Pool Implementation ──────────────────────────────
class ThreadPool {
private taskQueue: WorkerTask[] = [];
private workers: WorkerStats[] = [];
private activeWorkers: Map<number, { task: WorkerTask; abortController: AbortController }> = new Map();
private poolSize: number;
private maxQueueSize: number;
private draining = false;
constructor(options: {
poolSize?: number;
maxQueueSize?: number;
} = {}) {
this.poolSize = options.poolSize ?? 4;
this.maxQueueSize = options.maxQueueSize ?? 1024;
// Initialize worker descriptors
for (let i = 0; i < this.poolSize; i++) {
this.workers.push({
id: i,
state: "idle",
tasksCompleted: 0,
totalBusyTime: 0,
currentTaskId: null,
currentTaskStartedAt: null,
});
}
}
// Submit a task to the pool
submit<T>(
fn: () => T | Promise<T>,
options: {
priority?: TaskPriority;
timeout?: number;
} = {}
): Promise<T> {
if (this.draining) {
return Promise.reject(new Error("Thread pool is draining"));
}
if (this.taskQueue.length >= this.maxQueueSize) {
return Promise.reject(new Error("Task queue is full"));
}
return new Promise<T>((resolve, reject) => {
const task: WorkerTask<T> = {
id: `task_${Date.now()}_${Math.random().toString(36).slice(2, 8)}`,
fn,
priority: options.priority ?? "normal",
timeout: options.timeout ?? 30000,
enqueuedAt: Date.now(),
resolve: resolve as (value: unknown) => void,
reject,
};
// Insert based on priority
this.insertByPriority(task);
// Try to schedule immediately
this.scheduleNext();
});
}
// Priority-based insertion
private insertByPriority(task: WorkerTask): void {
const priorityOrder: Record<TaskPriority, number> = {
high: 0,
normal: 1,
low: 2,
};
const taskPriority = priorityOrder[task.priority];
let insertIndex = this.taskQueue.length;
for (let i = 0; i < this.taskQueue.length; i++) {
if (priorityOrder[this.taskQueue[i].priority] > taskPriority) {
insertIndex = i;
break;
}
}
this.taskQueue.splice(insertIndex, 0, task);
}
// Schedule next task on an idle worker
private scheduleNext(): void {
if (this.taskQueue.length === 0) return;
const idleWorker = this.workers.find((w) => w.state === "idle");
if (!idleWorker) return;
const task = this.taskQueue.shift()!;
this.executeOnWorker(idleWorker, task);
}
// Execute task on a specific worker
private async executeOnWorker(worker: WorkerStats, task: WorkerTask): Promise<void> {
worker.state = "busy";
worker.currentTaskId = task.id;
worker.currentTaskStartedAt = Date.now();
const abortController = new AbortController();
this.activeWorkers.set(worker.id, { task, abortController });
// Setup timeout
const timeoutHandle = setTimeout(() => {
abortController.abort();
}, task.timeout);
try {
const result = await Promise.race([
// Actual task execution
(async () => {
return await task.fn();
})(),
// Abort signal
new Promise<never>((_, reject) => {
abortController.signal.addEventListener("abort", () => {
reject(new Error(`Task ${task.id} timed out after ${task.timeout}ms`));
});
}),
]);
task.resolve(result);
} catch (error) {
task.reject(error instanceof Error ? error : new Error(String(error)));
} finally {
clearTimeout(timeoutHandle);
const busyTime = Date.now() - (worker.currentTaskStartedAt ?? Date.now());
worker.totalBusyTime += busyTime;
worker.tasksCompleted++;
worker.state = this.draining ? "draining" : "idle";
worker.currentTaskId = null;
worker.currentTaskStartedAt = null;
this.activeWorkers.delete(worker.id);
// Schedule next task if not draining
if (!this.draining) {
this.scheduleNext();
}
}
}
// Drain the pool: finish current tasks, reject queued tasks
async drain(): Promise<void> {
this.draining = true;
// Reject all queued tasks
for (const task of this.taskQueue) {
task.reject(new Error("Thread pool is draining"));
}
this.taskQueue = [];
// Wait for active tasks to complete
while (this.activeWorkers.size > 0) {
await new Promise((resolve) => setTimeout(resolve, 100));
}
for (const worker of this.workers) {
worker.state = "draining";
}
}
// Get pool statistics
getStats(): {
poolSize: number;
busyWorkers: number;
idleWorkers: number;
queuedTasks: number;
workers: WorkerStats[];
} {
return {
poolSize: this.poolSize,
busyWorkers: this.workers.filter((w) => w.state === "busy").length,
idleWorkers: this.workers.filter((w) => w.state === "idle").length,
queuedTasks: this.taskQueue.length,
workers: [...this.workers],
};
}
}
Work-Stealing Scheduler
Go's runtime and many high-performance servers use work-stealing schedulers. Each thread has its own local queue, and idle threads "steal" work from busy threads:
┌──────────────────────────────────────────────────────────────┐
│ WORK-STEALING SCHEDULER │
│ │
│ ┌─────────────────┐ ┌─────────────────┐ │
│ │ Processor P0 │ │ Processor P1 │ │
│ │ │ │ │ │
│ │ Local Queue: │ │ Local Queue: │ │
│ │ ┌──┬──┬──┬──┐ │ │ ┌──┐ │ │
│ │ │G1│G2│G3│G4│ │ │ │G5│ ← only 1 │ │
│ │ └──┴──┴──┴──┘ │ │ └──┘ │ │
│ │ ▲ push/pop │ │ │ │ │
│ │ │ (LIFO) │ │ │ idle! │ │
│ └──┼───────────────┘ └───────┼───────────┘ │
│ │ │ │
│ │ ┌──────▼──────┐ │
│ │ │ STEAL from │ │
│ │◄────────────────────│ P0's queue │ │
│ │ steal G1 (FIFO) │ (bottom) │ │
│ │ └─────────────┘ │
│ │
│ ┌─────────────────────────────────────┐ │
│ │ GLOBAL RUN QUEUE │ │
│ │ ┌──┬──┬──┬──┬──┐ │ │
│ │ │G6│G7│G8│G9│..│ ← overflow │ │
│ │ └──┴──┴──┴──┴──┘ from locals │ │
│ └─────────────────────────────────────┘ │
│ │
│ Steal Order: 1. Local queue (LIFO) │
│ 2. Global queue (FIFO) │
│ 3. Other processor's queue (FIFO from bottom) │
│ 4. Network poller │
└──────────────────────────────────────────────────────────────┘
// ─── Work-Stealing Scheduler ──────────────────────────────────
interface Goroutine {
id: number;
fn: () => Promise<void>;
state: "runnable" | "running" | "blocked" | "dead";
processor: number | null;
createdAt: number;
totalRunTime: number;
}
// Deque (double-ended queue) for local run queues
class Deque<T> {
private items: T[] = [];
pushBack(item: T): void {
this.items.push(item);
}
popBack(): T | undefined {
return this.items.pop();
}
popFront(): T | undefined {
return this.items.shift();
}
stealHalf(): T[] {
const half = Math.ceil(this.items.length / 2);
return this.items.splice(0, half);
}
get length(): number {
return this.items.length;
}
isEmpty(): boolean {
return this.items.length === 0;
}
}
interface Processor {
id: number;
localQueue: Deque<Goroutine>;
currentGoroutine: Goroutine | null;
state: "running" | "idle" | "syscall" | "stopped";
stealAttempts: number;
successfulSteals: number;
tasksExecuted: number;
}
class WorkStealingScheduler {
private processors: Processor[] = [];
private globalQueue: Goroutine[] = [];
private allGoroutines: Map<number, Goroutine> = new Map();
private nextGoroutineId = 1;
private numProcessors: number;
private maxLocalQueueSize = 256;
private running = false;
private scheduleCount = 0;
constructor(numProcessors: number = 4) {
this.numProcessors = numProcessors;
for (let i = 0; i < numProcessors; i++) {
this.processors.push({
id: i,
localQueue: new Deque<Goroutine>(),
currentGoroutine: null,
state: "idle",
stealAttempts: 0,
successfulSteals: 0,
tasksExecuted: 0,
});
}
}
// Spawn a new goroutine (submit work)
spawn(fn: () => Promise<void>): number {
const g: Goroutine = {
id: this.nextGoroutineId++,
fn,
state: "runnable",
processor: null,
createdAt: Date.now(),
totalRunTime: 0,
};
this.allGoroutines.set(g.id, g);
// Try to put on current processor's local queue first
const currentP = this.findLeastLoadedProcessor();
if (currentP && currentP.localQueue.length < this.maxLocalQueueSize) {
currentP.localQueue.pushBack(g);
g.processor = currentP.id;
} else {
// Overflow to global queue
this.globalQueue.push(g);
}
return g.id;
}
private findLeastLoadedProcessor(): Processor | null {
let min = Infinity;
let result: Processor | null = null;
for (const p of this.processors) {
if (p.localQueue.length < min) {
min = p.localQueue.length;
result = p;
}
}
return result;
}
// Find next goroutine for a processor
private findWork(p: Processor): Goroutine | null {
this.scheduleCount++;
// Every 61st schedule, check global queue first to prevent starvation
if (this.scheduleCount % 61 === 0 && this.globalQueue.length > 0) {
const g = this.globalQueue.shift()!;
// Also grab a batch from global queue for local
const batchSize = Math.min(
this.globalQueue.length,
Math.floor(this.maxLocalQueueSize / 2)
);
for (let i = 0; i < batchSize; i++) {
p.localQueue.pushBack(this.globalQueue.shift()!);
}
return g;
}
// 1. Check local queue (LIFO — pop from back)
const local = p.localQueue.popBack();
if (local) return local;
// 2. Check global queue
if (this.globalQueue.length > 0) {
const g = this.globalQueue.shift()!;
// Grab batch
const batchSize = Math.min(
this.globalQueue.length,
Math.floor(this.maxLocalQueueSize / 2)
);
for (let i = 0; i < batchSize; i++) {
p.localQueue.pushBack(this.globalQueue.shift()!);
}
return g;
}
// 3. Steal from other processors
return this.stealFromOthers(p);
}
// Steal work from another processor's queue
private stealFromOthers(thief: Processor): Goroutine | null {
// Randomize start to avoid all processors stealing from the same target
const start = Math.floor(Math.random() * this.numProcessors);
for (let i = 0; i < this.numProcessors; i++) {
const targetIdx = (start + i) % this.numProcessors;
const target = this.processors[targetIdx];
if (target.id === thief.id) continue;
if (target.localQueue.isEmpty()) continue;
thief.stealAttempts++;
// Steal half from the front (FIFO — oldest tasks)
const stolen = target.localQueue.stealHalf();
if (stolen.length === 0) continue;
thief.successfulSteals++;
// Take first stolen item as work, put rest in local queue
const work = stolen[0];
for (let j = 1; j < stolen.length; j++) {
thief.localQueue.pushBack(stolen[j]);
}
return work;
}
return null;
}
// Run the scheduler
async run(): Promise<void> {
this.running = true;
// Run each processor as a concurrent "thread"
const processorLoops = this.processors.map((p) => this.processorLoop(p));
await Promise.all(processorLoops);
}
private async processorLoop(p: Processor): Promise<void> {
p.state = "running";
while (this.running) {
const g = this.findWork(p);
if (!g) {
p.state = "idle";
// Spinning: brief wait before retry
await new Promise((resolve) => setTimeout(resolve, 1));
// Check if all work is done
if (this.isAllWorkDone()) {
break;
}
continue;
}
p.state = "running";
p.currentGoroutine = g;
g.state = "running";
g.processor = p.id;
const startTime = Date.now();
try {
await g.fn();
g.state = "dead";
} catch (err) {
console.error(`Goroutine ${g.id} panicked:`, err);
g.state = "dead";
}
g.totalRunTime += Date.now() - startTime;
p.currentGoroutine = null;
p.tasksExecuted++;
}
p.state = "stopped";
}
private isAllWorkDone(): boolean {
if (this.globalQueue.length > 0) return false;
for (const p of this.processors) {
if (!p.localQueue.isEmpty()) return false;
if (p.currentGoroutine !== null) return false;
}
return true;
}
stop(): void {
this.running = false;
}
getStats(): {
processors: Array<{
id: number;
state: string;
localQueueSize: number;
tasksExecuted: number;
stealAttempts: number;
successfulSteals: number;
}>;
globalQueueSize: number;
totalGoroutines: number;
completedGoroutines: number;
} {
return {
processors: this.processors.map((p) => ({
id: p.id,
state: p.state,
localQueueSize: p.localQueue.length,
tasksExecuted: p.tasksExecuted,
stealAttempts: p.stealAttempts,
successfulSteals: p.successfulSteals,
})),
globalQueueSize: this.globalQueue.length,
totalGoroutines: this.allGoroutines.size,
completedGoroutines: Array.from(this.allGoroutines.values()).filter(
(g) => g.state === "dead"
).length,
};
}
}
Node.js Cluster Mode: Multi-Process Architecture
The cluster module forks multiple processes that share the same server port:
┌──────────────────────────────────────────────────────────────────┐
│ CLUSTER ARCHITECTURE │
│ │
│ ┌──────────────────────────────────────┐ │
│ │ MASTER PROCESS │ │
│ │ │ │
│ │ ┌────────────────────────────────┐ │ │
│ │ │ Load Balancer (Round-Robin) │ │ │
│ │ │ OR OS-level (SO_REUSEPORT) │ │ │
│ │ └────────────┬───────────────────┘ │ │
│ │ │ │ │
│ │ Listening on port 3000 │ │
│ └───────────────┼──────────────────────┘ │
│ │ │
│ ┌──────────────┼──────────────┬──────────────┐ │
│ │ │ │ │ │
│ ▼ ▼ ▼ ▼ │
│ ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐ │
│ │Worker│ │Worker│ │Worker│ │Worker│ │
│ │ #1 │ │ #2 │ │ #3 │ │ #4 │ │
│ │ │ │ │ │ │ │ │ │
│ │ pid: │ │ pid: │ │ pid: │ │ pid: │ │
│ │ 1001 │ │ 1002 │ │ 1003 │ │ 1004 │ │
│ │ │ │ │ │ │ │ │ │
│ │ Own │ │ Own │ │ Own │ │ Own │ │
│ │ V8 │ │ V8 │ │ V8 │ │ V8 │ │
│ │ heap │ │ heap │ │ heap │ │ heap │ │
│ └──┬───┘ └──┬───┘ └──┬───┘ └──┬───┘ │
│ │ │ │ │ │
│ └─────┬─────┴─────┬─────┘ │ │
│ │ │ │ │
│ ┌─────▼─────┐ ┌──▼──────────┐ ┌──▼──────┐ │
│ │ IPC Pipe │ │ Shared FD │ │ Signals │ │
│ │ (messages)│ │ (SO_REUSEPORT)│ │ SIGTERM │ │
│ └───────────┘ └─────────────┘ └─────────┘ │
└──────────────────────────────────────────────────────────────────┘
Let's build a production-ready cluster manager:
// ─── Cluster Manager ──────────────────────────────────────────
import { EventEmitter } from "events";
interface WorkerInfo {
id: number;
pid: number;
state: "online" | "listening" | "disconnected" | "dead";
connections: number;
requestsHandled: number;
memoryUsage: number;
startTime: number;
lastHealthCheck: number;
healthStatus: "healthy" | "unhealthy" | "unknown";
}
interface ClusterConfig {
workers: number;
schedulingPolicy: "round-robin" | "os-level";
healthCheckInterval: number;
healthCheckTimeout: number;
maxMemoryPerWorker: number; // bytes
gracefulShutdownTimeout: number;
rollingRestartDelay: number;
}
class ClusterManager extends EventEmitter {
private config: ClusterConfig;
private workers: Map<number, WorkerInfo> = new Map();
private roundRobinIndex = 0;
private shuttingDown = false;
private healthCheckTimer: ReturnType<typeof setInterval> | null = null;
constructor(config: Partial<ClusterConfig> = {}) {
super();
this.config = {
workers: config.workers ?? require("os").cpus().length,
schedulingPolicy: config.schedulingPolicy ?? "round-robin",
healthCheckInterval: config.healthCheckInterval ?? 10000,
healthCheckTimeout: config.healthCheckTimeout ?? 5000,
maxMemoryPerWorker: config.maxMemoryPerWorker ?? 512 * 1024 * 1024,
gracefulShutdownTimeout: config.gracefulShutdownTimeout ?? 30000,
rollingRestartDelay: config.rollingRestartDelay ?? 2000,
};
}
// Initialize cluster
async start(): Promise<void> {
console.log(`Starting cluster with ${this.config.workers} workers`);
console.log(`Scheduling policy: ${this.config.schedulingPolicy}`);
// Fork initial workers
for (let i = 0; i < this.config.workers; i++) {
await this.forkWorker(i);
}
// Start health checking
this.startHealthChecks();
this.emit("cluster:ready", {
workers: this.config.workers,
pids: Array.from(this.workers.values()).map((w) => w.pid),
});
}
// Fork a new worker
private async forkWorker(id: number): Promise<WorkerInfo> {
// In production, this would use child_process.fork()
// Simulating the worker info structure
const info: WorkerInfo = {
id,
pid: process.pid + id + 1, // simulated PID
state: "online",
connections: 0,
requestsHandled: 0,
memoryUsage: 0,
startTime: Date.now(),
lastHealthCheck: Date.now(),
healthStatus: "unknown",
};
this.workers.set(id, info);
// Simulate worker becoming ready
await new Promise((resolve) => setTimeout(resolve, 100));
info.state = "listening";
info.healthStatus = "healthy";
this.emit("worker:online", { id, pid: info.pid });
return info;
}
// Round-robin request distribution
getNextWorker(): WorkerInfo | null {
const healthyWorkers = Array.from(this.workers.values()).filter(
(w) => w.state === "listening" && w.healthStatus === "healthy"
);
if (healthyWorkers.length === 0) return null;
const worker = healthyWorkers[this.roundRobinIndex % healthyWorkers.length];
this.roundRobinIndex++;
return worker;
}
// Least-connections routing
getLeastConnectionsWorker(): WorkerInfo | null {
const healthyWorkers = Array.from(this.workers.values()).filter(
(w) => w.state === "listening" && w.healthStatus === "healthy"
);
if (healthyWorkers.length === 0) return null;
return healthyWorkers.reduce((min, worker) =>
worker.connections < min.connections ? worker : min
);
}
// Health check loop
private startHealthChecks(): void {
this.healthCheckTimer = setInterval(async () => {
for (const [id, worker] of this.workers) {
if (worker.state === "dead" || worker.state === "disconnected") continue;
try {
const healthy = await this.checkWorkerHealth(worker);
worker.lastHealthCheck = Date.now();
if (!healthy) {
worker.healthStatus = "unhealthy";
this.emit("worker:unhealthy", { id, pid: worker.pid });
// Replace unhealthy worker
if (!this.shuttingDown) {
await this.replaceWorker(id);
}
} else {
worker.healthStatus = "healthy";
}
// Check memory limits
if (worker.memoryUsage > this.config.maxMemoryPerWorker) {
this.emit("worker:memory-limit", {
id,
usage: worker.memoryUsage,
limit: this.config.maxMemoryPerWorker,
});
if (!this.shuttingDown) {
await this.replaceWorker(id);
}
}
} catch (err) {
worker.healthStatus = "unhealthy";
}
}
}, this.config.healthCheckInterval);
}
// Check individual worker health
private async checkWorkerHealth(worker: WorkerInfo): Promise<boolean> {
// In production: send IPC health check message and wait for response
// Check if worker responds within timeout
return new Promise<boolean>((resolve) => {
const timeout = setTimeout(() => resolve(false), this.config.healthCheckTimeout);
// Simulate health check via IPC
// In real implementation: worker.process.send({ type: 'health-check' })
setTimeout(() => {
clearTimeout(timeout);
resolve(worker.state === "listening");
}, 50);
});
}
// Replace a worker (graceful)
private async replaceWorker(id: number): Promise<void> {
const oldWorker = this.workers.get(id);
if (!oldWorker) return;
this.emit("worker:replacing", { id, pid: oldWorker.pid });
// 1. Fork new worker first (zero-downtime)
const newWorker = await this.forkWorker(id + 1000); // temp ID
// 2. Stop sending traffic to old worker
oldWorker.state = "disconnected";
// 3. Wait for existing connections to drain
await this.drainConnections(oldWorker);
// 4. Clean up old worker
oldWorker.state = "dead";
this.workers.delete(id);
// 5. Reassign new worker to proper ID
this.workers.delete(newWorker.id);
newWorker.id = id;
this.workers.set(id, newWorker);
this.emit("worker:replaced", { id, newPid: newWorker.pid });
}
// Drain connections from a worker
private async drainConnections(worker: WorkerInfo): Promise<void> {
const deadline = Date.now() + this.config.gracefulShutdownTimeout;
while (worker.connections > 0 && Date.now() < deadline) {
await new Promise((resolve) => setTimeout(resolve, 100));
}
if (worker.connections > 0) {
this.emit("worker:force-kill", {
id: worker.id,
remainingConnections: worker.connections,
});
}
}
// Rolling restart all workers
async rollingRestart(): Promise<void> {
const workerIds = Array.from(this.workers.keys());
for (const id of workerIds) {
if (this.shuttingDown) break;
this.emit("rolling-restart:worker", { id, total: workerIds.length });
await this.replaceWorker(id);
await new Promise((resolve) =>
setTimeout(resolve, this.config.rollingRestartDelay)
);
}
this.emit("rolling-restart:complete");
}
// Graceful shutdown of entire cluster
async shutdown(): Promise<void> {
this.shuttingDown = true;
if (this.healthCheckTimer) {
clearInterval(this.healthCheckTimer);
}
this.emit("cluster:shutting-down");
// Stop all workers gracefully
const shutdownPromises = Array.from(this.workers.entries()).map(
async ([id, worker]) => {
worker.state = "disconnected";
await this.drainConnections(worker);
worker.state = "dead";
this.emit("worker:stopped", { id, pid: worker.pid });
}
);
await Promise.all(shutdownPromises);
this.workers.clear();
this.emit("cluster:shutdown");
}
// Get cluster status
getStatus(): {
totalWorkers: number;
healthyWorkers: number;
totalConnections: number;
totalRequests: number;
workers: WorkerInfo[];
} {
const workerList = Array.from(this.workers.values());
return {
totalWorkers: workerList.length,
healthyWorkers: workerList.filter((w) => w.healthStatus === "healthy").length,
totalConnections: workerList.reduce((sum, w) => sum + w.connections, 0),
totalRequests: workerList.reduce((sum, w) => sum + w.requestsHandled, 0),
workers: workerList,
};
}
}
Worker Threads: Shared Memory Concurrency in Node.js
Worker threads (Node.js worker_threads) run in the same process but with separate V8 isolates, allowing true parallelism:
┌──────────────────────────────────────────────────────────────┐
│ SINGLE PROCESS │
│ │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ SHARED MEMORY (SharedArrayBuffer) │ │
│ │ ┌──────────────────────────────────────────────────┐ │ │
│ │ │ 0x00 0x04 0x08 0x0C 0x10 ... │ │ │
│ │ │ [counter] [flag] [lock] [data] [...] │ │ │
│ │ └──────────────────────────────────────────────────┘ │ │
│ └────────────┬───────────┬───────────┬───────────────────┘ │
│ │ │ │ │
│ ┌────────────▼───┐ ┌───▼────────┐ ┌▼──────────────┐ │
│ │ Main Thread │ │ Worker #1 │ │ Worker #2 │ │
│ │ │ │ │ │ │ │
│ │ V8 Isolate A │ │ V8 Isolate B│ │ V8 Isolate C │ │
│ │ │ │ │ │ │ │
│ │ Own heap │ │ Own heap │ │ Own heap │ │
│ │ Own stack │ │ Own stack │ │ Own stack │ │
│ │ Own event loop │ │ Own evloop │ │ Own evloop │ │
│ │ │ │ │ │ │ │
│ │ Atomics.wait() │ │ Atomics │ │ Atomics │ │
│ │ Atomics.notify()│ │ operations │ │ operations │ │
│ └────────────────┘ └────────────┘ └───────────────┘ │
│ │
│ Communication: │
│ ├── SharedArrayBuffer (zero-copy, requires Atomics) │
│ ├── MessagePort (structured clone, ownership transfer) │
│ └── BroadcastChannel (pub/sub between workers) │
└──────────────────────────────────────────────────────────────┘
// ─── Worker Thread Pool with Shared Memory ────────────────────
interface SharedMemoryLayout {
// Offset 0: Lock (Mutex)
LOCK_OFFSET: 0;
// Offset 4: Task counter
TASK_COUNTER_OFFSET: 4;
// Offset 8: Completed task counter
COMPLETED_COUNTER_OFFSET: 8;
// Offset 12: Active workers
ACTIVE_WORKERS_OFFSET: 12;
// Offset 16: Shutdown flag
SHUTDOWN_FLAG_OFFSET: 16;
// Offset 20-1024: Ring buffer for task IDs
RING_BUFFER_OFFSET: 20;
RING_BUFFER_SIZE: 256;
}
const LAYOUT: SharedMemoryLayout = {
LOCK_OFFSET: 0,
TASK_COUNTER_OFFSET: 4,
COMPLETED_COUNTER_OFFSET: 8,
ACTIVE_WORKERS_OFFSET: 12,
SHUTDOWN_FLAG_OFFSET: 16,
RING_BUFFER_OFFSET: 20,
RING_BUFFER_SIZE: 256,
};
// Mutex using Atomics
class AtomicMutex {
private buffer: Int32Array;
private offset: number;
constructor(sharedBuffer: SharedArrayBuffer, offset: number) {
this.buffer = new Int32Array(sharedBuffer);
this.offset = offset / 4; // Convert byte offset to Int32 index
}
lock(): void {
while (true) {
// Try to acquire: CAS from 0 (unlocked) to 1 (locked)
const old = Atomics.compareExchange(
this.buffer,
this.offset,
0, // expected: unlocked
1 // desired: locked
);
if (old === 0) {
// Successfully acquired lock
return;
}
// Lock is held, wait
Atomics.wait(this.buffer, this.offset, 1);
}
}
unlock(): void {
Atomics.store(this.buffer, this.offset, 0);
Atomics.notify(this.buffer, this.offset, 1);
}
tryLock(): boolean {
const old = Atomics.compareExchange(this.buffer, this.offset, 0, 1);
return old === 0;
}
}
// Atomic counter
class AtomicCounter {
private buffer: Int32Array;
private index: number;
constructor(sharedBuffer: SharedArrayBuffer, byteOffset: number) {
this.buffer = new Int32Array(sharedBuffer);
this.index = byteOffset / 4;
}
increment(): number {
return Atomics.add(this.buffer, this.index, 1) + 1;
}
decrement(): number {
return Atomics.sub(this.buffer, this.index, 1) - 1;
}
get(): number {
return Atomics.load(this.buffer, this.index);
}
set(value: number): void {
Atomics.store(this.buffer, this.index, value);
}
}
// Lock-free ring buffer for task distribution
class AtomicRingBuffer {
private buffer: Int32Array;
private startIndex: number;
private capacity: number;
private headIndex: number; // next write position (byte offset / 4)
private tailIndex: number; // next read position (byte offset / 4)
constructor(
sharedBuffer: SharedArrayBuffer,
byteOffset: number,
capacity: number
) {
this.buffer = new Int32Array(sharedBuffer);
this.startIndex = byteOffset / 4;
this.capacity = capacity;
// Head and tail are stored at the beginning of the region
this.headIndex = this.startIndex;
this.tailIndex = this.startIndex + 1;
// Data starts at startIndex + 2
}
push(value: number): boolean {
const head = Atomics.load(this.buffer, this.headIndex);
const tail = Atomics.load(this.buffer, this.tailIndex);
const nextHead = (head + 1) % this.capacity;
if (nextHead === tail) {
return false; // Buffer full
}
const dataIndex = this.startIndex + 2 + head;
Atomics.store(this.buffer, dataIndex, value);
Atomics.store(this.buffer, this.headIndex, nextHead);
return true;
}
pop(): number | null {
const head = Atomics.load(this.buffer, this.headIndex);
const tail = Atomics.load(this.buffer, this.tailIndex);
if (head === tail) {
return null; // Buffer empty
}
const dataIndex = this.startIndex + 2 + tail;
const value = Atomics.load(this.buffer, dataIndex);
const nextTail = (tail + 1) % this.capacity;
Atomics.store(this.buffer, this.tailIndex, nextTail);
return value;
}
size(): number {
const head = Atomics.load(this.buffer, this.headIndex);
const tail = Atomics.load(this.buffer, this.tailIndex);
return (head - tail + this.capacity) % this.capacity;
}
}
// ─── Coordinated Worker Thread Pool ──────────────────────────
class WorkerThreadPool {
private sharedBuffer: SharedArrayBuffer;
private mutex: AtomicMutex;
private taskCounter: AtomicCounter;
private completedCounter: AtomicCounter;
private activeWorkers: AtomicCounter;
private ringBuffer: AtomicRingBuffer;
private workerCount: number;
constructor(workerCount: number = 4) {
// Allocate shared memory (4KB should be enough)
this.sharedBuffer = new SharedArrayBuffer(4096);
this.mutex = new AtomicMutex(this.sharedBuffer, LAYOUT.LOCK_OFFSET);
this.taskCounter = new AtomicCounter(this.sharedBuffer, LAYOUT.TASK_COUNTER_OFFSET);
this.completedCounter = new AtomicCounter(this.sharedBuffer, LAYOUT.COMPLETED_COUNTER_OFFSET);
this.activeWorkers = new AtomicCounter(this.sharedBuffer, LAYOUT.ACTIVE_WORKERS_OFFSET);
this.ringBuffer = new AtomicRingBuffer(
this.sharedBuffer,
LAYOUT.RING_BUFFER_OFFSET,
LAYOUT.RING_BUFFER_SIZE
);
this.workerCount = workerCount;
}
// Submit a task
submitTask(taskId: number): boolean {
this.mutex.lock();
try {
const success = this.ringBuffer.push(taskId);
if (success) {
this.taskCounter.increment();
}
return success;
} finally {
this.mutex.unlock();
}
}
// Worker: pick up a task
pickupTask(): number | null {
this.mutex.lock();
try {
const taskId = this.ringBuffer.pop();
if (taskId !== null) {
this.activeWorkers.increment();
}
return taskId;
} finally {
this.mutex.unlock();
}
}
// Worker: mark task complete
completeTask(): void {
this.completedCounter.increment();
this.activeWorkers.decrement();
}
// Get shared buffer for passing to worker threads
getSharedBuffer(): SharedArrayBuffer {
return this.sharedBuffer;
}
getStats(): {
submitted: number;
completed: number;
active: number;
queued: number;
} {
return {
submitted: this.taskCounter.get(),
completed: this.completedCounter.get(),
active: this.activeWorkers.get(),
queued: this.ringBuffer.size(),
};
}
}
Concurrency Models Compared
┌─────────────────────────────────────────────────────────────────────┐
│ BACKEND CONCURRENCY MODELS COMPARED │
│ │
│ MODEL 1: Multi-Process (Pre-fork) │
│ ┌─────────────────────────────────────────────┐ │
│ │ Apache httpd, PHP-FPM, unicorn │ │
│ │ │ │
│ │ Master ──fork()──▶ Worker (own memory) │ │
│ │ ──fork()──▶ Worker (own memory) │ │
│ │ ──fork()──▶ Worker (own memory) │ │
│ │ │ │
│ │ Pros: Isolation, stability │ │
│ │ Cons: Memory overhead, IPC cost │ │
│ └─────────────────────────────────────────────┘ │
│ │
│ MODEL 2: Multi-Thread (Thread-per-request) │
│ ┌─────────────────────────────────────────────┐ │
│ │ Java (Tomcat), C# (ASP.NET), Go │ │
│ │ │ │
│ │ Process ──▶ Thread Pool │ │
│ │ ├── Thread 1 (shared memory) │ │
│ │ ├── Thread 2 (shared memory) │ │
│ │ └── Thread N (shared memory) │ │
│ │ │ │
│ │ Pros: Shared memory, lower overhead │ │
│ │ Cons: Race conditions, deadlocks │ │
│ └─────────────────────────────────────────────┘ │
│ │
│ MODEL 3: Event Loop (Single-thread + async I/O) │
│ ┌─────────────────────────────────────────────┐ │
│ │ Node.js, nginx, Redis │ │
│ │ │ │
│ │ Main Thread ──▶ Event Loop │ │
│ │ ├── epoll/kqueue for I/O │ │
│ │ └── Thread pool for FS/DNS │ │
│ │ │ │
│ │ Pros: No context switching, no locks │ │
│ │ Cons: Can't use multiple cores (alone) │ │
│ └─────────────────────────────────────────────┘ │
│ │
│ MODEL 4: Green Threads / Coroutines │
│ ┌─────────────────────────────────────────────┐ │
│ │ Go (goroutines), Erlang, Java 21 (Loom) │ │
│ │ │ │
│ │ M goroutines ──mapped──▶ N OS threads │ │
│ │ Cooperative scheduling in userspace │ │
│ │ Work-stealing across processors │ │
│ │ │ │
│ │ Pros: Millions of concurrent tasks │ │
│ │ Cons: Runtime complexity, stack management │ │
│ └─────────────────────────────────────────────┘ │
│ │
│ MODEL 5: Actor Model │
│ ┌─────────────────────────────────────────────┐ │
│ │ Erlang/OTP, Akka, Orleans │ │
│ │ │ │
│ │ Actor ◄──message──▶ Actor │ │
│ │ │ │ │ │
│ │ ▼ ▼ │ │
│ │ Mailbox Mailbox │ │
│ │ (sequential) (sequential) │ │
│ │ │ │
│ │ Pros: No shared state, supervision trees │ │
│ │ Cons: Message overhead, ordering complexity │ │
│ └─────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────┘
Building an Actor System
// ─── Actor System Implementation ──────────────────────────────
type ActorMessage = {
type: string;
payload: unknown;
sender: string | null;
correlationId: string;
timestamp: number;
};
type ActorBehavior = (
context: ActorContext,
message: ActorMessage
) => Promise<void>;
interface ActorRef {
name: string;
send(type: string, payload: unknown, sender?: string): void;
ask<T>(type: string, payload: unknown, timeout?: number): Promise<T>;
}
class ActorContext {
constructor(
public readonly self: ActorRef,
public readonly system: ActorSystem,
public readonly parent: ActorRef | null
) {}
// Spawn a child actor
spawn(name: string, behavior: ActorBehavior): ActorRef {
return this.system.createActor(
`${this.self.name}/${name}`,
behavior,
this.self
);
}
// Send message to another actor
send(target: string, type: string, payload: unknown): void {
this.system.send(target, type, payload, this.self.name);
}
// Stop self
stop(): void {
this.system.stopActor(this.self.name);
}
}
class Actor {
public readonly name: string;
private behavior: ActorBehavior;
private mailbox: ActorMessage[] = [];
private processing = false;
private context: ActorContext;
private parent: ActorRef | null;
private children: Set<string> = new Set();
private pendingAsks: Map<
string,
{ resolve: (value: unknown) => void; reject: (error: Error) => void }
> = new Map();
constructor(
name: string,
behavior: ActorBehavior,
system: ActorSystem,
parent: ActorRef | null
) {
this.name = name;
this.behavior = behavior;
this.parent = parent;
const ref: ActorRef = {
name: this.name,
send: (type, payload, sender) => {
this.enqueue({
type,
payload,
sender: sender ?? null,
correlationId: `${Date.now()}_${Math.random().toString(36).slice(2)}`,
timestamp: Date.now(),
});
},
ask: <T>(type: string, payload: unknown, timeout = 5000) => {
return this.ask<T>(type, payload, timeout);
},
};
this.context = new ActorContext(ref, system, parent);
}
getRef(): ActorRef {
return this.context.self;
}
addChild(name: string): void {
this.children.add(name);
}
removeChild(name: string): void {
this.children.delete(name);
}
getChildren(): Set<string> {
return this.children;
}
private enqueue(message: ActorMessage): void {
this.mailbox.push(message);
this.processNext();
}
private async processNext(): Promise<void> {
if (this.processing || this.mailbox.length === 0) return;
this.processing = true;
while (this.mailbox.length > 0) {
const message = this.mailbox.shift()!;
try {
// Check if this is a response to an ask
if (message.type === "__ask_response__") {
const pending = this.pendingAsks.get(message.correlationId);
if (pending) {
pending.resolve(message.payload);
this.pendingAsks.delete(message.correlationId);
continue;
}
}
await this.behavior(this.context, message);
} catch (error) {
// Supervision: notify parent of failure
if (this.parent) {
this.parent.send(
"__child_failure__",
{
child: this.name,
error: error instanceof Error ? error.message : String(error),
message,
},
this.name
);
}
}
}
this.processing = false;
}
private ask<T>(type: string, payload: unknown, timeout: number): Promise<T> {
return new Promise<T>((resolve, reject) => {
const correlationId = `ask_${Date.now()}_${Math.random().toString(36).slice(2)}`;
const timer = setTimeout(() => {
this.pendingAsks.delete(correlationId);
reject(new Error(`Ask timed out after ${timeout}ms`));
}, timeout);
this.pendingAsks.set(correlationId, {
resolve: (value: unknown) => {
clearTimeout(timer);
resolve(value as T);
},
reject: (error: Error) => {
clearTimeout(timer);
reject(error);
},
});
this.enqueue({
type,
payload,
sender: null,
correlationId,
timestamp: Date.now(),
});
});
}
}
// ─── Supervision Strategies ──────────────────────────────────
type SupervisionStrategy = "one-for-one" | "one-for-all" | "rest-for-one";
type SupervisionDirective = "resume" | "restart" | "stop" | "escalate";
class Supervisor {
private strategy: SupervisionStrategy;
private maxRestarts: number;
private withinMs: number;
private restartHistory: number[] = [];
constructor(options: {
strategy?: SupervisionStrategy;
maxRestarts?: number;
withinMs?: number;
} = {}) {
this.strategy = options.strategy ?? "one-for-one";
this.maxRestarts = options.maxRestarts ?? 3;
this.withinMs = options.withinMs ?? 60000;
}
handleFailure(
failedChild: string,
siblings: Set<string>,
error: string
): { directive: SupervisionDirective; targets: string[] } {
// Check restart frequency
const now = Date.now();
this.restartHistory.push(now);
this.restartHistory = this.restartHistory.filter(
(t) => now - t < this.withinMs
);
if (this.restartHistory.length > this.maxRestarts) {
return { directive: "escalate", targets: [failedChild] };
}
switch (this.strategy) {
case "one-for-one":
// Only restart the failed child
return { directive: "restart", targets: [failedChild] };
case "one-for-all":
// Restart all children
return {
directive: "restart",
targets: [failedChild, ...siblings],
};
case "rest-for-one":
// Restart failed child and all children started after it
// (simplified: restart failed + all siblings)
return {
directive: "restart",
targets: [failedChild, ...siblings],
};
}
}
}
// ─── Actor System ────────────────────────────────────────────
class ActorSystem {
private actors: Map<string, Actor> = new Map();
private supervisors: Map<string, Supervisor> = new Map();
private deadLetters: ActorMessage[] = [];
createActor(
name: string,
behavior: ActorBehavior,
parent?: ActorRef | null
): ActorRef {
if (this.actors.has(name)) {
throw new Error(`Actor "${name}" already exists`);
}
const actor = new Actor(name, behavior, this, parent ?? null);
this.actors.set(name, actor);
// Register as child of parent
if (parent) {
const parentActor = this.actors.get(parent.name);
if (parentActor) {
parentActor.addChild(name);
}
}
return actor.getRef();
}
send(
target: string,
type: string,
payload: unknown,
sender?: string
): void {
const actor = this.actors.get(target);
if (actor) {
actor.getRef().send(type, payload, sender);
} else {
// Dead letter
this.deadLetters.push({
type,
payload,
sender: sender ?? null,
correlationId: `dead_${Date.now()}`,
timestamp: Date.now(),
});
}
}
stopActor(name: string): void {
const actor = this.actors.get(name);
if (!actor) return;
// Stop all children first
for (const childName of actor.getChildren()) {
this.stopActor(childName);
}
this.actors.delete(name);
}
setSupervisor(actorName: string, supervisor: Supervisor): void {
this.supervisors.set(actorName, supervisor);
}
getStats(): {
totalActors: number;
deadLetters: number;
actorNames: string[];
} {
return {
totalActors: this.actors.size,
deadLetters: this.deadLetters.length,
actorNames: Array.from(this.actors.keys()),
};
}
}
The GIL Problem: Why Python Can't Do Real Threads
THE GLOBAL INTERPRETER LOCK (GIL):
Python (CPython):
┌────────────────────────────────────────────────┐
│ Thread 1 ──▶ [HOLDS GIL] ── executing ──▶ │
│ Thread 2 ──▶ [WAITING] ────────────────▶ │
│ Thread 3 ──▶ [WAITING] ────────────────▶ │
│ │
│ Time: ═══█████═════════█████═════════█████═══ │
│ T1: ████▓ ████▓ │
│ T2: ████▓ ████▓ │
│ T3: ████▓ │
│ (only one thread runs Python at a time) │
└────────────────────────────────────────────────┘
Node.js (V8):
┌────────────────────────────────────────────────┐
│ Main Thread ──▶ All JS execution │
│ libuv Thread 1 ──▶ FS / DNS / crypto │
│ libuv Thread 2 ──▶ FS / DNS / crypto │
│ libuv Thread 3 ──▶ FS / DNS / crypto │
│ libuv Thread 4 ──▶ FS / DNS / crypto │
│ │
│ JS is single-threaded BY DESIGN │
│ (no GIL needed — there's only one thread) │
│ │
│ Worker Threads: separate V8 isolates │
│ (no shared JS objects, only SharedArrayBuffer) │
└────────────────────────────────────────────────┘
Go:
┌────────────────────────────────────────────────┐
│ Goroutine 1 ──▶ OS Thread 1 ──▶ CPU Core 0 │
│ Goroutine 2 ──▶ OS Thread 2 ──▶ CPU Core 1 │
│ Goroutine 3 ──▶ OS Thread 1 ──▶ CPU Core 0 │
│ Goroutine 4 ──▶ OS Thread 3 ──▶ CPU Core 2 │
│ │
│ No GIL. True parallelism. │
│ M:N scheduling (M goroutines : N OS threads) │
│ Channel-based communication (CSP) │
└────────────────────────────────────────────────┘
Java (21+ with Virtual Threads / Loom):
┌────────────────────────────────────────────────┐
│ Virtual Thread 1 ──▶ Carrier Thread 1 ──▶ CPU │
│ Virtual Thread 2 ──▶ Carrier Thread 2 ──▶ CPU │
│ Virtual Thread 3 ──▶ (parked / waiting) │
│ ... │
│ Virtual Thread 1M ──▶ Carrier Thread N │
│ │
│ Millions of virtual threads │
│ Mounted/unmounted on carrier (platform) threads │
│ Blocking calls automatically park the VT │
└────────────────────────────────────────────────┘
Comparison Table: Process vs Thread vs Coroutine
| Characteristic | Process | OS Thread | Green Thread / Coroutine |
|---|---|---|---|
| Memory isolation | Full (own address space) | Shared (within process) | Shared (within runtime) |
| Creation cost | ~1-10ms, ~MBs | ~10-100μs, ~1-8MB stack | ~1-10μs, ~2-8KB stack |
| Context switch cost | ~1-10μs (TLB flush) | ~1-10μs (kernel) | ~100ns (userspace) |
| Max concurrent | ~1,000s | ~10,000s | ~1,000,000s |
| Communication | IPC (pipe, socket, shm) | Shared memory + locks | Channels, message passing |
| Failure isolation | Excellent (crash one, others live) | Poor (crash = process dead) | Runtime-dependent |
| CPU utilization | Multi-core native | Multi-core native | Multi-core with M:N |
| Scheduling | OS preemptive | OS preemptive | Cooperative / hybrid |
| Debugging | Easier (isolated) | Hard (race conditions) | Runtime-specific tools |
| Examples | Nginx workers, PHP-FPM | Java threads, pthreads | Go goroutines, Erlang procs |
Backpressure and Flow Control
When producers are faster than consumers, you need backpressure mechanisms:
// ─── Backpressure Controller ─────────────────────────────────
interface BackpressureConfig {
highWaterMark: number;
lowWaterMark: number;
strategy: "drop" | "block" | "buffer-and-signal";
maxBufferSize: number;
}
class BackpressureController<T> {
private buffer: T[] = [];
private config: BackpressureConfig;
private paused = false;
private waiters: Array<{
resolve: () => void;
reject: (error: Error) => void;
}> = [];
private draining = false;
private totalProcessed = 0;
private totalDropped = 0;
constructor(config: Partial<BackpressureConfig> = {}) {
this.config = {
highWaterMark: config.highWaterMark ?? 1000,
lowWaterMark: config.lowWaterMark ?? 250,
strategy: config.strategy ?? "buffer-and-signal",
maxBufferSize: config.maxBufferSize ?? 10000,
};
}
// Producer calls this to push items
async push(item: T): Promise<boolean> {
if (this.draining) {
return false;
}
// Check backpressure
if (this.buffer.length >= this.config.highWaterMark) {
switch (this.config.strategy) {
case "drop":
this.totalDropped++;
return false;
case "block":
// Block the producer until space available
if (this.buffer.length >= this.config.maxBufferSize) {
throw new Error("Buffer overflow: max buffer size reached");
}
await this.waitForDrain();
break;
case "buffer-and-signal":
if (this.buffer.length >= this.config.maxBufferSize) {
this.totalDropped++;
return false;
}
if (!this.paused) {
this.paused = true;
}
break;
}
}
this.buffer.push(item);
return true;
}
// Consumer calls this to pull items
pull(batchSize: number = 1): T[] {
const items = this.buffer.splice(0, batchSize);
this.totalProcessed += items.length;
// Check if we should resume producers
if (this.paused && this.buffer.length <= this.config.lowWaterMark) {
this.paused = false;
this.resumeWaiters();
}
return items;
}
private waitForDrain(): Promise<void> {
return new Promise<void>((resolve, reject) => {
this.waiters.push({ resolve, reject });
});
}
private resumeWaiters(): void {
const waiters = this.waiters.splice(0);
for (const waiter of waiters) {
waiter.resolve();
}
}
isPaused(): boolean {
return this.paused;
}
drain(): void {
this.draining = true;
// Reject all waiting producers
for (const waiter of this.waiters) {
waiter.reject(new Error("Draining"));
}
this.waiters = [];
}
getStats(): {
bufferSize: number;
totalProcessed: number;
totalDropped: number;
isPaused: boolean;
waitingProducers: number;
} {
return {
bufferSize: this.buffer.length,
totalProcessed: this.totalProcessed,
totalDropped: this.totalDropped,
isPaused: this.paused,
waitingProducers: this.waiters.length,
};
}
}
Signal Handling and Process Lifecycle
UNIX SIGNAL LIFECYCLE:
Process Start
│
▼
┌──────────────┐
│ RUNNING │
│ │◄──────── SIGCONT (resume)
└──────┬───────┘
│
┌────┼────┬────────────────────┐
│ │ │ │
▼ ▼ ▼ ▼
SIGTERM SIGINT SIGQUIT SIGSTOP/SIGTSTP
(soft) (Ctrl+C) (core dump) (suspend)
│ │ │ │
▼ ▼ ▼ ▼
┌──────────────┐ ┌──────────────┐
│ Signal Handler│ │ STOPPED │
│ (trappable) │ │ (T state) │
└──────┬───────┘ └──────────────┘
│
┌────┼────┐
│ │
▼ ▼
Handle Default
(cleanup) (terminate)
│ │
▼ ▼
┌──────────────┐
│ ZOMBIE (Z) │
│ (exit code │
│ stored) │
└──────┬───────┘
│
parent calls
waitpid()
│
▼
┌──────────────┐
│ DEAD │
│ (reaped) │
└──────────────┘
SIGKILL (9) and SIGSTOP: CANNOT be caught or ignored
// ─── Process Lifecycle Manager ────────────────────────────────
type SignalHandler = (signal: string) => Promise<void> | void;
interface ShutdownHook {
name: string;
priority: number; // lower = earlier
handler: () => Promise<void>;
timeout: number;
}
class ProcessLifecycle {
private state: "starting" | "running" | "stopping" | "stopped" = "starting";
private shutdownHooks: ShutdownHook[] = [];
private signalHandlers: Map<string, SignalHandler[]> = new Map();
private shutdownInProgress = false;
private startTime = Date.now();
constructor() {
this.registerDefaultSignalHandlers();
}
private registerDefaultSignalHandlers(): void {
// SIGTERM: Graceful shutdown (what orchestrators send)
this.onSignal("SIGTERM", async () => {
console.log("[lifecycle] Received SIGTERM, initiating graceful shutdown");
await this.shutdown("SIGTERM");
});
// SIGINT: User interrupt (Ctrl+C)
this.onSignal("SIGINT", async () => {
console.log("[lifecycle] Received SIGINT (Ctrl+C)");
await this.shutdown("SIGINT");
});
// SIGUSR1: Convention for custom actions (e.g., heap dump)
this.onSignal("SIGUSR1", () => {
console.log("[lifecycle] Received SIGUSR1");
// Could trigger heap snapshot, log rotation, etc.
});
// SIGUSR2: Convention for custom actions (e.g., reopen log files)
this.onSignal("SIGUSR2", () => {
console.log("[lifecycle] Received SIGUSR2");
});
// Register with Node.js process
for (const signal of ["SIGTERM", "SIGINT", "SIGUSR1", "SIGUSR2"]) {
process.on(signal, () => {
this.handleSignal(signal);
});
}
// Uncaught exceptions
process.on("uncaughtException", (error) => {
console.error("[lifecycle] Uncaught exception:", error);
this.shutdown("uncaughtException").finally(() => {
process.exit(1);
});
});
// Unhandled promise rejections
process.on("unhandledRejection", (reason) => {
console.error("[lifecycle] Unhandled rejection:", reason);
// In newer Node.js, this will crash by default
});
}
// Register a signal handler
onSignal(signal: string, handler: SignalHandler): void {
if (!this.signalHandlers.has(signal)) {
this.signalHandlers.set(signal, []);
}
this.signalHandlers.get(signal)!.push(handler);
}
// Dispatch signal to handlers
private async handleSignal(signal: string): Promise<void> {
const handlers = this.signalHandlers.get(signal) ?? [];
for (const handler of handlers) {
try {
await handler(signal);
} catch (err) {
console.error(`[lifecycle] Signal handler error (${signal}):`, err);
}
}
}
// Register a shutdown hook
addShutdownHook(hook: ShutdownHook): void {
this.shutdownHooks.push(hook);
this.shutdownHooks.sort((a, b) => a.priority - b.priority);
}
// Mark process as ready
markReady(): void {
this.state = "running";
console.log(
`[lifecycle] Process ready (startup took ${Date.now() - this.startTime}ms)`
);
}
// Execute graceful shutdown
async shutdown(reason: string): Promise<void> {
if (this.shutdownInProgress) {
console.log("[lifecycle] Shutdown already in progress, ignoring");
return;
}
this.shutdownInProgress = true;
this.state = "stopping";
console.log(`[lifecycle] Shutdown initiated. Reason: ${reason}`);
const shutdownStart = Date.now();
for (const hook of this.shutdownHooks) {
console.log(`[lifecycle] Running shutdown hook: ${hook.name}`);
try {
await Promise.race([
hook.handler(),
new Promise<never>((_, reject) =>
setTimeout(
() => reject(new Error(`Hook "${hook.name}" timed out`)),
hook.timeout
)
),
]);
console.log(`[lifecycle] ✓ ${hook.name} completed`);
} catch (err) {
console.error(`[lifecycle] ✗ ${hook.name} failed:`, err);
}
}
this.state = "stopped";
console.log(
`[lifecycle] Shutdown complete (took ${Date.now() - shutdownStart}ms)`
);
}
getState(): string {
return this.state;
}
getUptime(): number {
return Date.now() - this.startTime;
}
}
Real-World Architecture: Production Server Setup
┌──────────────────────────────────────────────────────────────────────┐
│ PRODUCTION NODE.JS DEPLOYMENT │
│ │
│ ┌────────────────────────────────────────────────────────────────┐ │
│ │ SYSTEMD / CONTAINER RUNTIME (PID 1) │ │
│ │ │ │
│ │ Sends: SIGTERM (graceful) → wait → SIGKILL (force) │ │
│ └─────────────────────┬──────────────────────────────────────────┘ │
│ │ │
│ ┌─────────────────────▼──────────────────────────────────────────┐ │
│ │ MASTER PROCESS (cluster.isPrimary) │ │
│ │ │ │
│ │ Responsibilities: │ │
│ │ ├── Fork N worker processes (N = CPU cores) │ │
│ │ ├── Monitor worker health │ │
│ │ ├── Restart crashed workers (with backoff) │ │
│ │ ├── Handle signals (SIGTERM → graceful shutdown) │ │
│ │ ├── IPC message routing │ │
│ │ └── Metrics aggregation │ │
│ │ │ │
│ │ Does NOT handle HTTP requests │ │
│ └───┬────────────┬────────────┬────────────┬────────────────────┘ │
│ │ │ │ │ │
│ ┌───▼──┐ ┌───▼──┐ ┌───▼──┐ ┌───▼──┐ │
│ │ W #1 │ │ W #2 │ │ W #3 │ │ W #4 │ │
│ │ │ │ │ │ │ │ │ │
│ │ HTTP │ │ HTTP │ │ HTTP │ │ HTTP │ │
│ │Server│ │Server│ │Server│ │Server│ │
│ │ │ │ │ │ │ │ │ │
│ │ Each worker: │ │
│ │ ├── Own V8 instance (~40MB base) │ │
│ │ ├── Own event loop │ │
│ │ ├── Shared server port (via fd passing) │ │
│ │ ├── libuv thread pool (4 threads each) │ │
│ │ └── IPC channel to master │ │
│ └──────┘ └──────┘ └──────┘ └──────┘ │
│ │
│ Total threads per worker: 1 (main) + 4 (libuv) + 2 (V8 GC) = ~7 │
│ Total threads for 4 workers: ~28 + master = ~35 threads │
│ Total memory: ~(40MB base + app * 4) ≈ 200-800MB │
└──────────────────────────────────────────────────────────────────────┘
Interview Questions & Answers
Q1: Why does Node.js use single-threaded event loop instead of multi-threading?
A: Node.js chose single-threaded event loop for I/O-bound workloads because:
-
No synchronization overhead: No mutexes, semaphores, or lock contention. This eliminates entire classes of bugs (deadlocks, race conditions).
-
Context switching cost: Thread context switches involve kernel transitions (~1-10μs). The event loop avoids this by running everything cooperatively on one thread.
-
Memory efficiency: Each OS thread typically reserves 1-8MB for its stack. With the event loop model, 10,000 concurrent connections don't require 10,000 stacks.
-
I/O is the bottleneck, not CPU: For a typical web server, the CPU spends >95% of time waiting for network/disk. You don't need multiple threads to wait.
-
Simpler programming model: No shared mutable state means no data races. Callback/async-await is simpler to reason about than locks.
The trade-off: CPU-intensive work blocks the event loop. Solutions include worker_threads for CPU parallelism, the cluster module for multi-process, and offloading to native addons.
Q2: What's the difference between cluster.fork() and child_process.fork()?
A: Both use the fork(2) syscall under the hood, but they serve different purposes:
child_process.fork(): Creates a new Node.js process with an IPC channel. The child runs a specified module. There's no shared server port — each process binds independently.
cluster.fork(): Also creates a new Node.js process, but with special handling:
- The master passes the listening socket's file descriptor to workers
- On Linux: uses
SO_REUSEPORTor round-robin distribution - Workers call
server.listen(port), but the kernel routes connections from the master's socket - Built-in load balancing (configurable:
cluster.schedulingPolicy)
Key difference: cluster is specifically designed for sharing a server port across multiple processes. The master process either uses round-robin scheduling (default on Linux/macOS) or lets the OS handle distribution (SO_REUSEPORT).
Under the hood, cluster worker's listen() call doesn't actually call bind(2) — it sends a message to the master, which passes the file descriptor back via sendmsg(2) with SCM_RIGHTS.
Q3: How does Go's goroutine scheduling differ from OS thread scheduling?
A: Go implements an M:N scheduler (M goroutines on N OS threads):
OS thread scheduling:
- Managed by kernel (CFS on Linux)
- Preemptive: kernel interrupts via timer (typically every 4ms)
- Context switch: ~1-10μs (save/restore registers, TLB flush for processes)
- Stack: 1-8MB fixed at creation
- Creation: system call (~10-100μs)
Go goroutine scheduling (GMP model):
- G (Goroutine): the unit of work, starts with 2-8KB stack (growable)
- M (Machine): OS thread, goroutines run on these
- P (Processor): logical processor with local run queue
Key differences:
- Stack size: Goroutines start at 2KB (vs 1MB for threads). Stacks grow/shrink dynamically via stack copying.
- Context switch: ~100-200ns (userspace register swap, no kernel transition)
- Cooperative + preemptive hybrid: Goroutines yield at function calls (cooperative). Since Go 1.14, the runtime also uses async preemption via signals.
- Work stealing: Idle P steals from other P's local queue or the global queue.
- Network poller integration: When a goroutine does I/O, it's parked and the M picks up another G. No thread is wasted waiting.
Q4: What happens when fork() is called? Walk through the OS-level steps.
A: When a process calls fork():
-
Kernel entry: The calling process traps into kernel mode via a system call.
-
Process descriptor allocation: The kernel allocates a new
task_struct(Linux) — a ~6KB structure describing the process. -
PID assignment: A new PID is allocated from the PID namespace.
-
Copy-on-Write (COW) page tables: The kernel does NOT copy the parent's memory. Instead, it:
- Copies the parent's page table entries
- Marks all pages as read-only in BOTH parent and child
- When either writes, a page fault triggers and the kernel copies just that page
-
File descriptor table: The child gets a copy of the parent's FD table. Both parent and child now reference the same open file descriptions (file position, flags are shared).
-
Signal disposition: Copied from parent. Pending signals are NOT inherited.
-
Scheduling: The child is placed on the scheduler's run queue. On modern Linux, the child typically runs first (to avoid COW overhead if it immediately calls
exec()). -
Return values:
fork()returns the child's PID to the parent, and 0 to the child. This is how each process knows its role.
The COW optimization means fork() is extremely fast (~1ms) even for processes with GBs of memory — until actual writes cause page faults.
Q5: How would you design a process model for a high-throughput backend service?
A: The optimal model depends on the workload:
For I/O-bound services (95%+ of web backends):
- Use Node.js cluster mode with N workers (N = CPU cores)
- Each worker runs an event loop handling thousands of connections
- Master process handles health checks, graceful restarts
- Use
SharedArrayBufferfor shared counters/metrics - Total memory: ~50-200MB per worker
For CPU-bound services (image processing, ML inference):
- Use a thread pool model (Go, Java)
- Thread count: 2 × CPU cores (to overlap I/O waits)
- Use bounded work queues with backpressure
- Consider separate process pools for isolation
For mixed workloads:
- Main event loop for I/O (connection handling, routing)
- Worker thread pool for CPU-intensive tasks
- Backpressure between the event loop and thread pool
- Circuit breaker: if thread pool saturates, fast-fail new CPU tasks
Production considerations:
- Health check endpoints (readiness + liveness)
- Graceful shutdown (drain connections on SIGTERM)
- Memory limits per worker (restart if exceeded)
- Core dump configuration for crash analysis
- CPU affinity (pin workers to cores for cache locality)
- cgroup limits in containers (respect
--max-old-space-sizein Node.js based on container memory limit, not host memory)
Real-World Problems & How to Solve Them
Problem 1: Node.js server freezes under CPU-heavy endpoints
Symptom: P99 latency spikes for all routes when one endpoint runs compression, image processing, or crypto.
Root cause: CPU-bound work runs on the main event-loop thread, preventing poll/check phases from handling other sockets.
Fix — Offload CPU-heavy tasks to worker threads:
import { Worker } from "node:worker_threads";
function runCpuTask(workerFile: string, payload: unknown): Promise<unknown> {
return new Promise((resolve, reject) => {
const worker = new Worker(workerFile, { workerData: payload });
worker.once("message", resolve);
worker.once("error", reject);
worker.once("exit", (code) => {
if (code !== 0) reject(new Error(`Worker exited with code ${code}`));
});
});
}
app.post("/thumbnail", async (req, res, next) => {
try {
const result = await runCpuTask("./workers/thumbnail.js", req.body);
res.json(result);
} catch (error) {
next(error);
}
});
Problem 2: Slow crypto/fs operations during traffic bursts
Symptom: Password hashing and file operations become randomly slow even when CPU appears available.
Root cause: The libuv thread pool (default size 4) is saturated by blocking async tasks (PBKDF2, zlib, fs), causing queueing delays.
Fix — Bound concurrency and tune UV_THREADPOOL_SIZE for workload:
import { pbkdf2 } from "node:crypto";
process.env.UV_THREADPOOL_SIZE = String(Math.min(32, require("os").cpus().length * 2));
class Semaphore {
private permits: number;
private waiters: Array<() => void> = [];
constructor(permits: number) {
this.permits = permits;
}
async acquire(): Promise<void> {
if (this.permits > 0) {
this.permits--;
return;
}
await new Promise<void>((resolve) => this.waiters.push(resolve));
}
release(): void {
const waiter = this.waiters.shift();
if (waiter) waiter();
else this.permits++;
}
}
const cryptoSemaphore = new Semaphore(16);
async function hashPassword(password: string, salt: string): Promise<Buffer> {
await cryptoSemaphore.acquire();
try {
return await new Promise<Buffer>((resolve, reject) => {
pbkdf2(password, salt, 100_000, 64, "sha512", (err, key) => {
if (err) reject(err);
else resolve(key);
});
});
} finally {
cryptoSemaphore.release();
}
}
Problem 3: Memory usage doubles after forking workers
Symptom: Clustered process memory grows rapidly after deploy, despite no traffic increase.
Root cause: Copy-on-write benefits are lost when workers mutate large preloaded objects, forcing private page copies per process.
Fix — Preload immutable shared data before fork() and avoid mutation:
import cluster from "node:cluster";
import os from "node:os";
const immutableConfig = Object.freeze(loadLargeRoutingTable());
if (cluster.isPrimary) {
const workerCount = os.cpus().length;
for (let i = 0; i < workerCount; i++) {
cluster.fork({ ROUTING_TABLE_VERSION: immutableConfig.version });
}
} else {
// Read-only usage preserves CoW pages better than mutating shared blobs.
startHttpServer({ routingTable: immutableConfig });
}
Problem 4: Restart loops create unstable worker pools
Symptom: Workers crash and restart continuously, causing request drops and noisy alerts.
Root cause: Process manager restarts instantly without backoff or max restart limits, amplifying failures.
Fix — Add restart budget + exponential delay in supervisor:
type WorkerState = { restarts: number; lastStartMs: number };
const workerState = new Map<number, WorkerState>();
function nextRestartDelay(restarts: number): number {
return Math.min(30_000, 500 * Math.pow(2, restarts));
}
function shouldRestart(exitCode: number | null, restarts: number): boolean {
const maxRestarts = 5;
return exitCode !== 0 && restarts < maxRestarts;
}
function onWorkerExit(workerId: number, exitCode: number | null): void {
const current = workerState.get(workerId) ?? { restarts: 0, lastStartMs: Date.now() };
if (!shouldRestart(exitCode, current.restarts)) return;
const delay = nextRestartDelay(current.restarts);
setTimeout(() => {
spawnWorker();
}, delay);
workerState.set(workerId, { restarts: current.restarts + 1, lastStartMs: Date.now() + delay });
}
Problem 5: One worker is overloaded while others are idle
Symptom: CPU utilization is skewed; one worker has a deep queue while others are underused.
Root cause: Naive round-robin dispatch ignores in-flight load and queue depth.
Fix — Route requests by least in-flight load:
type WorkerRef = { id: number; inFlight: number; send(req: unknown): void };
class LeastLoadedDispatcher {
constructor(private workers: WorkerRef[]) {}
dispatch(req: unknown): void {
const worker = this.workers.reduce((best, current) =>
current.inFlight < best.inFlight ? current : best,
);
worker.inFlight++;
worker.send(req);
}
onComplete(workerId: number): void {
const worker = this.workers.find((w) => w.id === workerId);
if (worker) worker.inFlight = Math.max(0, worker.inFlight - 1);
}
}
Problem 6: Internal queues grow until OOM
Symptom: Memory climbs steadily during load tests and process crashes with out-of-memory.
Root cause: Producers outpace consumers and no backpressure/high-water mark logic limits queue growth.
Fix — Add bounded queues with high/low-water backpressure signals:
class BoundedQueue<T> {
private items: T[] = [];
constructor(private highWater = 10_000, private lowWater = 5_000) {}
push(item: T): boolean {
if (this.items.length >= this.highWater) return false;
this.items.push(item);
return true;
}
shift(): T | undefined {
return this.items.shift();
}
shouldResumeProducers(): boolean {
return this.items.length <= this.lowWater;
}
size(): number {
return this.items.length;
}
}
const queue = new BoundedQueue<string>();
const accepted = queue.push("job-123");
if (!accepted) {
throw new Error("Backpressure: queue is full, retry later");
}
Key Takeaways
-
Processes provide isolation, threads provide efficiency: Processes have separate address spaces (safe), threads share memory (fast). Choose based on whether you need isolation or low-overhead communication.
-
The event loop is not "single-threaded": Node.js has one JS thread but uses kernel async I/O (epoll/kqueue) plus a thread pool for blocking operations. The single-thread model avoids locks, not parallelism.
-
Copy-on-Write makes
fork()cheap: The OS doesn't copy memory on fork — just page table entries. Actual copying happens lazily on write. This is why pre-fork servers work well. -
Work-stealing balances load automatically: Each processor has its own queue. Idle processors steal from busy ones, naturally balancing work without centralized scheduling overhead.
-
Backpressure prevents cascade failures: When producers outpace consumers, you need explicit flow control (high/low water marks). Without it, unbounded queues consume memory until OOM.
-
Green threads trade OS control for efficiency: Goroutines and virtual threads achieve million-scale concurrency by managing scheduling in userspace with tiny stacks, at the cost of runtime complexity.
-
The GIL is a design choice, not a bug: Python's GIL simplifies the interpreter and C extension model. For CPU parallelism, use multiprocessing. For I/O concurrency, asyncio works fine.
-
Signal handling is critical for production: SIGTERM starts graceful shutdown, SIGKILL force-kills. Your process must handle SIGTERM to drain connections and clean up before the orchestrator sends SIGKILL.
-
Cluster mode shares ports, not memory: Node.js cluster workers share a server port via file descriptor passing, but each worker has its own V8 heap. Use IPC or SharedArrayBuffer for coordination.
-
Match the concurrency model to the workload: I/O-bound → event loop, CPU-bound → thread pool, mixed → event loop + worker threads. Attempting CPU-heavy work on an event loop thread is the #1 Node.js performance anti-pattern.
What did you think?