Atomics, SharedArrayBuffer, and True Parallelism in JavaScript
Atomics, SharedArrayBuffer, and True Parallelism in JavaScript
JavaScript has always been single-threaded—until SharedArrayBuffer. With shared memory between workers, you can build truly parallel algorithms, but you also inherit all the complexity of concurrent programming: data races, memory ordering, and synchronization primitives. This article covers how to use these low-level tools correctly and when they genuinely change what's possible in JavaScript applications.
The Shared Memory Model
┌─────────────────────────────────────────────────────────────────────────────┐
│ SHAREDARRAYBUFFER ARCHITECTURE │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ WITHOUT SharedArrayBuffer (postMessage): │
│ ──────────────────────────────────────── │
│ │
│ Main Thread Worker Thread │
│ ┌─────────────┐ ┌─────────────┐ │
│ │ ArrayBuffer │ │ ArrayBuffer │ │
│ │ [1,2,3,4] │ │ [1,2,3,4] │ ← COPY (structured clone) │
│ └──────┬──────┘ └──────┬──────┘ │
│ │ │ │
│ │ postMessage(arr) │ │
│ │ ─────────────────────▶│ │
│ │ (serialize, │ │
│ │ copy memory, │ │
│ │ deserialize) │ │
│ │
│ WITH SharedArrayBuffer: │
│ ───────────────────────── │
│ │
│ Main Thread Worker Thread │
│ ┌─────────────┐ ┌─────────────┐ │
│ │ View │ │ View │ │
│ │ Int32Array │ │ Int32Array │ │
│ └──────┬──────┘ └──────┬──────┘ │
│ │ │ │
│ └───────────┬───────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────┐ │
│ │ SharedArrayBuffer │ ← SAME MEMORY │
│ │ [1, 2, 3, 4] │ │
│ │ (shared between │ │
│ │ all threads) │ │
│ └─────────────────────────┘ │
│ │
│ Changes by one thread are visible to all threads! │
│ But this creates data race potential... │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Basic SharedArrayBuffer Usage
// Main thread
const sharedBuffer = new SharedArrayBuffer(1024); // 1KB shared memory
const view = new Int32Array(sharedBuffer);
// Initialize data
view[0] = 42;
view[1] = 100;
// Send to worker (buffer is NOT copied, just the reference)
worker.postMessage({ buffer: sharedBuffer });
// Worker can now read AND WRITE the same memory
// Changes are immediately visible to main thread
// worker.js
self.onmessage = function(e) {
const view = new Int32Array(e.data.buffer);
console.log(view[0]); // 42 - reading shared memory
view[0] = 99; // Main thread sees this change!
};
The Data Race Problem
// WITHOUT Atomics - DATA RACE!
// Main thread
const view = new Int32Array(sharedBuffer);
view[0] = 0;
// Spawn 4 workers, each increments view[0] by 1000000
// Expected result: 4000000
// Actual result: UNDEFINED (usually less than 4000000)
// Worker code:
for (let i = 0; i < 1000000; i++) {
view[0]++; // NOT ATOMIC!
}
// What happens internally for view[0]++:
// 1. Read current value from memory
// 2. Add 1 to the value
// 3. Write new value to memory
// Race condition timeline:
// Thread A: Read value (100)
// Thread B: Read value (100) ← Both see 100
// Thread A: Write 101
// Thread B: Write 101 ← Lost update! Should be 102
// This is why we need Atomics
Atomics: Safe Shared Memory Operations
// Atomics provide thread-safe operations
const view = new Int32Array(sharedBuffer);
// Atomic read and write
Atomics.store(view, 0, 42); // Atomic write
const value = Atomics.load(view, 0); // Atomic read
// Atomic arithmetic
Atomics.add(view, 0, 5); // view[0] += 5, atomically
Atomics.sub(view, 0, 3); // view[0] -= 3, atomically
// Atomic bitwise operations
Atomics.and(view, 0, 0xFF); // view[0] &= 0xFF
Atomics.or(view, 0, 0x100); // view[0] |= 0x100
Atomics.xor(view, 0, 0x55); // view[0] ^= 0x55
// Atomic exchange
const old = Atomics.exchange(view, 0, 99); // Set to 99, return old value
// Compare-and-swap (CAS) - foundation of lock-free algorithms
const oldValue = Atomics.compareExchange(
view,
0, // index
expected, // expected current value
newValue // value to set if current === expected
);
// Returns actual old value (compare with expected to know if swap happened)
Compare-and-Exchange: The Building Block
// CAS is how you build lock-free data structures
function atomicIncrement(view, index) {
while (true) {
const current = Atomics.load(view, index);
const next = current + 1;
// Try to swap current → next
const actual = Atomics.compareExchange(view, index, current, next);
if (actual === current) {
// Swap succeeded! No other thread modified it
return next;
}
// Another thread modified it, retry with new value
}
}
// This is what Atomics.add does internally, but you can use CAS
// for more complex operations:
function atomicMultiply(view, index, multiplier) {
while (true) {
const current = Atomics.load(view, index);
const next = current * multiplier;
if (Atomics.compareExchange(view, index, current, next) === current) {
return next;
}
// Retry - another thread interfered
}
}
Memory Ordering and Happens-Before
┌─────────────────────────────────────────────────────────────────────────────┐
│ MEMORY ORDERING GUARANTEES │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ JavaScript uses Sequential Consistency for Atomics: │
│ ───────────────────────────────────────────────── │
│ All atomic operations appear to execute in SOME sequential order │
│ that is consistent with the program order of each thread. │
│ │
│ HAPPENS-BEFORE RELATIONSHIPS: │
│ │
│ 1. Within a thread, operations happen in program order │
│ │
│ 2. An Atomics.store() HAPPENS-BEFORE any Atomics.load() that reads │
│ the value stored │
│ │
│ 3. Atomics.wait() HAPPENS-BEFORE the matching Atomics.notify() │
│ │
│ Thread A Thread B │
│ ───────── ───────── │
│ │
│ data = 42; │
│ Atomics.store(flag, 0, 1); ─────────┐ │
│ │ HAPPENS-BEFORE │
│ ┌─────────┘ │
│ ▼ │
│ while(Atomics.load(flag,0) === 0); │
│ console.log(data); // GUARANTEED to see 42 │
│ │
│ WITHOUT Atomics, Thread B might see stale 'data' even after flag is set! │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Atomics.wait and Atomics.notify: Thread Synchronization
// Atomics.wait() - Block thread until condition is met
// Atomics.notify() - Wake up waiting threads
// IMPORTANT: Atomics.wait() only works in Workers, not main thread!
// (Main thread cannot block)
// Worker code (waiter):
const view = new Int32Array(sharedBuffer);
// Wait for view[0] to become non-zero
// Blocks the thread entirely (no event loop)
const result = Atomics.wait(view, 0, 0); // Wait while value === 0
// result can be:
// 'ok' - was woken by notify
// 'not-equal' - value wasn't 0 when we checked
// 'timed-out' - timeout expired
// With timeout:
Atomics.wait(view, 0, 0, 1000); // Wait max 1000ms
// Main thread (notifier):
const view = new Int32Array(sharedBuffer);
view[0] = 1; // Set the value
// Wake up one waiting thread
const wokenCount = Atomics.notify(view, 0, 1);
// Wake up all waiting threads
Atomics.notify(view, 0, Infinity);
// Wake up returns the number of threads actually woken
Building a Mutex with Atomics
// A simple mutex (mutual exclusion lock)
class Mutex {
private lockView: Int32Array;
private lockIndex = 0;
private UNLOCKED = 0;
private LOCKED = 1;
constructor(sharedBuffer: SharedArrayBuffer, byteOffset: number = 0) {
this.lockView = new Int32Array(sharedBuffer, byteOffset, 1);
// Initialize to unlocked state
Atomics.store(this.lockView, this.lockIndex, this.UNLOCKED);
}
lock(): void {
while (true) {
// Try to acquire lock (CAS: UNLOCKED → LOCKED)
const oldValue = Atomics.compareExchange(
this.lockView,
this.lockIndex,
this.UNLOCKED,
this.LOCKED
);
if (oldValue === this.UNLOCKED) {
// Successfully acquired lock
return;
}
// Lock is held by another thread, wait
Atomics.wait(this.lockView, this.lockIndex, this.LOCKED);
// When woken, loop and try again
}
}
unlock(): void {
// Release lock
Atomics.store(this.lockView, this.lockIndex, this.UNLOCKED);
// Wake one waiting thread
Atomics.notify(this.lockView, this.lockIndex, 1);
}
// Try to acquire without blocking
tryLock(): boolean {
const oldValue = Atomics.compareExchange(
this.lockView,
this.lockIndex,
this.UNLOCKED,
this.LOCKED
);
return oldValue === this.UNLOCKED;
}
}
// Usage in worker:
const mutex = new Mutex(sharedBuffer, 0);
const data = new Int32Array(sharedBuffer, 4);
function criticalSection() {
mutex.lock();
try {
// Only one thread can be here at a time
const current = data[0];
// ... complex operation ...
data[0] = current + 1;
} finally {
mutex.unlock();
}
}
Building a Lock-Free Queue
// A simple lock-free single-producer, single-consumer queue
class SPSCQueue {
private buffer: Int32Array;
private head: number; // Index for read position
private tail: number; // Index for write position
private capacity: number;
// Layout: [head, tail, data...]
private HEAD_INDEX = 0;
private TAIL_INDEX = 1;
private DATA_START = 2;
constructor(sharedBuffer: SharedArrayBuffer, capacity: number) {
this.buffer = new Int32Array(sharedBuffer);
this.capacity = capacity;
// Initialize head and tail to 0
Atomics.store(this.buffer, this.HEAD_INDEX, 0);
Atomics.store(this.buffer, this.TAIL_INDEX, 0);
}
// Producer calls this
push(value: number): boolean {
const tail = Atomics.load(this.buffer, this.TAIL_INDEX);
const head = Atomics.load(this.buffer, this.HEAD_INDEX);
const nextTail = (tail + 1) % this.capacity;
if (nextTail === head) {
// Queue is full
return false;
}
// Write the value
this.buffer[this.DATA_START + tail] = value;
// Publish the new tail (release semantics)
Atomics.store(this.buffer, this.TAIL_INDEX, nextTail);
// Wake any waiting consumer
Atomics.notify(this.buffer, this.TAIL_INDEX, 1);
return true;
}
// Consumer calls this
pop(): number | null {
const head = Atomics.load(this.buffer, this.HEAD_INDEX);
const tail = Atomics.load(this.buffer, this.TAIL_INDEX);
if (head === tail) {
// Queue is empty
return null;
}
// Read the value
const value = this.buffer[this.DATA_START + head];
// Advance head
const nextHead = (head + 1) % this.capacity;
Atomics.store(this.buffer, this.HEAD_INDEX, nextHead);
return value;
}
// Consumer can wait for data
popBlocking(): number {
while (true) {
const value = this.pop();
if (value !== null) {
return value;
}
// Wait for producer to publish
const tail = Atomics.load(this.buffer, this.TAIL_INDEX);
Atomics.wait(this.buffer, this.TAIL_INDEX, tail);
}
}
}
Real-World Use Cases
Use Case 1: Parallel Image Processing
// Main thread
async function processImageParallel(imageData: ImageData): Promise<ImageData> {
const { width, height, data } = imageData;
// Create shared buffer for image data
const sharedBuffer = new SharedArrayBuffer(data.length);
const sharedView = new Uint8ClampedArray(sharedBuffer);
sharedView.set(data);
// Divide work among workers
const numWorkers = navigator.hardwareConcurrency || 4;
const rowsPerWorker = Math.ceil(height / numWorkers);
const workers = [];
const promises = [];
for (let i = 0; i < numWorkers; i++) {
const worker = new Worker('image-worker.js');
workers.push(worker);
const startRow = i * rowsPerWorker;
const endRow = Math.min((i + 1) * rowsPerWorker, height);
const promise = new Promise((resolve) => {
worker.onmessage = () => resolve(undefined);
worker.postMessage({
buffer: sharedBuffer,
width,
startRow,
endRow
});
});
promises.push(promise);
}
// Wait for all workers to complete
await Promise.all(promises);
// Create result (data is already in sharedView)
const result = new ImageData(width, height);
result.data.set(sharedView);
// Cleanup workers
workers.forEach(w => w.terminate());
return result;
}
// image-worker.js
self.onmessage = function(e) {
const { buffer, width, startRow, endRow } = e.data;
const view = new Uint8ClampedArray(buffer);
for (let y = startRow; y < endRow; y++) {
for (let x = 0; x < width; x++) {
const i = (y * width + x) * 4;
// Example: Grayscale conversion
const gray = view[i] * 0.299 + view[i+1] * 0.587 + view[i+2] * 0.114;
view[i] = gray;
view[i+1] = gray;
view[i+2] = gray;
// Alpha unchanged
}
}
self.postMessage('done');
};
Use Case 2: Parallel Data Processing in Node.js
// main.js
const { Worker, isMainThread, parentPort, workerData } = require('worker_threads');
if (isMainThread) {
async function parallelSum(numbers: number[]): Promise<number> {
// Create shared buffer for numbers
const sharedBuffer = new SharedArrayBuffer(numbers.length * 4);
const view = new Int32Array(sharedBuffer);
numbers.forEach((n, i) => view[i] = n);
// Create buffer for partial sums
const numWorkers = 4;
const resultsBuffer = new SharedArrayBuffer(numWorkers * 4);
const results = new Int32Array(resultsBuffer);
const workers = [];
const chunkSize = Math.ceil(numbers.length / numWorkers);
for (let i = 0; i < numWorkers; i++) {
const start = i * chunkSize;
const end = Math.min(start + chunkSize, numbers.length);
const worker = new Worker(__filename, {
workerData: {
dataBuffer: sharedBuffer,
resultsBuffer,
workerIndex: i,
start,
end
}
});
workers.push(new Promise((resolve, reject) => {
worker.on('message', resolve);
worker.on('error', reject);
}));
}
await Promise.all(workers);
// Sum partial results
let total = 0;
for (let i = 0; i < numWorkers; i++) {
total += results[i];
}
return total;
}
// Test it
const numbers = Array.from({ length: 10000000 }, (_, i) => i);
parallelSum(numbers).then(console.log);
} else {
// Worker code
const { dataBuffer, resultsBuffer, workerIndex, start, end } = workerData;
const data = new Int32Array(dataBuffer);
const results = new Int32Array(resultsBuffer);
let sum = 0;
for (let i = start; i < end; i++) {
sum += data[i];
}
// Write result atomically
Atomics.store(results, workerIndex, sum);
parentPort.postMessage('done');
}
Use Case 3: Real-Time Audio Processing
// Audio worklet with shared buffer for zero-copy audio processing
// main.js
async function setupAudioProcessing() {
const audioContext = new AudioContext();
await audioContext.audioWorklet.addModule('processor.js');
// Shared buffer for audio parameters
const paramBuffer = new SharedArrayBuffer(16);
const params = new Float32Array(paramBuffer);
// Create processor node
const processorNode = new AudioWorkletNode(audioContext, 'shared-processor', {
processorOptions: { paramBuffer }
});
// Connect to output
const source = audioContext.createOscillator();
source.connect(processorNode);
processorNode.connect(audioContext.destination);
// Update parameters from main thread (no message passing!)
function setVolume(value: number) {
Atomics.store(params, 0, value); // Instant update
}
function setFrequency(value: number) {
Atomics.store(params, 1, value);
}
return { setVolume, setFrequency };
}
// processor.js (AudioWorklet)
class SharedProcessor extends AudioWorkletProcessor {
private params: Float32Array;
constructor(options) {
super();
this.params = new Float32Array(options.processorOptions.paramBuffer);
}
process(inputs, outputs, parameters) {
const output = outputs[0];
const volume = Atomics.load(this.params, 0);
const frequency = Atomics.load(this.params, 1);
for (let channel = 0; channel < output.length; channel++) {
const outputChannel = output[channel];
for (let i = 0; i < outputChannel.length; i++) {
// Apply volume from shared buffer (updated in real-time)
outputChannel[i] = inputs[0][channel][i] * volume;
}
}
return true;
}
}
registerProcessor('shared-processor', SharedProcessor);
When NOT to Use SharedArrayBuffer
// SharedArrayBuffer adds complexity. Use it only when:
// 1. postMessage copying overhead is a measurable bottleneck
// 2. You need true parallelism (not just concurrency)
// 3. You understand the synchronization requirements
// DON'T use for:
// ❌ Simple worker communication
// Use postMessage with Transferable objects instead
worker.postMessage(arrayBuffer, [arrayBuffer]); // Transfer, not copy
// ❌ Sharing small amounts of data
// The synchronization overhead exceeds the copying cost
// ❌ Data that's read-only
// Just copy it; no synchronization needed for reads
// ❌ When you're not sure about thread safety
// Bugs are subtle and hard to reproduce
// DO use for:
// ✅ Large datasets processed in parallel
// ✅ Real-time applications (audio, video)
// ✅ Compute-intensive algorithms (physics, ML inference)
// ✅ When profiling shows postMessage is a bottleneck
Security Considerations
// SharedArrayBuffer was disabled in browsers after Spectre/Meltdown
// It's re-enabled with Cross-Origin Isolation:
// Required HTTP headers:
// Cross-Origin-Opener-Policy: same-origin
// Cross-Origin-Embedder-Policy: require-corp
// In your server:
res.setHeader('Cross-Origin-Opener-Policy', 'same-origin');
res.setHeader('Cross-Origin-Embedder-Policy', 'require-corp');
// Check if available:
if (typeof SharedArrayBuffer !== 'undefined') {
// Can use SharedArrayBuffer
} else {
// Fall back to postMessage with transfers
}
// Cross-origin resources need:
// Cross-Origin-Resource-Policy: cross-origin
// Or be served from the same origin
Key Takeaways
-
SharedArrayBuffer enables true parallelism: Multiple threads can read and write the same memory, enabling parallel algorithms impossible with postMessage.
-
Atomics prevent data races: Without Atomics, concurrent reads/writes produce undefined results. Use Atomics for ALL shared memory access.
-
CAS is the foundation of lock-free programming: Atomics.compareExchange enables building complex synchronization primitives without locks.
-
Atomics.wait/notify are for blocking synchronization: Like condition variables in other languages. Only work in workers, not main thread.
-
Sequential consistency is guaranteed: Atomic operations have well-defined memory ordering. Non-atomic operations on shared memory do not.
-
Cross-origin isolation is required in browsers: Security headers must be set for SharedArrayBuffer to be available.
-
Use sparingly and measure first: SharedArrayBuffer adds complexity. Only use when postMessage overhead is a proven bottleneck.
-
Real use cases exist: Parallel image/video processing, real-time audio, compute-intensive algorithms benefit from true parallelism.
SharedArrayBuffer and Atomics bring systems programming to JavaScript. With great power comes great responsibility—you're now dealing with the same concurrency challenges that C++ and Rust programmers face. Use these tools when parallelism genuinely solves your problem, and always default to simpler message-passing when it's sufficient.
What did you think?
Related Posts
March 10, 20263 min
Advanced Concurrency Patterns: Implementing Promise.scheduler, Semaphore, and ReadWriteLock in JavaScript
March 9, 20265 min
Implementing a Promise-Based Actor Model: Message Passing Concurrency for Complex Frontend State Machines
March 8, 20263 min