How Babel Transforms Your Code: AST Parsing, Plugin Visitors, and the Compilation Pipeline
How Babel Transforms Your Code: AST Parsing, Plugin Visitors, and the Compilation Pipeline
The Pipeline
Every Babel transformation follows the same three-stage pipeline: Parse → Transform → Generate. Understanding this pipeline is the foundation for writing custom plugins, codemods, and understanding how your JSX becomes function calls.
Source Code (string)
│
▼
┌──────────┐
│ PARSE │ @babel/parser (formerly babylon)
│ │ Source string → AST (Abstract Syntax Tree)
└────┬─────┘
│ AST (JSON-like tree structure)
▼
┌──────────┐
│TRANSFORM │ @babel/traverse + visitor plugins
│ │ Walk AST → match patterns → mutate nodes
└────┬─────┘
│ Modified AST
▼
┌──────────┐
│ GENERATE │ @babel/generator
│ │ AST → source string + source map
└──────────┘
│
▼
Output Code (string) + Source Map
Stage 1: Parsing — Source to AST
What Is An AST?
Source: const x = a + b;
AST (simplified):
Program
└── VariableDeclaration (kind: "const")
└── VariableDeclarator
├── id: Identifier (name: "x")
└── init: BinaryExpression (operator: "+")
├── left: Identifier (name: "a")
└── right: Identifier (name: "b")
The Full AST for Real Code
// Source:
const greet = (name: string): string => `Hello, ${name}!`;
// AST (major nodes):
{
type: "Program",
body: [{
type: "VariableDeclaration",
kind: "const",
declarations: [{
type: "VariableDeclarator",
id: { type: "Identifier", name: "greet" },
init: {
type: "ArrowFunctionExpression",
params: [{
type: "Identifier",
name: "name",
typeAnnotation: {
type: "TSTypeAnnotation",
typeAnnotation: { type: "TSStringKeyword" }
}
}],
returnType: {
type: "TSTypeAnnotation",
typeAnnotation: { type: "TSStringKeyword" }
},
body: {
type: "TemplateLiteral",
quasis: [
{ type: "TemplateElement", value: { raw: "Hello, " } },
{ type: "TemplateElement", value: { raw: "!" } }
],
expressions: [
{ type: "Identifier", name: "name" }
]
}
}
}]
}]
}
How @babel/parser Works
import { parse } from '@babel/parser';
const ast = parse(sourceCode, {
sourceType: 'module', // 'module' | 'script' | 'unambiguous'
plugins: [
'typescript', // Enable TypeScript parsing
'jsx', // Enable JSX parsing
'decorators', // Stage 3 decorators
'importAttributes', // import x from './y' with { type: 'json' }
'optionalChaining', // a?.b?.c
'nullishCoalescingOperator', // a ?? b
],
});
/**
* Parser Pipeline (internal):
*
* 1. TOKENIZER (Lexer)
* Source string → Token stream
* "const x = 1 + 2;"
* → [CONST, IDENT("x"), EQ, NUM(1), PLUS, NUM(2), SEMI]
*
* 2. PARSER (Recursive Descent)
* Token stream → AST
* Implements ECMAScript grammar rules:
* Program → StatementList
* Statement → VariableDeclaration | ExpressionStatement | ...
* Expression → BinaryExpression | CallExpression | ...
* Handles operator precedence, associativity, ASI (semicolons)
*
* 3. VALIDATION
* Check for syntax errors that the grammar allows but spec forbids:
* - Duplicate parameter names in strict mode
* - await/yield in wrong context
* - Invalid assignment targets (1 = 2)
*/
Stage 2: Transform — The Visitor Pattern
How Visitors Work
The AST is a tree. Transformation = walking the tree and mutating nodes.
@babel/traverse walks the AST depth-first:
Program
├── enter Program
│ ├── enter VariableDeclaration
│ │ ├── enter VariableDeclarator
│ │ │ ├── enter Identifier (x)
│ │ │ ├── exit Identifier (x)
│ │ │ ├── enter BinaryExpression
│ │ │ │ ├── enter Identifier (a)
│ │ │ │ ├── exit Identifier (a)
│ │ │ │ ├── enter Identifier (b)
│ │ │ │ ├── exit Identifier (b)
│ │ │ ├── exit BinaryExpression
│ │ ├── exit VariableDeclarator
│ ├── exit VariableDeclaration
├── exit Program
A "visitor" is an object with methods named after node types.
When traverse encounters a node of that type, it calls your method.
import traverse from '@babel/traverse';
import * as t from '@babel/types'; // AST node builders + type checkers
// A visitor that replaces all `var` with `const`
traverse(ast, {
VariableDeclaration(path) {
if (path.node.kind === 'var') {
path.node.kind = 'const';
}
},
});
// A visitor with enter/exit hooks
traverse(ast, {
FunctionDeclaration: {
enter(path) {
console.log('Entering function:', path.node.id?.name);
},
exit(path) {
console.log('Exiting function:', path.node.id?.name);
},
},
});
The path Object — Your Handle on the AST
// path is NOT the AST node — it's a wrapper with rich APIs
interface NodePath<T = Node> {
node: T; // The actual AST node
parent: Node; // Parent AST node
parentPath: NodePath; // Parent path
scope: Scope; // Variable scope information
type: string; // Node type (e.g., "Identifier")
// --- Navigation ---
get(key: string): NodePath; // Get child path
getSibling(key: number): NodePath;
getStatementParent(): NodePath;
findParent(cb: (path: NodePath) => boolean): NodePath | null;
// --- Mutation ---
replaceWith(node: Node): void; // Replace this node with another
replaceWithMultiple(nodes: Node[]): void;
remove(): void; // Remove from tree
insertBefore(nodes: Node[]): void;
insertAfter(nodes: Node[]): void;
// --- Checks ---
isIdentifier(): boolean;
isMemberExpression(): boolean;
isReferencedIdentifier(): boolean; // Is this usage, not declaration?
// --- Scope ---
scope.hasBinding(name: string): boolean;
scope.getBinding(name: string): Binding | undefined;
scope.generateUidIdentifier(name: string): Identifier; // Unique name
}
Writing a Custom Babel Plugin
Plugin Structure
// A Babel plugin is a function that returns a visitor object
import { PluginObj, types as t } from '@babel/core';
export default function myPlugin(): PluginObj {
return {
name: 'my-custom-plugin',
visitor: {
// Your visitor methods here
},
};
}
// With options:
export default function myPlugin(
api: { types: typeof t },
options: { target?: string }
): PluginObj {
const { types: t } = api;
return {
name: 'my-custom-plugin',
visitor: { /* ... */ },
};
}
Example 1: Function Call Tracing Plugin
/**
* Plugin: Instrument every function call for tracing.
*
* BEFORE:
* function processUser(user) {
* validate(user);
* return save(user);
* }
*
* AFTER:
* function processUser(user) {
* console.trace("[TRACE] processUser called", { args: [user] });
* const __start = performance.now();
* try {
* validate(user);
* return save(user);
* } finally {
* console.trace("[TRACE] processUser returned", {
* duration: performance.now() - __start
* });
* }
* }
*/
import { PluginObj, types as t } from '@babel/core';
function functionTracingPlugin(): PluginObj {
return {
name: 'function-tracing',
visitor: {
// Handle: function foo() { ... }
FunctionDeclaration(path) {
const funcName = path.node.id?.name || '<anonymous>';
wrapFunctionBody(path, funcName, t);
},
// Handle: const foo = () => { ... }
ArrowFunctionExpression(path) {
// Get the variable name from parent
const parent = path.parentPath;
let funcName = '<arrow>';
if (parent.isVariableDeclarator() && t.isIdentifier(parent.node.id)) {
funcName = parent.node.id.name;
}
// Only wrap if body is a block statement
if (t.isBlockStatement(path.node.body)) {
wrapFunctionBody(path, funcName, t);
}
},
},
};
}
function wrapFunctionBody(
path: any,
funcName: string,
t: typeof import('@babel/types')
): void {
const body = path.get('body');
if (!body.isBlockStatement()) return;
const startVar = path.scope.generateUidIdentifier('start');
const paramNames = path.node.params.map((p: any) =>
t.isIdentifier(p) ? p : t.identifier('_')
);
// const __start = performance.now();
const startTimer = t.variableDeclaration('const', [
t.variableDeclarator(
startVar,
t.callExpression(
t.memberExpression(t.identifier('performance'), t.identifier('now')),
[]
)
),
]);
// console.trace("[TRACE] funcName called", { args: [...params] })
const entryLog = t.expressionStatement(
t.callExpression(
t.memberExpression(t.identifier('console'), t.identifier('trace')),
[
t.stringLiteral(`[TRACE] ${funcName} called`),
t.objectExpression([
t.objectProperty(
t.identifier('args'),
t.arrayExpression(paramNames.map((p: any) => t.cloneNode(p)))
),
]),
]
)
);
// console.trace("[TRACE] funcName returned", { duration: ... })
const exitLog = t.expressionStatement(
t.callExpression(
t.memberExpression(t.identifier('console'), t.identifier('trace')),
[
t.stringLiteral(`[TRACE] ${funcName} returned`),
t.objectExpression([
t.objectProperty(
t.identifier('duration'),
t.binaryExpression(
'-',
t.callExpression(
t.memberExpression(
t.identifier('performance'),
t.identifier('now')
),
[]
),
t.cloneNode(startVar)
)
),
]),
]
)
);
// Wrap original body in try { ... } finally { exitLog }
const tryFinally = t.tryStatement(
t.blockStatement(body.node.body),
null,
t.blockStatement([exitLog])
);
body.node.body = [entryLog, startTimer, tryFinally];
}
Example 2: Auto-Import Plugin (Codemod)
/**
* Plugin: Automatically adds missing React import when JSX is used.
*
* BEFORE:
* export function App() {
* return <div>Hello</div>;
* }
*
* AFTER:
* import React from 'react';
* export function App() {
* return <div>Hello</div>;
* }
*/
function autoImportReactPlugin(): PluginObj {
return {
name: 'auto-import-react',
visitor: {
Program: {
exit(path) {
let hasJSX = false;
let hasReactImport = false;
// Check if file contains JSX
path.traverse({
JSXElement() { hasJSX = true; },
JSXFragment() { hasJSX = true; },
});
// Check if React is already imported
path.traverse({
ImportDeclaration(importPath) {
if (importPath.node.source.value === 'react') {
hasReactImport = true;
}
},
});
if (hasJSX && !hasReactImport) {
const importDecl = t.importDeclaration(
[t.importDefaultSpecifier(t.identifier('React'))],
t.stringLiteral('react')
);
path.unshiftContainer('body', importDecl);
}
},
},
},
};
}
Example 3: Dead Code Elimination
/**
* Plugin: Remove if (false) { ... } and if (process.env.NODE_ENV === 'production')
* branches in development builds.
*/
function deadCodeEliminationPlugin(): PluginObj {
return {
name: 'dead-code-elimination',
visitor: {
IfStatement(path) {
const test = path.get('test');
// if (false) { ... } → remove entire statement
if (test.isBooleanLiteral({ value: false })) {
if (path.node.alternate) {
// Has else branch — replace if with else body
path.replaceWith(path.node.alternate);
} else {
path.remove();
}
return;
}
// if (true) { ... } → replace with consequent body
if (test.isBooleanLiteral({ value: true })) {
path.replaceWith(path.node.consequent);
return;
}
// if (process.env.NODE_ENV === 'production') in dev → evaluate
if (
test.isBinaryExpression({ operator: '===' }) &&
isMemberExpression(test.get('left'), 'process.env.NODE_ENV')
) {
const right = test.get('right');
if (right.isStringLiteral()) {
const envValue = process.env.NODE_ENV || 'development';
const matches = right.node.value === envValue;
if (matches) {
path.replaceWith(path.node.consequent);
} else if (path.node.alternate) {
path.replaceWith(path.node.alternate);
} else {
path.remove();
}
}
}
},
},
};
}
function isMemberExpression(path: any, chain: string): boolean {
const parts = chain.split('.');
let current = path;
for (let i = parts.length - 1; i > 0; i--) {
if (!current.isMemberExpression()) return false;
if (!current.get('property').isIdentifier({ name: parts[i] })) return false;
current = current.get('object');
}
return current.isIdentifier({ name: parts[0] });
}
Stage 3: Code Generation
@babel/generator
import generate from '@babel/generator';
const output = generate(
modifiedAst,
{
retainLines: false, // Try to keep original line numbers
compact: false, // Minification mode
concise: false, // Reduce whitespace
sourceMaps: true, // Generate source map
sourceFileName: 'input.ts',
decoratorsBeforeExport: true,
},
originalSourceCode // Original source (for source map)
);
console.log(output.code); // Generated JavaScript string
console.log(output.map); // Source map object
How Generation Works
Generator walks the AST and emits characters:
VariableDeclaration (kind: "const")
→ emit "const "
VariableDeclarator
→ visit id → emit "x"
→ emit " = "
→ visit init → ...
BinaryExpression (operator: "+")
→ visit left → emit "a"
→ emit " + "
→ visit right → emit "b"
→ emit ";"
Result: "const x = a + b;"
Each AST node type has a "printer" function that knows
how to emit the correct syntax for that node.
Source maps are generated by tracking:
"For output column 6-7, this came from input line 1, column 6-7"
How @babel/preset-env Works
The Browserslist → Transforms Pipeline
Your babel config:
{
"presets": [
["@babel/preset-env", {
"targets": "> 0.25%, not dead",
"useBuiltIns": "usage",
"corejs": 3
}]
]
}
Step 1: Resolve browserslist targets
"> 0.25%, not dead" →
Chrome 80+, Firefox 78+, Safari 13+, Edge 80+, iOS 13+
Step 2: Check compat-table for each ES feature
┌──────────────────────┬──────┬───────┬────────┐
│ Feature │ Chr80│ FF78 │ Saf13 │
├──────────────────────┼──────┼───────┼────────┤
│ Arrow functions │ ✓ │ ✓ │ ✓ │
│ const/let │ ✓ │ ✓ │ ✓ │
│ Optional chaining │ ✓ │ ✓ │ ✗ (13) │
│ Nullish coalescing │ ✓ │ ✓ │ ✗ (13) │
│ Class fields │ ✓ │ FF79+ │ Saf15 │
│ Top-level await │ ✗ │ ✗ │ ✗ │
└──────────────────────┴──────┴───────┴────────┘
Step 3: Enable ONLY the transforms needed
In this case:
✗ Arrow functions (all targets support) → skipped
✗ const/let (all targets support) → skipped
✓ Optional chaining → transform (Safari 13) → enabled
✓ Nullish coalescing → transform (Safari 13) → enabled
✓ Class fields → transform (Firefox, Safari) → enabled
✓ Top-level await → transform (all) → enabled
Step 4: For "useBuiltIns: usage", scan code for used APIs
If your code uses: Array.from(), Promise.allSettled()
→ inject: import "core-js/modules/es.array.from"
→ inject: import "core-js/modules/es.promise.all-settled"
But NOT: import "core-js" (the entire library)
What Transforms Actually Do
// BEFORE: Optional chaining
const name = user?.profile?.name;
// AFTER: Babel transform-optional-chaining
var _user$profile;
const name =
user === null || user === void 0
? void 0
: (_user$profile = user.profile) === null || _user$profile === void 0
? void 0
: _user$profile.name;
// BEFORE: Nullish coalescing
const value = input ?? 'default';
// AFTER: Babel transform-nullish-coalescing
const value = input !== null && input !== void 0 ? input : 'default';
// BEFORE: Class field
class Counter {
count = 0;
#secret = 42;
}
// AFTER: Babel transform-class-properties
class Counter {
constructor() {
Object.defineProperty(this, "count", {
configurable: true,
enumerable: true,
writable: true,
value: 0,
});
// Private fields use a WeakMap for encapsulation
_secret.set(this, 42);
}
}
var _secret = new WeakMap();
SWC and esbuild: Why 100x Faster?
The Speed Gap
Benchmark: Transform 10,000 TypeScript files (real-world project)
- Remove type annotations
- Transform JSX
- Downlevel to ES2019
Babel: 45 seconds (single-threaded JS, plugin system overhead)
SWC: 0.4 seconds (multi-threaded Rust, no plugin bridge)
esbuild: 0.3 seconds (multi-threaded Go, no plugin bridge)
Why the 100x difference?
Architectural Reasons for the Speed Difference
BABEL:
┌──────────────────────────────────────────────┐
│ 1. JavaScript execution overhead │
│ - V8 JIT compiles Babel itself on startup │
│ - GC pauses from millions of AST nodes │
│ - Object property access is slower than │
│ struct field access (hidden classes help, │
│ but not enough) │
│ │
│ 2. Plugin system overhead │
│ - Each plugin creates a visitor │
│ - Traverse merges visitors and walks once │
│ - But: visitor dispatch = dynamic dispatch │
│ (function calls based on node type string)│
│ - Plugin communication goes through shared │
│ AST (no direct data flow) │
│ │
│ 3. Single-threaded │
│ - JS is single-threaded by nature │
│ - Worker threads add serialization overhead │
│ - AST can't be shared between workers │
│ │
│ 4. Separate parse → traverse → generate │
│ - Three full tree walks minimum │
│ - Memory: full AST in memory at once │
└──────────────────────────────────────────────┘
SWC / ESBUILD:
┌──────────────────────────────────────────────┐
│ 1. Native language (Rust / Go) │
│ - No VM startup, no GC pauses │
│ - Struct field access = single pointer add │
│ - Deterministic memory layout (cache- │
│ friendly) │
│ │
│ 2. No plugin system (mostly) │
│ - All transforms compiled into the binary │
│ - Static dispatch: compiler knows which │
│ function to call at compile time │
│ - No visitor merging overhead │
│ │
│ 3. Multi-threaded │
│ - Rust: rayon for parallel file processing │
│ - Go: goroutines for parallel file I/O │
│ - Each file is independent → trivially │
│ parallelizable │
│ │
│ 4. Single-pass processing │
│ - esbuild parses + transforms in one pass │
│ - No separate AST → traveral → codegen │
│ - Memory: minimal intermediate state │
└──────────────────────────────────────────────┘
When You Still Need Babel
SWC/esbuild CANNOT do:
✗ Custom Babel plugins (macros, codegen, instrumentation)
✗ babel-plugin-styled-components (compile-time CSS extraction)
✗ babel-plugin-relay (GraphQL fragment compilation)
✗ babel-plugin-macros (compile-time evaluation)
✗ Fine-grained polyfill injection (useBuiltIns: "usage")
SWC partial support:
✓ @swc/plugin-styled-components (Rust rewrite, not all features)
✓ @swc/plugin-emotion (Rust rewrite)
✓ Custom SWC plugins via WASM (limited, slower than native Rust)
The pragmatic approach:
Use SWC/esbuild for the HOT PATH (type stripping, JSX, downleveling)
Use Babel ONLY for custom plugins that have no SWC equivalent
This is exactly what @vitejs/plugin-react does:
esbuild for all transforms EXCEPT React Refresh injection
Babel only for the react-refresh/babel plugin
Writing a Codemod with Babel
What Is a Codemod?
A codemod is a Babel plugin designed to transform existing code
from pattern A to pattern B across an entire codebase.
Common use cases:
- API migration: React.createClass → class extends Component
- Import rewrite: 'lodash' → 'lodash-es'
- Deprecation removal: remove usage of deprecated APIs
- Framework upgrade: Next.js 12 → 13 (pages → app router)
Tool: jscodeshift (Facebook's codemod runner, built on @babel/parser)
/**
* Codemod: Migrate from React.PropTypes to prop-types package
*
* BEFORE:
* import React from 'react';
* MyComponent.propTypes = {
* name: React.PropTypes.string.isRequired,
* };
*
* AFTER:
* import React from 'react';
* import PropTypes from 'prop-types';
* MyComponent.propTypes = {
* name: PropTypes.string.isRequired,
* };
*/
import { API, FileInfo } from 'jscodeshift';
export default function transformer(file: FileInfo, api: API) {
const j = api.jscodeshift;
const root = j(file.source);
// Find all React.PropTypes references
const reactPropTypes = root.find(j.MemberExpression, {
object: { type: 'Identifier', name: 'React' },
property: { type: 'Identifier', name: 'PropTypes' },
});
if (reactPropTypes.length === 0) return file.source; // No changes needed
// Replace React.PropTypes with PropTypes
reactPropTypes.forEach(path => {
j(path).replaceWith(j.identifier('PropTypes'));
});
// Add import for prop-types package (if not already present)
const hasImport = root.find(j.ImportDeclaration, {
source: { value: 'prop-types' },
}).length > 0;
if (!hasImport) {
const propTypesImport = j.importDeclaration(
[j.importDefaultSpecifier(j.identifier('PropTypes'))],
j.literal('prop-types')
);
// Insert after the React import
const reactImport = root.find(j.ImportDeclaration, {
source: { value: 'react' },
});
if (reactImport.length > 0) {
reactImport.at(0).insertAfter(propTypesImport);
} else {
root.get().node.program.body.unshift(propTypesImport);
}
}
return root.toSource({ quote: 'single' });
}
The Complete Transform Chain in a Real Build
Your JSX + TypeScript file goes through:
1. @babel/parser
Parses: TypeScript + JSX + latest ES features
Output: Full AST with type annotations
2. @babel/plugin-transform-typescript
Strips ALL type annotations (no type checking!)
TSTypeAnnotation, TSInterfaceDeclaration → removed
3. @babel/plugin-transform-react-jsx
JSX → React.createElement() or jsx() calls
<div className="x"> → jsx("div", { className: "x" })
4. @babel/preset-env (with browserslist)
Only transforms needed for target browsers
optional chaining, class fields, etc. → ES5 equivalents
5. Custom plugins (in order of declaration)
styled-components, relay, custom instrumentation, etc.
6. @babel/generator
Modified AST → JavaScript string + source map
Total: 1 parse + N traverse passes + 1 generate
Performance insight:
@babel/traverse MERGES all visitors from all plugins
into a SINGLE traversal pass. So 10 plugins ≠ 10 tree walks.
But: each node is checked against all 10 plugins' visitors,
which still adds overhead per plugin.
Interview Q&A
Q: What is the visitor pattern and why does Babel use it?
A: The visitor pattern separates the algorithm (what to do) from the data structure (the AST). Instead of embedding transform logic in each AST node class (which would require modifying the parser), Babel lets plugins declare "I'm interested in CallExpression nodes" as a visitor method. @babel/traverse walks the tree once and dispatches to the appropriate visitors. This enables an open plugin ecosystem: anyone can add new transforms without modifying the parser or other plugins.
Q: Why does Babel NOT do type checking when it processes TypeScript?
A: Because type checking requires analyzing the entire program's type relationships — following imports, resolving generic type parameters, checking structural compatibility. This is fundamentally a whole-program analysis. Babel operates file-by-file for performance and can't resolve cross-file type dependencies. Babel's TypeScript plugin simply strips type annotations (a syntactic operation), leaving type checking to tsc or an IDE language server.
Q: How do SWC and esbuild achieve 100x speed over Babel? A: Three factors compound: (1) Native language — Rust/Go has no GC pauses, deterministic memory layout, and CPU-cache-friendly struct access. (2) Multi-threading — each file transforms independently, so Rust's rayon / Go's goroutines parallelize trivially across all CPU cores. (3) No plugin overhead — all transforms are compiled into the binary with static dispatch, eliminating Babel's dynamic visitor merging and string-based node type matching. The trade-off is extensibility: Babel's plugin system is why it's slow, but also why it's the only option for custom transforms.
Q: When would you write a codemod instead of doing find-and-replace?
A: When the transformation is semantic, not textual. Find-and-replace can't distinguish between import React from 'react' (an import declaration) and const React = "react" (a string assignment) — both contain the text "React" and "react". A codemod operates on the AST, so it can specifically target ImportDeclaration nodes where the source is 'react', ignoring string literals, comments, and variable names that happen to match. This makes codemods safe at scale — you can run them across 10,000 files with confidence.
Q: What happens when you chain multiple Babel plugins?
A: Plugins run in declaration order, but NOT as separate tree walks. @babel/traverse merges all plugin visitors into a single traversal. When the traverser reaches a CallExpression node, it calls every plugin's CallExpression visitor in order. However, if Plugin A mutates a node that Plugin B also targets, ordering matters — Plugin B sees the mutated version. This is why plugin order in your Babel config can cause subtle bugs: the same input can produce different output depending on plugin ordering.
What did you think?