Frontend Architecture
Part 8 of 11The Strangler Fig Pattern: Migrating Legacy Systems Without a Rewrite
The Strangler Fig Pattern: Migrating Legacy Systems Without a Rewrite
Introduction
Every engineering leader eventually faces the legacy system conversation. The old system is slow, painful to modify, running on outdated technology, and everyone agrees it needs to go. The tempting answer is a rewrite—start fresh, do it right this time, use modern tools.
Rewrites fail. Not always, but often enough that they've earned a reputation as career-ending projects. They take longer than estimated, the old system keeps evolving while you build the new one, features get lost in translation, and the business loses patience before you finish. Many rewrites are abandoned halfway, leaving you with two systems instead of one.
The strangler fig pattern offers a different approach: incremental migration. Instead of replacing the old system all at once, you gradually grow a new system around it, piece by piece, until the old system can be safely removed. Named after the strangler fig tree that slowly envelops its host, this pattern has enabled some of the most successful large-scale migrations in software history.
This guide covers how to apply the strangler fig pattern effectively—the strategies, the gotchas, and the practical techniques that make incremental migration work.
Why Rewrites Fail
The Rewrite Trap
THE SECOND SYSTEM EFFECT:
════════════════════════════════════════════════════════════════════
Year 1: "This system is a mess. Let's rewrite it properly."
┌─────────────────────────────────────────────────────────────────┐
│ │
│ OLD SYSTEM NEW SYSTEM │
│ ────────── ────────── │
│ │
│ ┌─────────────────┐ ┌─────────────────┐ │
│ │ 10 years of │ │ Clean slate! │ │
│ │ accumulated │ │ Modern stack! │ │
│ │ features │ ────► │ Best practices! │ │
│ │ edge cases │ │ This time we'll │ │
│ │ bug fixes │ │ do it right! │ │
│ │ business logic │ │ │ │
│ └─────────────────┘ └─────────────────┘ │
│ │
│ "Should take about 18 months." │
│ │
└─────────────────────────────────────────────────────────────────┘
Year 2: Reality sets in.
┌─────────────────────────────────────────────────────────────────┐
│ │
│ OLD SYSTEM NEW SYSTEM │
│ ────────── ────────── │
│ │
│ ┌─────────────────┐ ┌─────────────────┐ │
│ │ Still running │ │ 60% complete │ │
│ │ Still evolving │ │ Missing features│ │
│ │ New features │ ────► │ Team exhausted │ │
│ │ added by │ │ Scope creeping │ │
│ │ business demand │ │ Original devs │ │
│ │ │ │ leaving project │ │
│ └─────────────────┘ └─────────────────┘ │
│ │
│ "We need another year. Maybe 18 months." │
│ │
└─────────────────────────────────────────────────────────────────┘
Year 3+: The death spiral.
┌─────────────────────────────────────────────────────────────────┐
│ │
│ TWO SYSTEMS, NEITHER COMPLETE │
│ ──────────────────────────────── │
│ │
│ • Old system: Still in production, now even more neglected │
│ • New system: Perpetually almost done │
│ • Team: Demoralized, key people have left │
│ • Business: Lost patience, trust eroded │
│ • Budget: Exhausted │
│ │
│ Outcomes: │
│ a) Abandon rewrite, keep old system (most common) │
│ b) Force-launch incomplete new system (disasters happen) │
│ c) Keep going forever (zombie project) │
│ │
└─────────────────────────────────────────────────────────────────┘
Why Rewrites Are Harder Than They Look
THE HIDDEN COMPLEXITY ICEBERG:
════════════════════════════════════════════════════════════════════
What you see (the spec):
─────────────────────────
┌─────────────────────┐
│ Core Features │
│ - User auth │
│ - Basic CRUD │
│ - Reports │
└─────────────────────┘
What actually exists (10 years of production):
──────────────────────────────────────────────
┌─────────────────────┐
│ Core Features │
~~~~│~~~~~~~~~~~~~~~~~~~~~│~~~~ ← Water line
┌────┴─────────────────────┴────┐
│ Edge cases for every │
│ imaginable input │
├──────────────────────────────┤
│ Workarounds for upstream │
│ system quirks │
├──────────────────────────────┤
│ Business rules no one │
│ remembers documenting │
├──────────────────────────────┤
│ That weird thing for │
│ the client in Germany │
├──────────────────────────────┤
│ Tax calculation edge │
│ cases discovered in audits │
├──────────────────────────────┤
│ Performance optimizations │
│ added after incidents │
├──────────────────────────────┤
│ Integration with 15 other │
│ systems, each with quirks │
├──────────────────────────────┤
│ Features no one uses but │
│ that one VP depends on │
└──────────────────────────────┘
Every "bug" in the old system might be:
• An actual bug
• A feature someone depends on
• A workaround for another system's bug
• A compliance requirement from 2015
• The only thing preventing data corruption
You won't know until you break it in production.
The Moving Target Problem
THE REWRITE RACE:
════════════════════════════════════════════════════════════════════
Month 0: Start rewrite
─────────────────────
Old System Features: █████████████████████░░░░░░░░░░ (existing)
New System Features: ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ (none yet)
Month 12: Progress!
───────────────────
Old System Features: █████████████████████████░░░░░░ (business added more)
New System Features: ██████████████░░░░░░░░░░░░░░░░ (catching up)
Gap is the same. Old system didn't freeze.
Month 24: Still going...
────────────────────────
Old System Features: ████████████████████████████░░░ (still growing)
New System Features: ████████████████████░░░░░░░░░░ (still behind)
You're running to stand still.
THE UNCOMFORTABLE TRUTH:
────────────────────────
The old system handles 100% of production traffic.
It's the source of truth for how the business actually works.
Every day it runs, it accumulates more institutional knowledge.
Every day you spend on the rewrite, it falls further behind.
Unless you can freeze the old system (you usually can't),
you're chasing a moving target.
The Strangler Fig Pattern
The Metaphor
THE STRANGLER FIG TREE:
════════════════════════════════════════════════════════════════════
In nature, strangler fig seeds land on host trees and begin to grow.
Stage 1: SEED
─────────────
The fig sends roots down alongside the host tree.
The host tree is unaffected.
🌱 ← Tiny fig seedling
│
┌─────┴─────┐
│ HOST │
│ TREE │
│ │
└───────────┘
Stage 2: GROWTH
───────────────
The fig grows, sending more roots down.
It starts to handle some of its own needs.
🌿🌿🌿🌿 ← Growing fig
╲│╱│
┌────┴┴────┐
│ HOST │
│ TREE │
│ │
└───────────┘
Stage 3: ENCLOSURE
──────────────────
The fig surrounds the host tree.
Both still function.
🌳🌳🌳🌳🌳🌳
╲╲ │╱╱
┌───╲─┴─╱───┐
│╲ HOST ╱│
│ ╲ TREE ╱ │
│ ╲ ╱ │
└───╲───╱───┘
Stage 4: REPLACEMENT
────────────────────
The host tree eventually dies and decomposes.
The fig stands on its own.
🌳🌳🌳🌳🌳🌳
╲╲ │╱╱
┌───╲───╱───┐
│ ╲ ╱ │
│ ╲ ╱ │ ← Hollow where host was
│ V │
└───────────┘
THE SOFTWARE PARALLEL:
──────────────────────
• Seed: New system starts small, beside the old
• Growth: New system handles more and more requests
• Enclosure: New system surrounds the old, handling most traffic
• Replacement: Old system is removed, new system stands alone
At no point does everything stop working.
The Pattern in Software
STRANGLER FIG IN SOFTWARE SYSTEMS:
════════════════════════════════════════════════════════════════════
ARCHITECTURE OVERVIEW:
──────────────────────
┌─────────────────────────────────────────────────────────────────┐
│ CLIENTS │
│ │ │
│ ▼ │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ FACADE / PROXY │ │
│ │ (Routes requests to old or new system) │ │
│ └────────────────────────────────────────────────────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌─────────────────────────┐ ┌─────────────────────────┐ │
│ │ OLD SYSTEM │ │ NEW SYSTEM │ │
│ │ │ │ │ │
│ │ ███ Feature A │ │ ░░░ Feature A (TODO) │ │
│ │ ███ Feature B │ │ ███ Feature B (DONE!) │ │
│ │ ███ Feature C │ │ ███ Feature C (DONE!) │ │
│ │ ███ Feature D │ │ ░░░ Feature D (TODO) │ │
│ │ ███ Feature E │ │ ░░░ Feature E (TODO) │ │
│ │ │ │ │ │
│ └─────────────────────────┘ └─────────────────────────┘ │
│ │
│ ███ = Active (handling traffic) │
│ ░░░ = Not yet implemented │
│ │
└─────────────────────────────────────────────────────────────────┘
THE KEY INSIGHT:
────────────────
The facade/proxy routes each request to the appropriate system.
• Feature B request → New system
• Feature A request → Old system
You migrate one feature at a time.
Each migration is small, testable, reversible.
PROGRESSION:
────────────
Phase 1: New system handles 0%
┌──────────────────────────────────────────────────────────────┐
│ Old: █████████████████████████████████████████████████ 100% │
│ New: ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 0% │
└──────────────────────────────────────────────────────────────┘
Phase 2: New system handles 20%
┌──────────────────────────────────────────────────────────────┐
│ Old: ████████████████████████████████████████░░░░░░░░░ 80% │
│ New: ██████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 20% │
└──────────────────────────────────────────────────────────────┘
Phase 3: New system handles 60%
┌──────────────────────────────────────────────────────────────┐
│ Old: ████████████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 40% │
│ New: ██████████████████████████████░░░░░░░░░░░░░░░░░░░ 60% │
└──────────────────────────────────────────────────────────────┘
Phase 4: New system handles 100%
┌──────────────────────────────────────────────────────────────┐
│ Old: ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 0% │
│ New: █████████████████████████████████████████████████ 100% │
└──────────────────────────────────────────────────────────────┘
Phase 5: Old system decommissioned
Core Principles
STRANGLER FIG PRINCIPLES:
════════════════════════════════════════════════════════════════════
┌─────────────────────────────────────────────────────────────────┐
│ │
│ 1. INCREMENTAL MIGRATION │
│ ───────────────────────────────────────────────────────────── │
│ Move one piece at a time. │
│ Each piece is a small, manageable project. │
│ Complete and deploy each piece before starting the next. │
│ │
│ Bad: Migrate everything in one big bang. │
│ Good: Migrate user authentication. Then billing. Then orders. │
│ │
├─────────────────────────────────────────────────────────────────┤
│ │
│ 2. ALWAYS WORKING SOFTWARE │
│ ───────────────────────────────────────────────────────────── │
│ The system works at every step. │
│ There's no "we'll be down for the migration." │
│ Users don't notice (except things get better). │
│ │
│ Bad: "We're migrating this weekend, expect 48 hours downtime"│
│ Good: "Feature X is now on the new system" (users don't care) │
│ │
├─────────────────────────────────────────────────────────────────┤
│ │
│ 3. REVERSIBILITY │
│ ───────────────────────────────────────────────────────────── │
│ Every migration can be rolled back. │
│ If the new implementation has issues, route back to old. │
│ Lower risk = more confidence = faster progress. │
│ │
│ Bad: "We've burned the boats, no going back." │
│ Good: "If this fails, flip the switch and traffic goes to old"│
│ │
├─────────────────────────────────────────────────────────────────┤
│ │
│ 4. PARALLEL RUNNING │
│ ───────────────────────────────────────────────────────────── │
│ Old and new systems run simultaneously. │
│ Can compare outputs, verify correctness. │
│ New system proves itself before taking over. │
│ │
│ Bad: Trust the new system because the tests pass. │
│ Good: Run both, compare results, gain confidence. │
│ │
├─────────────────────────────────────────────────────────────────┤
│ │
│ 5. VALUE THROUGHOUT │
│ ───────────────────────────────────────────────────────────── │
│ Each step delivers value. │
│ You can stop at any point with working software. │
│ No "we need 18 more months before anything works." │
│ │
│ Bad: "We'll have benefits after the migration is complete." │
│ Good: "This week's migration improved latency by 40%." │
│ │
└─────────────────────────────────────────────────────────────────┘
Implementation Strategies
The Facade Pattern
THE STRANGLER FACADE:
════════════════════════════════════════════════════════════════════
The facade sits in front of everything, routing requests.
┌─────────────────────────────────────────────────────────────────┐
│ │
│ Implementation Options: │
│ │
│ 1. REVERSE PROXY (Nginx, HAProxy, API Gateway) │
│ ───────────────────────────────────────────────────────────── │
│ │
│ Client Request │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────┐ │
│ │ Nginx / Kong │ │
│ │ │ │
│ │ /api/users/* → new-system:8080 │ │
│ │ /api/orders/* → old-system:3000 │ │
│ │ /api/products/* → old-system:3000 │ │
│ │ │ │
│ └─────────────────────────────────────────┘ │
│ │
│ Pros: No code changes, infrastructure level │
│ Cons: Limited to URL-based routing │
│ │
│ 2. APPLICATION-LEVEL ROUTER │
│ ───────────────────────────────────────────────────────────── │
│ │
│ Client Request │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────┐ │
│ │ Facade Application │ │
│ │ │ │
│ │ if (featureFlag.newUsers) { │ │
│ │ proxy(newSystem, request); │ │
│ │ } else { │ │
│ │ proxy(oldSystem, request); │ │
│ │ } │ │
│ │ │ │
│ └─────────────────────────────────────────┘ │
│ │
│ Pros: Complex routing logic, feature flags │
│ Cons: Another service to maintain │
│ │
│ 3. LIBRARY/MODULE WITHIN OLD SYSTEM │
│ ───────────────────────────────────────────────────────────── │
│ │
│ ┌─────────────────────────────────────────┐ │
│ │ Old System │ │
│ │ ┌───────────────────────────────────┐ │ │
│ │ │ Strangler Module │ │ │
│ │ │ │ │ │
│ │ │ intercept(/users/*) { │ │ │
│ │ │ return callNewSystem(req); │ │ │
│ │ │ } │ │ │
│ │ └───────────────────────────────────┘ │ │
│ │ │ │
│ │ (Old code still here, not called) │ │
│ │ │ │
│ └─────────────────────────────────────────┘ │
│ │
│ Pros: No infrastructure changes │
│ Cons: Requires modifying old system │
│ │
└─────────────────────────────────────────────────────────────────┘
Routing Strategies
// ROUTING IMPLEMENTATION EXAMPLES:
// ═══════════════════════════════════════════════════════════════
// ─────────────────────────────────────────────────────────────────
// STRATEGY 1: URL-BASED ROUTING (Nginx)
// ─────────────────────────────────────────────────────────────────
// nginx.conf
upstream old_system {
server old-app:3000;
}
upstream new_system {
server new-app:8080;
}
server {
listen 80;
# Migrated endpoints go to new system
location /api/v1/users {
proxy_pass http://new_system;
}
location /api/v1/auth {
proxy_pass http://new_system;
}
# Everything else goes to old system
location / {
proxy_pass http://old_system;
}
}
// ─────────────────────────────────────────────────────────────────
// STRATEGY 2: FEATURE FLAG ROUTING
// ─────────────────────────────────────────────────────────────────
// facade/router.ts
import { featureFlags } from './feature-flags';
const routingRules: RoutingRule[] = [
{
pattern: '/api/users/*',
condition: () => featureFlags.isEnabled('new-user-service'),
newSystem: 'http://new-user-service:8080',
oldSystem: 'http://legacy-monolith:3000',
},
{
pattern: '/api/orders/*',
condition: (req) => {
// Gradual rollout: new system for 10% of users
const userId = extractUserId(req);
return featureFlags.isEnabledForUser('new-order-service', userId);
},
newSystem: 'http://new-order-service:8080',
oldSystem: 'http://legacy-monolith:3000',
},
];
async function routeRequest(req: Request): Promise<Response> {
for (const rule of routingRules) {
if (matchesPattern(req.url, rule.pattern)) {
const target = rule.condition(req) ? rule.newSystem : rule.oldSystem;
return proxyRequest(req, target);
}
}
// Default to old system
return proxyRequest(req, 'http://legacy-monolith:3000');
}
// ─────────────────────────────────────────────────────────────────
// STRATEGY 3: HEADER-BASED ROUTING
// ─────────────────────────────────────────────────────────────────
// Route based on custom header (useful for testing)
server {
location /api/ {
# If header present, route to new system
if ($http_x_use_new_system = "true") {
proxy_pass http://new_system;
break;
}
# Otherwise, old system
proxy_pass http://old_system;
}
}
// Developers can test new system:
// curl -H "X-Use-New-System: true" https://api.example.com/users/123
// ─────────────────────────────────────────────────────────────────
// STRATEGY 4: PERCENTAGE-BASED ROLLOUT
// ─────────────────────────────────────────────────────────────────
// Gradually shift traffic from old to new
interface TrafficSplit {
endpoint: string;
newSystemPercentage: number; // 0-100
}
const trafficSplits: TrafficSplit[] = [
{ endpoint: '/api/users/*', newSystemPercentage: 100 }, // Fully migrated
{ endpoint: '/api/orders/*', newSystemPercentage: 25 }, // 25% to new
{ endpoint: '/api/products/*', newSystemPercentage: 0 }, // Not started
];
function routeByPercentage(req: Request): string {
const split = trafficSplits.find(s => matchesPattern(req.url, s.endpoint));
if (!split) return OLD_SYSTEM;
// Use consistent hashing so same user always goes to same system
const userId = extractUserId(req);
const hash = consistentHash(userId, 100);
return hash < split.newSystemPercentage ? NEW_SYSTEM : OLD_SYSTEM;
}
// ─────────────────────────────────────────────────────────────────
// STRATEGY 5: CANARY WITH AUTOMATIC ROLLBACK
// ─────────────────────────────────────────────────────────────────
interface CanaryConfig {
endpoint: string;
newSystemPercentage: number;
errorThreshold: number; // Rollback if error rate exceeds this
}
class CanaryRouter {
private errorCounts: Map<string, { old: number; new: number }> = new Map();
private requestCounts: Map<string, { old: number; new: number }> = new Map();
async route(req: Request, config: CanaryConfig): Promise<Response> {
const useNew = this.shouldUseNew(req, config);
const target = useNew ? NEW_SYSTEM : OLD_SYSTEM;
try {
const response = await proxyRequest(req, target);
this.recordRequest(config.endpoint, useNew, response.ok);
// Check if we should auto-rollback
if (useNew && this.shouldRollback(config)) {
console.error(`Auto-rollback triggered for ${config.endpoint}`);
this.disableCanary(config.endpoint);
}
return response;
} catch (error) {
this.recordRequest(config.endpoint, useNew, false);
throw error;
}
}
private shouldRollback(config: CanaryConfig): boolean {
const counts = this.requestCounts.get(config.endpoint);
const errors = this.errorCounts.get(config.endpoint);
if (!counts || !errors || counts.new < 100) return false; // Need sample size
const errorRate = errors.new / counts.new;
return errorRate > config.errorThreshold;
}
}
Data Migration Strategies
DATA MIGRATION IN STRANGLER FIG:
════════════════════════════════════════════════════════════════════
The hardest part: both systems need data.
STRATEGY 1: SHARED DATABASE (Temporary)
───────────────────────────────────────
┌─────────────────────────────────────────────────────────────────┐
│ │
│ ┌─────────────────┐ ┌─────────────────┐ │
│ │ Old System │ │ New System │ │
│ └────────┬────────┘ └────────┬────────┘ │
│ │ │ │
│ └──────────────┬─────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────┐ │
│ │ Shared Database │ │
│ │ (Legacy Schema) │ │
│ └─────────────────────┘ │
│ │
│ Pros: Simple, no data sync issues │
│ Cons: New system stuck with old schema, coupling │
│ │
│ Use when: Migrating application logic, not data model │
│ │
└─────────────────────────────────────────────────────────────────┘
STRATEGY 2: DATABASE PER SYSTEM + SYNC
──────────────────────────────────────
┌─────────────────────────────────────────────────────────────────┐
│ │
│ ┌─────────────────┐ ┌─────────────────┐ │
│ │ Old System │ │ New System │ │
│ └────────┬────────┘ └────────┬────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌─────────────────┐ sync ┌─────────────────┐ │
│ │ Old Database │◄────────────►│ New Database │ │
│ │ (Legacy) │ │ (Modern) │ │
│ └─────────────────┘ └─────────────────┘ │
│ │
│ Sync options: │
│ • Change Data Capture (CDC) - Debezium, etc. │
│ • Dual writes (write to both) │
│ • Event sourcing │
│ • Batch sync jobs │
│ │
│ Pros: New system has clean schema, decoupled │
│ Cons: Sync complexity, consistency challenges │
│ │
└─────────────────────────────────────────────────────────────────┘
STRATEGY 3: EVENT-DRIVEN SYNC
─────────────────────────────
┌─────────────────────────────────────────────────────────────────┐
│ │
│ ┌─────────────────┐ ┌─────────────────┐ │
│ │ Old System │──┐ ┌────│ New System │ │
│ └─────────────────┘ │ │ └─────────────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌─────────────────────┐ │
│ │ Event Bus │ │
│ │ (Kafka, etc.) │ │
│ └─────────────────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌─────────────────┐ ┌─────────────────┐ │
│ │ Old Database │ │ New Database │ │
│ └─────────────────┘ └─────────────────┘ │
│ │
│ Both systems publish events on changes. │
│ Both systems consume events to stay in sync. │
│ │
│ Pros: Loose coupling, auditable, replayable │
│ Cons: Eventual consistency, complexity │
│ │
└─────────────────────────────────────────────────────────────────┘
// DATA SYNC IMPLEMENTATION:
// ═══════════════════════════════════════════════════════════════
// ─────────────────────────────────────────────────────────────────
// DUAL WRITE PATTERN (Simple but has issues)
// ─────────────────────────────────────────────────────────────────
// In the new system, write to both databases
async function createUser(userData: UserInput): Promise<User> {
// Start transaction in new database
const newDbTx = await newDb.transaction();
try {
// Write to new database (source of truth)
const user = await newDbTx.users.create(userData);
// Also write to old database (for old system)
await oldDb.users.create(transformToLegacyFormat(user));
await newDbTx.commit();
return user;
} catch (error) {
await newDbTx.rollback();
throw error;
}
}
// Problem: What if old DB write fails after new DB commits?
// Solution: Use outbox pattern
// ─────────────────────────────────────────────────────────────────
// OUTBOX PATTERN (More robust)
// ─────────────────────────────────────────────────────────────────
async function createUser(userData: UserInput): Promise<User> {
// Single transaction
const tx = await newDb.transaction();
try {
// Write user
const user = await tx.users.create(userData);
// Write to outbox (same transaction)
await tx.outbox.create({
type: 'USER_CREATED',
payload: user,
processed: false,
});
await tx.commit();
return user;
} catch (error) {
await tx.rollback();
throw error;
}
}
// Separate process reads outbox and syncs to old system
async function processOutbox() {
const events = await newDb.outbox.findMany({
where: { processed: false },
orderBy: { createdAt: 'asc' },
});
for (const event of events) {
try {
await syncToOldSystem(event);
await newDb.outbox.update({
where: { id: event.id },
data: { processed: true },
});
} catch (error) {
// Log and retry later
console.error(`Failed to process outbox event ${event.id}`, error);
}
}
}
// ─────────────────────────────────────────────────────────────────
// CHANGE DATA CAPTURE (CDC)
// ─────────────────────────────────────────────────────────────────
// Using Debezium to capture changes from old database
// and sync to new database
// debezium-config.json
{
"name": "legacy-db-connector",
"config": {
"connector.class": "io.debezium.connector.postgresql.PostgresConnector",
"database.hostname": "legacy-db",
"database.port": "5432",
"database.user": "debezium",
"database.password": "secret",
"database.dbname": "legacy_app",
"table.include.list": "public.users,public.orders",
"topic.prefix": "legacy"
}
}
// Consumer in new system
async function handleCdcEvent(event: CdcEvent) {
switch (event.op) {
case 'c': // Create
await newDb.users.create(transformFromLegacy(event.after));
break;
case 'u': // Update
await newDb.users.update({
where: { legacyId: event.after.id },
data: transformFromLegacy(event.after),
});
break;
case 'd': // Delete
await newDb.users.delete({
where: { legacyId: event.before.id },
});
break;
}
}
Step-by-Step Migration
Phase 1: Preparation
PHASE 1: PREPARATION (Weeks 1-4)
════════════════════════════════════════════════════════════════════
□ UNDERSTAND THE LEGACY SYSTEM
─────────────────────────────
• Document current architecture
• Map all integrations and dependencies
• Identify data flows
• Find the "seams" (natural boundaries)
• Talk to people who've been there longest
Deliverable: Architecture diagram, dependency map
□ IDENTIFY MIGRATION CANDIDATES
─────────────────────────────
Rank features by:
• Independence (fewer dependencies = easier)
• Business value (impact if improved)
• Risk (what if migration fails)
• Team familiarity
Start with: High independence, high value, low risk
┌─────────────────────────────────────────────────────────────┐
│ │
│ GOOD FIRST CANDIDATES: BAD FIRST CANDIDATES: │
│ ───────────────────── ───────────────────── │
│ • User authentication • Core transaction │
│ • Reporting/analytics • Payment processing │
│ • Notification service • Data migration │
│ • Search functionality • Tightly coupled │
│ • File processing features │
│ │
└─────────────────────────────────────────────────────────────┘
□ SET UP THE FACADE
─────────────────
• Deploy reverse proxy or API gateway
• Route all traffic through it (to old system)
• Verify no behavior changes
• Set up monitoring and logging
Deliverable: Facade routing 100% to old system
□ ESTABLISH METRICS
─────────────────
Define success criteria for migrations:
• Response time (p50, p95, p99)
• Error rate
• Throughput
• Business metrics (conversions, etc.)
Capture baselines from old system.
Deliverable: Dashboard with current metrics
Phase 2: First Migration
PHASE 2: FIRST MIGRATION (Weeks 5-12)
════════════════════════════════════════════════════════════════════
□ BUILD THE NEW COMPONENT
────────────────────────
Build the first feature in the new system.
Mirror the API contract of the old system.
// Old system endpoint
GET /api/v1/users/:id
Response: { id, name, email, created_at }
// New system endpoint (same contract!)
GET /api/v1/users/:id
Response: { id, name, email, created_at }
Don't change the API yet. That's a separate project.
□ SHADOW TESTING
──────────────
Route traffic to BOTH systems, compare responses.
┌─────────────────────────────────────────────────────────────┐
│ │
│ Request ──► Facade │
│ │ │
│ ├──────► Old System ──► Response to user │
│ │ │
│ └──────► New System ──► Log only (compare) │
│ │
└─────────────────────────────────────────────────────────────┘
Compare:
• Are responses identical?
• Are there edge cases where they differ?
• Is new system faster/slower?
Fix discrepancies before proceeding.
□ CANARY DEPLOYMENT
─────────────────
Route small percentage of real traffic to new system.
Week 1: 1% of traffic
Week 2: 5% of traffic
Week 3: 25% of traffic
Week 4: 50% of traffic
Week 5: 100% of traffic
At each step:
• Monitor error rates
• Compare response times
• Check business metrics
• Be ready to rollback
□ DECOMMISSION OLD COMPONENT
──────────────────────────
Once stable at 100%:
• Remove routing to old component
• Mark old code as deprecated
• (Don't delete yet—keep for reference)
□ RETROSPECTIVE
─────────────
What worked? What didn't?
How can we improve for next migration?
Deliverable: Lessons learned, updated process
Phase 3: Continued Migration
PHASE 3: CONTINUED MIGRATION (Months 3-12+)
════════════════════════════════════════════════════════════════════
Repeat for each component:
┌─────────────────────────────────────────────────────────────────┐
│ │
│ MIGRATION LOOP: │
│ ─────────────── │
│ │
│ ┌─────────┐ │
│ │ Plan │ Identify next component, define scope │
│ └────┬────┘ │
│ │ │
│ ▼ │
│ ┌─────────┐ │
│ │ Build │ Implement in new system │
│ └────┬────┘ │
│ │ │
│ ▼ │
│ ┌─────────┐ │
│ │ Test │ Shadow testing, verify correctness │
│ └────┬────┘ │
│ │ │
│ ▼ │
│ ┌─────────┐ │
│ │ Deploy │ Canary rollout, monitor │
│ └────┬────┘ │
│ │ │
│ ▼ │
│ ┌─────────┐ │
│ │Validate │ Verify metrics, business outcomes │
│ └────┬────┘ │
│ │ │
│ ▼ │
│ ┌─────────┐ │
│ │ Clean │ Remove old code, update docs │
│ └────┬────┘ │
│ │ │
│ └──────────────► Next component │
│ │
└─────────────────────────────────────────────────────────────────┘
VELOCITY EXPECTATIONS:
──────────────────────
• First migration: Slow (learning, setting up infrastructure)
• Migrations 2-5: Faster (patterns established)
• Later migrations: Can parallelize across teams
Timeline example:
• Component 1: 8 weeks (includes setup)
• Components 2-5: 4 weeks each
• Components 6+: 2-3 weeks each (parallelized)
PROGRESS TRACKING:
──────────────────
Migration Progress Dashboard
────────────────────────────
Component Status Traffic Split
────────────────────────────────────────────────
User Service ✓ Complete 100% new
Auth Service ✓ Complete 100% new
Product Catalog ● In Progress 45% new
Order Service ○ Planned 0% new
Payment Service ○ Planned 0% new
Reporting ○ Planned 0% new
Notifications ○ Backlog 0% new
Overall Progress: ███████░░░░░░░░░░░░░ 35%
Phase 4: Decommissioning
PHASE 4: DECOMMISSIONING THE LEGACY SYSTEM
════════════════════════════════════════════════════════════════════
When all components are migrated:
□ VERIFY ZERO TRAFFIC TO OLD SYSTEM
─────────────────────────────────
Check logs and metrics.
Are there ANY requests still going to old system?
Hidden integrations? Cron jobs? Batch processes?
□ ANNOUNCE DEPRECATION
────────────────────
Internal communication:
• Old system will be turned off on [date]
• Speak now if you have dependencies we don't know about
□ TURN OFF (BUT KEEP READY)
─────────────────────────
Week 1: Stop old system, but keep infrastructure
Week 2-4: Monitor for any issues
Month 2: If no issues, tear down infrastructure
□ ARCHIVE CODE AND DATA
─────────────────────
Don't delete—archive:
• Source code repository (read-only)
• Database backups
• Documentation
You may need it for legal/compliance reasons.
□ CELEBRATE
─────────
This is a big deal. Recognize the team.
┌─────────────────────────────────────────────────────────────┐
│ │
│ 🎉 LEGACY SYSTEM DECOMMISSIONED 🎉 │
│ │
│ Timeline: 14 months │
│ Components migrated: 12 │
│ Downtime: 0 │
│ Customer-visible incidents: 0 │
│ │
│ Improvements delivered: │
│ • 60% reduction in response time │
│ • 40% reduction in infrastructure cost │
│ • Modern tech stack enabling faster development │
│ • Team can now ship features 3x faster │
│ │
└─────────────────────────────────────────────────────────────┘
Common Challenges and Solutions
Challenge: Tightly Coupled Components
PROBLEM: COMPONENTS THAT CAN'T BE SEPARATED
════════════════════════════════════════════════════════════════════
"We can't migrate Users without also migrating Orders, which
requires Products, which needs Inventory..."
┌─────────────────────────────────────────────────────────────────┐
│ │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
│ │ Users │◄────►│ Orders │◄────►│ Products│ │
│ └────┬────┘ └────┬────┘ └────┬────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
│ │ Billing │◄────►│Inventory│◄────►│ Pricing │ │
│ └─────────┘ └─────────┘ └─────────┘ │
│ │
│ Everything depends on everything! │
│ │
└─────────────────────────────────────────────────────────────────┘
SOLUTION 1: ANTI-CORRUPTION LAYER
─────────────────────────────────
Create an adapter that translates between old and new systems.
┌─────────────────────────────────────────────────────────────────┐
│ │
│ ┌─────────────────┐ ┌─────────────────┐ │
│ │ New Users │ │ Old Orders │ │
│ │ Service │ │ (Legacy) │ │
│ └────────┬────────┘ └────────┬────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ ANTI-CORRUPTION LAYER │ │
│ │ │ │
│ │ • Translates new User format to legacy format │ │
│ │ • Translates legacy Order format to new format │ │
│ │ • Hides complexity from both systems │ │
│ │ │ │
│ └──────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
// Anti-corruption layer example
class UserServiceAdapter {
constructor(
private newUserService: NewUserService,
private legacyUserService: LegacyUserService,
) {}
// New system calls this
async getUserForOrder(userId: string): Promise<LegacyUserFormat> {
const newUser = await this.newUserService.getUser(userId);
return this.transformToLegacyFormat(newUser);
}
// Legacy system calls this
async getLegacyUser(userId: string): Promise<LegacyUserFormat> {
const newUser = await this.newUserService.getUser(userId);
return this.transformToLegacyFormat(newUser);
}
private transformToLegacyFormat(user: NewUser): LegacyUserFormat {
return {
user_id: user.id,
user_name: `${user.firstName} ${user.lastName}`,
email_address: user.email,
// ... legacy field mappings
};
}
}
SOLUTION 2: BRANCH BY ABSTRACTION
─────────────────────────────────
1. Create abstraction interface in old codebase
2. Implement interface with old code
3. Create new implementation
4. Switch implementations via feature flag
// Step 1: Define abstraction
interface UserRepository {
findById(id: string): Promise<User>;
save(user: User): Promise<void>;
}
// Step 2: Wrap legacy code
class LegacyUserRepository implements UserRepository {
async findById(id: string): Promise<User> {
// Old implementation
return this.legacyDb.query('SELECT * FROM users WHERE id = ?', [id]);
}
}
// Step 3: Create new implementation
class NewUserRepository implements UserRepository {
async findById(id: string): Promise<User> {
// New implementation (calls new service)
return this.httpClient.get(`/new-user-service/users/${id}`);
}
}
// Step 4: Switch via feature flag
function getUserRepository(): UserRepository {
if (featureFlags.isEnabled('new-user-service')) {
return new NewUserRepository();
}
return new LegacyUserRepository();
}
Challenge: Data Consistency
PROBLEM: KEEPING DATA IN SYNC BETWEEN OLD AND NEW
════════════════════════════════════════════════════════════════════
During migration, data may be written to either system.
Both systems need consistent data.
SOLUTION: WRITE-THROUGH WITH SINGLE SOURCE OF TRUTH
───────────────────────────────────────────────────
Define which system owns each data type during transition.
Phase 1: Old system is source of truth
┌─────────────────────────────────────────────────────────────────┐
│ │
│ Write ──► Old System ──► Sync ──► New System (read replica) │
│ │
│ New system reads from its own DB but data comes from old. │
│ │
└─────────────────────────────────────────────────────────────────┘
Phase 2: New system becomes source of truth
┌─────────────────────────────────────────────────────────────────┐
│ │
│ Write ──► New System ──► Sync ──► Old System (for reads) │
│ │
│ Old system still works but data comes from new. │
│ │
└─────────────────────────────────────────────────────────────────┘
Phase 3: Old system decommissioned
┌─────────────────────────────────────────────────────────────────┐
│ │
│ Write ──► New System │
│ │
│ No sync needed. Old system gone. │
│ │
└─────────────────────────────────────────────────────────────────┘
VERIFICATION: COMPARE AND ALERT
───────────────────────────────
// Periodically compare data between systems
async function verifyDataConsistency() {
const sampleUserIds = await getRandomUserIds(1000);
for (const userId of sampleUserIds) {
const oldUser = await oldSystem.getUser(userId);
const newUser = await newSystem.getUser(userId);
const differences = findDifferences(oldUser, newUser);
if (differences.length > 0) {
alertDataInconsistency({
userId,
differences,
oldUser,
newUser,
});
}
}
}
// Run hourly during migration
schedule('0 * * * *', verifyDataConsistency);
Challenge: Team Resistance
PROBLEM: "WE DON'T HAVE TIME FOR MIGRATION"
════════════════════════════════════════════════════════════════════
Product wants features. Engineering wants to migrate.
Migration doesn't ship features.
SOLUTION: INTEGRATE MIGRATION WITH FEATURE WORK
───────────────────────────────────────────────
Don't: "Spend 6 months on migration, then resume features."
Do: "Each feature touches the new system, expanding it."
┌─────────────────────────────────────────────────────────────────┐
│ │
│ Product request: "Add user preferences feature" │
│ │
│ Without strangler: │
│ → Build preferences in old system │
│ → Migration delayed │
│ │
│ With strangler: │
│ → Build preferences in new system │
│ → Migrate user profiles as part of feature │
│ → Feature delivered + migration progressed │
│ │
└─────────────────────────────────────────────────────────────────┘
THE STRANGLER "TAX":
────────────────────
Frame migration as a tax on feature work:
"Every new feature will be built in the new system.
This adds ~20% overhead initially.
But it means we're always making progress on migration.
And new features are better/faster because they're in the new system."
20% overhead now vs. 2-year rewrite later.
MAKE MIGRATION INVISIBLE:
─────────────────────────
To product/business stakeholders, focus on outcomes:
"The login flow is now 3x faster." (Because we migrated it)
"We can now add payment methods in days, not weeks."
"The new search is way more accurate."
Don't say: "We migrated the auth service to a new microservice
using event-driven architecture..."
Anti-Patterns
What Not to Do
STRANGLER FIG ANTI-PATTERNS:
════════════════════════════════════════════════════════════════════
┌─────────────────────────────────────────────────────────────────┐
│ │
│ 1. THE BIG BANG DISGUISED AS STRANGLER │
│ ───────────────────────────────────────────────────────────── │
│ │
│ Bad: "We'll migrate all 50 components in parallel, │
│ then flip the switch on launch day." │
│ │
│ This is still a big bang rewrite. │
│ You just added a proxy in front. │
│ │
│ Fix: Migrate and launch ONE component at a time. │
│ │
├─────────────────────────────────────────────────────────────────┤
│ │
│ 2. THE ETERNAL MIGRATION │
│ ───────────────────────────────────────────────────────────── │
│ │
│ Bad: Migration started 3 years ago. │
│ We're 60% done. We'll finish "eventually." │
│ Both systems are now equally legacy. │
│ │
│ The "new" system is now 3 years old too. │
│ │
│ Fix: Set deadlines. Staff appropriately. Finish it. │
│ A 12-month migration is better than a 36-month one. │
│ │
├─────────────────────────────────────────────────────────────────┤
│ │
│ 3. THE DUAL MAINTENANCE HELL │
│ ───────────────────────────────────────────────────────────── │
│ │
│ Bad: Adding features to BOTH old and new systems. │
│ "We need this feature now, can't wait for migration." │
│ Now we have divergent systems. │
│ │
│ Fix: New features ONLY in new system. │
│ If you need it in old system, migrate that component. │
│ │
├─────────────────────────────────────────────────────────────────┤
│ │
│ 4. THE SCOPE CREEP │
│ ───────────────────────────────────────────────────────────── │
│ │
│ Bad: "While we're migrating users, let's also redesign the │
│ data model, add new features, change the API, │
│ and switch databases." │
│ │
│ Migration scope creep kills projects. │
│ │
│ Fix: Migration should be a MIGRATION. │
│ Same functionality, new implementation. │
│ Improvements come AFTER migration is done. │
│ │
├─────────────────────────────────────────────────────────────────┤
│ │
│ 5. THE INVISIBLE PROGRESS │
│ ───────────────────────────────────────────────────────────── │
│ │
│ Bad: Leadership doesn't see value. │
│ "What did we get for 6 months of migration work?" │
│ Budget gets cut. Migration stalls. │
│ │
│ Fix: Make progress visible. │
│ Dashboard with traffic percentages. │
│ Regular updates on improvements delivered. │
│ Celebrate each component migration. │
│ │
├─────────────────────────────────────────────────────────────────┤
│ │
│ 6. THE ABANDONED OLD SYSTEM │
│ ───────────────────────────────────────────────────────────── │
│ │
│ Bad: No one maintains the old system anymore. │
│ It's "temporary" so we don't fix bugs. │
│ But 80% of traffic still goes there. │
│ │
│ Fix: Old system stays production quality until decommissioned.│
│ It's still serving customers! │
│ │
└─────────────────────────────────────────────────────────────────┘
Quick Reference
┌─────────────────────────────────────────────────────────────────────┐
│ STRANGLER FIG QUICK REFERENCE │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ CORE PRINCIPLES │
│ ───────────────────────────────────────────────────────────────── │
│ 1. Incremental migration - One component at a time │
│ 2. Always working - System functions throughout │
│ 3. Reversible - Can rollback any migration │
│ 4. Parallel running - Old and new run simultaneously │
│ 5. Value throughout - Each step delivers improvement │
│ │
│ FACADE OPTIONS │
│ ───────────────────────────────────────────────────────────────── │
│ • Reverse proxy (Nginx, HAProxy) - URL-based routing │
│ • API Gateway (Kong, AWS API GW) - More features │
│ • Application proxy - Complex routing logic │
│ • In-app module - No infrastructure changes │
│ │
│ ROUTING STRATEGIES │
│ ───────────────────────────────────────────────────────────────── │
│ • URL-based: /users/* → new, /orders/* → old │
│ • Feature flag: if (flag.enabled) → new │
│ • Percentage: 10% → new, 90% → old │
│ • Header-based: X-Use-New-System: true │
│ • User-based: Specific users → new (for testing) │
│ │
│ DATA STRATEGIES │
│ ───────────────────────────────────────────────────────────────── │
│ • Shared database - Simple, but couples systems │
│ • Dual writes - Write to both, sync issues │
│ • Outbox pattern - Reliable dual writes │
│ • CDC (Debezium) - Capture and sync changes │
│ • Event sourcing - Events are source of truth │
│ │
│ MIGRATION ORDER CRITERIA │
│ ───────────────────────────────────────────────────────────────── │
│ Good first: | Bad first: │
│ ────────────────────────────────────────────────── │
│ • Independent | • Tightly coupled │
│ • Well-understood | • Core business logic │
│ • Low risk | • Payment/financial │
│ • High pain | • Complex data model │
│ • Stateless | • Stateful │
│ │
│ VALIDATION CHECKLIST │
│ ───────────────────────────────────────────────────────────────── │
│ Before 100% traffic to new system: │
│ □ Shadow testing passed (responses match) │
│ □ Canary at 1%, 5%, 25%, 50% successful │
│ □ Error rate equal or better than old │
│ □ Latency equal or better than old │
│ □ Business metrics unchanged │
│ □ Rollback tested and works │
│ │
│ RED FLAGS │
│ ───────────────────────────────────────────────────────────────── │
│ ✗ "We'll migrate everything at once" │
│ ✗ Migration taking > 18 months │
│ ✗ Building features in both systems │
│ ✗ Scope creeping beyond migration │
│ ✗ No visibility into progress │
│ ✗ Old system neglected while still in use │
│ │
│ TYPICAL TIMELINE │
│ ───────────────────────────────────────────────────────────────── │
│ Preparation: 2-4 weeks │
│ First component: 6-10 weeks │
│ Subsequent components: 2-4 weeks each │
│ Full migration: 6-18 months (depends on system size) │
│ Decommissioning: 2-4 weeks after last component │
│ │
└─────────────────────────────────────────────────────────────────────┘
Conclusion
The strangler fig pattern isn't glamorous. There's no big reveal, no dramatic switchover, no "we rebuilt everything in React." It's a long game of incremental progress, careful validation, and gradual improvement.
But it works.
Why rewrites fail and strangler succeeds:
-
Rewrites are all-or-nothing. You don't get value until it's done, and "done" keeps moving further away. Strangler delivers value incrementally—each migrated component is an improvement.
-
Rewrites fight the business. Features freeze or diverge, stakeholders lose patience. Strangler works alongside feature development—new features go in the new system, moving the migration forward.
-
Rewrites hide the complexity. You discover that gnarly edge case when you're already committed. Strangler exposes complexity one piece at a time, when you can still adjust.
-
Rewrites are irreversible. Once you commit, you're committed. Strangler lets you pause, rollback, or change direction at any point.
Keys to success:
-
Start small. Pick an independent, well-understood component. Get the pattern working before tackling hard stuff.
-
Make progress visible. Dashboard with traffic splits. Regular updates on improvements. Celebrate milestones.
-
Resist scope creep. Migration should be migration. Save the redesigns for after you've escaped the legacy system.
-
Don't abandon the old system. It's still serving customers. Keep it healthy until it's actually turned off.
-
Set a deadline. Open-ended migrations become eternal migrations. Staff appropriately and finish it.
The goal isn't a perfect new system—it's escaping an unmaintainable old one while keeping the business running. The strangler fig pattern makes that possible, one branch at a time.
When your legacy system is finally decommissioned, there won't be fanfare or celebration (though there should be). There will just be a better system, built incrementally over months, that's been proving itself in production the entire time.
That's the unsexy reality of successful migrations. And it beats the alternative every time.
What did you think?