Zero-Downtime Schema Migrations: Strategies for Production Databases

April 19, 202686 min read0 views

zero downtime

database migrations

system design

data consistency

backward compatibility

feature flags

scalability

distributed systems

Zero-Downtime Schema Migrations: Strategies for Production Databases

Database schema migrations are the most dangerous routine operation in production systems. A bad migration can lock tables for hours, corrupt data, or take down your entire application. Zero-downtime migrations are the art of changing your database schema while traffic is flowing — without locks, without downtime, and without losing data.

This article covers the mechanics of online schema changes: how to add columns without locking, how to rename tables while queries are running, and how to migrate billions of rows without your pager going off at 3 AM.

Why Schema Migrations Are Dangerous

┌─────────────────────────────────────────────────────────────────────────┐
│              SCHEMA MIGRATION FAILURE MODES                              │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│  1. TABLE LOCKS                                                          │
│     ALTER TABLE in many databases acquires exclusive lock               │
│     All queries to that table block until ALTER completes               │
│     On a 100GB table, this can be hours                                 │
│                                                                          │
│  2. TRANSACTION LOG EXPLOSION                                            │
│     Large migrations generate massive write-ahead logs                  │
│     Disk fills up, replication falls behind, failover breaks            │
│                                                                          │
│  3. BACKWARD INCOMPATIBILITY                                             │
│     Old application code runs alongside new schema                      │
│     Column rename: old code inserts to old name, new code reads new    │
│     Data ends up in wrong place or queries fail                        │
│                                                                          │
│  4. DATA CORRUPTION                                                      │
│     Bad migration logic (wrong default, bad transform)                  │
│     No rollback possible after data is transformed                     │
│                                                                          │
│  5. REPLICATION ISSUES                                                   │
│     Schema change propagates to replicas at different times            │
│     Queries hit replicas with old schema, fail                         │
│     Replication lag causes inconsistent reads                          │
│                                                                          │
│  6. GHOST MIGRATIONS                                                     │
│     Migration runs on staging, not production                           │
│     Or runs partially, leaves schema in inconsistent state             │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

The Expand-Contract Pattern

The core strategy for zero-downtime migrations is expand-contract (also called parallel change):

┌─────────────────────────────────────────────────────────────────────────┐
│              EXPAND-CONTRACT MIGRATION PATTERN                           │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│  GOAL: Rename column "name" to "full_name"                              │
│                                                                          │
│  WRONG (causes downtime):                                                │
│  ─────────────────────────                                               │
│  1. Deploy: ALTER TABLE users RENAME COLUMN name TO full_name;          │
│  2. Deploy new code that uses full_name                                 │
│  ❌ Between step 1 and 2: all queries fail                              │
│                                                                          │
│  EXPAND-CONTRACT (zero downtime):                                        │
│  ────────────────────────────────                                        │
│                                                                          │
│  PHASE 1: EXPAND                                                         │
│  ┌────────────────────────────────────────────────────────┐             │
│  │ Add new column (no lock on Postgres):                  │             │
│  │ ALTER TABLE users ADD COLUMN full_name VARCHAR(255);   │             │
│  │                                                        │             │
│  │ Schema now: [id, name, email, full_name]               │             │
│  │ Code reads: name                                       │             │
│  │ Code writes: name                                      │             │
│  └────────────────────────────────────────────────────────┘             │
│                                                                          │
│  PHASE 2: DUAL WRITE                                                     │
│  ┌────────────────────────────────────────────────────────┐             │
│  │ Deploy code that writes to BOTH columns:               │             │
│  │ INSERT INTO users (name, full_name, ...) VALUES (?, ?, ...)│         │
│  │ UPDATE users SET name = ?, full_name = ? WHERE id = ?  │             │
│  │                                                        │             │
│  │ Code reads: name (still)                               │             │
│  │ Code writes: name AND full_name                        │             │
│  └────────────────────────────────────────────────────────┘             │
│                                                                          │
│  PHASE 3: BACKFILL                                                       │
│  ┌────────────────────────────────────────────────────────┐             │
│  │ Copy data from old column to new:                      │             │
│  │ UPDATE users SET full_name = name WHERE full_name IS NULL│           │
│  │ (Do in batches to avoid locks!)                        │             │
│  │                                                        │             │
│  │ Now both columns have same data                        │             │
│  └────────────────────────────────────────────────────────┘             │
│                                                                          │
│  PHASE 4: SWITCH READS                                                   │
│  ┌────────────────────────────────────────────────────────┐             │
│  │ Deploy code that reads from new column:                │             │
│  │ SELECT full_name FROM users                            │             │
│  │                                                        │             │
│  │ Code reads: full_name                                  │             │
│  │ Code writes: name AND full_name (still dual-writing)  │             │
│  └────────────────────────────────────────────────────────┘             │
│                                                                          │
│  PHASE 5: CONTRACT                                                       │
│  ┌────────────────────────────────────────────────────────┐             │
│  │ Stop writing to old column:                            │             │
│  │ Code now only writes to full_name                      │             │
│  │                                                        │             │
│  │ Then, later: DROP COLUMN name                          │             │
│  │ (Or keep for rollback safety period)                   │             │
│  └────────────────────────────────────────────────────────┘             │
│                                                                          │
│  Each phase is a separate deploy. Each is independently safe.           │
│  Any phase can be rolled back. Zero downtime throughout.                │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

PostgreSQL Online Schema Changes

PostgreSQL is relatively good at online schema changes, but has gotchas:

Operations That Don't Lock

-- These are essentially instant (metadata-only):

-- Add column with no default
ALTER TABLE users ADD COLUMN new_field VARCHAR(255);

-- Add column with VOLATILE default (Postgres 11+)
ALTER TABLE users ADD COLUMN created_at TIMESTAMP DEFAULT now();

-- Drop column (marks as invisible, doesn't rewrite)
ALTER TABLE users DROP COLUMN old_field;

-- Create index concurrently (doesn't block writes)
CREATE INDEX CONCURRENTLY idx_users_email ON users(email);

-- Add constraint without validation
ALTER TABLE users ADD CONSTRAINT fk_org
  FOREIGN KEY (org_id) REFERENCES orgs(id) NOT VALID;

Operations That DO Lock

-- These rewrite the table or acquire ACCESS EXCLUSIVE lock:

-- Add column with non-volatile default (before Postgres 11)
ALTER TABLE users ADD COLUMN status VARCHAR(20) DEFAULT 'active';
-- LOCKS entire table, rewrites every row

-- Add NOT NULL without default
ALTER TABLE users ALTER COLUMN email SET NOT NULL;
-- Must scan entire table to verify constraint

-- Change column type
ALTER TABLE users ALTER COLUMN age TYPE bigint;
-- Rewrites entire table

-- Add unique constraint
ALTER TABLE users ADD CONSTRAINT uk_email UNIQUE (email);
-- Creates index (use CREATE INDEX CONCURRENTLY + ADD CONSTRAINT USING INDEX)

-- Validate foreign key
ALTER TABLE users VALIDATE CONSTRAINT fk_org;
-- Scans entire table

Safe Pattern: Adding NOT NULL Column

-- WRONG: Locks table for full rewrite
ALTER TABLE users ADD COLUMN role VARCHAR(20) NOT NULL DEFAULT 'user';

-- SAFE: Multi-step approach
-- Step 1: Add nullable column (instant)
ALTER TABLE users ADD COLUMN role VARCHAR(20);

-- Step 2: Set default for new rows (instant)
ALTER TABLE users ALTER COLUMN role SET DEFAULT 'user';

-- Step 3: Backfill in batches (no lock per batch)
DO $$
DECLARE
  batch_size INT := 10000;
  affected INT;
BEGIN
  LOOP
    UPDATE users
    SET role = 'user'
    WHERE id IN (
      SELECT id FROM users
      WHERE role IS NULL
      LIMIT batch_size
      FOR UPDATE SKIP LOCKED  -- Don't block on locked rows
    );

    GET DIAGNOSTICS affected = ROW_COUNT;
    EXIT WHEN affected = 0;

    COMMIT;
    PERFORM pg_sleep(0.1);  -- Brief pause to reduce load
  END LOOP;
END $$;

-- Step 4: Add NOT NULL constraint (scans but doesn't rewrite)
ALTER TABLE users ALTER COLUMN role SET NOT NULL;

MySQL Online DDL and pt-online-schema-change

MySQL's online DDL has improved but still has limitations. For large tables, use pt-online-schema-change or gh-ost:

┌─────────────────────────────────────────────────────────────────────────┐
│              pt-online-schema-change ALGORITHM                           │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│  GOAL: ALTER TABLE orders ADD COLUMN tracking_num VARCHAR(100);         │
│  (Table has 500M rows, normal ALTER would take hours with lock)         │
│                                                                          │
│  STEP 1: CREATE SHADOW TABLE                                             │
│  ────────────────────────────                                            │
│  CREATE TABLE _orders_new LIKE orders;                                  │
│  ALTER TABLE _orders_new ADD COLUMN tracking_num VARCHAR(100);          │
│                                                                          │
│  STEP 2: CREATE TRIGGERS                                                 │
│  ───────────────────────                                                 │
│  -- Capture all changes to original table                               │
│  CREATE TRIGGER pt_osc_ins AFTER INSERT ON orders                       │
│    FOR EACH ROW REPLACE INTO _orders_new (...) VALUES (...);            │
│                                                                          │
│  CREATE TRIGGER pt_osc_upd AFTER UPDATE ON orders                       │
│    FOR EACH ROW REPLACE INTO _orders_new (...) VALUES (...);            │
│                                                                          │
│  CREATE TRIGGER pt_osc_del AFTER DELETE ON orders                       │
│    FOR EACH ROW DELETE FROM _orders_new WHERE id = OLD.id;              │
│                                                                          │
│  STEP 3: COPY DATA IN CHUNKS                                             │
│  ───────────────────────────                                             │
│  -- Copy in small batches (e.g., 1000 rows at a time)                   │
│  INSERT INTO _orders_new SELECT * FROM orders                           │
│    WHERE id BETWEEN ? AND ?;                                            │
│                                                                          │
│  -- Triggers ensure new writes are captured                              │
│  -- Copy takes hours but no locking                                     │
│                                                                          │
│  STEP 4: ATOMIC SWAP                                                     │
│  ───────────────────────                                                 │
│  -- Brief lock for atomic rename                                        │
│  RENAME TABLE orders TO _orders_old, _orders_new TO orders;             │
│                                                                          │
│  STEP 5: CLEANUP                                                         │
│  ───────────────────                                                     │
│  DROP TABLE _orders_old;                                                │
│  DROP TRIGGER pt_osc_ins, pt_osc_upd, pt_osc_del;                      │
│                                                                          │
│  RESULT: Schema changed with only milliseconds of lock time             │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

gh-ost: Triggerless Online Migration

# GitHub's gh-ost uses binary log parsing instead of triggers
# Safer for high-write tables (triggers add latency)

gh-ost \
  --host=db.example.com \
  --database=myapp \
  --table=orders \
  --alter="ADD COLUMN tracking_num VARCHAR(100)" \
  --execute \
  --chunk-size=1000 \
  --max-load=Threads_running=50 \
  --critical-load=Threads_running=100 \
  --postpone-cut-over-flag-file=/tmp/gh-ost.postpone

┌─────────────────────────────────────────────────────────────────────────┐
│              gh-ost ALGORITHM                                            │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│  KEY DIFFERENCE: No triggers. Reads MySQL binary log instead.           │
│                                                                          │
│  1. Create ghost table with new schema                                  │
│  2. Connect to replica and stream binary log                            │
│  3. Copy existing rows from original table (in chunks)                  │
│  4. Apply binary log events to ghost table (keeps it in sync)           │
│  5. When copy complete and lag is zero: atomic table swap              │
│                                                                          │
│  ADVANTAGES OVER pt-osc:                                                 │
│  • No triggers means no write latency impact                            │
│  • Can pause/resume migration                                            │
│  • Can throttle based on replica lag                                    │
│  • Can postpone cut-over until maintenance window                       │
│  • Can test on replica before running on primary                        │
│                                                                          │
│  BINARY LOG STREAMING:                                                   │
│  ┌────────────────────────────────────────────────────────┐             │
│  │  Original Table          │      Ghost Table           │             │
│  │  ──────────────          │      ───────────           │             │
│  │       │                  │           ▲                │             │
│  │       │ writes           │           │ applied        │             │
│  │       ▼                  │           │                │             │
│  │  Binary Log ────────────────────────►│                │             │
│  │  (binlog stream)                                      │             │
│  └────────────────────────────────────────────────────────┘             │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Handling Foreign Keys

Foreign keys make migrations much harder:

// Migration strategy for foreign key constraints

interface ForeignKeyMigration {
  steps: MigrationStep[];
}

const renameForeignKeyColumn: ForeignKeyMigration = {
  steps: [
    // 1. Add new column (nullable)
    {
      type: 'add_column',
      sql: 'ALTER TABLE orders ADD COLUMN customer_uuid UUID',
    },

    // 2. Add FK constraint without validation
    {
      type: 'add_constraint',
      sql: `ALTER TABLE orders ADD CONSTRAINT fk_orders_customer_uuid
            FOREIGN KEY (customer_uuid) REFERENCES customers(uuid) NOT VALID`,
      note: 'NOT VALID skips validation scan',
    },

    // 3. Deploy dual-write code
    {
      type: 'code_deploy',
      description: 'Write to both customer_id and customer_uuid',
    },

    // 4. Backfill data
    {
      type: 'backfill',
      sql: `UPDATE orders o
            SET customer_uuid = c.uuid
            FROM customers c
            WHERE o.customer_id = c.id
            AND o.customer_uuid IS NULL`,
      batch_size: 10000,
    },

    // 5. Validate constraint (scans but doesn't lock)
    {
      type: 'validate_constraint',
      sql: 'ALTER TABLE orders VALIDATE CONSTRAINT fk_orders_customer_uuid',
    },

    // 6. Switch reads to new column
    {
      type: 'code_deploy',
      description: 'Read from customer_uuid',
    },

    // 7. Make new column NOT NULL
    {
      type: 'alter_column',
      sql: 'ALTER TABLE orders ALTER COLUMN customer_uuid SET NOT NULL',
    },

    // 8. Drop old FK and column
    {
      type: 'drop_constraint',
      sql: 'ALTER TABLE orders DROP CONSTRAINT fk_orders_customer_id',
    },
    {
      type: 'drop_column',
      sql: 'ALTER TABLE orders DROP COLUMN customer_id',
      delay: '7 days',  // Wait for rollback safety
    },
  ],
};

Backfill Strategies

Chunked Backfill with Progress Tracking

import { Pool } from 'pg';

interface BackfillConfig {
  tableName: string;
  setClause: string;
  whereClause: string;
  batchSize: number;
  delayMs: number;
  primaryKey: string;
  onProgress?: (processed: number, total: number) => void;
}

async function backfillInChunks(pool: Pool, config: BackfillConfig): Promise<void> {
  const {
    tableName,
    setClause,
    whereClause,
    batchSize,
    delayMs,
    primaryKey,
    onProgress
  } = config;

  // Get total count for progress reporting
  const countResult = await pool.query(
    `SELECT COUNT(*) as total FROM ${tableName} WHERE ${whereClause}`
  );
  const total = parseInt(countResult.rows[0].total, 10);
  let processed = 0;

  // Get min/max ID for range-based batching
  const rangeResult = await pool.query(
    `SELECT MIN(${primaryKey}) as min_id, MAX(${primaryKey}) as max_id
     FROM ${tableName}
     WHERE ${whereClause}`
  );

  if (!rangeResult.rows[0].min_id) {
    console.log('No rows to backfill');
    return;
  }

  let currentId = rangeResult.rows[0].min_id;
  const maxId = rangeResult.rows[0].max_id;

  console.log(`Backfilling ${total} rows from ID ${currentId} to ${maxId}`);

  while (currentId <= maxId) {
    const batchStart = currentId;
    const batchEnd = currentId + batchSize;

    // Update batch
    const result = await pool.query(
      `UPDATE ${tableName}
       SET ${setClause}
       WHERE ${primaryKey} >= $1
         AND ${primaryKey} < $2
         AND ${whereClause}`,
      [batchStart, batchEnd]
    );

    processed += result.rowCount ?? 0;
    currentId = batchEnd;

    onProgress?.(processed, total);

    // Throttle to avoid overwhelming the database
    if (delayMs > 0) {
      await new Promise(resolve => setTimeout(resolve, delayMs));
    }

    // Check replication lag (if using replicas)
    const lagResult = await pool.query(
      `SELECT EXTRACT(EPOCH FROM (now() - pg_last_xact_replay_timestamp()))::int as lag_seconds`
    );

    const lagSeconds = lagResult.rows[0]?.lag_seconds ?? 0;
    if (lagSeconds > 30) {
      console.log(`Replication lag ${lagSeconds}s, pausing...`);
      await new Promise(resolve => setTimeout(resolve, 10000));
    }
  }

  console.log(`Backfill complete: ${processed} rows updated`);
}

// Usage
await backfillInChunks(pool, {
  tableName: 'users',
  setClause: 'full_name = name',
  whereClause: 'full_name IS NULL',
  batchSize: 5000,
  delayMs: 100,
  primaryKey: 'id',
  onProgress: (processed, total) => {
    console.log(`Progress: ${processed}/${total} (${((processed/total)*100).toFixed(1)}%)`);
  },
});

Parallel Backfill for Large Tables

import { Worker, isMainThread, parentPort, workerData } from 'worker_threads';

interface BackfillTask {
  startId: number;
  endId: number;
  connectionString: string;
  updateQuery: string;
}

// Main thread: coordinate workers
async function parallelBackfill(
  connectionString: string,
  tableName: string,
  setClause: string,
  whereClause: string,
  workerCount: number = 4
): Promise<void> {
  const pool = new Pool({ connectionString });

  // Get ID range
  const { rows } = await pool.query(
    `SELECT MIN(id) as min_id, MAX(id) as max_id FROM ${tableName} WHERE ${whereClause}`
  );

  const minId = rows[0].min_id;
  const maxId = rows[0].max_id;
  const rangeSize = Math.ceil((maxId - minId) / workerCount);

  console.log(`Spawning ${workerCount} workers for IDs ${minId}-${maxId}`);

  const workers: Promise<number>[] = [];

  for (let i = 0; i < workerCount; i++) {
    const startId = minId + (i * rangeSize);
    const endId = Math.min(startId + rangeSize, maxId + 1);

    const task: BackfillTask = {
      startId,
      endId,
      connectionString,
      updateQuery: `UPDATE ${tableName} SET ${setClause} WHERE id >= $1 AND id < $2 AND ${whereClause}`,
    };

    workers.push(runWorker(task));
  }

  const results = await Promise.all(workers);
  const totalUpdated = results.reduce((sum, count) => sum + count, 0);

  console.log(`All workers complete. Total updated: ${totalUpdated}`);
  await pool.end();
}

function runWorker(task: BackfillTask): Promise<number> {
  return new Promise((resolve, reject) => {
    const worker = new Worker(__filename, { workerData: task });

    worker.on('message', resolve);
    worker.on('error', reject);
    worker.on('exit', (code) => {
      if (code !== 0) reject(new Error(`Worker exited with code ${code}`));
    });
  });
}

// Worker thread: process assigned range
if (!isMainThread) {
  const task = workerData as BackfillTask;

  (async () => {
    const pool = new Pool({ connectionString: task.connectionString });
    let updated = 0;
    let currentId = task.startId;
    const batchSize = 1000;

    while (currentId < task.endId) {
      const result = await pool.query(task.updateQuery, [
        currentId,
        Math.min(currentId + batchSize, task.endId),
      ]);

      updated += result.rowCount ?? 0;
      currentId += batchSize;

      // Brief pause to reduce contention
      await new Promise(resolve => setTimeout(resolve, 50));
    }

    await pool.end();
    parentPort?.postMessage(updated);
  })();
}

Handling Rollbacks

┌─────────────────────────────────────────────────────────────────────────┐
│              MIGRATION ROLLBACK STRATEGIES                               │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│  ROLLBACK TYPES:                                                         │
│                                                                          │
│  1. CODE ROLLBACK (easy)                                                 │
│     Schema is forward-compatible                                        │
│     Old code can work with new schema                                   │
│     Just redeploy old code version                                      │
│                                                                          │
│  2. SCHEMA ROLLBACK (hard)                                               │
│     Reverse the schema change                                           │
│     May need to recover dropped data                                    │
│     Expand-contract makes this safer                                    │
│                                                                          │
│  EXPAND-CONTRACT ROLLBACK SAFETY:                                        │
│  ─────────────────────────────────                                       │
│                                                                          │
│  Phase 1: Expand (add column)                                            │
│  → Rollback: Drop column (no data loss, old code unaffected)            │
│                                                                          │
│  Phase 2: Dual-write                                                     │
│  → Rollback: Deploy single-write code (new column has data, but unused)│
│                                                                          │
│  Phase 3: Backfill                                                       │
│  → Rollback: Nothing needed (backfill is idempotent)                    │
│                                                                          │
│  Phase 4: Switch reads                                                   │
│  → Rollback: Deploy old-read code (data still in old column)            │
│                                                                          │
│  Phase 5: Stop dual-write                                                │
│  → Rollback: Resume dual-writing (brief gap in new column data)         │
│                                                                          │
│  Phase 6: Drop old column                                                │
│  → POINT OF NO RETURN - cannot easily rollback                          │
│  → Keep old column for N days before dropping                           │
│                                                                          │
│  KEEP OLD ARTIFACTS:                                                     │
│  • Don't drop columns immediately — wait 7-14 days                      │
│  • Don't drop tables — rename to _deprecated_tablename                  │
│  • Don't delete data — soft delete or move to archive                   │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Migration State Machine

enum MigrationPhase {
  NOT_STARTED = 'not_started',
  EXPAND = 'expand',
  DUAL_WRITE = 'dual_write',
  BACKFILL = 'backfill',
  SWITCH_READS = 'switch_reads',
  SINGLE_WRITE = 'single_write',
  CONTRACTED = 'contracted',
  ROLLED_BACK = 'rolled_back',
}

interface MigrationState {
  name: string;
  phase: MigrationPhase;
  startedAt: Date;
  completedPhases: MigrationPhase[];
  error?: string;
}

class MigrationStateMachine {
  private state: MigrationState;

  constructor(name: string) {
    this.state = {
      name,
      phase: MigrationPhase.NOT_STARTED,
      startedAt: new Date(),
      completedPhases: [],
    };
  }

  async transition(to: MigrationPhase, action: () => Promise<void>): Promise<void> {
    const validTransitions: Record<MigrationPhase, MigrationPhase[]> = {
      [MigrationPhase.NOT_STARTED]: [MigrationPhase.EXPAND],
      [MigrationPhase.EXPAND]: [MigrationPhase.DUAL_WRITE, MigrationPhase.ROLLED_BACK],
      [MigrationPhase.DUAL_WRITE]: [MigrationPhase.BACKFILL, MigrationPhase.EXPAND],
      [MigrationPhase.BACKFILL]: [MigrationPhase.SWITCH_READS, MigrationPhase.DUAL_WRITE],
      [MigrationPhase.SWITCH_READS]: [MigrationPhase.SINGLE_WRITE, MigrationPhase.BACKFILL],
      [MigrationPhase.SINGLE_WRITE]: [MigrationPhase.CONTRACTED, MigrationPhase.SWITCH_READS],
      [MigrationPhase.CONTRACTED]: [], // Terminal state
      [MigrationPhase.ROLLED_BACK]: [MigrationPhase.EXPAND], // Can retry
    };

    if (!validTransitions[this.state.phase].includes(to)) {
      throw new Error(
        `Invalid transition from ${this.state.phase} to ${to}. ` +
        `Valid transitions: ${validTransitions[this.state.phase].join(', ')}`
      );
    }

    console.log(`Migration ${this.state.name}: ${this.state.phase} → ${to}`);

    try {
      await action();
      this.state.completedPhases.push(this.state.phase);
      this.state.phase = to;
      await this.persistState();
    } catch (error) {
      this.state.error = error instanceof Error ? error.message : 'Unknown error';
      console.error(`Migration failed at ${this.state.phase}:`, error);
      throw error;
    }
  }

  async rollbackTo(phase: MigrationPhase, action: () => Promise<void>): Promise<void> {
    if (!this.state.completedPhases.includes(phase)) {
      throw new Error(`Cannot rollback to ${phase}: not in completed phases`);
    }

    console.log(`Rolling back ${this.state.name} from ${this.state.phase} to ${phase}`);

    await action();
    this.state.phase = phase;
    this.state.completedPhases = this.state.completedPhases.filter(p =>
      this.phaseOrder(p) <= this.phaseOrder(phase)
    );
    await this.persistState();
  }

  private phaseOrder(phase: MigrationPhase): number {
    const order = [
      MigrationPhase.NOT_STARTED,
      MigrationPhase.EXPAND,
      MigrationPhase.DUAL_WRITE,
      MigrationPhase.BACKFILL,
      MigrationPhase.SWITCH_READS,
      MigrationPhase.SINGLE_WRITE,
      MigrationPhase.CONTRACTED,
    ];
    return order.indexOf(phase);
  }

  private async persistState(): Promise<void> {
    // Store in database or file for recovery
    await db.query(
      `INSERT INTO migration_states (name, phase, completed_phases, updated_at)
       VALUES ($1, $2, $3, NOW())
       ON CONFLICT (name) DO UPDATE SET
         phase = EXCLUDED.phase,
         completed_phases = EXCLUDED.completed_phases,
         updated_at = NOW()`,
      [this.state.name, this.state.phase, JSON.stringify(this.state.completedPhases)]
    );
  }
}

Testing Migrations

// Migration test framework

import { Pool } from 'pg';
import { execSync } from 'child_process';

interface MigrationTest {
  name: string;
  seedData: string;
  migration: string;
  assertions: (pool: Pool) => Promise<void>;
}

async function testMigration(test: MigrationTest): Promise<void> {
  const testDbName = `test_migration_${Date.now()}`;
  const pool = new Pool({ database: 'postgres' });

  try {
    // Create isolated test database
    await pool.query(`CREATE DATABASE ${testDbName}`);

    const testPool = new Pool({ database: testDbName });

    // Apply schema up to migration point
    await testPool.query(test.seedData);

    // Verify pre-migration state
    console.log(`Testing migration: ${test.name}`);
    console.log('Pre-migration data seeded');

    // Apply migration
    await testPool.query(test.migration);
    console.log('Migration applied');

    // Run assertions
    await test.assertions(testPool);
    console.log('All assertions passed');

    await testPool.end();
  } finally {
    // Cleanup
    await pool.query(`DROP DATABASE IF EXISTS ${testDbName}`);
    await pool.end();
  }
}

// Example test
const renameColumnTest: MigrationTest = {
  name: 'rename_name_to_full_name',

  seedData: `
    CREATE TABLE users (
      id SERIAL PRIMARY KEY,
      name VARCHAR(255) NOT NULL,
      email VARCHAR(255)
    );
    INSERT INTO users (name, email) VALUES
      ('Alice Smith', 'alice@example.com'),
      ('Bob Jones', 'bob@example.com');
  `,

  migration: `
    -- Phase 1: Add column
    ALTER TABLE users ADD COLUMN full_name VARCHAR(255);

    -- Phase 3: Backfill
    UPDATE users SET full_name = name WHERE full_name IS NULL;

    -- Phase 7: Add NOT NULL
    ALTER TABLE users ALTER COLUMN full_name SET NOT NULL;
  `,

  assertions: async (pool: Pool) => {
    // Verify column exists
    const { rows: columns } = await pool.query(`
      SELECT column_name, is_nullable
      FROM information_schema.columns
      WHERE table_name = 'users' AND column_name = 'full_name'
    `);

    if (columns.length === 0) {
      throw new Error('full_name column not created');
    }
    if (columns[0].is_nullable !== 'NO') {
      throw new Error('full_name should be NOT NULL');
    }

    // Verify data migrated
    const { rows: users } = await pool.query(
      'SELECT name, full_name FROM users ORDER BY id'
    );

    if (users[0].full_name !== 'Alice Smith') {
      throw new Error('Data not migrated correctly');
    }

    // Verify old data preserved (for rollback)
    if (users[0].name !== users[0].full_name) {
      throw new Error('Old column data corrupted');
    }
  },
};

await testMigration(renameColumnTest);

Monitoring Migrations in Production

// Migration observability

interface MigrationMetrics {
  name: string;
  phase: string;
  rowsProcessed: number;
  rowsRemaining: number;
  estimatedCompletionTime: Date | null;
  errorCount: number;
  replicationLagSeconds: number;
  lockWaitCount: number;
}

async function monitorMigration(pool: Pool, migrationName: string): Promise<MigrationMetrics> {
  // Get migration progress from tracking table
  const { rows: [progress] } = await pool.query(
    `SELECT * FROM migration_progress WHERE name = $1`,
    [migrationName]
  );

  // Check replication lag
  const { rows: [lag] } = await pool.query(
    `SELECT EXTRACT(EPOCH FROM (now() - pg_last_xact_replay_timestamp()))::int as lag`
  );

  // Check for lock waits
  const { rows: [locks] } = await pool.query(`
    SELECT COUNT(*) as lock_count
    FROM pg_stat_activity
    WHERE wait_event_type = 'Lock'
    AND query LIKE '%${migrationName}%'
  `);

  // Estimate completion time
  const rowsPerSecond = progress.rows_processed /
    ((Date.now() - new Date(progress.started_at).getTime()) / 1000);

  const secondsRemaining = progress.rows_remaining / rowsPerSecond;

  return {
    name: migrationName,
    phase: progress.phase,
    rowsProcessed: progress.rows_processed,
    rowsRemaining: progress.rows_remaining,
    estimatedCompletionTime: new Date(Date.now() + secondsRemaining * 1000),
    errorCount: progress.error_count,
    replicationLagSeconds: lag?.lag ?? 0,
    lockWaitCount: parseInt(locks?.lock_count ?? '0', 10),
  };
}

// Alert if migration is unhealthy
function checkMigrationHealth(metrics: MigrationMetrics): string[] {
  const alerts: string[] = [];

  if (metrics.replicationLagSeconds > 60) {
    alerts.push(`High replication lag: ${metrics.replicationLagSeconds}s`);
  }

  if (metrics.lockWaitCount > 0) {
    alerts.push(`Lock contention detected: ${metrics.lockWaitCount} waiting`);
  }

  if (metrics.errorCount > 100) {
    alerts.push(`High error count: ${metrics.errorCount}`);
  }

  // Detect stalled migration
  if (metrics.rowsProcessed === 0 && metrics.phase !== 'not_started') {
    alerts.push('Migration appears stalled: no progress');
  }

  return alerts;
}

Conclusion

Zero-downtime schema migrations require:

Expand-contract pattern — Never break running code, always add before removing
Online DDL knowledge — Know which operations lock and which don't
Chunked backfills — Never UPDATE millions of rows in one transaction
Replication awareness — Monitor lag, throttle when needed
State tracking — Know exactly which phase each migration is in
Tested rollbacks — Every phase should be reversible

The tooling matters: pt-online-schema-change and gh-ost for MySQL, native online DDL for PostgreSQL. But the pattern matters more — even with perfect tooling, you need expand-contract discipline to keep code and schema in sync.

The safest migration is the one you don't have to do. Design schemas for evolution: use UUIDs instead of auto-increment, use JSON columns for frequently-changing attributes, and think about how you'll change things before you ship v1.

What did you think?