Designing a Feature Flag System That Doesn't Become a Graveyard

February 28, 20262 min read10 views

feature flags

release engineering

software architecture

engineering leadership

platform engineering

scalable systems

continuous deployment

frontend architecture

backend architecture

Designing a Feature Flag System That Doesn't Become a Graveyard

Beyond if (flag.enabled) — architectural patterns for flag lifecycle management, targeting rules, gradual rollouts, and how to build cleanup discipline into your team's engineering culture before flags become permanent tech debt.

The Graveyard Problem

Every codebase has them. Feature flags created three years ago. Nobody knows if they're safe to remove. The original author left. The flag name is cryptic. There's no documentation. The code has seventeen branches that reference it.

// Actual code from a real production system
if (flags.get('new_checkout_flow_v2_temp_rollback_FINAL_2')) {
  if (flags.get('checkout_experiment_holdout_group')) {
    if (!flags.get('disable_new_checkout_for_enterprise')) {
      // The actual feature, buried under three years of flag accumulation
    }
  }
}

Feature flags start as a best practice: ship safely, test in production, control rollouts. They end as technical debt that compounds faster than anyone admits.

┌─────────────────────────────────────────────────────────────────────────────┐
│                    THE FLAG LIFECYCLE REALITY                                │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  INTENDED:                                                                   │
│  ─────────                                                                  │
│  Create flag → Develop feature → Roll out → Remove flag                    │
│       │                │               │            │                        │
│       ▼                ▼               ▼            ▼                        │
│    1 day            2 weeks         1 week       1 day                      │
│                                                                              │
│  ACTUAL:                                                                     │
│  ───────                                                                    │
│  Create flag → Develop feature → Roll out → "We'll clean it up later"     │
│       │                │               │            │                        │
│       ▼                ▼               ▼            ▼                        │
│    1 day            2 weeks         1 week       ∞ (never)                  │
│                                                                              │
│  Result after 3 years:                                                       │
│  • 847 feature flags in codebase                                            │
│  • 200+ are "temporary" (created > 1 year ago)                              │
│  • 50+ reference flags that no longer exist in config                       │
│  • 12 circular flag dependencies                                            │
│  • 0 documentation on what most flags do                                    │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

This post is about building a feature flag system that prevents this outcome through architecture, not discipline alone.

Flag Taxonomy: Not All Flags Are Equal

The first mistake is treating all flags the same. Different flag types have different lifecycles and cleanup requirements.

┌─────────────────────────────────────────────────────────────────────────────┐
│                    FEATURE FLAG TAXONOMY                                     │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  TYPE              LIFESPAN        CLEANUP        EXAMPLE                   │
│  ─────────────────────────────────────────────────────────────────────────  │
│                                                                              │
│  RELEASE FLAG      Days-Weeks      Mandatory      new_checkout_ui           │
│  └── Control feature rollout                                                │
│  └── MUST be removed after 100% rollout                                     │
│  └── Temporary by definition                                                │
│                                                                              │
│  EXPERIMENT FLAG   Weeks-Months    Mandatory      pricing_page_variant_b    │
│  └── A/B testing, measured outcomes                                         │
│  └── MUST be removed after experiment concludes                             │
│  └── Winner becomes default, loser code deleted                             │
│                                                                              │
│  OPS FLAG          Permanent       Never          enable_read_replicas      │
│  └── Operational controls, circuit breakers                                 │
│  └── Intended to live forever                                               │
│  └── Kill switches, degradation modes                                       │
│                                                                              │
│  PERMISSION FLAG   Permanent       Conditional    enable_enterprise_sso     │
│  └── Entitlement gating, plan features                                      │
│  └── Lives as long as business model requires                               │
│  └── May become default, then removable                                     │
│                                                                              │
│  The problem: Most codebases treat all four types identically               │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Encoding Type in the Flag Definition

// Bad: No lifecycle information
const flags = {
  new_checkout: true,
  pricing_experiment: true,
  enable_cache: true,
};

// Good: Explicit type and lifecycle
interface FlagDefinition {
  key: string;
  type: 'release' | 'experiment' | 'ops' | 'permission';
  description: string;
  owner: string;                    // Team or individual
  createdAt: Date;
  expiresAt?: Date;                 // Required for release/experiment
  jiraTicket?: string;              // Cleanup tracking
  cleanupDeadline?: Date;           // Automated reminders
  defaultValue: boolean;
  rules: TargetingRule[];
}

const flags: FlagDefinition[] = [
  {
    key: 'new_checkout_ui',
    type: 'release',
    description: 'New streamlined checkout flow with fewer steps',
    owner: 'checkout-team',
    createdAt: new Date('2024-01-15'),
    expiresAt: new Date('2024-02-28'),      // Hard deadline
    cleanupDeadline: new Date('2024-03-15'), // Grace period
    jiraTicket: 'CHECKOUT-1234',
    defaultValue: false,
    rules: [
      { segment: 'internal', value: true },
      { segment: 'beta_users', value: true },
      { percentage: 25, value: true },
    ],
  },
];

Architecture: The Flag Service

A well-designed flag system has clear boundaries.

┌─────────────────────────────────────────────────────────────────────────────┐
│                    FLAG SYSTEM ARCHITECTURE                                  │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│                         ┌─────────────────────────┐                         │
│                         │     Flag Dashboard      │                         │
│                         │  (Create, Edit, View)   │                         │
│                         └───────────┬─────────────┘                         │
│                                     │                                        │
│                                     ▼                                        │
│  ┌──────────────────────────────────────────────────────────────────────┐   │
│  │                        FLAG SERVICE (API)                            │   │
│  │  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  ┌────────────┐  │   │
│  │  │ Definition  │  │  Targeting  │  │  Lifecycle  │  │  Audit     │  │   │
│  │  │ Management  │  │  Engine     │  │  Manager    │  │  Logger    │  │   │
│  │  └─────────────┘  └─────────────┘  └─────────────┘  └────────────┘  │   │
│  └──────────────────────────────────────────────────────────────────────┘   │
│         │                    │                                               │
│         │                    │                                               │
│         ▼                    ▼                                               │
│  ┌─────────────┐      ┌─────────────┐                                       │
│  │  Flag Store │      │  Edge Cache │                                       │
│  │ (Postgres)  │      │  (Redis/CDN)│                                       │
│  └─────────────┘      └─────────────┘                                       │
│                              │                                               │
│         ┌────────────────────┼────────────────────┐                         │
│         │                    │                    │                         │
│         ▼                    ▼                    ▼                         │
│  ┌─────────────┐      ┌─────────────┐      ┌─────────────┐                 │
│  │  Backend    │      │  Frontend   │      │  Mobile     │                 │
│  │  SDK        │      │  SDK        │      │  SDK        │                 │
│  └─────────────┘      └─────────────┘      └─────────────┘                 │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Flag Store Schema

-- Core flag definition
CREATE TABLE feature_flags (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  key VARCHAR(255) UNIQUE NOT NULL,
  type VARCHAR(50) NOT NULL CHECK (type IN ('release', 'experiment', 'ops', 'permission')),
  description TEXT NOT NULL,
  owner VARCHAR(255) NOT NULL,

  -- State
  enabled BOOLEAN NOT NULL DEFAULT false,
  default_value BOOLEAN NOT NULL DEFAULT false,

  -- Lifecycle
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  expires_at TIMESTAMPTZ,              -- NULL for permanent flags
  cleanup_deadline TIMESTAMPTZ,

  -- Tracking
  jira_ticket VARCHAR(100),
  cleanup_ticket VARCHAR(100),         -- Created when flag nears expiry
  removed_at TIMESTAMPTZ,              -- Soft delete

  -- Metadata
  tags VARCHAR(255)[] DEFAULT '{}',

  CONSTRAINT release_must_expire CHECK (
    type != 'release' OR expires_at IS NOT NULL
  ),
  CONSTRAINT experiment_must_expire CHECK (
    type != 'experiment' OR expires_at IS NOT NULL
  )
);

-- Targeting rules
CREATE TABLE flag_rules (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  flag_id UUID NOT NULL REFERENCES feature_flags(id),
  priority INT NOT NULL,               -- Lower = evaluated first

  -- Conditions
  segment VARCHAR(255),                -- Named segment (e.g., 'beta_users')
  user_ids UUID[],                     -- Specific users
  percentage INT CHECK (percentage BETWEEN 0 AND 100),
  attribute_rules JSONB,               -- Complex attribute matching

  -- Result
  value BOOLEAN NOT NULL,

  UNIQUE (flag_id, priority)
);

-- Audit log
CREATE TABLE flag_audit_log (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  flag_id UUID NOT NULL REFERENCES feature_flags(id),
  action VARCHAR(50) NOT NULL,
  actor VARCHAR(255) NOT NULL,
  previous_state JSONB,
  new_state JSONB,
  reason TEXT,
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- Cleanup tracking
CREATE TABLE flag_cleanup_tasks (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  flag_id UUID NOT NULL REFERENCES feature_flags(id),
  status VARCHAR(50) NOT NULL DEFAULT 'pending',
  assigned_to VARCHAR(255),
  ticket_id VARCHAR(100),
  reminder_sent_at TIMESTAMPTZ,
  completed_at TIMESTAMPTZ,
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

Targeting Engine: Beyond Simple Booleans

Rule Evaluation Order

┌─────────────────────────────────────────────────────────────────────────────┐
│                    FLAG EVALUATION FLOW                                      │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  evaluateFlag(flagKey, context) → boolean                                   │
│       │                                                                      │
│       ▼                                                                      │
│  1. Flag enabled globally?                                                  │
│       │                                                                      │
│       ├── No ──▶ Return defaultValue                                        │
│       │                                                                      │
│       ▼ Yes                                                                  │
│  2. User in explicit allow/deny list?                                       │
│       │                                                                      │
│       ├── Allow list ──▶ Return true                                        │
│       ├── Deny list ──▶ Return false                                        │
│       │                                                                      │
│       ▼ Not in list                                                          │
│  3. User matches segment rules? (in priority order)                         │
│       │                                                                      │
│       ├── Matches segment X ──▶ Return segment X value                      │
│       │                                                                      │
│       ▼ No segment match                                                     │
│  4. User matches attribute rules?                                           │
│       │                                                                      │
│       ├── Matches rule ──▶ Return rule value                                │
│       │                                                                      │
│       ▼ No attribute match                                                   │
│  5. Percentage rollout?                                                     │
│       │                                                                      │
│       ├── hash(userId + flagKey) % 100 < percentage ──▶ Return true        │
│       │                                                                      │
│       ▼ Outside percentage                                                   │
│  6. Return defaultValue                                                     │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Implementation

interface EvaluationContext {
  userId: string;
  userAttributes: Record<string, unknown>;
  sessionId?: string;
  requestId?: string;
}

interface FlagEvaluationResult {
  value: boolean;
  reason: 'default' | 'disabled' | 'user_list' | 'segment' | 'attribute' | 'percentage';
  ruleId?: string;
  flagVersion: number;
}

class FlagEvaluator {
  constructor(
    private flagStore: FlagStore,
    private segmentStore: SegmentStore,
  ) {}

  async evaluate(flagKey: string, context: EvaluationContext): Promise<FlagEvaluationResult> {
    const flag = await this.flagStore.get(flagKey);

    if (!flag) {
      return { value: false, reason: 'default', flagVersion: 0 };
    }

    // 1. Global kill switch
    if (!flag.enabled) {
      return { value: flag.defaultValue, reason: 'disabled', flagVersion: flag.version };
    }

    // 2. Sort rules by priority
    const rules = flag.rules.sort((a, b) => a.priority - b.priority);

    for (const rule of rules) {
      const match = await this.evaluateRule(rule, context);
      if (match.matches) {
        return {
          value: rule.value,
          reason: match.reason,
          ruleId: rule.id,
          flagVersion: flag.version,
        };
      }
    }

    // 3. Default
    return { value: flag.defaultValue, reason: 'default', flagVersion: flag.version };
  }

  private async evaluateRule(
    rule: FlagRule,
    context: EvaluationContext
  ): Promise<{ matches: boolean; reason: FlagEvaluationResult['reason'] }> {
    // Explicit user list
    if (rule.userIds?.includes(context.userId)) {
      return { matches: true, reason: 'user_list' };
    }

    // Segment membership
    if (rule.segment) {
      const segment = await this.segmentStore.get(rule.segment);
      if (segment && this.userInSegment(context, segment)) {
        return { matches: true, reason: 'segment' };
      }
    }

    // Attribute rules
    if (rule.attributeRules) {
      if (this.evaluateAttributeRules(rule.attributeRules, context.userAttributes)) {
        return { matches: true, reason: 'attribute' };
      }
    }

    // Percentage rollout
    if (rule.percentage !== undefined) {
      const bucket = this.hashToBucket(context.userId, rule.flagKey);
      if (bucket < rule.percentage) {
        return { matches: true, reason: 'percentage' };
      }
    }

    return { matches: false, reason: 'default' };
  }

  // Deterministic bucketing — same user always gets same bucket
  private hashToBucket(userId: string, flagKey: string): number {
    const hash = createHash('sha256')
      .update(`${userId}:${flagKey}`)
      .digest('hex');
    const hashInt = parseInt(hash.substring(0, 8), 16);
    return hashInt % 100;
  }

  private userInSegment(context: EvaluationContext, segment: Segment): boolean {
    // Segment can define rules based on attributes
    return this.evaluateAttributeRules(segment.rules, context.userAttributes);
  }

  private evaluateAttributeRules(
    rules: AttributeRule[],
    attributes: Record<string, unknown>
  ): boolean {
    return rules.every((rule) => {
      const value = attributes[rule.attribute];
      switch (rule.operator) {
        case 'eq': return value === rule.value;
        case 'neq': return value !== rule.value;
        case 'gt': return (value as number) > (rule.value as number);
        case 'gte': return (value as number) >= (rule.value as number);
        case 'lt': return (value as number) < (rule.value as number);
        case 'lte': return (value as number) <= (rule.value as number);
        case 'in': return (rule.value as unknown[]).includes(value);
        case 'contains': return String(value).includes(rule.value as string);
        case 'regex': return new RegExp(rule.value as string).test(String(value));
        default: return false;
      }
    });
  }
}

Gradual Rollout Patterns

Pattern 1: Percentage Ramp

┌─────────────────────────────────────────────────────────────────────────────┐
│                    PERCENTAGE ROLLOUT SCHEDULE                               │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  Day 1:  Internal only (segment: 'employees')                               │
│                                                                              │
│  Day 3:  1% of users (catch obvious issues)                                 │
│          ├── Monitor: error rates, latency, user feedback                   │
│          └── Rollback criteria: >1% error rate increase                     │
│                                                                              │
│  Day 5:  5% of users                                                        │
│          ├── Monitor: business metrics (conversion, engagement)             │
│          └── Rollback criteria: >2% conversion drop                         │
│                                                                              │
│  Day 7:  25% of users                                                       │
│          ├── Monitor: support tickets, social media                         │
│          └── Statistical significance for experiments                       │
│                                                                              │
│  Day 10: 50% of users                                                       │
│          └── Validate at scale                                               │
│                                                                              │
│  Day 14: 100% of users                                                      │
│          └── Flag becomes candidate for removal                             │
│                                                                              │
│  Day 21: Flag removed from codebase                                         │
│          └── Old code path deleted                                          │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Pattern 2: Ring-Based Deployment

// Deployment rings — each ring gets the feature before the next
const rings = {
  ring0: {
    name: 'Canary',
    segments: ['internal_testers'],
    percentage: 0,           // Segment-based, not percentage
    minDuration: '24h',      // Must stay in ring for 24h minimum
    successCriteria: {
      errorRate: { max: 0.001 },
      latencyP99: { max: 500 },
    },
  },
  ring1: {
    name: 'Early Adopters',
    segments: ['beta_users'],
    percentage: 5,
    minDuration: '48h',
    successCriteria: {
      errorRate: { max: 0.005 },
      latencyP99: { max: 600 },
      npsScore: { min: 30 },
    },
  },
  ring2: {
    name: 'Broad',
    segments: [],
    percentage: 25,
    minDuration: '72h',
    successCriteria: {
      errorRate: { max: 0.01 },
      conversionRate: { minDelta: -0.02 }, // No more than 2% drop
    },
  },
  ring3: {
    name: 'General Availability',
    segments: [],
    percentage: 100,
    minDuration: '168h',     // 1 week before cleanup eligible
    successCriteria: {},     // Monitoring only
  },
};

// Automated progression
class RolloutManager {
  async checkProgression(flagKey: string): Promise<ProgressionDecision> {
    const flag = await this.flagStore.get(flagKey);
    const currentRing = this.getCurrentRing(flag);
    const metrics = await this.metricsService.getForFlag(flagKey, currentRing.minDuration);

    // Check if we've met duration requirement
    if (!this.durationMet(flag, currentRing)) {
      return { action: 'wait', reason: 'Duration not met', nextCheckIn: '1h' };
    }

    // Check success criteria
    const criteriaCheck = this.checkCriteria(metrics, currentRing.successCriteria);
    if (!criteriaCheck.passed) {
      return {
        action: 'alert',
        reason: `Criteria failed: ${criteriaCheck.failedCriteria.join(', ')}`,
        recommendation: metrics.severity === 'high' ? 'rollback' : 'investigate',
      };
    }

    // Ready to progress
    const nextRing = this.getNextRing(currentRing);
    if (!nextRing) {
      return {
        action: 'complete',
        reason: 'Rollout complete, flag ready for cleanup',
      };
    }

    return {
      action: 'progress',
      reason: 'All criteria met',
      nextRing: nextRing.name,
    };
  }
}

Pattern 3: Sticky Bucketing

Users should get a consistent experience. If they're in the 25% that sees the new feature, they should always see it (until 100% rollout).

// Bucketing service — ensures consistency
class BucketingService {
  constructor(private cache: BucketCache) {}

  async getBucket(userId: string, flagKey: string): Promise<number> {
    // Check if user has existing bucket assignment
    const cacheKey = `bucket:${flagKey}:${userId}`;
    const existing = await this.cache.get(cacheKey);

    if (existing !== null) {
      return existing;
    }

    // Generate deterministic bucket (0-99)
    const bucket = this.hashToBucket(userId, flagKey);

    // Cache for consistency (long TTL — survives percentage changes)
    await this.cache.set(cacheKey, bucket, { ttl: '30d' });

    return bucket;
  }

  // When rolling back, we need to respect original buckets
  // User in bucket 15 who saw feature at 25% rollout
  // should NOT see feature if we roll back to 10%
  async isInRollout(userId: string, flagKey: string, percentage: number): Promise<boolean> {
    const bucket = await this.getBucket(userId, flagKey);
    return bucket < percentage;
  }

  private hashToBucket(userId: string, flagKey: string): number {
    // Cryptographic hash ensures uniform distribution
    const hash = createHash('sha256')
      .update(`${userId}:${flagKey}`)
      .digest('hex');
    return parseInt(hash.substring(0, 8), 16) % 100;
  }
}

Lifecycle Management: Building Cleanup Into the System

Automatic Expiration Enforcement

// Lifecycle manager — runs on schedule
class FlagLifecycleManager {
  async runDailyCheck(): Promise<void> {
    const now = new Date();

    // 1. Find flags approaching expiration
    const expiringFlags = await this.flagStore.findWhere({
      expiresAt: { gte: now, lte: addDays(now, 14) },
      cleanupTicket: null,
    });

    for (const flag of expiringFlags) {
      await this.createCleanupReminder(flag);
    }

    // 2. Find expired flags still in code
    const expiredFlags = await this.flagStore.findWhere({
      expiresAt: { lt: now },
      removedAt: null,
    });

    for (const flag of expiredFlags) {
      await this.handleExpiredFlag(flag);
    }

    // 3. Find orphaned flags (in DB but not in code)
    const orphanedFlags = await this.findOrphanedFlags();
    for (const flag of orphanedFlags) {
      await this.cleanupOrphanedFlag(flag);
    }
  }

  private async createCleanupReminder(flag: FeatureFlag): Promise<void> {
    // Create Jira ticket automatically
    const ticket = await this.jira.createTicket({
      project: 'TECH-DEBT',
      type: 'Task',
      summary: `Remove feature flag: ${flag.key}`,
      description: this.generateCleanupDescription(flag),
      assignee: flag.owner,
      dueDate: flag.expiresAt,
      labels: ['feature-flag-cleanup', 'auto-generated'],
    });

    // Update flag with cleanup ticket
    await this.flagStore.update(flag.id, { cleanupTicket: ticket.key });

    // Notify owner
    await this.slack.sendMessage({
      channel: `@${flag.owner}`,
      text: `🚩 Feature flag \`${flag.key}\` expires on ${flag.expiresAt.toDateString()}. Cleanup ticket created: ${ticket.url}`,
    });
  }

  private async handleExpiredFlag(flag: FeatureFlag): Promise<void> {
    const daysPastExpiration = differenceInDays(new Date(), flag.expiresAt);

    if (daysPastExpiration > 30) {
      // Force removal — this flag is way overdue
      await this.forceDisableFlag(flag);
      await this.escalateToManager(flag);
    } else if (daysPastExpiration > 14) {
      // Escalate
      await this.slack.sendMessage({
        channel: '#engineering-leads',
        text: `⚠️ Feature flag \`${flag.key}\` is ${daysPastExpiration} days past expiration. Owner: ${flag.owner}`,
      });
    } else if (daysPastExpiration > 7) {
      // Daily reminder to owner
      await this.sendDailyReminder(flag);
    }
  }

  private async findOrphanedFlags(): Promise<FeatureFlag[]> {
    // Query codebase for flag references
    const codebaseFlags = await this.codeScanner.findFlagReferences();
    const dbFlags = await this.flagStore.findAll({ removedAt: null });

    return dbFlags.filter((dbFlag) => !codebaseFlags.has(dbFlag.key));
  }

  private generateCleanupDescription(flag: FeatureFlag): string {
    return `
## Feature Flag Cleanup

**Flag Key:** \`${flag.key}\`
**Type:** ${flag.type}
**Created:** ${flag.createdAt.toDateString()}
**Expires:** ${flag.expiresAt?.toDateString() ?? 'Never'}
**Description:** ${flag.description}

### Current State
- Enabled: ${flag.enabled}
- Current rollout: ${this.calculateCurrentRollout(flag)}%

### Cleanup Steps
1. Verify flag is at 100% rollout (or 0% if abandoned)
2. Remove all code references to \`${flag.key}\`
3. Delete the losing code path (if applicable)
4. Remove flag from database
5. Update this ticket when complete

### Code References
${await this.generateCodeReferences(flag)}

### Related Tickets
- Original implementation: ${flag.jiraTicket ?? 'Unknown'}
`;
  }
}

Code Scanner: Find Flag References

// Static analysis to find flag usage
class FlagCodeScanner {
  async findFlagReferences(): Promise<Map<string, CodeReference[]>> {
    const references = new Map<string, CodeReference[]>();

    // Scan TypeScript/JavaScript files
    const files = await glob('src/**/*.{ts,tsx,js,jsx}');

    for (const file of files) {
      const content = await fs.readFile(file, 'utf-8');
      const ast = parse(content, {
        sourceType: 'module',
        plugins: ['typescript', 'jsx'],
      });

      traverse(ast, {
        CallExpression: (path) => {
          // Match patterns like: flags.get('flag_key'), useFlag('flag_key'), etc.
          if (this.isFlagAccess(path)) {
            const flagKey = this.extractFlagKey(path);
            if (flagKey) {
              const existing = references.get(flagKey) ?? [];
              existing.push({
                file,
                line: path.node.loc?.start.line ?? 0,
                column: path.node.loc?.start.column ?? 0,
                snippet: this.getCodeSnippet(content, path.node.loc),
              });
              references.set(flagKey, existing);
            }
          }
        },
      });
    }

    return references;
  }

  private isFlagAccess(path: NodePath<CallExpression>): boolean {
    const callee = path.node.callee;

    // flags.get(), flags.isEnabled(), etc.
    if (
      callee.type === 'MemberExpression' &&
      callee.object.type === 'Identifier' &&
      ['flags', 'featureFlags', 'features'].includes(callee.object.name)
    ) {
      return true;
    }

    // useFlag(), useFeatureFlag(), etc.
    if (
      callee.type === 'Identifier' &&
      ['useFlag', 'useFeatureFlag', 'getFlag'].includes(callee.name)
    ) {
      return true;
    }

    return false;
  }
}

Flag Removal Automation

// Generate code removal suggestions
class FlagRemovalAssistant {
  async generateRemovalPlan(flagKey: string): Promise<RemovalPlan> {
    const references = await this.scanner.findFlagReferences();
    const flagRefs = references.get(flagKey) ?? [];

    if (flagRefs.length === 0) {
      return {
        status: 'no_references',
        message: 'Flag has no code references — safe to delete from database',
      };
    }

    const plan: RemovalStep[] = [];

    for (const ref of flagRefs) {
      const analysis = await this.analyzeReference(ref);
      plan.push({
        file: ref.file,
        line: ref.line,
        action: analysis.action,
        beforeCode: analysis.beforeCode,
        afterCode: analysis.afterCode,
        confidence: analysis.confidence,
      });
    }

    return {
      status: 'removal_plan_generated',
      steps: plan,
      estimatedChanges: plan.length,
      filesAffected: new Set(plan.map((s) => s.file)).size,
      canAutoRemove: plan.every((s) => s.confidence === 'high'),
    };
  }

  private async analyzeReference(ref: CodeReference): Promise<ReferenceAnalysis> {
    const content = await fs.readFile(ref.file, 'utf-8');
    const lines = content.split('\n');

    // Simple pattern: if (flags.get('key')) { ... }
    const line = lines[ref.line - 1];

    if (line.includes('if') && line.includes('flags.get')) {
      // Find the matching block
      const block = this.findConditionalBlock(lines, ref.line - 1);

      // Determine if we should keep the truthy or falsy branch
      const flagValue = await this.getFlagFinalValue(ref.flagKey);

      return {
        action: 'remove_conditional',
        beforeCode: block.fullCode,
        afterCode: flagValue ? block.truthyBranch : block.falsyBranch,
        confidence: block.hasElse ? 'high' : 'medium',
      };
    }

    // Complex pattern — needs manual review
    return {
      action: 'manual_review',
      beforeCode: this.getCodeContext(lines, ref.line - 1),
      afterCode: '// TODO: Manual cleanup required',
      confidence: 'low',
    };
  }
}

Observability: Know Your Flags

Metrics to Track

// Flag evaluation metrics
const flagMetrics = {
  // Evaluation performance
  evaluationDuration: new Histogram({
    name: 'feature_flag_evaluation_duration_ms',
    help: 'Time to evaluate a feature flag',
    labelNames: ['flag_key', 'result', 'reason'],
    buckets: [0.1, 0.5, 1, 5, 10, 50],
  }),

  // Usage tracking
  evaluationCount: new Counter({
    name: 'feature_flag_evaluations_total',
    help: 'Total feature flag evaluations',
    labelNames: ['flag_key', 'result', 'reason'],
  }),

  // Exposure tracking (for experiments)
  exposureCount: new Counter({
    name: 'feature_flag_exposures_total',
    help: 'Users exposed to each flag variant',
    labelNames: ['flag_key', 'variant'],
  }),

  // Error tracking
  evaluationErrors: new Counter({
    name: 'feature_flag_evaluation_errors_total',
    help: 'Feature flag evaluation errors',
    labelNames: ['flag_key', 'error_type'],
  }),

  // Staleness
  flagAge: new Gauge({
    name: 'feature_flag_age_days',
    help: 'Days since flag was created',
    labelNames: ['flag_key', 'type', 'owner'],
  }),

  // Cleanup debt
  expiredFlags: new Gauge({
    name: 'feature_flags_expired_total',
    help: 'Number of expired flags still in codebase',
    labelNames: ['owner'],
  }),
};

// Evaluation with metrics
async function evaluateWithMetrics(
  evaluator: FlagEvaluator,
  flagKey: string,
  context: EvaluationContext
): Promise<FlagEvaluationResult> {
  const startTime = process.hrtime.bigint();

  try {
    const result = await evaluator.evaluate(flagKey, context);

    const durationMs = Number(process.hrtime.bigint() - startTime) / 1_000_000;

    flagMetrics.evaluationDuration.observe(
      { flag_key: flagKey, result: String(result.value), reason: result.reason },
      durationMs
    );

    flagMetrics.evaluationCount.inc({
      flag_key: flagKey,
      result: String(result.value),
      reason: result.reason,
    });

    // Track exposure for experiments
    flagMetrics.exposureCount.inc({
      flag_key: flagKey,
      variant: result.value ? 'treatment' : 'control',
    });

    return result;
  } catch (error) {
    flagMetrics.evaluationErrors.inc({
      flag_key: flagKey,
      error_type: error.constructor.name,
    });
    throw error;
  }
}

Dashboard Queries

-- Flags by age (cleanup candidates)
SELECT
  key,
  type,
  owner,
  DATE_PART('day', NOW() - created_at) as age_days,
  CASE
    WHEN expires_at < NOW() THEN 'EXPIRED'
    WHEN expires_at < NOW() + INTERVAL '14 days' THEN 'EXPIRING_SOON'
    ELSE 'ACTIVE'
  END as status
FROM feature_flags
WHERE removed_at IS NULL
ORDER BY created_at ASC;

-- Flag usage in last 7 days (unused = safe to remove)
SELECT
  f.key,
  f.owner,
  COALESCE(m.evaluation_count, 0) as evaluations_7d,
  COALESCE(m.unique_users, 0) as unique_users_7d
FROM feature_flags f
LEFT JOIN (
  SELECT
    flag_key,
    COUNT(*) as evaluation_count,
    COUNT(DISTINCT user_id) as unique_users
  FROM flag_evaluation_log
  WHERE timestamp > NOW() - INTERVAL '7 days'
  GROUP BY flag_key
) m ON f.key = m.flag_key
WHERE f.removed_at IS NULL
ORDER BY evaluations_7d ASC;

-- Cleanup debt by owner
SELECT
  owner,
  COUNT(*) FILTER (WHERE expires_at < NOW()) as expired_flags,
  COUNT(*) FILTER (WHERE created_at < NOW() - INTERVAL '90 days' AND type = 'release') as stale_release_flags,
  COUNT(*) as total_flags
FROM feature_flags
WHERE removed_at IS NULL
GROUP BY owner
ORDER BY expired_flags DESC;

SDK Design: Making Flags Easy to Use Correctly

Type-Safe Flag Access

// Generated types from flag definitions
// Regenerated on flag changes via CI

// flags.generated.ts
export interface FeatureFlags {
  new_checkout_ui: boolean;
  pricing_experiment: 'control' | 'variant_a' | 'variant_b';
  max_upload_size_mb: number;
  allowed_file_types: string[];
}

export type FlagKey = keyof FeatureFlags;

// SDK with type safety
class TypedFlagClient {
  async get<K extends FlagKey>(
    key: K,
    context: EvaluationContext
  ): Promise<FeatureFlags[K]> {
    const result = await this.evaluator.evaluate(key, context);
    return result.value as FeatureFlags[K];
  }
}

// Usage — fully typed
const client = new TypedFlagClient();

const isNewCheckout = await client.get('new_checkout_ui', ctx);
//    ^? boolean

const pricingVariant = await client.get('pricing_experiment', ctx);
//    ^? 'control' | 'variant_a' | 'variant_b'

// Type error: flag doesn't exist
const invalid = await client.get('nonexistent_flag', ctx);
//                                ^^^^^^^^^^^^^^^^^
// Error: Argument of type '"nonexistent_flag"' is not assignable

React Integration

// hooks/useFeatureFlag.ts
import { createContext, useContext, useSyncExternalStore } from 'react';

interface FlagContextValue {
  flags: FlagClient;
  context: EvaluationContext;
}

const FlagContext = createContext<FlagContextValue | null>(null);

export function FlagProvider({
  children,
  client,
  userContext,
}: {
  children: React.ReactNode;
  client: FlagClient;
  userContext: EvaluationContext;
}) {
  return (
    <FlagContext.Provider value={{ flags: client, context: userContext }}>
      {children}
    </FlagContext.Provider>
  );
}

export function useFlag<K extends FlagKey>(key: K): FeatureFlags[K] {
  const ctx = useContext(FlagContext);
  if (!ctx) throw new Error('useFlag must be within FlagProvider');

  // Subscribe to flag changes (for real-time updates)
  return useSyncExternalStore(
    (callback) => ctx.flags.subscribe(key, callback),
    () => ctx.flags.getSync(key, ctx.context),
    () => ctx.flags.getDefaultValue(key)
  );
}

// Usage
function CheckoutButton() {
  const isNewCheckout = useFlag('new_checkout_ui');

  if (isNewCheckout) {
    return <NewCheckoutButton />;
  }

  return <LegacyCheckoutButton />;
}

Flag Wrapper Component

// Declarative flag access
interface FeatureFlagProps<K extends FlagKey> {
  flag: K;
  children: React.ReactNode;
  fallback?: React.ReactNode;
  // For multivariate flags
  value?: FeatureFlags[K];
}

function FeatureFlag<K extends FlagKey>({
  flag,
  children,
  fallback = null,
  value,
}: FeatureFlagProps<K>) {
  const flagValue = useFlag(flag);

  // Boolean flag
  if (value === undefined) {
    return flagValue ? <>{children}</> : <>{fallback}</>;
  }

  // Multivariate flag
  return flagValue === value ? <>{children}</> : <>{fallback}</>;
}

// Usage
function PricingPage() {
  return (
    <div>
      <FeatureFlag flag="pricing_experiment" value="control">
        <OriginalPricing />
      </FeatureFlag>

      <FeatureFlag flag="pricing_experiment" value="variant_a">
        <SimplifiedPricing />
      </FeatureFlag>

      <FeatureFlag flag="pricing_experiment" value="variant_b">
        <PremiumFocusedPricing />
      </FeatureFlag>
    </div>
  );
}

Building Cleanup Culture

Architecture alone won't save you. Culture matters.

1. Flag Budgets

┌─────────────────────────────────────────────────────────────────────────────┐
│                    FLAG BUDGET POLICY                                        │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  Each team has a flag budget:                                               │
│                                                                              │
│  RELEASE FLAGS:     Max 10 active at a time                                 │
│  EXPERIMENT FLAGS:  Max 5 active at a time                                  │
│  OPS FLAGS:         Unlimited (permanent by nature)                         │
│  PERMISSION FLAGS:  Unlimited (tied to business model)                      │
│                                                                              │
│  Enforcement:                                                                │
│  • CI blocks new flag creation if budget exceeded                           │
│  • Exception process requires VP approval                                   │
│  • Weekly flag review in team standup                                       │
│                                                                              │
│  Budget recovery:                                                            │
│  • Remove 1 flag → create 1 new flag                                        │
│  • Incentive: team with lowest flag count gets recognition                  │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

2. Definition of Done Includes Cleanup

## Pull Request Checklist

- [ ] Code compiles and tests pass
- [ ] Documentation updated
- [ ] **If feature flag added:**
  - [ ] Flag has `expires_at` date (release/experiment types)
  - [ ] Flag has cleanup ticket linked
  - [ ] Flag owner field is set
  - [ ] Flag description explains what it controls
  - [ ] Rollout plan documented in ticket

3. Automated PR Comments

// GitHub Action: Comment on PRs that add flags
name: Feature Flag Audit

on:
  pull_request:
    types: [opened, synchronize]

jobs:
  flag-audit:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - name: Detect new flags
        id: detect
        run: |
          NEW_FLAGS=$(git diff origin/main --name-only | xargs grep -l "createFlag\|addFlag" || true)
          echo "new_flags=$NEW_FLAGS" >> $GITHUB_OUTPUT

      - name: Comment on PR
        if: steps.detect.outputs.new_flags != ''
        uses: actions/github-script@v6
        with:
          script: |
            github.rest.issues.createComment({
              issue_number: context.issue.number,
              owner: context.repo.owner,
              repo: context.repo.repo,
              body: `## 🚩 Feature Flag Detected

              This PR adds a new feature flag. Please ensure:

              - [ ] Flag has an expiration date
              - [ ] Cleanup ticket is created and linked
              - [ ] Flag is added to monitoring dashboard
              - [ ] Rollout plan is documented

              **Remember:** Feature flags are temporary. Plan for removal before merging.`
            })

4. Cleanup Sprints

┌─────────────────────────────────────────────────────────────────────────────┐
│                    QUARTERLY FLAG CLEANUP SPRINT                             │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  Schedule: First week of each quarter                                       │
│                                                                              │
│  Day 1-2: Audit                                                              │
│  • Generate report of all flags > 90 days old                               │
│  • Identify flags at 100% or 0% (candidates for removal)                    │
│  • Assign cleanup owners                                                    │
│                                                                              │
│  Day 3-4: Cleanup                                                            │
│  • Remove flag code references                                              │
│  • Delete losing code paths                                                 │
│  • Update tests                                                             │
│  • Submit PRs                                                               │
│                                                                              │
│  Day 5: Celebrate                                                            │
│  • Track flags removed (gamification)                                       │
│  • Recognize top contributors                                               │
│  • Share lessons learned                                                    │
│                                                                              │
│  Metric: Lines of code deleted (the only metric that matters)               │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Summary: The Flag Lifecycle Done Right

┌─────────────────────────────────────────────────────────────────────────────┐
│                    HEALTHY FLAG LIFECYCLE                                    │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  CREATION                                                                    │
│  ─────────                                                                  │
│  ✓ Define flag type (release/experiment/ops/permission)                    │
│  ✓ Set expiration date (required for release/experiment)                   │
│  ✓ Create cleanup ticket upfront                                           │
│  ✓ Document rollout plan                                                   │
│  ✓ Assign owner                                                            │
│                                                                              │
│  ROLLOUT                                                                     │
│  ────────                                                                   │
│  ✓ Gradual percentage increase (1% → 5% → 25% → 50% → 100%)                │
│  ✓ Monitor metrics at each stage                                           │
│  ✓ Define rollback criteria                                                │
│  ✓ Sticky bucketing for consistency                                        │
│                                                                              │
│  STABILIZATION                                                               │
│  ─────────────                                                              │
│  ✓ Reach 100% rollout                                                      │
│  ✓ Bake for 1-2 weeks                                                      │
│  ✓ Verify no incidents related to flag                                     │
│                                                                              │
│  CLEANUP                                                                     │
│  ────────                                                                   │
│  ✓ Generate removal plan                                                   │
│  ✓ Remove code references                                                  │
│  ✓ Delete losing code path                                                 │
│  ✓ Remove flag from database                                               │
│  ✓ Close cleanup ticket                                                    │
│                                                                              │
│  Total lifecycle: 4-8 weeks (not 4-8 years)                                 │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Feature flags are a powerful tool. They enable safe deployments, gradual rollouts, and experimentation. But without lifecycle management built into the system — not just documented in a wiki nobody reads — they become permanent fixtures in your codebase.

The graveyard isn't inevitable. It's a choice. Choose differently by making cleanup the default, expiration the requirement, and removal the celebration.

The best feature flag is one that no longer exists. Build systems that make deletion easy, expected, and celebrated.

What did you think?