Designing a Feature Flag System That Doesn't Become a Graveyard
Designing a Feature Flag System That Doesn't Become a Graveyard
Beyond if (flag.enabled) — architectural patterns for flag lifecycle management, targeting rules, gradual rollouts, and how to build cleanup discipline into your team's engineering culture before flags become permanent tech debt.
The Graveyard Problem
Every codebase has them. Feature flags created three years ago. Nobody knows if they're safe to remove. The original author left. The flag name is cryptic. There's no documentation. The code has seventeen branches that reference it.
// Actual code from a real production system
if (flags.get('new_checkout_flow_v2_temp_rollback_FINAL_2')) {
if (flags.get('checkout_experiment_holdout_group')) {
if (!flags.get('disable_new_checkout_for_enterprise')) {
// The actual feature, buried under three years of flag accumulation
}
}
}
Feature flags start as a best practice: ship safely, test in production, control rollouts. They end as technical debt that compounds faster than anyone admits.
┌─────────────────────────────────────────────────────────────────────────────┐
│ THE FLAG LIFECYCLE REALITY │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ INTENDED: │
│ ───────── │
│ Create flag → Develop feature → Roll out → Remove flag │
│ │ │ │ │ │
│ ▼ ▼ ▼ ▼ │
│ 1 day 2 weeks 1 week 1 day │
│ │
│ ACTUAL: │
│ ─────── │
│ Create flag → Develop feature → Roll out → "We'll clean it up later" │
│ │ │ │ │ │
│ ▼ ▼ ▼ ▼ │
│ 1 day 2 weeks 1 week ∞ (never) │
│ │
│ Result after 3 years: │
│ • 847 feature flags in codebase │
│ • 200+ are "temporary" (created > 1 year ago) │
│ • 50+ reference flags that no longer exist in config │
│ • 12 circular flag dependencies │
│ • 0 documentation on what most flags do │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
This post is about building a feature flag system that prevents this outcome through architecture, not discipline alone.
Flag Taxonomy: Not All Flags Are Equal
The first mistake is treating all flags the same. Different flag types have different lifecycles and cleanup requirements.
┌─────────────────────────────────────────────────────────────────────────────┐
│ FEATURE FLAG TAXONOMY │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ TYPE LIFESPAN CLEANUP EXAMPLE │
│ ───────────────────────────────────────────────────────────────────────── │
│ │
│ RELEASE FLAG Days-Weeks Mandatory new_checkout_ui │
│ └── Control feature rollout │
│ └── MUST be removed after 100% rollout │
│ └── Temporary by definition │
│ │
│ EXPERIMENT FLAG Weeks-Months Mandatory pricing_page_variant_b │
│ └── A/B testing, measured outcomes │
│ └── MUST be removed after experiment concludes │
│ └── Winner becomes default, loser code deleted │
│ │
│ OPS FLAG Permanent Never enable_read_replicas │
│ └── Operational controls, circuit breakers │
│ └── Intended to live forever │
│ └── Kill switches, degradation modes │
│ │
│ PERMISSION FLAG Permanent Conditional enable_enterprise_sso │
│ └── Entitlement gating, plan features │
│ └── Lives as long as business model requires │
│ └── May become default, then removable │
│ │
│ The problem: Most codebases treat all four types identically │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Encoding Type in the Flag Definition
// Bad: No lifecycle information
const flags = {
new_checkout: true,
pricing_experiment: true,
enable_cache: true,
};
// Good: Explicit type and lifecycle
interface FlagDefinition {
key: string;
type: 'release' | 'experiment' | 'ops' | 'permission';
description: string;
owner: string; // Team or individual
createdAt: Date;
expiresAt?: Date; // Required for release/experiment
jiraTicket?: string; // Cleanup tracking
cleanupDeadline?: Date; // Automated reminders
defaultValue: boolean;
rules: TargetingRule[];
}
const flags: FlagDefinition[] = [
{
key: 'new_checkout_ui',
type: 'release',
description: 'New streamlined checkout flow with fewer steps',
owner: 'checkout-team',
createdAt: new Date('2024-01-15'),
expiresAt: new Date('2024-02-28'), // Hard deadline
cleanupDeadline: new Date('2024-03-15'), // Grace period
jiraTicket: 'CHECKOUT-1234',
defaultValue: false,
rules: [
{ segment: 'internal', value: true },
{ segment: 'beta_users', value: true },
{ percentage: 25, value: true },
],
},
];
Architecture: The Flag Service
A well-designed flag system has clear boundaries.
┌─────────────────────────────────────────────────────────────────────────────┐
│ FLAG SYSTEM ARCHITECTURE │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────┐ │
│ │ Flag Dashboard │ │
│ │ (Create, Edit, View) │ │
│ └───────────┬─────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────────────────┐ │
│ │ FLAG SERVICE (API) │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌────────────┐ │ │
│ │ │ Definition │ │ Targeting │ │ Lifecycle │ │ Audit │ │ │
│ │ │ Management │ │ Engine │ │ Manager │ │ Logger │ │ │
│ │ └─────────────┘ └─────────────┘ └─────────────┘ └────────────┘ │ │
│ └──────────────────────────────────────────────────────────────────────┘ │
│ │ │ │
│ │ │ │
│ ▼ ▼ │
│ ┌─────────────┐ ┌─────────────┐ │
│ │ Flag Store │ │ Edge Cache │ │
│ │ (Postgres) │ │ (Redis/CDN)│ │
│ └─────────────┘ └─────────────┘ │
│ │ │
│ ┌────────────────────┼────────────────────┐ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Backend │ │ Frontend │ │ Mobile │ │
│ │ SDK │ │ SDK │ │ SDK │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Flag Store Schema
-- Core flag definition
CREATE TABLE feature_flags (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
key VARCHAR(255) UNIQUE NOT NULL,
type VARCHAR(50) NOT NULL CHECK (type IN ('release', 'experiment', 'ops', 'permission')),
description TEXT NOT NULL,
owner VARCHAR(255) NOT NULL,
-- State
enabled BOOLEAN NOT NULL DEFAULT false,
default_value BOOLEAN NOT NULL DEFAULT false,
-- Lifecycle
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
expires_at TIMESTAMPTZ, -- NULL for permanent flags
cleanup_deadline TIMESTAMPTZ,
-- Tracking
jira_ticket VARCHAR(100),
cleanup_ticket VARCHAR(100), -- Created when flag nears expiry
removed_at TIMESTAMPTZ, -- Soft delete
-- Metadata
tags VARCHAR(255)[] DEFAULT '{}',
CONSTRAINT release_must_expire CHECK (
type != 'release' OR expires_at IS NOT NULL
),
CONSTRAINT experiment_must_expire CHECK (
type != 'experiment' OR expires_at IS NOT NULL
)
);
-- Targeting rules
CREATE TABLE flag_rules (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
flag_id UUID NOT NULL REFERENCES feature_flags(id),
priority INT NOT NULL, -- Lower = evaluated first
-- Conditions
segment VARCHAR(255), -- Named segment (e.g., 'beta_users')
user_ids UUID[], -- Specific users
percentage INT CHECK (percentage BETWEEN 0 AND 100),
attribute_rules JSONB, -- Complex attribute matching
-- Result
value BOOLEAN NOT NULL,
UNIQUE (flag_id, priority)
);
-- Audit log
CREATE TABLE flag_audit_log (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
flag_id UUID NOT NULL REFERENCES feature_flags(id),
action VARCHAR(50) NOT NULL,
actor VARCHAR(255) NOT NULL,
previous_state JSONB,
new_state JSONB,
reason TEXT,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
-- Cleanup tracking
CREATE TABLE flag_cleanup_tasks (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
flag_id UUID NOT NULL REFERENCES feature_flags(id),
status VARCHAR(50) NOT NULL DEFAULT 'pending',
assigned_to VARCHAR(255),
ticket_id VARCHAR(100),
reminder_sent_at TIMESTAMPTZ,
completed_at TIMESTAMPTZ,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
Targeting Engine: Beyond Simple Booleans
Rule Evaluation Order
┌─────────────────────────────────────────────────────────────────────────────┐
│ FLAG EVALUATION FLOW │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ evaluateFlag(flagKey, context) → boolean │
│ │ │
│ ▼ │
│ 1. Flag enabled globally? │
│ │ │
│ ├── No ──▶ Return defaultValue │
│ │ │
│ ▼ Yes │
│ 2. User in explicit allow/deny list? │
│ │ │
│ ├── Allow list ──▶ Return true │
│ ├── Deny list ──▶ Return false │
│ │ │
│ ▼ Not in list │
│ 3. User matches segment rules? (in priority order) │
│ │ │
│ ├── Matches segment X ──▶ Return segment X value │
│ │ │
│ ▼ No segment match │
│ 4. User matches attribute rules? │
│ │ │
│ ├── Matches rule ──▶ Return rule value │
│ │ │
│ ▼ No attribute match │
│ 5. Percentage rollout? │
│ │ │
│ ├── hash(userId + flagKey) % 100 < percentage ──▶ Return true │
│ │ │
│ ▼ Outside percentage │
│ 6. Return defaultValue │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Implementation
interface EvaluationContext {
userId: string;
userAttributes: Record<string, unknown>;
sessionId?: string;
requestId?: string;
}
interface FlagEvaluationResult {
value: boolean;
reason: 'default' | 'disabled' | 'user_list' | 'segment' | 'attribute' | 'percentage';
ruleId?: string;
flagVersion: number;
}
class FlagEvaluator {
constructor(
private flagStore: FlagStore,
private segmentStore: SegmentStore,
) {}
async evaluate(flagKey: string, context: EvaluationContext): Promise<FlagEvaluationResult> {
const flag = await this.flagStore.get(flagKey);
if (!flag) {
return { value: false, reason: 'default', flagVersion: 0 };
}
// 1. Global kill switch
if (!flag.enabled) {
return { value: flag.defaultValue, reason: 'disabled', flagVersion: flag.version };
}
// 2. Sort rules by priority
const rules = flag.rules.sort((a, b) => a.priority - b.priority);
for (const rule of rules) {
const match = await this.evaluateRule(rule, context);
if (match.matches) {
return {
value: rule.value,
reason: match.reason,
ruleId: rule.id,
flagVersion: flag.version,
};
}
}
// 3. Default
return { value: flag.defaultValue, reason: 'default', flagVersion: flag.version };
}
private async evaluateRule(
rule: FlagRule,
context: EvaluationContext
): Promise<{ matches: boolean; reason: FlagEvaluationResult['reason'] }> {
// Explicit user list
if (rule.userIds?.includes(context.userId)) {
return { matches: true, reason: 'user_list' };
}
// Segment membership
if (rule.segment) {
const segment = await this.segmentStore.get(rule.segment);
if (segment && this.userInSegment(context, segment)) {
return { matches: true, reason: 'segment' };
}
}
// Attribute rules
if (rule.attributeRules) {
if (this.evaluateAttributeRules(rule.attributeRules, context.userAttributes)) {
return { matches: true, reason: 'attribute' };
}
}
// Percentage rollout
if (rule.percentage !== undefined) {
const bucket = this.hashToBucket(context.userId, rule.flagKey);
if (bucket < rule.percentage) {
return { matches: true, reason: 'percentage' };
}
}
return { matches: false, reason: 'default' };
}
// Deterministic bucketing — same user always gets same bucket
private hashToBucket(userId: string, flagKey: string): number {
const hash = createHash('sha256')
.update(`${userId}:${flagKey}`)
.digest('hex');
const hashInt = parseInt(hash.substring(0, 8), 16);
return hashInt % 100;
}
private userInSegment(context: EvaluationContext, segment: Segment): boolean {
// Segment can define rules based on attributes
return this.evaluateAttributeRules(segment.rules, context.userAttributes);
}
private evaluateAttributeRules(
rules: AttributeRule[],
attributes: Record<string, unknown>
): boolean {
return rules.every((rule) => {
const value = attributes[rule.attribute];
switch (rule.operator) {
case 'eq': return value === rule.value;
case 'neq': return value !== rule.value;
case 'gt': return (value as number) > (rule.value as number);
case 'gte': return (value as number) >= (rule.value as number);
case 'lt': return (value as number) < (rule.value as number);
case 'lte': return (value as number) <= (rule.value as number);
case 'in': return (rule.value as unknown[]).includes(value);
case 'contains': return String(value).includes(rule.value as string);
case 'regex': return new RegExp(rule.value as string).test(String(value));
default: return false;
}
});
}
}
Gradual Rollout Patterns
Pattern 1: Percentage Ramp
┌─────────────────────────────────────────────────────────────────────────────┐
│ PERCENTAGE ROLLOUT SCHEDULE │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ Day 1: Internal only (segment: 'employees') │
│ │
│ Day 3: 1% of users (catch obvious issues) │
│ ├── Monitor: error rates, latency, user feedback │
│ └── Rollback criteria: >1% error rate increase │
│ │
│ Day 5: 5% of users │
│ ├── Monitor: business metrics (conversion, engagement) │
│ └── Rollback criteria: >2% conversion drop │
│ │
│ Day 7: 25% of users │
│ ├── Monitor: support tickets, social media │
│ └── Statistical significance for experiments │
│ │
│ Day 10: 50% of users │
│ └── Validate at scale │
│ │
│ Day 14: 100% of users │
│ └── Flag becomes candidate for removal │
│ │
│ Day 21: Flag removed from codebase │
│ └── Old code path deleted │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Pattern 2: Ring-Based Deployment
// Deployment rings — each ring gets the feature before the next
const rings = {
ring0: {
name: 'Canary',
segments: ['internal_testers'],
percentage: 0, // Segment-based, not percentage
minDuration: '24h', // Must stay in ring for 24h minimum
successCriteria: {
errorRate: { max: 0.001 },
latencyP99: { max: 500 },
},
},
ring1: {
name: 'Early Adopters',
segments: ['beta_users'],
percentage: 5,
minDuration: '48h',
successCriteria: {
errorRate: { max: 0.005 },
latencyP99: { max: 600 },
npsScore: { min: 30 },
},
},
ring2: {
name: 'Broad',
segments: [],
percentage: 25,
minDuration: '72h',
successCriteria: {
errorRate: { max: 0.01 },
conversionRate: { minDelta: -0.02 }, // No more than 2% drop
},
},
ring3: {
name: 'General Availability',
segments: [],
percentage: 100,
minDuration: '168h', // 1 week before cleanup eligible
successCriteria: {}, // Monitoring only
},
};
// Automated progression
class RolloutManager {
async checkProgression(flagKey: string): Promise<ProgressionDecision> {
const flag = await this.flagStore.get(flagKey);
const currentRing = this.getCurrentRing(flag);
const metrics = await this.metricsService.getForFlag(flagKey, currentRing.minDuration);
// Check if we've met duration requirement
if (!this.durationMet(flag, currentRing)) {
return { action: 'wait', reason: 'Duration not met', nextCheckIn: '1h' };
}
// Check success criteria
const criteriaCheck = this.checkCriteria(metrics, currentRing.successCriteria);
if (!criteriaCheck.passed) {
return {
action: 'alert',
reason: `Criteria failed: ${criteriaCheck.failedCriteria.join(', ')}`,
recommendation: metrics.severity === 'high' ? 'rollback' : 'investigate',
};
}
// Ready to progress
const nextRing = this.getNextRing(currentRing);
if (!nextRing) {
return {
action: 'complete',
reason: 'Rollout complete, flag ready for cleanup',
};
}
return {
action: 'progress',
reason: 'All criteria met',
nextRing: nextRing.name,
};
}
}
Pattern 3: Sticky Bucketing
Users should get a consistent experience. If they're in the 25% that sees the new feature, they should always see it (until 100% rollout).
// Bucketing service — ensures consistency
class BucketingService {
constructor(private cache: BucketCache) {}
async getBucket(userId: string, flagKey: string): Promise<number> {
// Check if user has existing bucket assignment
const cacheKey = `bucket:${flagKey}:${userId}`;
const existing = await this.cache.get(cacheKey);
if (existing !== null) {
return existing;
}
// Generate deterministic bucket (0-99)
const bucket = this.hashToBucket(userId, flagKey);
// Cache for consistency (long TTL — survives percentage changes)
await this.cache.set(cacheKey, bucket, { ttl: '30d' });
return bucket;
}
// When rolling back, we need to respect original buckets
// User in bucket 15 who saw feature at 25% rollout
// should NOT see feature if we roll back to 10%
async isInRollout(userId: string, flagKey: string, percentage: number): Promise<boolean> {
const bucket = await this.getBucket(userId, flagKey);
return bucket < percentage;
}
private hashToBucket(userId: string, flagKey: string): number {
// Cryptographic hash ensures uniform distribution
const hash = createHash('sha256')
.update(`${userId}:${flagKey}`)
.digest('hex');
return parseInt(hash.substring(0, 8), 16) % 100;
}
}
Lifecycle Management: Building Cleanup Into the System
Automatic Expiration Enforcement
// Lifecycle manager — runs on schedule
class FlagLifecycleManager {
async runDailyCheck(): Promise<void> {
const now = new Date();
// 1. Find flags approaching expiration
const expiringFlags = await this.flagStore.findWhere({
expiresAt: { gte: now, lte: addDays(now, 14) },
cleanupTicket: null,
});
for (const flag of expiringFlags) {
await this.createCleanupReminder(flag);
}
// 2. Find expired flags still in code
const expiredFlags = await this.flagStore.findWhere({
expiresAt: { lt: now },
removedAt: null,
});
for (const flag of expiredFlags) {
await this.handleExpiredFlag(flag);
}
// 3. Find orphaned flags (in DB but not in code)
const orphanedFlags = await this.findOrphanedFlags();
for (const flag of orphanedFlags) {
await this.cleanupOrphanedFlag(flag);
}
}
private async createCleanupReminder(flag: FeatureFlag): Promise<void> {
// Create Jira ticket automatically
const ticket = await this.jira.createTicket({
project: 'TECH-DEBT',
type: 'Task',
summary: `Remove feature flag: ${flag.key}`,
description: this.generateCleanupDescription(flag),
assignee: flag.owner,
dueDate: flag.expiresAt,
labels: ['feature-flag-cleanup', 'auto-generated'],
});
// Update flag with cleanup ticket
await this.flagStore.update(flag.id, { cleanupTicket: ticket.key });
// Notify owner
await this.slack.sendMessage({
channel: `@${flag.owner}`,
text: `🚩 Feature flag \`${flag.key}\` expires on ${flag.expiresAt.toDateString()}. Cleanup ticket created: ${ticket.url}`,
});
}
private async handleExpiredFlag(flag: FeatureFlag): Promise<void> {
const daysPastExpiration = differenceInDays(new Date(), flag.expiresAt);
if (daysPastExpiration > 30) {
// Force removal — this flag is way overdue
await this.forceDisableFlag(flag);
await this.escalateToManager(flag);
} else if (daysPastExpiration > 14) {
// Escalate
await this.slack.sendMessage({
channel: '#engineering-leads',
text: `⚠️ Feature flag \`${flag.key}\` is ${daysPastExpiration} days past expiration. Owner: ${flag.owner}`,
});
} else if (daysPastExpiration > 7) {
// Daily reminder to owner
await this.sendDailyReminder(flag);
}
}
private async findOrphanedFlags(): Promise<FeatureFlag[]> {
// Query codebase for flag references
const codebaseFlags = await this.codeScanner.findFlagReferences();
const dbFlags = await this.flagStore.findAll({ removedAt: null });
return dbFlags.filter((dbFlag) => !codebaseFlags.has(dbFlag.key));
}
private generateCleanupDescription(flag: FeatureFlag): string {
return `
## Feature Flag Cleanup
**Flag Key:** \`${flag.key}\`
**Type:** ${flag.type}
**Created:** ${flag.createdAt.toDateString()}
**Expires:** ${flag.expiresAt?.toDateString() ?? 'Never'}
**Description:** ${flag.description}
### Current State
- Enabled: ${flag.enabled}
- Current rollout: ${this.calculateCurrentRollout(flag)}%
### Cleanup Steps
1. Verify flag is at 100% rollout (or 0% if abandoned)
2. Remove all code references to \`${flag.key}\`
3. Delete the losing code path (if applicable)
4. Remove flag from database
5. Update this ticket when complete
### Code References
${await this.generateCodeReferences(flag)}
### Related Tickets
- Original implementation: ${flag.jiraTicket ?? 'Unknown'}
`;
}
}
Code Scanner: Find Flag References
// Static analysis to find flag usage
class FlagCodeScanner {
async findFlagReferences(): Promise<Map<string, CodeReference[]>> {
const references = new Map<string, CodeReference[]>();
// Scan TypeScript/JavaScript files
const files = await glob('src/**/*.{ts,tsx,js,jsx}');
for (const file of files) {
const content = await fs.readFile(file, 'utf-8');
const ast = parse(content, {
sourceType: 'module',
plugins: ['typescript', 'jsx'],
});
traverse(ast, {
CallExpression: (path) => {
// Match patterns like: flags.get('flag_key'), useFlag('flag_key'), etc.
if (this.isFlagAccess(path)) {
const flagKey = this.extractFlagKey(path);
if (flagKey) {
const existing = references.get(flagKey) ?? [];
existing.push({
file,
line: path.node.loc?.start.line ?? 0,
column: path.node.loc?.start.column ?? 0,
snippet: this.getCodeSnippet(content, path.node.loc),
});
references.set(flagKey, existing);
}
}
},
});
}
return references;
}
private isFlagAccess(path: NodePath<CallExpression>): boolean {
const callee = path.node.callee;
// flags.get(), flags.isEnabled(), etc.
if (
callee.type === 'MemberExpression' &&
callee.object.type === 'Identifier' &&
['flags', 'featureFlags', 'features'].includes(callee.object.name)
) {
return true;
}
// useFlag(), useFeatureFlag(), etc.
if (
callee.type === 'Identifier' &&
['useFlag', 'useFeatureFlag', 'getFlag'].includes(callee.name)
) {
return true;
}
return false;
}
}
Flag Removal Automation
// Generate code removal suggestions
class FlagRemovalAssistant {
async generateRemovalPlan(flagKey: string): Promise<RemovalPlan> {
const references = await this.scanner.findFlagReferences();
const flagRefs = references.get(flagKey) ?? [];
if (flagRefs.length === 0) {
return {
status: 'no_references',
message: 'Flag has no code references — safe to delete from database',
};
}
const plan: RemovalStep[] = [];
for (const ref of flagRefs) {
const analysis = await this.analyzeReference(ref);
plan.push({
file: ref.file,
line: ref.line,
action: analysis.action,
beforeCode: analysis.beforeCode,
afterCode: analysis.afterCode,
confidence: analysis.confidence,
});
}
return {
status: 'removal_plan_generated',
steps: plan,
estimatedChanges: plan.length,
filesAffected: new Set(plan.map((s) => s.file)).size,
canAutoRemove: plan.every((s) => s.confidence === 'high'),
};
}
private async analyzeReference(ref: CodeReference): Promise<ReferenceAnalysis> {
const content = await fs.readFile(ref.file, 'utf-8');
const lines = content.split('\n');
// Simple pattern: if (flags.get('key')) { ... }
const line = lines[ref.line - 1];
if (line.includes('if') && line.includes('flags.get')) {
// Find the matching block
const block = this.findConditionalBlock(lines, ref.line - 1);
// Determine if we should keep the truthy or falsy branch
const flagValue = await this.getFlagFinalValue(ref.flagKey);
return {
action: 'remove_conditional',
beforeCode: block.fullCode,
afterCode: flagValue ? block.truthyBranch : block.falsyBranch,
confidence: block.hasElse ? 'high' : 'medium',
};
}
// Complex pattern — needs manual review
return {
action: 'manual_review',
beforeCode: this.getCodeContext(lines, ref.line - 1),
afterCode: '// TODO: Manual cleanup required',
confidence: 'low',
};
}
}
Observability: Know Your Flags
Metrics to Track
// Flag evaluation metrics
const flagMetrics = {
// Evaluation performance
evaluationDuration: new Histogram({
name: 'feature_flag_evaluation_duration_ms',
help: 'Time to evaluate a feature flag',
labelNames: ['flag_key', 'result', 'reason'],
buckets: [0.1, 0.5, 1, 5, 10, 50],
}),
// Usage tracking
evaluationCount: new Counter({
name: 'feature_flag_evaluations_total',
help: 'Total feature flag evaluations',
labelNames: ['flag_key', 'result', 'reason'],
}),
// Exposure tracking (for experiments)
exposureCount: new Counter({
name: 'feature_flag_exposures_total',
help: 'Users exposed to each flag variant',
labelNames: ['flag_key', 'variant'],
}),
// Error tracking
evaluationErrors: new Counter({
name: 'feature_flag_evaluation_errors_total',
help: 'Feature flag evaluation errors',
labelNames: ['flag_key', 'error_type'],
}),
// Staleness
flagAge: new Gauge({
name: 'feature_flag_age_days',
help: 'Days since flag was created',
labelNames: ['flag_key', 'type', 'owner'],
}),
// Cleanup debt
expiredFlags: new Gauge({
name: 'feature_flags_expired_total',
help: 'Number of expired flags still in codebase',
labelNames: ['owner'],
}),
};
// Evaluation with metrics
async function evaluateWithMetrics(
evaluator: FlagEvaluator,
flagKey: string,
context: EvaluationContext
): Promise<FlagEvaluationResult> {
const startTime = process.hrtime.bigint();
try {
const result = await evaluator.evaluate(flagKey, context);
const durationMs = Number(process.hrtime.bigint() - startTime) / 1_000_000;
flagMetrics.evaluationDuration.observe(
{ flag_key: flagKey, result: String(result.value), reason: result.reason },
durationMs
);
flagMetrics.evaluationCount.inc({
flag_key: flagKey,
result: String(result.value),
reason: result.reason,
});
// Track exposure for experiments
flagMetrics.exposureCount.inc({
flag_key: flagKey,
variant: result.value ? 'treatment' : 'control',
});
return result;
} catch (error) {
flagMetrics.evaluationErrors.inc({
flag_key: flagKey,
error_type: error.constructor.name,
});
throw error;
}
}
Dashboard Queries
-- Flags by age (cleanup candidates)
SELECT
key,
type,
owner,
DATE_PART('day', NOW() - created_at) as age_days,
CASE
WHEN expires_at < NOW() THEN 'EXPIRED'
WHEN expires_at < NOW() + INTERVAL '14 days' THEN 'EXPIRING_SOON'
ELSE 'ACTIVE'
END as status
FROM feature_flags
WHERE removed_at IS NULL
ORDER BY created_at ASC;
-- Flag usage in last 7 days (unused = safe to remove)
SELECT
f.key,
f.owner,
COALESCE(m.evaluation_count, 0) as evaluations_7d,
COALESCE(m.unique_users, 0) as unique_users_7d
FROM feature_flags f
LEFT JOIN (
SELECT
flag_key,
COUNT(*) as evaluation_count,
COUNT(DISTINCT user_id) as unique_users
FROM flag_evaluation_log
WHERE timestamp > NOW() - INTERVAL '7 days'
GROUP BY flag_key
) m ON f.key = m.flag_key
WHERE f.removed_at IS NULL
ORDER BY evaluations_7d ASC;
-- Cleanup debt by owner
SELECT
owner,
COUNT(*) FILTER (WHERE expires_at < NOW()) as expired_flags,
COUNT(*) FILTER (WHERE created_at < NOW() - INTERVAL '90 days' AND type = 'release') as stale_release_flags,
COUNT(*) as total_flags
FROM feature_flags
WHERE removed_at IS NULL
GROUP BY owner
ORDER BY expired_flags DESC;
SDK Design: Making Flags Easy to Use Correctly
Type-Safe Flag Access
// Generated types from flag definitions
// Regenerated on flag changes via CI
// flags.generated.ts
export interface FeatureFlags {
new_checkout_ui: boolean;
pricing_experiment: 'control' | 'variant_a' | 'variant_b';
max_upload_size_mb: number;
allowed_file_types: string[];
}
export type FlagKey = keyof FeatureFlags;
// SDK with type safety
class TypedFlagClient {
async get<K extends FlagKey>(
key: K,
context: EvaluationContext
): Promise<FeatureFlags[K]> {
const result = await this.evaluator.evaluate(key, context);
return result.value as FeatureFlags[K];
}
}
// Usage — fully typed
const client = new TypedFlagClient();
const isNewCheckout = await client.get('new_checkout_ui', ctx);
// ^? boolean
const pricingVariant = await client.get('pricing_experiment', ctx);
// ^? 'control' | 'variant_a' | 'variant_b'
// Type error: flag doesn't exist
const invalid = await client.get('nonexistent_flag', ctx);
// ^^^^^^^^^^^^^^^^^
// Error: Argument of type '"nonexistent_flag"' is not assignable
React Integration
// hooks/useFeatureFlag.ts
import { createContext, useContext, useSyncExternalStore } from 'react';
interface FlagContextValue {
flags: FlagClient;
context: EvaluationContext;
}
const FlagContext = createContext<FlagContextValue | null>(null);
export function FlagProvider({
children,
client,
userContext,
}: {
children: React.ReactNode;
client: FlagClient;
userContext: EvaluationContext;
}) {
return (
<FlagContext.Provider value={{ flags: client, context: userContext }}>
{children}
</FlagContext.Provider>
);
}
export function useFlag<K extends FlagKey>(key: K): FeatureFlags[K] {
const ctx = useContext(FlagContext);
if (!ctx) throw new Error('useFlag must be within FlagProvider');
// Subscribe to flag changes (for real-time updates)
return useSyncExternalStore(
(callback) => ctx.flags.subscribe(key, callback),
() => ctx.flags.getSync(key, ctx.context),
() => ctx.flags.getDefaultValue(key)
);
}
// Usage
function CheckoutButton() {
const isNewCheckout = useFlag('new_checkout_ui');
if (isNewCheckout) {
return <NewCheckoutButton />;
}
return <LegacyCheckoutButton />;
}
Flag Wrapper Component
// Declarative flag access
interface FeatureFlagProps<K extends FlagKey> {
flag: K;
children: React.ReactNode;
fallback?: React.ReactNode;
// For multivariate flags
value?: FeatureFlags[K];
}
function FeatureFlag<K extends FlagKey>({
flag,
children,
fallback = null,
value,
}: FeatureFlagProps<K>) {
const flagValue = useFlag(flag);
// Boolean flag
if (value === undefined) {
return flagValue ? <>{children}</> : <>{fallback}</>;
}
// Multivariate flag
return flagValue === value ? <>{children}</> : <>{fallback}</>;
}
// Usage
function PricingPage() {
return (
<div>
<FeatureFlag flag="pricing_experiment" value="control">
<OriginalPricing />
</FeatureFlag>
<FeatureFlag flag="pricing_experiment" value="variant_a">
<SimplifiedPricing />
</FeatureFlag>
<FeatureFlag flag="pricing_experiment" value="variant_b">
<PremiumFocusedPricing />
</FeatureFlag>
</div>
);
}
Building Cleanup Culture
Architecture alone won't save you. Culture matters.
1. Flag Budgets
┌─────────────────────────────────────────────────────────────────────────────┐
│ FLAG BUDGET POLICY │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ Each team has a flag budget: │
│ │
│ RELEASE FLAGS: Max 10 active at a time │
│ EXPERIMENT FLAGS: Max 5 active at a time │
│ OPS FLAGS: Unlimited (permanent by nature) │
│ PERMISSION FLAGS: Unlimited (tied to business model) │
│ │
│ Enforcement: │
│ • CI blocks new flag creation if budget exceeded │
│ • Exception process requires VP approval │
│ • Weekly flag review in team standup │
│ │
│ Budget recovery: │
│ • Remove 1 flag → create 1 new flag │
│ • Incentive: team with lowest flag count gets recognition │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
2. Definition of Done Includes Cleanup
## Pull Request Checklist
- [ ] Code compiles and tests pass
- [ ] Documentation updated
- [ ] **If feature flag added:**
- [ ] Flag has `expires_at` date (release/experiment types)
- [ ] Flag has cleanup ticket linked
- [ ] Flag owner field is set
- [ ] Flag description explains what it controls
- [ ] Rollout plan documented in ticket
3. Automated PR Comments
// GitHub Action: Comment on PRs that add flags
name: Feature Flag Audit
on:
pull_request:
types: [opened, synchronize]
jobs:
flag-audit:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Detect new flags
id: detect
run: |
NEW_FLAGS=$(git diff origin/main --name-only | xargs grep -l "createFlag\|addFlag" || true)
echo "new_flags=$NEW_FLAGS" >> $GITHUB_OUTPUT
- name: Comment on PR
if: steps.detect.outputs.new_flags != ''
uses: actions/github-script@v6
with:
script: |
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: `## 🚩 Feature Flag Detected
This PR adds a new feature flag. Please ensure:
- [ ] Flag has an expiration date
- [ ] Cleanup ticket is created and linked
- [ ] Flag is added to monitoring dashboard
- [ ] Rollout plan is documented
**Remember:** Feature flags are temporary. Plan for removal before merging.`
})
4. Cleanup Sprints
┌─────────────────────────────────────────────────────────────────────────────┐
│ QUARTERLY FLAG CLEANUP SPRINT │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ Schedule: First week of each quarter │
│ │
│ Day 1-2: Audit │
│ • Generate report of all flags > 90 days old │
│ • Identify flags at 100% or 0% (candidates for removal) │
│ • Assign cleanup owners │
│ │
│ Day 3-4: Cleanup │
│ • Remove flag code references │
│ • Delete losing code paths │
│ • Update tests │
│ • Submit PRs │
│ │
│ Day 5: Celebrate │
│ • Track flags removed (gamification) │
│ • Recognize top contributors │
│ • Share lessons learned │
│ │
│ Metric: Lines of code deleted (the only metric that matters) │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Summary: The Flag Lifecycle Done Right
┌─────────────────────────────────────────────────────────────────────────────┐
│ HEALTHY FLAG LIFECYCLE │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ CREATION │
│ ───────── │
│ ✓ Define flag type (release/experiment/ops/permission) │
│ ✓ Set expiration date (required for release/experiment) │
│ ✓ Create cleanup ticket upfront │
│ ✓ Document rollout plan │
│ ✓ Assign owner │
│ │
│ ROLLOUT │
│ ──────── │
│ ✓ Gradual percentage increase (1% → 5% → 25% → 50% → 100%) │
│ ✓ Monitor metrics at each stage │
│ ✓ Define rollback criteria │
│ ✓ Sticky bucketing for consistency │
│ │
│ STABILIZATION │
│ ───────────── │
│ ✓ Reach 100% rollout │
│ ✓ Bake for 1-2 weeks │
│ ✓ Verify no incidents related to flag │
│ │
│ CLEANUP │
│ ──────── │
│ ✓ Generate removal plan │
│ ✓ Remove code references │
│ ✓ Delete losing code path │
│ ✓ Remove flag from database │
│ ✓ Close cleanup ticket │
│ │
│ Total lifecycle: 4-8 weeks (not 4-8 years) │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Feature flags are a powerful tool. They enable safe deployments, gradual rollouts, and experimentation. But without lifecycle management built into the system — not just documented in a wiki nobody reads — they become permanent fixtures in your codebase.
The graveyard isn't inevitable. It's a choice. Choose differently by making cleanup the default, expiration the requirement, and removal the celebration.
The best feature flag is one that no longer exists. Build systems that make deletion easy, expected, and celebrated.
What did you think?