API Gateway & Service Mesh: Routing, Traffic Control & Sidecar Architecture

March 12, 20266 min read11 views

api gateway

infrastructure engineering

API Gateway & Service Mesh: Routing, Traffic Control & Sidecar Architecture

The Problem: Microservice Communication Complexity

As services multiply, cross-cutting concerns (auth, rate limiting, retries, observability) get duplicated across every service. An API gateway handles north-south traffic (external → internal), while a service mesh handles east-west traffic (internal → internal).

Without Gateway/Mesh — Every service re-implements everything:
┌───────────┐  ┌───────────┐  ┌───────────┐  ┌───────────┐
│ Service A │  │ Service B │  │ Service C │  │ Service D │
│ ┌───────┐ │  │ ┌───────┐ │  │ ┌───────┐ │  │ ┌───────┐ │
│ │Auth   │ │  │ │Auth   │ │  │ │Auth   │ │  │ │Auth   │ │
│ │Rate   │ │  │ │Rate   │ │  │ │Rate   │ │  │ │Rate   │ │
│ │Retry  │ │  │ │Retry  │ │  │ │Retry  │ │  │ │Retry  │ │
│ │TLS    │ │  │ │TLS    │ │  │ │TLS    │ │  │ │TLS    │ │
│ │Metrics│ │  │ │Metrics│ │  │ │Metrics│ │  │ │Metrics│ │
│ │Circuit│ │  │ │Circuit│ │  │ │Circuit│ │  │ │Circuit│ │
│ └───────┘ │  │ └───────┘ │  │ └───────┘ │  │ └───────┘ │
└───────────┘  └───────────┘  └───────────┘  └───────────┘
  4 services × 6 concerns = 24 implementations to maintain

With Gateway + Mesh:
┌──────────────────────────┐
│   API Gateway            │  ← North-South (external)
│   Auth, Rate limit, TLS  │
└────────────┬─────────────┘
             │
   ┌─────────┼──────────┐
   │   Service Mesh      │  ← East-West (internal)
   │   mTLS, Retry,      │
   │   Circuit break,    │
   │   Observability     │
   │                     │
   │  ┌─────┐ ┌─────┐   │
   │  │Svc A│ │Svc B│   │
   │  │(app)│ │(app)│   │
   │  └─────┘ └─────┘   │
   └─────────────────────┘
  Services focus on business logic only

Architecture Overview

                    External Clients
                         │
                    ┌────▼────┐
                    │  Load   │
                    │Balancer │
                    └────┬────┘
                         │
              ┌──────────▼──────────┐
              │    API GATEWAY       │
              │                     │
              │ ┌─────────────────┐ │
              │ │  Route Matcher  │ │
              │ │  Rate Limiter   │ │
              │ │  Auth/AuthZ     │ │
              │ │  Request Xform  │ │
              │ │  Response Cache │ │
              │ │  Circuit Breaker│ │
              │ └─────────────────┘ │
              └──┬──────┬───────┬──┘
                 │      │       │
        ┌────────▼┐  ┌──▼────┐  ┌▼────────┐
        │ Pod A   │  │ Pod B │  │ Pod C   │
        │┌──────┐ │  │┌────┐ │  │┌──────┐ │
        ││Envoy │ │  ││Envy│ │  ││Envoy │ │  ← Sidecar proxies
        ││Proxy │ │  ││Prxy│ │  ││Proxy │ │    (service mesh
        │├──────┤ │  │├────┤ │  │├──────┤ │     data plane)
        ││User  │ │  ││Cart│ │  ││Order │ │
        ││ Svc  │ │  ││ Svc│ │  ││ Svc  │ │
        │└──────┘ │  │└────┘ │  │└──────┘ │
        └─────────┘  └──────┘  └─────────┘
              │           │          │
              └─────┬─────┘──────────┘
                    │
           ┌────────▼────────┐
           │  Control Plane  │
           │  (Istio/Linkerd)│
           │  Config, certs, │
           │  service disc.  │
           └─────────────────┘

1. API Gateway Core Implementation

// ─── Gateway Configuration ─────────────────────────────
interface GatewayConfig {
  port: number;
  routes: RouteConfig[];
  globalRateLimit: RateLimitConfig;
  authProviders: AuthProvider[];
  cors: CorsConfig;
  timeout: number;
  maxRequestBody: number;
}

interface RouteConfig {
  path: string;              // /api/v1/users/:id
  methods: string[];         // GET, POST, etc.
  upstream: UpstreamConfig;  // Target service
  middleware: MiddlewareRef[];
  rateLimit?: RateLimitConfig;
  cache?: CacheConfig;
  timeout?: number;
  retries?: RetryConfig;
  circuitBreaker?: CircuitBreakerConfig;
  transform?: TransformConfig;
  auth?: { required: boolean; scopes?: string[] };
}

interface UpstreamConfig {
  service: string;           // Service name (for discovery)
  url?: string;              // Direct URL fallback
  loadBalancer: 'round-robin' | 'least-connections' | 'random' | 'consistent-hash';
  healthCheck: {
    path: string;
    intervalMs: number;
    timeoutMs: number;
    unhealthyThreshold: number;
    healthyThreshold: number;
  };
}

// ─── Request Pipeline ──────────────────────────────────
type MiddlewareFn = (
  ctx: RequestContext,
  next: () => Promise<void>
) => Promise<void>;

interface RequestContext {
  request: IncomingRequest;
  response: GatewayResponse;
  route: RouteConfig;
  params: Record<string, string>;
  state: Map<string, unknown>;
  startTime: number;
  requestId: string;
  abortController: AbortController;
}

interface IncomingRequest {
  method: string;
  path: string;
  headers: Record<string, string>;
  query: Record<string, string>;
  body: unknown;
  ip: string;
}

interface GatewayResponse {
  status: number;
  headers: Record<string, string>;
  body: unknown;
}

// ─── Gateway Implementation ────────────────────────────
class APIGateway {
  private router: RouteTree;
  private middlewareChain: MiddlewareFn[];
  private serviceRegistry: ServiceRegistry;

  constructor(
    private config: GatewayConfig,
    registry: ServiceRegistry
  ) {
    this.router = new RouteTree();
    this.serviceRegistry = registry;
    this.middlewareChain = [];

    // Register routes
    for (const route of config.routes) {
      this.router.add(route);
    }

    // Build global middleware chain
    this.middlewareChain = [
      this.requestIdMiddleware(),
      this.corsMiddleware(),
      this.rateLimitMiddleware(config.globalRateLimit),
      this.authMiddleware(),
      this.requestTransformMiddleware(),
      this.circuitBreakerMiddleware(),
      this.retryMiddleware(),
      this.proxyMiddleware(),
      this.responseTransformMiddleware(),
      this.metricsMiddleware(),
    ];
  }

  // ── Handle Incoming Request ──────────────────────
  async handleRequest(req: IncomingRequest): Promise<GatewayResponse> {
    const match = this.router.match(req.method, req.path);
    if (!match) {
      return { status: 404, headers: {}, body: { error: 'Not Found' } };
    }

    const ctx: RequestContext = {
      request: req,
      response: { status: 200, headers: {}, body: null },
      route: match.route,
      params: match.params,
      state: new Map(),
      startTime: performance.now(),
      requestId: crypto.randomUUID(),
      abortController: new AbortController(),
    };

    // Execute middleware chain (Koa-style)
    await this.compose(this.middlewareChain)(ctx);

    return ctx.response;
  }

  // ── Middleware Composition ────────────────────────
  private compose(middlewares: MiddlewareFn[]): (ctx: RequestContext) => Promise<void> {
    return function (ctx: RequestContext) {
      let index = -1;

      function dispatch(i: number): Promise<void> {
        if (i <= index) {
          return Promise.reject(new Error('next() called multiple times'));
        }
        index = i;

        const fn = middlewares[i];
        if (!fn) return Promise.resolve();

        return fn(ctx, () => dispatch(i + 1));
      }

      return dispatch(0);
    };
  }

  // ── Individual Middleware Implementations ─────────

  private requestIdMiddleware(): MiddlewareFn {
    return async (ctx, next) => {
      ctx.response.headers['x-request-id'] = ctx.requestId;
      ctx.request.headers['x-request-id'] = ctx.requestId;
      await next();
    };
  }

  private corsMiddleware(): MiddlewareFn {
    return async (ctx, next) => {
      const { cors } = this.config as any;
      ctx.response.headers['access-control-allow-origin'] = cors?.origin || '*';
      ctx.response.headers['access-control-allow-methods'] = 'GET,POST,PUT,DELETE,OPTIONS';
      ctx.response.headers['access-control-allow-headers'] = 'Content-Type,Authorization';

      if (ctx.request.method === 'OPTIONS') {
        ctx.response.status = 204;
        return; // Don't call next — short-circuit
      }

      await next();
    };
  }

  private rateLimitMiddleware(config: RateLimitConfig): MiddlewareFn {
    const limiter = new SlidingWindowRateLimiter(config);
    return async (ctx, next) => {
      const key = this.getRateLimitKey(ctx);
      const result = limiter.check(key);

      ctx.response.headers['x-ratelimit-limit'] = String(result.limit);
      ctx.response.headers['x-ratelimit-remaining'] = String(result.remaining);
      ctx.response.headers['x-ratelimit-reset'] = String(result.resetAt);

      if (!result.allowed) {
        ctx.response.status = 429;
        ctx.response.headers['retry-after'] = String(
          Math.ceil((result.resetAt - Date.now()) / 1000)
        );
        ctx.response.body = { error: 'Rate limit exceeded' };
        return; // Don't call next
      }

      await next();
    };
  }

  private authMiddleware(): MiddlewareFn {
    return async (ctx, next) => {
      if (!ctx.route.auth?.required) {
        await next();
        return;
      }

      const authHeader = ctx.request.headers['authorization'];
      if (!authHeader) {
        ctx.response.status = 401;
        ctx.response.body = { error: 'Missing authorization header' };
        return;
      }

      // Validate token (JWT, API key, etc.)
      try {
        const token = authHeader.replace('Bearer ', '');
        const claims = await this.validateToken(token);
        ctx.state.set('user', claims);

        // Check scopes
        if (ctx.route.auth.scopes) {
          const userScopes = claims.scopes || [];
          const hasScope = ctx.route.auth.scopes.some(s =>
            userScopes.includes(s)
          );
          if (!hasScope) {
            ctx.response.status = 403;
            ctx.response.body = { error: 'Insufficient permissions' };
            return;
          }
        }
      } catch {
        ctx.response.status = 401;
        ctx.response.body = { error: 'Invalid token' };
        return;
      }

      await next();
    };
  }

  private requestTransformMiddleware(): MiddlewareFn {
    return async (ctx, next) => {
      // Add service-specific headers
      ctx.request.headers['x-forwarded-for'] = ctx.request.ip;
      ctx.request.headers['x-forwarded-proto'] = 'https';
      ctx.request.headers['x-gateway-request-id'] = ctx.requestId;

      // Strip sensitive headers before forwarding
      delete ctx.request.headers['cookie'];

      await next();
    };
  }

  private circuitBreakerMiddleware(): MiddlewareFn {
    return async (ctx, next) => {
      const cbConfig = ctx.route.circuitBreaker;
      if (!cbConfig) {
        await next();
        return;
      }

      const breaker = this.getCircuitBreaker(ctx.route.upstream.service);

      if (breaker.state === 'open') {
        ctx.response.status = 503;
        ctx.response.body = {
          error: 'Service temporarily unavailable',
          retryAfter: breaker.nextAttemptAt,
        };
        return;
      }

      try {
        await next();
        breaker.recordSuccess();
      } catch (err) {
        breaker.recordFailure();
        throw err;
      }
    };
  }

  private retryMiddleware(): MiddlewareFn {
    return async (ctx, next) => {
      const retryConfig = ctx.route.retries;
      if (!retryConfig) {
        await next();
        return;
      }

      let lastError: Error | undefined;
      for (let attempt = 0; attempt <= retryConfig.maxRetries; attempt++) {
        try {
          await next();

          // Check if response indicates a retryable error
          if (retryConfig.retryOn?.includes(ctx.response.status)) {
            throw new Error(`Retryable status: ${ctx.response.status}`);
          }

          return; // Success
        } catch (err) {
          lastError = err as Error;
          if (attempt < retryConfig.maxRetries) {
            const delay = retryConfig.baseDelayMs * Math.pow(2, attempt);
            await sleep(delay + Math.random() * delay * 0.2);
          }
        }
      }

      ctx.response.status = 502;
      ctx.response.body = { error: 'Upstream request failed', details: lastError?.message };
    };
  }

  private proxyMiddleware(): MiddlewareFn {
    return async (ctx, next) => {
      const upstream = ctx.route.upstream;
      const instances = this.serviceRegistry.getInstances(upstream.service);
      const healthy = instances.filter(i => i.healthy);

      if (healthy.length === 0) {
        ctx.response.status = 503;
        ctx.response.body = { error: 'No healthy upstream instances' };
        return;
      }

      // Load balance
      const instance = this.selectInstance(healthy, upstream.loadBalancer, ctx);

      // Forward request
      const timeout = ctx.route.timeout || this.config.timeout;
      try {
        const upstreamResponse = await this.forwardRequest(ctx, instance, timeout);
        ctx.response.status = upstreamResponse.status;
        ctx.response.body = upstreamResponse.body;
        Object.assign(ctx.response.headers, upstreamResponse.headers);
      } catch (err) {
        throw err; // Let retry middleware handle
      }
    };
  }

  private responseTransformMiddleware(): MiddlewareFn {
    return async (ctx, next) => {
      await next();

      // Strip internal headers
      delete ctx.response.headers['x-internal-trace'];

      // Add gateway headers
      ctx.response.headers['x-response-time'] =
        `${(performance.now() - ctx.startTime).toFixed(2)}ms`;
    };
  }

  private metricsMiddleware(): MiddlewareFn {
    return async (ctx, next) => {
      const start = performance.now();
      try {
        await next();
      } finally {
        const duration = performance.now() - start;
        // Record: method, path, status, duration
        this.recordMetric(
          ctx.request.method,
          ctx.route.path,
          ctx.response.status,
          duration
        );
      }
    };
  }

  // Helpers
  private getRateLimitKey(ctx: RequestContext): string {
    return ctx.request.ip; // Could also use user ID, API key, etc.
  }
  private async validateToken(token: string): Promise<any> { return {}; }
  private getCircuitBreaker(service: string): CircuitBreaker { return {} as any; }
  private selectInstance(instances: ServiceInstance[], strategy: string, ctx: RequestContext): ServiceInstance { return instances[0]; }
  private async forwardRequest(ctx: RequestContext, instance: ServiceInstance, timeout: number): Promise<any> { return {}; }
  private recordMetric(method: string, path: string, status: number, duration: number): void {}
}

interface RateLimitConfig { requestsPerWindow: number; windowMs: number; }
interface CorsConfig { origin: string; }
interface CacheConfig { ttlMs: number; }
interface RetryConfig { maxRetries: number; baseDelayMs: number; retryOn?: number[]; }
interface CircuitBreakerConfig { failureThreshold: number; resetTimeoutMs: number; }
interface TransformConfig {}
interface AuthProvider {}
interface MiddlewareRef { name: string; config?: unknown; }

function sleep(ms: number): Promise<void> {
  return new Promise(resolve => setTimeout(resolve, ms));
}

2. Route Matching Engine

// ─── Radix Tree Router ─────────────────────────────────
interface RouteMatch {
  route: RouteConfig;
  params: Record<string, string>;
}

class RouteTree {
  private root: RouteNode = { children: new Map(), paramChild: null, wildcardChild: null };

  add(route: RouteConfig): void {
    for (const method of route.methods) {
      const segments = route.path.split('/').filter(Boolean);
      let node = this.root;

      for (const segment of segments) {
        if (segment.startsWith(':')) {
          // Parameter segment: /users/:id
          if (!node.paramChild) {
            node.paramChild = {
              children: new Map(),
              paramChild: null,
              wildcardChild: null,
              paramName: segment.slice(1),
            };
          }
          node = node.paramChild;
        } else if (segment === '*') {
          // Wildcard: /assets/*
          if (!node.wildcardChild) {
            node.wildcardChild = {
              children: new Map(),
              paramChild: null,
              wildcardChild: null,
            };
          }
          node = node.wildcardChild;
          break;
        } else {
          // Static segment: /api/v1
          if (!node.children.has(segment)) {
            node.children.set(segment, {
              children: new Map(),
              paramChild: null,
              wildcardChild: null,
            });
          }
          node = node.children.get(segment)!;
        }
      }

      // Store route at terminal node
      if (!node.routes) node.routes = new Map();
      node.routes.set(method, route);
    }
  }

  match(method: string, path: string): RouteMatch | null {
    const segments = path.split('/').filter(Boolean);
    const params: Record<string, string> = {};

    const result = this.matchNode(this.root, segments, 0, params);
    if (!result) return null;

    const route = result.routes?.get(method) || result.routes?.get('*');
    if (!route) return null;

    return { route, params };
  }

  private matchNode(
    node: RouteNode,
    segments: string[],
    index: number,
    params: Record<string, string>
  ): RouteNode | null {
    // Base case: consumed all segments
    if (index === segments.length) {
      return node.routes ? node : null;
    }

    const segment = segments[index];

    // 1. Try static match first (highest priority)
    if (node.children.has(segment)) {
      const result = this.matchNode(
        node.children.get(segment)!,
        segments,
        index + 1,
        params
      );
      if (result) return result;
    }

    // 2. Try parameter match
    if (node.paramChild) {
      params[node.paramChild.paramName!] = segment;
      const result = this.matchNode(
        node.paramChild,
        segments,
        index + 1,
        params
      );
      if (result) return result;
      delete params[node.paramChild.paramName!];
    }

    // 3. Try wildcard match
    if (node.wildcardChild) {
      params['*'] = segments.slice(index).join('/');
      return node.wildcardChild;
    }

    return null;
  }
}

interface RouteNode {
  children: Map<string, RouteNode>;
  paramChild: RouteNode | null;
  wildcardChild: RouteNode | null;
  paramName?: string;
  routes?: Map<string, RouteConfig>;
}

3. Service Registry & Discovery

// ─── Service Registry ──────────────────────────────────
interface ServiceInstance {
  id: string;
  service: string;
  host: string;
  port: number;
  healthy: boolean;
  weight: number;
  metadata: Record<string, string>;
  registeredAt: number;
  lastHealthCheck: number;
  consecutiveFailures: number;
}

class ServiceRegistry {
  private services = new Map<string, ServiceInstance[]>();
  private healthCheckTimers = new Map<string, ReturnType<typeof setInterval>>();

  // ── Register Service Instance ────────────────────
  register(instance: ServiceInstance): void {
    if (!this.services.has(instance.service)) {
      this.services.set(instance.service, []);
    }

    const instances = this.services.get(instance.service)!;
    const existing = instances.findIndex(i => i.id === instance.id);

    if (existing >= 0) {
      instances[existing] = instance; // Update
    } else {
      instances.push(instance);
    }

    // Start health checks
    this.startHealthCheck(instance);
  }

  // ── Deregister ───────────────────────────────────
  deregister(instanceId: string): void {
    for (const [service, instances] of this.services) {
      const idx = instances.findIndex(i => i.id === instanceId);
      if (idx >= 0) {
        instances.splice(idx, 1);
        this.stopHealthCheck(instanceId);
        break;
      }
    }
  }

  // ── Get Healthy Instances ────────────────────────
  getInstances(service: string): ServiceInstance[] {
    return this.services.get(service) || [];
  }

  getHealthyInstances(service: string): ServiceInstance[] {
    return this.getInstances(service).filter(i => i.healthy);
  }

  // ── Health Checking ──────────────────────────────
  private startHealthCheck(instance: ServiceInstance): void {
    const timer = setInterval(async () => {
      try {
        const response = await fetch(
          `http://${instance.host}:${instance.port}/health`,
          { signal: AbortSignal.timeout(5000) }
        );

        if (response.ok) {
          instance.healthy = true;
          instance.consecutiveFailures = 0;
        } else {
          this.handleHealthCheckFailure(instance);
        }
      } catch {
        this.handleHealthCheckFailure(instance);
      }

      instance.lastHealthCheck = Date.now();
    }, 10000);

    this.healthCheckTimers.set(instance.id, timer);
  }

  private handleHealthCheckFailure(instance: ServiceInstance): void {
    instance.consecutiveFailures++;
    if (instance.consecutiveFailures >= 3) {
      instance.healthy = false;
    }
  }

  private stopHealthCheck(instanceId: string): void {
    const timer = this.healthCheckTimers.get(instanceId);
    if (timer) {
      clearInterval(timer);
      this.healthCheckTimers.delete(instanceId);
    }
  }
}

// ─── Load Balancer ─────────────────────────────────────
class LoadBalancer {
  private rrIndex = 0;
  private connectionCounts = new Map<string, number>();

  selectInstance(
    instances: ServiceInstance[],
    strategy: string,
    requestKey?: string
  ): ServiceInstance {
    switch (strategy) {
      case 'round-robin':
        return this.roundRobin(instances);
      case 'least-connections':
        return this.leastConnections(instances);
      case 'weighted-round-robin':
        return this.weightedRoundRobin(instances);
      case 'consistent-hash':
        return this.consistentHash(instances, requestKey || '');
      default:
        return instances[Math.floor(Math.random() * instances.length)];
    }
  }

  private roundRobin(instances: ServiceInstance[]): ServiceInstance {
    const instance = instances[this.rrIndex % instances.length];
    this.rrIndex++;
    return instance;
  }

  private leastConnections(instances: ServiceInstance[]): ServiceInstance {
    return instances.reduce((min, inst) => {
      const count = this.connectionCounts.get(inst.id) || 0;
      const minCount = this.connectionCounts.get(min.id) || 0;
      return count < minCount ? inst : min;
    });
  }

  private weightedRoundRobin(instances: ServiceInstance[]): ServiceInstance {
    const totalWeight = instances.reduce((sum, i) => sum + i.weight, 0);
    let random = Math.random() * totalWeight;

    for (const instance of instances) {
      random -= instance.weight;
      if (random <= 0) return instance;
    }

    return instances[0];
  }

  private consistentHash(
    instances: ServiceInstance[],
    key: string
  ): ServiceInstance {
    const hash = this.hashKey(key);
    const index = hash % instances.length;
    return instances[index];
  }

  private hashKey(key: string): number {
    let hash = 0;
    for (let i = 0; i < key.length; i++) {
      hash = ((hash << 5) - hash + key.charCodeAt(i)) | 0;
    }
    return Math.abs(hash);
  }

  recordConnection(instanceId: string): void {
    this.connectionCounts.set(
      instanceId,
      (this.connectionCounts.get(instanceId) || 0) + 1
    );
  }

  releaseConnection(instanceId: string): void {
    const count = this.connectionCounts.get(instanceId) || 0;
    this.connectionCounts.set(instanceId, Math.max(0, count - 1));
  }
}

4. Circuit Breaker

// ─── Circuit Breaker Implementation ────────────────────
type CircuitState = 'closed' | 'open' | 'half-open';

class CircuitBreaker {
  state: CircuitState = 'closed';
  private failures = 0;
  private successes = 0;
  private lastFailure = 0;
  private lastStateChange = Date.now();
  nextAttemptAt = 0;

  private slidingWindow: { timestamp: number; success: boolean }[] = [];

  constructor(
    private config: {
      failureThreshold: number;   // Failures before opening
      successThreshold: number;   // Successes in half-open to close
      resetTimeoutMs: number;     // Time in open before half-open
      windowSizeMs: number;       // Sliding window for failure rate
      failureRateThreshold: number; // 0.0-1.0
    }
  ) {}

  // ── Check if request should be allowed ───────────
  canExecute(): boolean {
    switch (this.state) {
      case 'closed':
        return true;

      case 'open':
        if (Date.now() >= this.nextAttemptAt) {
          this.transitionTo('half-open');
          return true; // Allow probe request
        }
        return false;

      case 'half-open':
        return true; // Allow limited requests

      default:
        return false;
    }
  }

  // ── Record Success ───────────────────────────────
  recordSuccess(): void {
    this.slidingWindow.push({ timestamp: Date.now(), success: true });
    this.trimWindow();

    switch (this.state) {
      case 'half-open':
        this.successes++;
        if (this.successes >= this.config.successThreshold) {
          this.transitionTo('closed');
        }
        break;
      case 'closed':
        // Reset failure count on success
        this.failures = 0;
        break;
    }
  }

  // ── Record Failure ───────────────────────────────
  recordFailure(): void {
    this.slidingWindow.push({ timestamp: Date.now(), success: false });
    this.trimWindow();
    this.lastFailure = Date.now();

    switch (this.state) {
      case 'closed':
        this.failures++;
        const failureRate = this.getFailureRate();
        if (
          this.failures >= this.config.failureThreshold ||
          failureRate >= this.config.failureRateThreshold
        ) {
          this.transitionTo('open');
        }
        break;
      case 'half-open':
        // Any failure in half-open → back to open
        this.transitionTo('open');
        break;
    }
  }

  // ── State Transitions ────────────────────────────
  private transitionTo(newState: CircuitState): void {
    const oldState = this.state;
    this.state = newState;
    this.lastStateChange = Date.now();

    switch (newState) {
      case 'open':
        this.nextAttemptAt = Date.now() + this.config.resetTimeoutMs;
        this.successes = 0;
        console.warn(`Circuit OPENED (was ${oldState}). Next attempt at ${new Date(this.nextAttemptAt).toISOString()}`);
        break;
      case 'half-open':
        this.successes = 0;
        this.failures = 0;
        console.info('Circuit HALF-OPEN. Allowing probe requests.');
        break;
      case 'closed':
        this.failures = 0;
        this.successes = 0;
        console.info('Circuit CLOSED. Normal operation resumed.');
        break;
    }
  }

  private getFailureRate(): number {
    if (this.slidingWindow.length === 0) return 0;
    const failures = this.slidingWindow.filter(e => !e.success).length;
    return failures / this.slidingWindow.length;
  }

  private trimWindow(): void {
    const cutoff = Date.now() - this.config.windowSizeMs;
    this.slidingWindow = this.slidingWindow.filter(e => e.timestamp > cutoff);
  }

  getStats(): CircuitBreakerStats {
    return {
      state: this.state,
      failures: this.failures,
      successes: this.successes,
      failureRate: this.getFailureRate(),
      lastFailure: this.lastFailure,
      lastStateChange: this.lastStateChange,
      nextAttemptAt: this.state === 'open' ? this.nextAttemptAt : undefined,
    };
  }
}

interface CircuitBreakerStats {
  state: CircuitState;
  failures: number;
  successes: number;
  failureRate: number;
  lastFailure: number;
  lastStateChange: number;
  nextAttemptAt?: number;
}

Circuit Breaker State Machine

  ┌────────────────┐
  │    CLOSED       │◄──────────────────────────────┐
  │  (normal flow)  │                               │
  │                 │          success threshold     │
  └───────┬────────┘          met in half-open      │
          │                                         │
          │ failure threshold                       │
          │ exceeded                                │
          ▼                                         │
  ┌────────────────┐     reset timeout    ┌─────────┴──────┐
  │     OPEN        │───────────────────►│   HALF-OPEN     │
  │ (reject all)    │                    │ (allow probes)  │
  │                 │◄───────────────────│                 │
  └────────────────┘    any failure      └────────────────┘
                        in half-open

  Typical Settings:
  - failureThreshold: 5 failures
  - resetTimeout: 30 seconds
  - successThreshold: 3 consecutive successes
  - windowSize: 60 seconds
  - failureRateThreshold: 50%

5. Service Mesh Sidecar Proxy

// ─── Sidecar Proxy Configuration ───────────────────────
interface SidecarConfig {
  serviceName: string;
  servicePort: number;       // Local service port
  inboundPort: number;       // Sidecar inbound listener
  outboundPort: number;      // Sidecar outbound listener
  adminPort: number;         // Health/metrics/config
  mtls: MTLSConfig;
  retryPolicy: RetryPolicy;
  circuitBreaker: CircuitBreakerConfig;
  rateLimits: RateLimitRule[];
  tracing: TracingConfig;
}

interface MTLSConfig {
  enabled: boolean;
  certPath: string;
  keyPath: string;
  caPath: string;
  rotationIntervalMs: number;
}

interface RetryPolicy {
  maxRetries: number;
  perTryTimeout: number;
  retryOn: string[];   // 5xx, reset, connect-failure, etc.
  retryBudget: {
    percentOfRequests: number;  // Max 20% of requests can be retries
    minRetries: number;         // But at least 3/sec
  };
}

interface RateLimitRule {
  source: string;       // Source service name
  requestsPerSecond: number;
}

interface TracingConfig {
  samplingRate: number; // 0.0-1.0
  exportEndpoint: string;
}

// ─── Sidecar Proxy Implementation ──────────────────────
class SidecarProxy {
  private circuitBreakers = new Map<string, CircuitBreaker>();
  private retryBudget: RetryBudget;

  constructor(private config: SidecarConfig) {
    this.retryBudget = new RetryBudget(
      config.retryPolicy.retryBudget.percentOfRequests,
      config.retryPolicy.retryBudget.minRetries
    );
  }

  // ── Inbound: External → Local Service ────────────
  async handleInbound(req: IncomingRequest): Promise<GatewayResponse> {
    // 1. Verify mTLS identity
    if (this.config.mtls.enabled) {
      const identity = await this.verifyClientCert(req);
      if (!identity) {
        return { status: 403, headers: {}, body: 'mTLS verification failed' };
      }
      req.headers['x-source-service'] = identity;
    }

    // 2. Rate limit by source service
    const source = req.headers['x-source-service'] || 'unknown';
    const rule = this.config.rateLimits.find(r => r.source === source);
    if (rule && !this.checkRateLimit(source, rule)) {
      return { status: 429, headers: {}, body: 'Rate limited' };
    }

    // 3. Inject tracing headers
    this.injectTracing(req);

    // 4. Forward to local service
    const response = await this.forwardToLocal(req);
    return response;
  }

  // ── Outbound: Local Service → External ───────────
  async handleOutbound(
    req: IncomingRequest,
    targetService: string
  ): Promise<GatewayResponse> {
    // 1. Service discovery — resolve target
    const endpoints = await this.resolveService(targetService);
    if (endpoints.length === 0) {
      return { status: 503, headers: {}, body: 'No endpoints available' };
    }

    // 2. Circuit breaker check
    const breaker = this.getCircuitBreaker(targetService);
    if (!breaker.canExecute()) {
      return { status: 503, headers: {}, body: 'Circuit open' };
    }

    // 3. Load balance
    const endpoint = this.selectEndpoint(endpoints);

    // 4. Establish mTLS connection
    if (this.config.mtls.enabled) {
      req.headers['x-source-service'] = this.config.serviceName;
    }

    // 5. Forward with retry
    return this.forwardWithRetry(req, endpoint, breaker);
  }

  // ── Retry with Budget ────────────────────────────
  private async forwardWithRetry(
    req: IncomingRequest,
    endpoint: ServiceEndpoint,
    breaker: CircuitBreaker
  ): Promise<GatewayResponse> {
    let lastError: Error | undefined;

    for (let attempt = 0; attempt <= this.config.retryPolicy.maxRetries; attempt++) {
      // Check retry budget (prevent retry storms)
      if (attempt > 0 && !this.retryBudget.canRetry()) {
        break;
      }

      try {
        const response = await this.forward(req, endpoint);

        if (this.isRetryableStatus(response.status) &&
            attempt < this.config.retryPolicy.maxRetries) {
          if (attempt > 0) this.retryBudget.recordRetry();
          continue;
        }

        if (response.status < 500) {
          breaker.recordSuccess();
        } else {
          breaker.recordFailure();
        }

        this.retryBudget.recordRequest();
        return response;
      } catch (err) {
        lastError = err as Error;
        breaker.recordFailure();
        if (attempt > 0) this.retryBudget.recordRetry();
      }
    }

    return {
      status: 502,
      headers: {},
      body: `All retries failed: ${lastError?.message}`,
    };
  }

  private getCircuitBreaker(service: string): CircuitBreaker {
    if (!this.circuitBreakers.has(service)) {
      this.circuitBreakers.set(service, new CircuitBreaker({
        failureThreshold: 5,
        successThreshold: 3,
        resetTimeoutMs: 30000,
        windowSizeMs: 60000,
        failureRateThreshold: 0.5,
      }));
    }
    return this.circuitBreakers.get(service)!;
  }

  private isRetryableStatus(status: number): boolean {
    return status === 502 || status === 503 || status === 504;
  }

  // Simplified helpers
  private async verifyClientCert(req: IncomingRequest): Promise<string | null> { return null; }
  private checkRateLimit(source: string, rule: RateLimitRule): boolean { return true; }
  private injectTracing(req: IncomingRequest): void {}
  private async forwardToLocal(req: IncomingRequest): Promise<GatewayResponse> { return {} as any; }
  private async resolveService(name: string): Promise<ServiceEndpoint[]> { return []; }
  private selectEndpoint(endpoints: ServiceEndpoint[]): ServiceEndpoint { return endpoints[0]; }
  private async forward(req: IncomingRequest, endpoint: ServiceEndpoint): Promise<GatewayResponse> { return {} as any; }
}

interface ServiceEndpoint { host: string; port: number; }

// ─── Retry Budget ──────────────────────────────────────
// Prevents retry storms — limits total retry rate
class RetryBudget {
  private requests: number[] = [];
  private retries: number[] = [];
  private windowMs = 10000;

  constructor(
    private maxRetryPercentage: number,
    private minRetriesPerWindow: number
  ) {}

  canRetry(): boolean {
    this.trim();
    const totalRequests = this.requests.length;
    const totalRetries = this.retries.length;

    // Always allow minimum retries
    if (totalRetries < this.minRetriesPerWindow) return true;

    // Cap at percentage of total requests
    return totalRetries < totalRequests * this.maxRetryPercentage;
  }

  recordRequest(): void {
    this.requests.push(Date.now());
  }

  recordRetry(): void {
    this.retries.push(Date.now());
  }

  private trim(): void {
    const cutoff = Date.now() - this.windowMs;
    this.requests = this.requests.filter(t => t > cutoff);
    this.retries = this.retries.filter(t => t > cutoff);
  }
}

Sidecar Proxy Traffic Flow

  ┌────────────────────────────────────────────────────┐
  │                  Pod / Container                    │
  │                                                    │
  │  ┌──────────────────────────────────────────┐      │
  │  │           Sidecar Proxy (Envoy)          │      │
  │  │                                          │      │
  │  │  Inbound:15001     Outbound:15006        │      │
  │  │  ┌─────────┐       ┌──────────┐          │      │
  │  │  │ mTLS    │       │ Service  │          │      │
  │  │  │ AuthZ   │       │ Discovery│          │      │
  │  │  │ RateLimit│       │ LB      │          │      │
  │  │  │ Tracing │       │ Retry   │          │      │
  │  │  └────┬────┘       │ Circuit │          │      │
  │  │       │            │ Breaker │          │      │
  │  │       │            └────┬─────┘          │      │
  │  └───────┼─────────────────┼───────────────┘      │
  │          │                 │                        │
  │          ▼                 ▲                        │
  │  ┌──────────────┐         │                        │
  │  │    App       │─────────┘                        │
  │  │  :8080       │  (app thinks it's calling        │
  │  │              │   localhost — sidecar intercepts) │
  │  └──────────────┘                                  │
  └────────────────────────────────────────────────────┘
       │                            │
       ▼ inbound                   ▼ outbound
   (from other                 (to other
    services)                   services)

6. Rate Limiter (Sliding Window)

// ─── Sliding Window Rate Limiter ───────────────────────
class SlidingWindowRateLimiter {
  private windows = new Map<string, WindowState>();

  constructor(private config: RateLimitConfig) {}

  check(key: string): RateLimitResult {
    const now = Date.now();
    const windowStart = Math.floor(now / this.config.windowMs) * this.config.windowMs;
    const prevWindowStart = windowStart - this.config.windowMs;

    let state = this.windows.get(key);
    if (!state) {
      state = { currentCount: 0, previousCount: 0, windowStart };
      this.windows.set(key, state);
    }

    // Roll window if needed
    if (state.windowStart < windowStart) {
      if (state.windowStart === prevWindowStart) {
        state.previousCount = state.currentCount;
      } else {
        state.previousCount = 0;
      }
      state.currentCount = 0;
      state.windowStart = windowStart;
    }

    // Calculate weighted count
    const elapsed = now - windowStart;
    const weight = 1 - elapsed / this.config.windowMs;
    const count = state.currentCount + Math.floor(state.previousCount * weight);

    if (count >= this.config.requestsPerWindow) {
      return {
        allowed: false,
        limit: this.config.requestsPerWindow,
        remaining: 0,
        resetAt: windowStart + this.config.windowMs,
      };
    }

    state.currentCount++;

    return {
      allowed: true,
      limit: this.config.requestsPerWindow,
      remaining: this.config.requestsPerWindow - count - 1,
      resetAt: windowStart + this.config.windowMs,
    };
  }
}

interface WindowState {
  currentCount: number;
  previousCount: number;
  windowStart: number;
}

interface RateLimitResult {
  allowed: boolean;
  limit: number;
  remaining: number;
  resetAt: number;
}

Comparison: API Gateway vs Service Mesh

Aspect	API Gateway	Service Mesh
Traffic direction	North-South (external → internal)	East-West (internal → internal)
Deployment	Centralized (1-few instances)	Decentralized (sidecar per pod)
Audience	External clients, partners	Internal services
Auth	API keys, OAuth, JWT	mTLS, SPIFFE identity
Routing	Path, header-based	Service name, version
Rate limiting	Per client/API key	Per service pair
Protocol	HTTP/REST, GraphQL, WebSocket	HTTP/2, gRPC (any L4/L7)
Visibility	Request logs, analytics	Distributed tracing
Examples	Kong, APISIX, Envoy Gateway	Istio, Linkerd, Consul Connect
Config model	Admin API, declarative	Control plane (CRDs)

Comparison: Gateway Implementations

Feature	Nginx	Envoy	Kong	AWS API GW	Traefik
Config	Static file	xDS API (dynamic)	DB + Admin API	Console/CloudFormation	Labels/tags
Extensibility	Lua, C modules	WASM, C++ filters	Lua plugins	Lambda authorizers	Go plugins
Protocol	HTTP/1.1, HTTP/2	HTTP/1.1, HTTP/2, gRPC	HTTP, gRPC	HTTP, WebSocket	HTTP, TCP, gRPC
Hot reload	Graceful restart	Full hot reload	Plugin hot reload	Managed	Hot reload
Service mesh	No (Nginx Plus)	Yes (Istio data plane)	Via Kong Mesh	No (App Mesh)	Via Maesh
Circuit breaker	No	Yes (outlier detection)	Plugin	No	Yes
Best for	Static sites, reverse proxy	Service mesh, dynamic	API management	Serverless	K8s ingress

Interview Questions & Answers

Q1: How does a service mesh achieve zero-trust networking with mTLS?

A: Each sidecar proxy has a unique identity certificate issued by the mesh's certificate authority (CA). When Service A calls Service B, their sidecars perform mutual TLS: A presents its certificate to B's sidecar, B presents its certificate to A's sidecar, and both verify the other's chain of trust back to the shared CA. The application code doesn't handle any of this — it talks to localhost in plaintext, and the sidecar transparently encrypts/decrypts. The control plane (Istio Citadel, Linkerd Identity) rotates certificates automatically (usually every 24 hours). Authorization policies can reference these identities: "only Service A can call Service B's /admin endpoint." This provides authentication (who is calling), encryption (can't intercept), and authorization (allowed to call) — all without changing application code.

Q2: Explain the difference between retry and retry budget.

A: A retry policy says "retry failed requests up to 3 times." A retry budget says "retries must not exceed 20% of total requests." Without a budget, if a downstream service starts failing, every client retries 3 times, tripling the load on an already struggling service — this is a retry storm that cascades failures. The retry budget (e.g., Envoy's retryBudget) tracks the ratio of retries to total requests in a sliding window. Once retries exceed the budget (say 20%), additional retries are suppressed. A minimum retry count (e.g., 3/second) ensures transient errors can still be retried even during low-traffic periods. This provides the benefits of retries for transient errors while preventing amplification during outages.

Q3: When should you put logic in the gateway vs the service mesh vs the application?

A: Gateway: External-facing concerns — API key validation, OAuth token verification, aggregate rate limiting, request/response transformation (e.g., REST → gRPC), public API versioning, and developer portal features. Service mesh: Cross-cutting infrastructure concerns — mTLS, per-service rate limiting, retries with budget, circuit breaking, distributed tracing propagation, canary deployments, and traffic shifting. Application: Business logic, data validation, authorization of business operations (not just identity), business event publishing, and domain-specific error handling. The key principle: keep business logic in the application, infrastructure concerns in the mesh, and external-boundary concerns in the gateway.

Q4: How do you handle API versioning at the gateway level?

A: Multiple strategies: (1) URL path versioning (/api/v1/users): Route /api/v1/* to Service-v1, /api/v2/* to Service-v2. Clean but URL changes. (2) Header versioning (Accept: application/vnd.api.v2+json): Gateway inspects header and routes accordingly. URL stays clean. (3) Query parameter (/users?version=2): Less common, harder to cache. (4) Traffic shifting: Run v1 and v2 simultaneously. Gradually route 10% → 50% → 100% to v2 using weighted routing. (5) Request transformation: Gateway rewrites v1 requests into v2 format, maintaining a single backend. Good for small differences. Best practice: URL path versioning for major versions, request transformation for minor versions, and traffic shifting for rollouts.

Key Takeaways

API gateways centralize north-south concerns — auth, rate limiting, TLS termination — preventing duplication across services
Service meshes handle east-west traffic transparently — sidecar proxies add mTLS, retries, and observability without code changes
Middleware composition (Koa-style onion model) enables clean separation of gateway concerns into reusable, testable functions
Circuit breakers prevent cascading failures — open circuit → reject requests → wait → probe → close circuit on recovery
Retry budgets prevent retry storms — limit retries to a percentage of total traffic to avoid amplifying load on failing services
Radix tree routing enables efficient path matching with O(path_length) performance regardless of route count
Service discovery + health checks are the foundation — without knowing which instances are healthy, no other feature works
Load balancing strategy matters: round-robin for uniform services, least-connections for variable latency, consistent hash for session affinity
mTLS provides zero-trust — every service connection is authenticated, encrypted, and authorized via sidecar-managed certificates
Separate gateway (external boundary) from mesh (internal traffic) — they serve different audiences with different requirements

What did you think?