API Gateway & Service Mesh: Routing, Traffic Control & Sidecar Architecture
API Gateway & Service Mesh: Routing, Traffic Control & Sidecar Architecture
The Problem: Microservice Communication Complexity
As services multiply, cross-cutting concerns (auth, rate limiting, retries, observability) get duplicated across every service. An API gateway handles north-south traffic (external → internal), while a service mesh handles east-west traffic (internal → internal).
Without Gateway/Mesh — Every service re-implements everything:
┌───────────┐ ┌───────────┐ ┌───────────┐ ┌───────────┐
│ Service A │ │ Service B │ │ Service C │ │ Service D │
│ ┌───────┐ │ │ ┌───────┐ │ │ ┌───────┐ │ │ ┌───────┐ │
│ │Auth │ │ │ │Auth │ │ │ │Auth │ │ │ │Auth │ │
│ │Rate │ │ │ │Rate │ │ │ │Rate │ │ │ │Rate │ │
│ │Retry │ │ │ │Retry │ │ │ │Retry │ │ │ │Retry │ │
│ │TLS │ │ │ │TLS │ │ │ │TLS │ │ │ │TLS │ │
│ │Metrics│ │ │ │Metrics│ │ │ │Metrics│ │ │ │Metrics│ │
│ │Circuit│ │ │ │Circuit│ │ │ │Circuit│ │ │ │Circuit│ │
│ └───────┘ │ │ └───────┘ │ │ └───────┘ │ │ └───────┘ │
└───────────┘ └───────────┘ └───────────┘ └───────────┘
4 services × 6 concerns = 24 implementations to maintain
With Gateway + Mesh:
┌──────────────────────────┐
│ API Gateway │ ← North-South (external)
│ Auth, Rate limit, TLS │
└────────────┬─────────────┘
│
┌─────────┼──────────┐
│ Service Mesh │ ← East-West (internal)
│ mTLS, Retry, │
│ Circuit break, │
│ Observability │
│ │
│ ┌─────┐ ┌─────┐ │
│ │Svc A│ │Svc B│ │
│ │(app)│ │(app)│ │
│ └─────┘ └─────┘ │
└─────────────────────┘
Services focus on business logic only
Architecture Overview
External Clients
│
┌────▼────┐
│ Load │
│Balancer │
└────┬────┘
│
┌──────────▼──────────┐
│ API GATEWAY │
│ │
│ ┌─────────────────┐ │
│ │ Route Matcher │ │
│ │ Rate Limiter │ │
│ │ Auth/AuthZ │ │
│ │ Request Xform │ │
│ │ Response Cache │ │
│ │ Circuit Breaker│ │
│ └─────────────────┘ │
└──┬──────┬───────┬──┘
│ │ │
┌────────▼┐ ┌──▼────┐ ┌▼────────┐
│ Pod A │ │ Pod B │ │ Pod C │
│┌──────┐ │ │┌────┐ │ │┌──────┐ │
││Envoy │ │ ││Envy│ │ ││Envoy │ │ ← Sidecar proxies
││Proxy │ │ ││Prxy│ │ ││Proxy │ │ (service mesh
│├──────┤ │ │├────┤ │ │├──────┤ │ data plane)
││User │ │ ││Cart│ │ ││Order │ │
││ Svc │ │ ││ Svc│ │ ││ Svc │ │
│└──────┘ │ │└────┘ │ │└──────┘ │
└─────────┘ └──────┘ └─────────┘
│ │ │
└─────┬─────┘──────────┘
│
┌────────▼────────┐
│ Control Plane │
│ (Istio/Linkerd)│
│ Config, certs, │
│ service disc. │
└─────────────────┘
1. API Gateway Core Implementation
// ─── Gateway Configuration ─────────────────────────────
interface GatewayConfig {
port: number;
routes: RouteConfig[];
globalRateLimit: RateLimitConfig;
authProviders: AuthProvider[];
cors: CorsConfig;
timeout: number;
maxRequestBody: number;
}
interface RouteConfig {
path: string; // /api/v1/users/:id
methods: string[]; // GET, POST, etc.
upstream: UpstreamConfig; // Target service
middleware: MiddlewareRef[];
rateLimit?: RateLimitConfig;
cache?: CacheConfig;
timeout?: number;
retries?: RetryConfig;
circuitBreaker?: CircuitBreakerConfig;
transform?: TransformConfig;
auth?: { required: boolean; scopes?: string[] };
}
interface UpstreamConfig {
service: string; // Service name (for discovery)
url?: string; // Direct URL fallback
loadBalancer: 'round-robin' | 'least-connections' | 'random' | 'consistent-hash';
healthCheck: {
path: string;
intervalMs: number;
timeoutMs: number;
unhealthyThreshold: number;
healthyThreshold: number;
};
}
// ─── Request Pipeline ──────────────────────────────────
type MiddlewareFn = (
ctx: RequestContext,
next: () => Promise<void>
) => Promise<void>;
interface RequestContext {
request: IncomingRequest;
response: GatewayResponse;
route: RouteConfig;
params: Record<string, string>;
state: Map<string, unknown>;
startTime: number;
requestId: string;
abortController: AbortController;
}
interface IncomingRequest {
method: string;
path: string;
headers: Record<string, string>;
query: Record<string, string>;
body: unknown;
ip: string;
}
interface GatewayResponse {
status: number;
headers: Record<string, string>;
body: unknown;
}
// ─── Gateway Implementation ────────────────────────────
class APIGateway {
private router: RouteTree;
private middlewareChain: MiddlewareFn[];
private serviceRegistry: ServiceRegistry;
constructor(
private config: GatewayConfig,
registry: ServiceRegistry
) {
this.router = new RouteTree();
this.serviceRegistry = registry;
this.middlewareChain = [];
// Register routes
for (const route of config.routes) {
this.router.add(route);
}
// Build global middleware chain
this.middlewareChain = [
this.requestIdMiddleware(),
this.corsMiddleware(),
this.rateLimitMiddleware(config.globalRateLimit),
this.authMiddleware(),
this.requestTransformMiddleware(),
this.circuitBreakerMiddleware(),
this.retryMiddleware(),
this.proxyMiddleware(),
this.responseTransformMiddleware(),
this.metricsMiddleware(),
];
}
// ── Handle Incoming Request ──────────────────────
async handleRequest(req: IncomingRequest): Promise<GatewayResponse> {
const match = this.router.match(req.method, req.path);
if (!match) {
return { status: 404, headers: {}, body: { error: 'Not Found' } };
}
const ctx: RequestContext = {
request: req,
response: { status: 200, headers: {}, body: null },
route: match.route,
params: match.params,
state: new Map(),
startTime: performance.now(),
requestId: crypto.randomUUID(),
abortController: new AbortController(),
};
// Execute middleware chain (Koa-style)
await this.compose(this.middlewareChain)(ctx);
return ctx.response;
}
// ── Middleware Composition ────────────────────────
private compose(middlewares: MiddlewareFn[]): (ctx: RequestContext) => Promise<void> {
return function (ctx: RequestContext) {
let index = -1;
function dispatch(i: number): Promise<void> {
if (i <= index) {
return Promise.reject(new Error('next() called multiple times'));
}
index = i;
const fn = middlewares[i];
if (!fn) return Promise.resolve();
return fn(ctx, () => dispatch(i + 1));
}
return dispatch(0);
};
}
// ── Individual Middleware Implementations ─────────
private requestIdMiddleware(): MiddlewareFn {
return async (ctx, next) => {
ctx.response.headers['x-request-id'] = ctx.requestId;
ctx.request.headers['x-request-id'] = ctx.requestId;
await next();
};
}
private corsMiddleware(): MiddlewareFn {
return async (ctx, next) => {
const { cors } = this.config as any;
ctx.response.headers['access-control-allow-origin'] = cors?.origin || '*';
ctx.response.headers['access-control-allow-methods'] = 'GET,POST,PUT,DELETE,OPTIONS';
ctx.response.headers['access-control-allow-headers'] = 'Content-Type,Authorization';
if (ctx.request.method === 'OPTIONS') {
ctx.response.status = 204;
return; // Don't call next — short-circuit
}
await next();
};
}
private rateLimitMiddleware(config: RateLimitConfig): MiddlewareFn {
const limiter = new SlidingWindowRateLimiter(config);
return async (ctx, next) => {
const key = this.getRateLimitKey(ctx);
const result = limiter.check(key);
ctx.response.headers['x-ratelimit-limit'] = String(result.limit);
ctx.response.headers['x-ratelimit-remaining'] = String(result.remaining);
ctx.response.headers['x-ratelimit-reset'] = String(result.resetAt);
if (!result.allowed) {
ctx.response.status = 429;
ctx.response.headers['retry-after'] = String(
Math.ceil((result.resetAt - Date.now()) / 1000)
);
ctx.response.body = { error: 'Rate limit exceeded' };
return; // Don't call next
}
await next();
};
}
private authMiddleware(): MiddlewareFn {
return async (ctx, next) => {
if (!ctx.route.auth?.required) {
await next();
return;
}
const authHeader = ctx.request.headers['authorization'];
if (!authHeader) {
ctx.response.status = 401;
ctx.response.body = { error: 'Missing authorization header' };
return;
}
// Validate token (JWT, API key, etc.)
try {
const token = authHeader.replace('Bearer ', '');
const claims = await this.validateToken(token);
ctx.state.set('user', claims);
// Check scopes
if (ctx.route.auth.scopes) {
const userScopes = claims.scopes || [];
const hasScope = ctx.route.auth.scopes.some(s =>
userScopes.includes(s)
);
if (!hasScope) {
ctx.response.status = 403;
ctx.response.body = { error: 'Insufficient permissions' };
return;
}
}
} catch {
ctx.response.status = 401;
ctx.response.body = { error: 'Invalid token' };
return;
}
await next();
};
}
private requestTransformMiddleware(): MiddlewareFn {
return async (ctx, next) => {
// Add service-specific headers
ctx.request.headers['x-forwarded-for'] = ctx.request.ip;
ctx.request.headers['x-forwarded-proto'] = 'https';
ctx.request.headers['x-gateway-request-id'] = ctx.requestId;
// Strip sensitive headers before forwarding
delete ctx.request.headers['cookie'];
await next();
};
}
private circuitBreakerMiddleware(): MiddlewareFn {
return async (ctx, next) => {
const cbConfig = ctx.route.circuitBreaker;
if (!cbConfig) {
await next();
return;
}
const breaker = this.getCircuitBreaker(ctx.route.upstream.service);
if (breaker.state === 'open') {
ctx.response.status = 503;
ctx.response.body = {
error: 'Service temporarily unavailable',
retryAfter: breaker.nextAttemptAt,
};
return;
}
try {
await next();
breaker.recordSuccess();
} catch (err) {
breaker.recordFailure();
throw err;
}
};
}
private retryMiddleware(): MiddlewareFn {
return async (ctx, next) => {
const retryConfig = ctx.route.retries;
if (!retryConfig) {
await next();
return;
}
let lastError: Error | undefined;
for (let attempt = 0; attempt <= retryConfig.maxRetries; attempt++) {
try {
await next();
// Check if response indicates a retryable error
if (retryConfig.retryOn?.includes(ctx.response.status)) {
throw new Error(`Retryable status: ${ctx.response.status}`);
}
return; // Success
} catch (err) {
lastError = err as Error;
if (attempt < retryConfig.maxRetries) {
const delay = retryConfig.baseDelayMs * Math.pow(2, attempt);
await sleep(delay + Math.random() * delay * 0.2);
}
}
}
ctx.response.status = 502;
ctx.response.body = { error: 'Upstream request failed', details: lastError?.message };
};
}
private proxyMiddleware(): MiddlewareFn {
return async (ctx, next) => {
const upstream = ctx.route.upstream;
const instances = this.serviceRegistry.getInstances(upstream.service);
const healthy = instances.filter(i => i.healthy);
if (healthy.length === 0) {
ctx.response.status = 503;
ctx.response.body = { error: 'No healthy upstream instances' };
return;
}
// Load balance
const instance = this.selectInstance(healthy, upstream.loadBalancer, ctx);
// Forward request
const timeout = ctx.route.timeout || this.config.timeout;
try {
const upstreamResponse = await this.forwardRequest(ctx, instance, timeout);
ctx.response.status = upstreamResponse.status;
ctx.response.body = upstreamResponse.body;
Object.assign(ctx.response.headers, upstreamResponse.headers);
} catch (err) {
throw err; // Let retry middleware handle
}
};
}
private responseTransformMiddleware(): MiddlewareFn {
return async (ctx, next) => {
await next();
// Strip internal headers
delete ctx.response.headers['x-internal-trace'];
// Add gateway headers
ctx.response.headers['x-response-time'] =
`${(performance.now() - ctx.startTime).toFixed(2)}ms`;
};
}
private metricsMiddleware(): MiddlewareFn {
return async (ctx, next) => {
const start = performance.now();
try {
await next();
} finally {
const duration = performance.now() - start;
// Record: method, path, status, duration
this.recordMetric(
ctx.request.method,
ctx.route.path,
ctx.response.status,
duration
);
}
};
}
// Helpers
private getRateLimitKey(ctx: RequestContext): string {
return ctx.request.ip; // Could also use user ID, API key, etc.
}
private async validateToken(token: string): Promise<any> { return {}; }
private getCircuitBreaker(service: string): CircuitBreaker { return {} as any; }
private selectInstance(instances: ServiceInstance[], strategy: string, ctx: RequestContext): ServiceInstance { return instances[0]; }
private async forwardRequest(ctx: RequestContext, instance: ServiceInstance, timeout: number): Promise<any> { return {}; }
private recordMetric(method: string, path: string, status: number, duration: number): void {}
}
interface RateLimitConfig { requestsPerWindow: number; windowMs: number; }
interface CorsConfig { origin: string; }
interface CacheConfig { ttlMs: number; }
interface RetryConfig { maxRetries: number; baseDelayMs: number; retryOn?: number[]; }
interface CircuitBreakerConfig { failureThreshold: number; resetTimeoutMs: number; }
interface TransformConfig {}
interface AuthProvider {}
interface MiddlewareRef { name: string; config?: unknown; }
function sleep(ms: number): Promise<void> {
return new Promise(resolve => setTimeout(resolve, ms));
}
2. Route Matching Engine
// ─── Radix Tree Router ─────────────────────────────────
interface RouteMatch {
route: RouteConfig;
params: Record<string, string>;
}
class RouteTree {
private root: RouteNode = { children: new Map(), paramChild: null, wildcardChild: null };
add(route: RouteConfig): void {
for (const method of route.methods) {
const segments = route.path.split('/').filter(Boolean);
let node = this.root;
for (const segment of segments) {
if (segment.startsWith(':')) {
// Parameter segment: /users/:id
if (!node.paramChild) {
node.paramChild = {
children: new Map(),
paramChild: null,
wildcardChild: null,
paramName: segment.slice(1),
};
}
node = node.paramChild;
} else if (segment === '*') {
// Wildcard: /assets/*
if (!node.wildcardChild) {
node.wildcardChild = {
children: new Map(),
paramChild: null,
wildcardChild: null,
};
}
node = node.wildcardChild;
break;
} else {
// Static segment: /api/v1
if (!node.children.has(segment)) {
node.children.set(segment, {
children: new Map(),
paramChild: null,
wildcardChild: null,
});
}
node = node.children.get(segment)!;
}
}
// Store route at terminal node
if (!node.routes) node.routes = new Map();
node.routes.set(method, route);
}
}
match(method: string, path: string): RouteMatch | null {
const segments = path.split('/').filter(Boolean);
const params: Record<string, string> = {};
const result = this.matchNode(this.root, segments, 0, params);
if (!result) return null;
const route = result.routes?.get(method) || result.routes?.get('*');
if (!route) return null;
return { route, params };
}
private matchNode(
node: RouteNode,
segments: string[],
index: number,
params: Record<string, string>
): RouteNode | null {
// Base case: consumed all segments
if (index === segments.length) {
return node.routes ? node : null;
}
const segment = segments[index];
// 1. Try static match first (highest priority)
if (node.children.has(segment)) {
const result = this.matchNode(
node.children.get(segment)!,
segments,
index + 1,
params
);
if (result) return result;
}
// 2. Try parameter match
if (node.paramChild) {
params[node.paramChild.paramName!] = segment;
const result = this.matchNode(
node.paramChild,
segments,
index + 1,
params
);
if (result) return result;
delete params[node.paramChild.paramName!];
}
// 3. Try wildcard match
if (node.wildcardChild) {
params['*'] = segments.slice(index).join('/');
return node.wildcardChild;
}
return null;
}
}
interface RouteNode {
children: Map<string, RouteNode>;
paramChild: RouteNode | null;
wildcardChild: RouteNode | null;
paramName?: string;
routes?: Map<string, RouteConfig>;
}
3. Service Registry & Discovery
// ─── Service Registry ──────────────────────────────────
interface ServiceInstance {
id: string;
service: string;
host: string;
port: number;
healthy: boolean;
weight: number;
metadata: Record<string, string>;
registeredAt: number;
lastHealthCheck: number;
consecutiveFailures: number;
}
class ServiceRegistry {
private services = new Map<string, ServiceInstance[]>();
private healthCheckTimers = new Map<string, ReturnType<typeof setInterval>>();
// ── Register Service Instance ────────────────────
register(instance: ServiceInstance): void {
if (!this.services.has(instance.service)) {
this.services.set(instance.service, []);
}
const instances = this.services.get(instance.service)!;
const existing = instances.findIndex(i => i.id === instance.id);
if (existing >= 0) {
instances[existing] = instance; // Update
} else {
instances.push(instance);
}
// Start health checks
this.startHealthCheck(instance);
}
// ── Deregister ───────────────────────────────────
deregister(instanceId: string): void {
for (const [service, instances] of this.services) {
const idx = instances.findIndex(i => i.id === instanceId);
if (idx >= 0) {
instances.splice(idx, 1);
this.stopHealthCheck(instanceId);
break;
}
}
}
// ── Get Healthy Instances ────────────────────────
getInstances(service: string): ServiceInstance[] {
return this.services.get(service) || [];
}
getHealthyInstances(service: string): ServiceInstance[] {
return this.getInstances(service).filter(i => i.healthy);
}
// ── Health Checking ──────────────────────────────
private startHealthCheck(instance: ServiceInstance): void {
const timer = setInterval(async () => {
try {
const response = await fetch(
`http://${instance.host}:${instance.port}/health`,
{ signal: AbortSignal.timeout(5000) }
);
if (response.ok) {
instance.healthy = true;
instance.consecutiveFailures = 0;
} else {
this.handleHealthCheckFailure(instance);
}
} catch {
this.handleHealthCheckFailure(instance);
}
instance.lastHealthCheck = Date.now();
}, 10000);
this.healthCheckTimers.set(instance.id, timer);
}
private handleHealthCheckFailure(instance: ServiceInstance): void {
instance.consecutiveFailures++;
if (instance.consecutiveFailures >= 3) {
instance.healthy = false;
}
}
private stopHealthCheck(instanceId: string): void {
const timer = this.healthCheckTimers.get(instanceId);
if (timer) {
clearInterval(timer);
this.healthCheckTimers.delete(instanceId);
}
}
}
// ─── Load Balancer ─────────────────────────────────────
class LoadBalancer {
private rrIndex = 0;
private connectionCounts = new Map<string, number>();
selectInstance(
instances: ServiceInstance[],
strategy: string,
requestKey?: string
): ServiceInstance {
switch (strategy) {
case 'round-robin':
return this.roundRobin(instances);
case 'least-connections':
return this.leastConnections(instances);
case 'weighted-round-robin':
return this.weightedRoundRobin(instances);
case 'consistent-hash':
return this.consistentHash(instances, requestKey || '');
default:
return instances[Math.floor(Math.random() * instances.length)];
}
}
private roundRobin(instances: ServiceInstance[]): ServiceInstance {
const instance = instances[this.rrIndex % instances.length];
this.rrIndex++;
return instance;
}
private leastConnections(instances: ServiceInstance[]): ServiceInstance {
return instances.reduce((min, inst) => {
const count = this.connectionCounts.get(inst.id) || 0;
const minCount = this.connectionCounts.get(min.id) || 0;
return count < minCount ? inst : min;
});
}
private weightedRoundRobin(instances: ServiceInstance[]): ServiceInstance {
const totalWeight = instances.reduce((sum, i) => sum + i.weight, 0);
let random = Math.random() * totalWeight;
for (const instance of instances) {
random -= instance.weight;
if (random <= 0) return instance;
}
return instances[0];
}
private consistentHash(
instances: ServiceInstance[],
key: string
): ServiceInstance {
const hash = this.hashKey(key);
const index = hash % instances.length;
return instances[index];
}
private hashKey(key: string): number {
let hash = 0;
for (let i = 0; i < key.length; i++) {
hash = ((hash << 5) - hash + key.charCodeAt(i)) | 0;
}
return Math.abs(hash);
}
recordConnection(instanceId: string): void {
this.connectionCounts.set(
instanceId,
(this.connectionCounts.get(instanceId) || 0) + 1
);
}
releaseConnection(instanceId: string): void {
const count = this.connectionCounts.get(instanceId) || 0;
this.connectionCounts.set(instanceId, Math.max(0, count - 1));
}
}
4. Circuit Breaker
// ─── Circuit Breaker Implementation ────────────────────
type CircuitState = 'closed' | 'open' | 'half-open';
class CircuitBreaker {
state: CircuitState = 'closed';
private failures = 0;
private successes = 0;
private lastFailure = 0;
private lastStateChange = Date.now();
nextAttemptAt = 0;
private slidingWindow: { timestamp: number; success: boolean }[] = [];
constructor(
private config: {
failureThreshold: number; // Failures before opening
successThreshold: number; // Successes in half-open to close
resetTimeoutMs: number; // Time in open before half-open
windowSizeMs: number; // Sliding window for failure rate
failureRateThreshold: number; // 0.0-1.0
}
) {}
// ── Check if request should be allowed ───────────
canExecute(): boolean {
switch (this.state) {
case 'closed':
return true;
case 'open':
if (Date.now() >= this.nextAttemptAt) {
this.transitionTo('half-open');
return true; // Allow probe request
}
return false;
case 'half-open':
return true; // Allow limited requests
default:
return false;
}
}
// ── Record Success ───────────────────────────────
recordSuccess(): void {
this.slidingWindow.push({ timestamp: Date.now(), success: true });
this.trimWindow();
switch (this.state) {
case 'half-open':
this.successes++;
if (this.successes >= this.config.successThreshold) {
this.transitionTo('closed');
}
break;
case 'closed':
// Reset failure count on success
this.failures = 0;
break;
}
}
// ── Record Failure ───────────────────────────────
recordFailure(): void {
this.slidingWindow.push({ timestamp: Date.now(), success: false });
this.trimWindow();
this.lastFailure = Date.now();
switch (this.state) {
case 'closed':
this.failures++;
const failureRate = this.getFailureRate();
if (
this.failures >= this.config.failureThreshold ||
failureRate >= this.config.failureRateThreshold
) {
this.transitionTo('open');
}
break;
case 'half-open':
// Any failure in half-open → back to open
this.transitionTo('open');
break;
}
}
// ── State Transitions ────────────────────────────
private transitionTo(newState: CircuitState): void {
const oldState = this.state;
this.state = newState;
this.lastStateChange = Date.now();
switch (newState) {
case 'open':
this.nextAttemptAt = Date.now() + this.config.resetTimeoutMs;
this.successes = 0;
console.warn(`Circuit OPENED (was ${oldState}). Next attempt at ${new Date(this.nextAttemptAt).toISOString()}`);
break;
case 'half-open':
this.successes = 0;
this.failures = 0;
console.info('Circuit HALF-OPEN. Allowing probe requests.');
break;
case 'closed':
this.failures = 0;
this.successes = 0;
console.info('Circuit CLOSED. Normal operation resumed.');
break;
}
}
private getFailureRate(): number {
if (this.slidingWindow.length === 0) return 0;
const failures = this.slidingWindow.filter(e => !e.success).length;
return failures / this.slidingWindow.length;
}
private trimWindow(): void {
const cutoff = Date.now() - this.config.windowSizeMs;
this.slidingWindow = this.slidingWindow.filter(e => e.timestamp > cutoff);
}
getStats(): CircuitBreakerStats {
return {
state: this.state,
failures: this.failures,
successes: this.successes,
failureRate: this.getFailureRate(),
lastFailure: this.lastFailure,
lastStateChange: this.lastStateChange,
nextAttemptAt: this.state === 'open' ? this.nextAttemptAt : undefined,
};
}
}
interface CircuitBreakerStats {
state: CircuitState;
failures: number;
successes: number;
failureRate: number;
lastFailure: number;
lastStateChange: number;
nextAttemptAt?: number;
}
Circuit Breaker State Machine
┌────────────────┐
│ CLOSED │◄──────────────────────────────┐
│ (normal flow) │ │
│ │ success threshold │
└───────┬────────┘ met in half-open │
│ │
│ failure threshold │
│ exceeded │
▼ │
┌────────────────┐ reset timeout ┌─────────┴──────┐
│ OPEN │───────────────────►│ HALF-OPEN │
│ (reject all) │ │ (allow probes) │
│ │◄───────────────────│ │
└────────────────┘ any failure └────────────────┘
in half-open
Typical Settings:
- failureThreshold: 5 failures
- resetTimeout: 30 seconds
- successThreshold: 3 consecutive successes
- windowSize: 60 seconds
- failureRateThreshold: 50%
5. Service Mesh Sidecar Proxy
// ─── Sidecar Proxy Configuration ───────────────────────
interface SidecarConfig {
serviceName: string;
servicePort: number; // Local service port
inboundPort: number; // Sidecar inbound listener
outboundPort: number; // Sidecar outbound listener
adminPort: number; // Health/metrics/config
mtls: MTLSConfig;
retryPolicy: RetryPolicy;
circuitBreaker: CircuitBreakerConfig;
rateLimits: RateLimitRule[];
tracing: TracingConfig;
}
interface MTLSConfig {
enabled: boolean;
certPath: string;
keyPath: string;
caPath: string;
rotationIntervalMs: number;
}
interface RetryPolicy {
maxRetries: number;
perTryTimeout: number;
retryOn: string[]; // 5xx, reset, connect-failure, etc.
retryBudget: {
percentOfRequests: number; // Max 20% of requests can be retries
minRetries: number; // But at least 3/sec
};
}
interface RateLimitRule {
source: string; // Source service name
requestsPerSecond: number;
}
interface TracingConfig {
samplingRate: number; // 0.0-1.0
exportEndpoint: string;
}
// ─── Sidecar Proxy Implementation ──────────────────────
class SidecarProxy {
private circuitBreakers = new Map<string, CircuitBreaker>();
private retryBudget: RetryBudget;
constructor(private config: SidecarConfig) {
this.retryBudget = new RetryBudget(
config.retryPolicy.retryBudget.percentOfRequests,
config.retryPolicy.retryBudget.minRetries
);
}
// ── Inbound: External → Local Service ────────────
async handleInbound(req: IncomingRequest): Promise<GatewayResponse> {
// 1. Verify mTLS identity
if (this.config.mtls.enabled) {
const identity = await this.verifyClientCert(req);
if (!identity) {
return { status: 403, headers: {}, body: 'mTLS verification failed' };
}
req.headers['x-source-service'] = identity;
}
// 2. Rate limit by source service
const source = req.headers['x-source-service'] || 'unknown';
const rule = this.config.rateLimits.find(r => r.source === source);
if (rule && !this.checkRateLimit(source, rule)) {
return { status: 429, headers: {}, body: 'Rate limited' };
}
// 3. Inject tracing headers
this.injectTracing(req);
// 4. Forward to local service
const response = await this.forwardToLocal(req);
return response;
}
// ── Outbound: Local Service → External ───────────
async handleOutbound(
req: IncomingRequest,
targetService: string
): Promise<GatewayResponse> {
// 1. Service discovery — resolve target
const endpoints = await this.resolveService(targetService);
if (endpoints.length === 0) {
return { status: 503, headers: {}, body: 'No endpoints available' };
}
// 2. Circuit breaker check
const breaker = this.getCircuitBreaker(targetService);
if (!breaker.canExecute()) {
return { status: 503, headers: {}, body: 'Circuit open' };
}
// 3. Load balance
const endpoint = this.selectEndpoint(endpoints);
// 4. Establish mTLS connection
if (this.config.mtls.enabled) {
req.headers['x-source-service'] = this.config.serviceName;
}
// 5. Forward with retry
return this.forwardWithRetry(req, endpoint, breaker);
}
// ── Retry with Budget ────────────────────────────
private async forwardWithRetry(
req: IncomingRequest,
endpoint: ServiceEndpoint,
breaker: CircuitBreaker
): Promise<GatewayResponse> {
let lastError: Error | undefined;
for (let attempt = 0; attempt <= this.config.retryPolicy.maxRetries; attempt++) {
// Check retry budget (prevent retry storms)
if (attempt > 0 && !this.retryBudget.canRetry()) {
break;
}
try {
const response = await this.forward(req, endpoint);
if (this.isRetryableStatus(response.status) &&
attempt < this.config.retryPolicy.maxRetries) {
if (attempt > 0) this.retryBudget.recordRetry();
continue;
}
if (response.status < 500) {
breaker.recordSuccess();
} else {
breaker.recordFailure();
}
this.retryBudget.recordRequest();
return response;
} catch (err) {
lastError = err as Error;
breaker.recordFailure();
if (attempt > 0) this.retryBudget.recordRetry();
}
}
return {
status: 502,
headers: {},
body: `All retries failed: ${lastError?.message}`,
};
}
private getCircuitBreaker(service: string): CircuitBreaker {
if (!this.circuitBreakers.has(service)) {
this.circuitBreakers.set(service, new CircuitBreaker({
failureThreshold: 5,
successThreshold: 3,
resetTimeoutMs: 30000,
windowSizeMs: 60000,
failureRateThreshold: 0.5,
}));
}
return this.circuitBreakers.get(service)!;
}
private isRetryableStatus(status: number): boolean {
return status === 502 || status === 503 || status === 504;
}
// Simplified helpers
private async verifyClientCert(req: IncomingRequest): Promise<string | null> { return null; }
private checkRateLimit(source: string, rule: RateLimitRule): boolean { return true; }
private injectTracing(req: IncomingRequest): void {}
private async forwardToLocal(req: IncomingRequest): Promise<GatewayResponse> { return {} as any; }
private async resolveService(name: string): Promise<ServiceEndpoint[]> { return []; }
private selectEndpoint(endpoints: ServiceEndpoint[]): ServiceEndpoint { return endpoints[0]; }
private async forward(req: IncomingRequest, endpoint: ServiceEndpoint): Promise<GatewayResponse> { return {} as any; }
}
interface ServiceEndpoint { host: string; port: number; }
// ─── Retry Budget ──────────────────────────────────────
// Prevents retry storms — limits total retry rate
class RetryBudget {
private requests: number[] = [];
private retries: number[] = [];
private windowMs = 10000;
constructor(
private maxRetryPercentage: number,
private minRetriesPerWindow: number
) {}
canRetry(): boolean {
this.trim();
const totalRequests = this.requests.length;
const totalRetries = this.retries.length;
// Always allow minimum retries
if (totalRetries < this.minRetriesPerWindow) return true;
// Cap at percentage of total requests
return totalRetries < totalRequests * this.maxRetryPercentage;
}
recordRequest(): void {
this.requests.push(Date.now());
}
recordRetry(): void {
this.retries.push(Date.now());
}
private trim(): void {
const cutoff = Date.now() - this.windowMs;
this.requests = this.requests.filter(t => t > cutoff);
this.retries = this.retries.filter(t => t > cutoff);
}
}
Sidecar Proxy Traffic Flow
┌────────────────────────────────────────────────────┐
│ Pod / Container │
│ │
│ ┌──────────────────────────────────────────┐ │
│ │ Sidecar Proxy (Envoy) │ │
│ │ │ │
│ │ Inbound:15001 Outbound:15006 │ │
│ │ ┌─────────┐ ┌──────────┐ │ │
│ │ │ mTLS │ │ Service │ │ │
│ │ │ AuthZ │ │ Discovery│ │ │
│ │ │ RateLimit│ │ LB │ │ │
│ │ │ Tracing │ │ Retry │ │ │
│ │ └────┬────┘ │ Circuit │ │ │
│ │ │ │ Breaker │ │ │
│ │ │ └────┬─────┘ │ │
│ └───────┼─────────────────┼───────────────┘ │
│ │ │ │
│ ▼ ▲ │
│ ┌──────────────┐ │ │
│ │ App │─────────┘ │
│ │ :8080 │ (app thinks it's calling │
│ │ │ localhost — sidecar intercepts) │
│ └──────────────┘ │
└────────────────────────────────────────────────────┘
│ │
▼ inbound ▼ outbound
(from other (to other
services) services)
6. Rate Limiter (Sliding Window)
// ─── Sliding Window Rate Limiter ───────────────────────
class SlidingWindowRateLimiter {
private windows = new Map<string, WindowState>();
constructor(private config: RateLimitConfig) {}
check(key: string): RateLimitResult {
const now = Date.now();
const windowStart = Math.floor(now / this.config.windowMs) * this.config.windowMs;
const prevWindowStart = windowStart - this.config.windowMs;
let state = this.windows.get(key);
if (!state) {
state = { currentCount: 0, previousCount: 0, windowStart };
this.windows.set(key, state);
}
// Roll window if needed
if (state.windowStart < windowStart) {
if (state.windowStart === prevWindowStart) {
state.previousCount = state.currentCount;
} else {
state.previousCount = 0;
}
state.currentCount = 0;
state.windowStart = windowStart;
}
// Calculate weighted count
const elapsed = now - windowStart;
const weight = 1 - elapsed / this.config.windowMs;
const count = state.currentCount + Math.floor(state.previousCount * weight);
if (count >= this.config.requestsPerWindow) {
return {
allowed: false,
limit: this.config.requestsPerWindow,
remaining: 0,
resetAt: windowStart + this.config.windowMs,
};
}
state.currentCount++;
return {
allowed: true,
limit: this.config.requestsPerWindow,
remaining: this.config.requestsPerWindow - count - 1,
resetAt: windowStart + this.config.windowMs,
};
}
}
interface WindowState {
currentCount: number;
previousCount: number;
windowStart: number;
}
interface RateLimitResult {
allowed: boolean;
limit: number;
remaining: number;
resetAt: number;
}
Comparison: API Gateway vs Service Mesh
| Aspect | API Gateway | Service Mesh |
|---|---|---|
| Traffic direction | North-South (external → internal) | East-West (internal → internal) |
| Deployment | Centralized (1-few instances) | Decentralized (sidecar per pod) |
| Audience | External clients, partners | Internal services |
| Auth | API keys, OAuth, JWT | mTLS, SPIFFE identity |
| Routing | Path, header-based | Service name, version |
| Rate limiting | Per client/API key | Per service pair |
| Protocol | HTTP/REST, GraphQL, WebSocket | HTTP/2, gRPC (any L4/L7) |
| Visibility | Request logs, analytics | Distributed tracing |
| Examples | Kong, APISIX, Envoy Gateway | Istio, Linkerd, Consul Connect |
| Config model | Admin API, declarative | Control plane (CRDs) |
Comparison: Gateway Implementations
| Feature | Nginx | Envoy | Kong | AWS API GW | Traefik |
|---|---|---|---|---|---|
| Config | Static file | xDS API (dynamic) | DB + Admin API | Console/CloudFormation | Labels/tags |
| Extensibility | Lua, C modules | WASM, C++ filters | Lua plugins | Lambda authorizers | Go plugins |
| Protocol | HTTP/1.1, HTTP/2 | HTTP/1.1, HTTP/2, gRPC | HTTP, gRPC | HTTP, WebSocket | HTTP, TCP, gRPC |
| Hot reload | Graceful restart | Full hot reload | Plugin hot reload | Managed | Hot reload |
| Service mesh | No (Nginx Plus) | Yes (Istio data plane) | Via Kong Mesh | No (App Mesh) | Via Maesh |
| Circuit breaker | No | Yes (outlier detection) | Plugin | No | Yes |
| Best for | Static sites, reverse proxy | Service mesh, dynamic | API management | Serverless | K8s ingress |
Interview Questions & Answers
Q1: How does a service mesh achieve zero-trust networking with mTLS?
A: Each sidecar proxy has a unique identity certificate issued by the mesh's certificate authority (CA). When Service A calls Service B, their sidecars perform mutual TLS: A presents its certificate to B's sidecar, B presents its certificate to A's sidecar, and both verify the other's chain of trust back to the shared CA. The application code doesn't handle any of this — it talks to localhost in plaintext, and the sidecar transparently encrypts/decrypts. The control plane (Istio Citadel, Linkerd Identity) rotates certificates automatically (usually every 24 hours). Authorization policies can reference these identities: "only Service A can call Service B's /admin endpoint." This provides authentication (who is calling), encryption (can't intercept), and authorization (allowed to call) — all without changing application code.
Q2: Explain the difference between retry and retry budget.
A: A retry policy says "retry failed requests up to 3 times." A retry budget says "retries must not exceed 20% of total requests." Without a budget, if a downstream service starts failing, every client retries 3 times, tripling the load on an already struggling service — this is a retry storm that cascades failures. The retry budget (e.g., Envoy's retryBudget) tracks the ratio of retries to total requests in a sliding window. Once retries exceed the budget (say 20%), additional retries are suppressed. A minimum retry count (e.g., 3/second) ensures transient errors can still be retried even during low-traffic periods. This provides the benefits of retries for transient errors while preventing amplification during outages.
Q3: When should you put logic in the gateway vs the service mesh vs the application?
A: Gateway: External-facing concerns — API key validation, OAuth token verification, aggregate rate limiting, request/response transformation (e.g., REST → gRPC), public API versioning, and developer portal features. Service mesh: Cross-cutting infrastructure concerns — mTLS, per-service rate limiting, retries with budget, circuit breaking, distributed tracing propagation, canary deployments, and traffic shifting. Application: Business logic, data validation, authorization of business operations (not just identity), business event publishing, and domain-specific error handling. The key principle: keep business logic in the application, infrastructure concerns in the mesh, and external-boundary concerns in the gateway.
Q4: How do you handle API versioning at the gateway level?
A: Multiple strategies: (1) URL path versioning (/api/v1/users): Route /api/v1/* to Service-v1, /api/v2/* to Service-v2. Clean but URL changes. (2) Header versioning (Accept: application/vnd.api.v2+json): Gateway inspects header and routes accordingly. URL stays clean. (3) Query parameter (/users?version=2): Less common, harder to cache. (4) Traffic shifting: Run v1 and v2 simultaneously. Gradually route 10% → 50% → 100% to v2 using weighted routing. (5) Request transformation: Gateway rewrites v1 requests into v2 format, maintaining a single backend. Good for small differences. Best practice: URL path versioning for major versions, request transformation for minor versions, and traffic shifting for rollouts.
Key Takeaways
- API gateways centralize north-south concerns — auth, rate limiting, TLS termination — preventing duplication across services
- Service meshes handle east-west traffic transparently — sidecar proxies add mTLS, retries, and observability without code changes
- Middleware composition (Koa-style onion model) enables clean separation of gateway concerns into reusable, testable functions
- Circuit breakers prevent cascading failures — open circuit → reject requests → wait → probe → close circuit on recovery
- Retry budgets prevent retry storms — limit retries to a percentage of total traffic to avoid amplifying load on failing services
- Radix tree routing enables efficient path matching with O(path_length) performance regardless of route count
- Service discovery + health checks are the foundation — without knowing which instances are healthy, no other feature works
- Load balancing strategy matters: round-robin for uniform services, least-connections for variable latency, consistent hash for session affinity
- mTLS provides zero-trust — every service connection is authenticated, encrypted, and authorized via sidecar-managed certificates
- Separate gateway (external boundary) from mesh (internal traffic) — they serve different audiences with different requirements
What did you think?