API Gateway System Design
API Gateway System Design
Real-World Problem Context
Your company runs 40 microservices. The mobile app needs to authenticate, then call the product service, the pricing service, the recommendation service, and the inventory service — all for a single product page. Each service has its own host, its own auth mechanism, its own rate limits. The mobile client is making 4-5 HTTP calls per screen, eating battery and data. On top of that, your team just shipped a breaking change to the pricing API, and now every client is broken.
This is exactly the problem an API gateway solves. It sits between your clients and your services, acting as a single entry point that handles authentication, routing, rate limiting, protocol translation, and response aggregation — so your clients talk to one endpoint and your services stay decoupled.
Problem Statement
Without a gateway, every client must:
- Know the address of every service
- Handle authentication with each service independently
- Deal with different response formats and API versions
- Make multiple round-trips for a single screen
- Implement retry logic, circuit breaking, and timeout handling
This creates tight coupling between clients and services, makes cross-cutting concerns (auth, logging, rate limiting) inconsistent, and makes it nearly impossible to change your backend topology without breaking clients.
The core challenge: how do you provide a unified, stable API surface to clients while allowing backend services to evolve independently?
Potential Solutions
1. Simple Reverse Proxy (Nginx / Envoy)
Route requests based on URL path to different backend services:
# nginx.conf — basic API gateway
upstream product_service {
server product-svc:8080;
server product-svc-2:8080;
}
upstream pricing_service {
server pricing-svc:8080;
}
server {
listen 443 ssl;
server_name api.example.com;
# Route by path prefix
location /api/v1/products {
proxy_pass http://product_service;
proxy_set_header X-Request-ID $request_id;
proxy_connect_timeout 5s;
proxy_read_timeout 30s;
}
location /api/v1/pricing {
proxy_pass http://pricing_service;
proxy_set_header X-Request-ID $request_id;
}
# Rate limiting
limit_req_zone $binary_remote_addr zone=api:10m rate=100r/s;
location /api/ {
limit_req zone=api burst=50 nodelay;
}
}
2. Dedicated API Gateway (Kong, AWS API Gateway, Apigee)
Full-featured gateway with plugins for auth, rate limiting, transformation:
# Kong declarative config
services:
- name: product-service
url: http://product-svc:8080
routes:
- name: products-route
paths: ["/api/v1/products"]
strip_path: false
plugins:
- name: jwt
config:
claims_to_verify: ["exp"]
- name: rate-limiting
config:
minute: 60
policy: redis
redis_host: redis
- name: request-transformer
config:
add:
headers: ["X-Internal-Source:gateway"]
- name: pricing-service
url: http://pricing-svc:8080
routes:
- name: pricing-route
paths: ["/api/v1/pricing"]
plugins:
- name: jwt
- name: rate-limiting
config:
minute: 120
3. Custom Gateway with Request Aggregation
Build a gateway that composes multiple service calls into a single response:
from fastapi import FastAPI, Depends, HTTPException
from fastapi.security import HTTPBearer
import httpx
import asyncio
app = FastAPI()
security = HTTPBearer()
async def verify_token(credentials = Depends(security)):
"""Centralized auth — services don't need to validate tokens."""
token = credentials.credentials
# Validate JWT, check expiration, extract claims
claims = jwt.decode(token, PUBLIC_KEY, algorithms=["RS256"])
return claims
@app.get("/api/v1/product-page/{product_id}")
async def get_product_page(product_id: str, user = Depends(verify_token)):
"""
Aggregate data from 3 services into a single response.
Client makes 1 call instead of 3.
"""
async with httpx.AsyncClient(timeout=5.0) as client:
# Fan out to services in parallel
product_task = client.get(f"http://product-svc:8080/products/{product_id}")
pricing_task = client.get(f"http://pricing-svc:8080/pricing/{product_id}")
inventory_task = client.get(f"http://inventory-svc:8080/stock/{product_id}")
product_resp, pricing_resp, inventory_resp = await asyncio.gather(
product_task, pricing_task, inventory_task,
return_exceptions=True,
)
# Compose response — graceful degradation if a service fails
result = {}
if not isinstance(product_resp, Exception):
result["product"] = product_resp.json()
if not isinstance(pricing_resp, Exception):
result["pricing"] = pricing_resp.json()
else:
result["pricing"] = {"error": "temporarily unavailable"}
if not isinstance(inventory_resp, Exception):
result["inventory"] = inventory_resp.json()
return result
4. Backend for Frontend (BFF) Pattern
Separate gateways per client type:
┌──────────┐ ┌───────────────┐
│ Mobile │────▶│ Mobile BFF │──┐
│ App │ │ (lightweight │ │
└──────────┘ │ responses) │ │
└───────────────┘ │ ┌─────────────┐
├──▶│ Product Svc │
┌──────────┐ ┌───────────────┐ │ ├─────────────┤
│ Web │────▶│ Web BFF │──┤ │ Pricing Svc │
│ Browser │ │ (rich │ │ ├─────────────┤
└──────────┘ │ responses) │ │ │ Inventory │
└───────────────┘ │ └─────────────┘
│
┌──────────┐ ┌───────────────┐ │
│ 3rd Party│────▶│ Public API │──┘
│ Devs │ │ Gateway │
└──────────┘ │ (versioned) │
└───────────────┘
Trade-offs & Considerations
Approach Pros Cons Best When
────────────────────────────────────────────────────────────────────────────────────
Reverse Proxy Simple, fast, battle-tested No aggregation, limited Few services,
(Nginx/Envoy) Low latency overhead transformation simple routing
Managed Gateway Rich plugin ecosystem, Vendor lock-in, cost at Medium teams,
(Kong/AWS) Managed infra, dashboards scale, latency overhead standard patterns
Custom Gateway Full control, aggregation, Maintenance burden, Complex aggregation,
(Code) exact business logic another service to deploy unique requirements
BFF Pattern Optimal per-client UX, Multiple gateways to Multiple client
independent deployment maintain, code duplication types (mobile/web)
Best Practices
-
Keep the gateway thin — route, authenticate, rate-limit. Don't put business logic in the gateway. It should be a pass-through, not a monolith.
-
Centralize cross-cutting concerns — authentication, logging, request tracing, CORS, and rate limiting belong in the gateway, not in every service.
-
Set aggressive timeouts — the gateway should timeout faster than the client. If a backend is slow, fail fast and return a partial response.
-
Version your APIs at the gateway — route
/api/v1/productsto service-v1 and/api/v2/productsto service-v2. Clients don't know about the internal routing. -
Implement circuit breakers — if a backend service is failing, stop sending traffic to it. Return cached/default responses instead of cascading the failure.
-
Cache aggressively at the edge — GET responses with
Cache-Controlheaders can be served from the gateway's cache, eliminating backend calls entirely.
Step-by-Step Approach
Step 1: Start with a reverse proxy (Nginx or Envoy)
├── Route by path prefix to backend services
├── Add TLS termination
└── Configure health checks
Step 2: Add authentication
├── JWT validation at the gateway
├── Forward user claims as headers to services
└── Services trust the gateway (internal network only)
Step 3: Add rate limiting and throttling
├── Per-client rate limits (API key based)
├── Global rate limits per endpoint
└── Use Redis for distributed rate limit counters
Step 4: Add observability
├── Inject X-Request-ID for distributed tracing
├── Log request/response metadata (not bodies)
└── Expose metrics: latency, error rate, throughput
Step 5: Add response aggregation (if needed)
├── Identify screens that call multiple services
├── Build composite endpoints in the gateway
└── Fan out in parallel, merge responses
Step 6: Add resilience
├── Circuit breakers per backend service
├── Retry with exponential backoff for idempotent GETs
└── Fallback responses for non-critical data
Conclusion
An API gateway is the front door to your microservices architecture. Start simple with a reverse proxy handling routing and TLS, then layer on auth, rate limiting, and observability as your system grows. Avoid the trap of putting business logic in the gateway — it should remain a thin orchestration layer. For complex client needs, consider the BFF pattern with separate gateways per client type. The key decision is build vs. buy: managed gateways (Kong, AWS API Gateway) save time but cost money and flexibility; custom gateways give full control but add maintenance burden.
What did you think?