Building Rate Limiting, Throttling and Abuse Protection Into Your Next.js API Routes

February 27, 20263 min read5 views

rate limiting

token bucket algorithm

sliding window algorithm

cloud architecture

performance engineering

Building Rate Limiting, Throttling and Abuse Protection Into Your Next.js API Routes

Not just setTimeout — sliding window algorithms, Redis-backed rate limiting, IP fingerprinting, and integrating it cleanly without bloating your route handlers.

Why This Matters More Than You Think

Every public API endpoint is a potential attack vector. Without rate limiting:

A single user can exhaust your database connections
Bots can scrape your entire product catalog
Credential stuffing attacks go unchecked
Your Vercel bill becomes terrifying
One angry user can take down your service

The tutorial version of rate limiting — "just count requests" — breaks down immediately:

// The naive approach that doesn't work

let requestCount = 0;

export async function POST(request: Request) {
  requestCount++;

  if (requestCount > 100) {
    return new Response('Too many requests', { status: 429 });
  }

  // ... handle request
}

// Problems:
// 1. Counter resets on every deployment
// 2. Doesn't work across multiple serverless instances
// 3. No per-user tracking
// 4. No time window
// 5. No recovery mechanism

This post covers production-grade rate limiting that actually works in serverless environments.

The Algorithms

Fixed Window

The simplest algorithm: count requests in fixed time buckets.

┌─────────────────────────────────────────────────────────────────┐
│                    FIXED WINDOW                                  │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  Window 1 (00:00-01:00)    Window 2 (01:00-02:00)               │
│  ┌───────────────────┐     ┌───────────────────┐                │
│  │ ████████████████░░│     │███░░░░░░░░░░░░░░░░│                │
│  │ 80/100 requests   │     │30/100 requests    │                │
│  └───────────────────┘     └───────────────────┘                │
│                                                                  │
│  Problem: Burst at window boundary                              │
│                                                                  │
│       Window 1          Window 2                                │
│  ─────────────┬────────────────────                             │
│               │                                                  │
│  ░░░░░░░░░████│████░░░░░░░░░░░░░░                               │
│        100 req│100 req                                          │
│               │                                                  │
│  200 requests in 2 seconds! (at boundary)                       │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

// Fixed window implementation

interface FixedWindowState {
  count: number;
  windowStart: number;
}

class FixedWindowRateLimiter {
  private windows = new Map<string, FixedWindowState>();

  constructor(
    private maxRequests: number,
    private windowMs: number
  ) {}

  check(identifier: string): { allowed: boolean; remaining: number; resetAt: number } {
    const now = Date.now();
    const windowStart = Math.floor(now / this.windowMs) * this.windowMs;
    const windowEnd = windowStart + this.windowMs;

    let state = this.windows.get(identifier);

    // New window or expired window
    if (!state || state.windowStart !== windowStart) {
      state = { count: 0, windowStart };
      this.windows.set(identifier, state);
    }

    const allowed = state.count < this.maxRequests;

    if (allowed) {
      state.count++;
    }

    return {
      allowed,
      remaining: Math.max(0, this.maxRequests - state.count),
      resetAt: windowEnd,
    };
  }
}

Sliding Window Log

Track timestamps of each request. More accurate but memory-intensive.

┌─────────────────────────────────────────────────────────────────┐
│                    SLIDING WINDOW LOG                            │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  Current time: 01:30                                            │
│  Window: 1 hour                                                  │
│  Max: 100 requests                                              │
│                                                                  │
│  Stored timestamps:                                              │
│  [00:35, 00:45, 00:50, 01:05, 01:10, 01:15, 01:20, 01:25]       │
│     ↓      ↓                                                     │
│   expired  expired   (remove these)                              │
│                                                                  │
│  Active requests: 6                                              │
│  Remaining: 94                                                   │
│                                                                  │
│  Pros: Perfectly accurate                                        │
│  Cons: O(n) memory per user, O(n) cleanup                       │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

// Sliding window log implementation

class SlidingWindowLog {
  private logs = new Map<string, number[]>();

  constructor(
    private maxRequests: number,
    private windowMs: number
  ) {}

  check(identifier: string): { allowed: boolean; remaining: number } {
    const now = Date.now();
    const windowStart = now - this.windowMs;

    // Get or create log
    let timestamps = this.logs.get(identifier) || [];

    // Remove expired entries
    timestamps = timestamps.filter((ts) => ts > windowStart);

    const allowed = timestamps.length < this.maxRequests;

    if (allowed) {
      timestamps.push(now);
    }

    this.logs.set(identifier, timestamps);

    return {
      allowed,
      remaining: Math.max(0, this.maxRequests - timestamps.length),
    };
  }
}

Sliding Window Counter (Best for Production)

Hybrid approach: fixed window storage with sliding window accuracy.

┌─────────────────────────────────────────────────────────────────┐
│                    SLIDING WINDOW COUNTER                        │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  Current time: 01:30 (30 mins into window 2)                    │
│  Window size: 1 hour                                             │
│  Max: 100 requests                                              │
│                                                                  │
│  Window 1 (00:00-01:00): 80 requests                            │
│  Window 2 (01:00-02:00): 30 requests (so far)                   │
│                                                                  │
│  Calculation:                                                    │
│  ├── Window 2 progress: 30/60 = 50%                             │
│  ├── Window 1 weight: 1 - 0.5 = 50%                             │
│  ├── Window 1 contribution: 80 * 0.5 = 40                       │
│  ├── Window 2 contribution: 30                                  │
│  └── Estimated count: 40 + 30 = 70                              │
│                                                                  │
│  Remaining: 100 - 70 = 30 requests                              │
│                                                                  │
│  Pros: O(1) memory per user, accurate estimate                  │
│  Cons: Slight estimation error (acceptable)                     │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

// Sliding window counter implementation

interface WindowData {
  count: number;
  timestamp: number;
}

class SlidingWindowCounter {
  private windows = new Map<string, { current: WindowData; previous: WindowData }>();

  constructor(
    private maxRequests: number,
    private windowMs: number
  ) {}

  check(identifier: string): {
    allowed: boolean;
    remaining: number;
    resetAt: number;
  } {
    const now = Date.now();
    const currentWindowStart = Math.floor(now / this.windowMs) * this.windowMs;
    const previousWindowStart = currentWindowStart - this.windowMs;

    let data = this.windows.get(identifier);

    if (!data) {
      data = {
        current: { count: 0, timestamp: currentWindowStart },
        previous: { count: 0, timestamp: previousWindowStart },
      };
      this.windows.set(identifier, data);
    }

    // Rotate windows if needed
    if (data.current.timestamp !== currentWindowStart) {
      if (data.current.timestamp === previousWindowStart) {
        // Current becomes previous
        data.previous = data.current;
      } else {
        // Gap - reset previous
        data.previous = { count: 0, timestamp: previousWindowStart };
      }
      data.current = { count: 0, timestamp: currentWindowStart };
    }

    // Calculate weighted count
    const elapsedInCurrentWindow = now - currentWindowStart;
    const windowProgress = elapsedInCurrentWindow / this.windowMs;
    const previousWeight = 1 - windowProgress;

    const estimatedCount =
      data.previous.count * previousWeight + data.current.count;

    const allowed = estimatedCount < this.maxRequests;

    if (allowed) {
      data.current.count++;
    }

    const newEstimatedCount = estimatedCount + (allowed ? 1 : 0);

    return {
      allowed,
      remaining: Math.max(0, Math.floor(this.maxRequests - newEstimatedCount)),
      resetAt: currentWindowStart + this.windowMs,
    };
  }
}

Token Bucket (For Burst Handling)

Allow bursts up to a limit, then enforce steady rate.

┌─────────────────────────────────────────────────────────────────┐
│                    TOKEN BUCKET                                  │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  Bucket capacity: 10 tokens                                      │
│  Refill rate: 1 token/second                                    │
│                                                                  │
│  Time 0:  [██████████] 10 tokens                                │
│  Burst:   [░░░░░░░░░░]  0 tokens (10 requests instantly)        │
│  +5 sec:  [█████░░░░░]  5 tokens (refilled)                     │
│  +3 req:  [██░░░░░░░░]  2 tokens                                │
│                                                                  │
│  Allows:                                                        │
│  - Instant burst up to bucket size                              │
│  - Sustained rate = refill rate                                 │
│  - Smooth rate limiting over time                               │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

// Token bucket implementation

interface BucketState {
  tokens: number;
  lastRefill: number;
}

class TokenBucket {
  private buckets = new Map<string, BucketState>();

  constructor(
    private bucketSize: number,      // Max tokens
    private refillRate: number,       // Tokens per second
  ) {}

  check(identifier: string, cost = 1): {
    allowed: boolean;
    remaining: number;
    retryAfter?: number;
  } {
    const now = Date.now();
    let bucket = this.buckets.get(identifier);

    if (!bucket) {
      bucket = { tokens: this.bucketSize, lastRefill: now };
      this.buckets.set(identifier, bucket);
    }

    // Refill tokens based on time elapsed
    const elapsed = (now - bucket.lastRefill) / 1000; // seconds
    const refillAmount = elapsed * this.refillRate;
    bucket.tokens = Math.min(this.bucketSize, bucket.tokens + refillAmount);
    bucket.lastRefill = now;

    const allowed = bucket.tokens >= cost;

    if (allowed) {
      bucket.tokens -= cost;
    }

    // Calculate when enough tokens will be available
    const tokensNeeded = cost - bucket.tokens;
    const retryAfter = allowed
      ? undefined
      : Math.ceil((tokensNeeded / this.refillRate) * 1000);

    return {
      allowed,
      remaining: Math.floor(bucket.tokens),
      retryAfter,
    };
  }
}

Redis-Backed Rate Limiting

In-memory rate limiting doesn't work in serverless. You need distributed state.

Redis Setup

// lib/redis.ts

import { Redis } from '@upstash/redis';

// For Upstash (serverless-friendly)
export const redis = new Redis({
  url: process.env.UPSTASH_REDIS_REST_URL!,
  token: process.env.UPSTASH_REDIS_REST_TOKEN!,
});

// Or for ioredis (traditional Redis)
// import Redis from 'ioredis';
// export const redis = new Redis(process.env.REDIS_URL);

Sliding Window Counter with Redis

// lib/rate-limiter.ts

import { redis } from './redis';

interface RateLimitResult {
  success: boolean;
  limit: number;
  remaining: number;
  reset: number;
  retryAfter?: number;
}

interface RateLimitConfig {
  maxRequests: number;
  windowMs: number;
}

export async function rateLimit(
  identifier: string,
  config: RateLimitConfig
): Promise<RateLimitResult> {
  const { maxRequests, windowMs } = config;
  const now = Date.now();
  const windowStart = Math.floor(now / windowMs) * windowMs;
  const previousWindowStart = windowStart - windowMs;

  const currentKey = `ratelimit:${identifier}:${windowStart}`;
  const previousKey = `ratelimit:${identifier}:${previousWindowStart}`;

  // Lua script for atomic operation
  const script = `
    local currentKey = KEYS[1]
    local previousKey = KEYS[2]
    local maxRequests = tonumber(ARGV[1])
    local windowMs = tonumber(ARGV[2])
    local now = tonumber(ARGV[3])
    local windowStart = tonumber(ARGV[4])

    -- Get counts
    local currentCount = tonumber(redis.call('GET', currentKey) or '0')
    local previousCount = tonumber(redis.call('GET', previousKey) or '0')

    -- Calculate weighted count
    local elapsedMs = now - windowStart
    local previousWeight = 1 - (elapsedMs / windowMs)
    local estimatedCount = (previousCount * previousWeight) + currentCount

    if estimatedCount >= maxRequests then
      return {0, currentCount, previousCount, estimatedCount}
    end

    -- Increment and set expiry
    local newCount = redis.call('INCR', currentKey)
    redis.call('PEXPIRE', currentKey, windowMs * 2)

    return {1, newCount, previousCount, estimatedCount + 1}
  `;

  const result = await redis.eval(
    script,
    [currentKey, previousKey],
    [maxRequests, windowMs, now, windowStart]
  ) as [number, number, number, number];

  const [success, currentCount, previousCount, estimatedCount] = result;

  const remaining = Math.max(0, Math.floor(maxRequests - estimatedCount));
  const reset = windowStart + windowMs;

  return {
    success: success === 1,
    limit: maxRequests,
    remaining,
    reset,
    retryAfter: success === 1 ? undefined : Math.ceil((reset - now) / 1000),
  };
}

Token Bucket with Redis

// lib/token-bucket.ts

import { redis } from './redis';

interface TokenBucketConfig {
  bucketSize: number;
  refillRate: number; // tokens per second
}

export async function tokenBucket(
  identifier: string,
  config: TokenBucketConfig,
  cost = 1
): Promise<{
  success: boolean;
  remaining: number;
  retryAfter?: number;
}> {
  const { bucketSize, refillRate } = config;
  const key = `tokenbucket:${identifier}`;
  const now = Date.now();

  const script = `
    local key = KEYS[1]
    local bucketSize = tonumber(ARGV[1])
    local refillRate = tonumber(ARGV[2])
    local cost = tonumber(ARGV[3])
    local now = tonumber(ARGV[4])

    local data = redis.call('HMGET', key, 'tokens', 'lastRefill')
    local tokens = tonumber(data[1]) or bucketSize
    local lastRefill = tonumber(data[2]) or now

    -- Refill tokens
    local elapsed = (now - lastRefill) / 1000
    tokens = math.min(bucketSize, tokens + (elapsed * refillRate))

    local allowed = 0
    if tokens >= cost then
      tokens = tokens - cost
      allowed = 1
    end

    -- Save state
    redis.call('HMSET', key, 'tokens', tokens, 'lastRefill', now)
    redis.call('EXPIRE', key, 86400) -- 24 hour expiry

    -- Calculate retry after
    local retryAfter = 0
    if allowed == 0 then
      retryAfter = math.ceil((cost - tokens) / refillRate * 1000)
    end

    return {allowed, tokens, retryAfter}
  `;

  const result = await redis.eval(
    script,
    [key],
    [bucketSize, refillRate, cost, now]
  ) as [number, number, number];

  const [allowed, tokens, retryAfter] = result;

  return {
    success: allowed === 1,
    remaining: Math.floor(tokens),
    retryAfter: allowed === 1 ? undefined : retryAfter,
  };
}

Identifying Users

IP addresses aren't enough. You need robust identification.

IP Extraction

// lib/ip.ts

import { headers } from 'next/headers';

export function getClientIP(): string {
  const headersList = headers();

  // Vercel
  const forwardedFor = headersList.get('x-forwarded-for');
  if (forwardedFor) {
    // Take first IP (client IP in chain)
    return forwardedFor.split(',')[0].trim();
  }

  // Cloudflare
  const cfConnectingIP = headersList.get('cf-connecting-ip');
  if (cfConnectingIP) {
    return cfConnectingIP;
  }

  // AWS/Other
  const realIP = headersList.get('x-real-ip');
  if (realIP) {
    return realIP;
  }

  // Fallback
  return 'unknown';
}

Fingerprinting

// lib/fingerprint.ts

import { headers } from 'next/headers';
import { createHash } from 'crypto';

export function generateFingerprint(): string {
  const headersList = headers();

  const components = [
    getClientIP(),
    headersList.get('user-agent') || '',
    headersList.get('accept-language') || '',
    headersList.get('accept-encoding') || '',
    // Add more stable headers
  ];

  const fingerprint = components.join('|');

  return createHash('sha256').update(fingerprint).digest('hex').slice(0, 16);
}

// More sophisticated fingerprinting
export function generateRobustFingerprint(): string {
  const headersList = headers();

  const components = {
    ip: getClientIP(),
    ua: headersList.get('user-agent') || '',
    lang: headersList.get('accept-language')?.split(',')[0] || '',
    // Normalize to handle minor variations
  };

  // Hash to create consistent identifier
  return createHash('sha256')
    .update(JSON.stringify(components))
    .digest('hex')
    .slice(0, 16);
}

Tiered Identification

// lib/identifier.ts

import { headers, cookies } from 'next/headers';
import { getServerSession } from 'next-auth';
import { getClientIP } from './ip';

export type IdentifierType = 'user' | 'session' | 'ip' | 'fingerprint';

interface Identifier {
  type: IdentifierType;
  value: string;
  trustLevel: number; // 1-4, higher is more trusted
}

export async function getIdentifier(): Promise<Identifier> {
  // 1. Authenticated user (most trusted)
  const session = await getServerSession();
  if (session?.user?.id) {
    return {
      type: 'user',
      value: `user:${session.user.id}`,
      trustLevel: 4,
    };
  }

  // 2. Session cookie
  const cookieStore = cookies();
  const sessionId = cookieStore.get('session')?.value;
  if (sessionId) {
    return {
      type: 'session',
      value: `session:${sessionId}`,
      trustLevel: 3,
    };
  }

  // 3. IP address
  const ip = getClientIP();
  if (ip !== 'unknown') {
    return {
      type: 'ip',
      value: `ip:${ip}`,
      trustLevel: 2,
    };
  }

  // 4. Fingerprint (least trusted)
  return {
    type: 'fingerprint',
    value: `fp:${generateFingerprint()}`,
    trustLevel: 1,
  };
}

// Different limits based on trust level
export function getLimitsForIdentifier(identifier: Identifier): {
  maxRequests: number;
  windowMs: number;
} {
  switch (identifier.trustLevel) {
    case 4: // Authenticated user
      return { maxRequests: 1000, windowMs: 60000 }; // 1000/min
    case 3: // Session
      return { maxRequests: 500, windowMs: 60000 };  // 500/min
    case 2: // IP
      return { maxRequests: 100, windowMs: 60000 };  // 100/min
    default: // Fingerprint
      return { maxRequests: 30, windowMs: 60000 };   // 30/min
  }
}

Middleware Integration

Global Rate Limiting Middleware

// middleware.ts

import { NextResponse } from 'next/server';
import type { NextRequest } from 'next/server';
import { rateLimit } from './lib/rate-limiter';

// Routes to protect
const PROTECTED_PATHS = ['/api/'];

// Rate limit config by path pattern
const RATE_LIMITS: Record<string, { maxRequests: number; windowMs: number }> = {
  '/api/auth/': { maxRequests: 10, windowMs: 60000 },    // 10/min for auth
  '/api/upload': { maxRequests: 5, windowMs: 60000 },    // 5/min for uploads
  '/api/': { maxRequests: 100, windowMs: 60000 },        // 100/min default
};

function getRateLimitConfig(pathname: string) {
  for (const [pattern, config] of Object.entries(RATE_LIMITS)) {
    if (pathname.startsWith(pattern)) {
      return config;
    }
  }
  return { maxRequests: 100, windowMs: 60000 };
}

export async function middleware(request: NextRequest) {
  const pathname = request.nextUrl.pathname;

  // Only rate limit API routes
  if (!PROTECTED_PATHS.some((path) => pathname.startsWith(path))) {
    return NextResponse.next();
  }

  // Get identifier
  const ip =
    request.headers.get('x-forwarded-for')?.split(',')[0] ||
    request.headers.get('x-real-ip') ||
    'unknown';

  const identifier = `${ip}:${pathname}`;
  const config = getRateLimitConfig(pathname);

  try {
    const result = await rateLimit(identifier, config);

    if (!result.success) {
      return new NextResponse(
        JSON.stringify({
          error: 'Too many requests',
          retryAfter: result.retryAfter,
        }),
        {
          status: 429,
          headers: {
            'Content-Type': 'application/json',
            'X-RateLimit-Limit': result.limit.toString(),
            'X-RateLimit-Remaining': result.remaining.toString(),
            'X-RateLimit-Reset': result.reset.toString(),
            'Retry-After': result.retryAfter?.toString() || '',
          },
        }
      );
    }

    // Add rate limit headers to successful responses
    const response = NextResponse.next();
    response.headers.set('X-RateLimit-Limit', result.limit.toString());
    response.headers.set('X-RateLimit-Remaining', result.remaining.toString());
    response.headers.set('X-RateLimit-Reset', result.reset.toString());

    return response;
  } catch (error) {
    // If rate limiter fails, allow the request (fail open)
    console.error('Rate limiter error:', error);
    return NextResponse.next();
  }
}

export const config = {
  matcher: '/api/:path*',
};

Route-Level Rate Limiting

For more granular control, use a wrapper function:

// lib/with-rate-limit.ts

import { NextRequest, NextResponse } from 'next/server';
import { rateLimit } from './rate-limiter';
import { getIdentifier, getLimitsForIdentifier } from './identifier';

interface RateLimitOptions {
  maxRequests?: number;
  windowMs?: number;
  keyPrefix?: string;
  skip?: (request: NextRequest) => boolean | Promise<boolean>;
}

type RouteHandler = (
  request: NextRequest,
  context?: any
) => Promise<Response> | Response;

export function withRateLimit(
  handler: RouteHandler,
  options: RateLimitOptions = {}
): RouteHandler {
  return async (request: NextRequest, context?: any) => {
    // Check if should skip
    if (options.skip && (await options.skip(request))) {
      return handler(request, context);
    }

    // Get identifier and limits
    const identifier = await getIdentifier();
    const defaultLimits = getLimitsForIdentifier(identifier);

    const config = {
      maxRequests: options.maxRequests ?? defaultLimits.maxRequests,
      windowMs: options.windowMs ?? defaultLimits.windowMs,
    };

    const key = options.keyPrefix
      ? `${options.keyPrefix}:${identifier.value}`
      : identifier.value;

    try {
      const result = await rateLimit(key, config);

      if (!result.success) {
        return NextResponse.json(
          {
            error: 'Too many requests',
            message: `Rate limit exceeded. Try again in ${result.retryAfter} seconds.`,
          },
          {
            status: 429,
            headers: {
              'X-RateLimit-Limit': result.limit.toString(),
              'X-RateLimit-Remaining': '0',
              'X-RateLimit-Reset': result.reset.toString(),
              'Retry-After': result.retryAfter?.toString() || '',
            },
          }
        );
      }

      // Execute handler
      const response = await handler(request, context);

      // Clone response to add headers
      const newResponse = new NextResponse(response.body, {
        status: response.status,
        statusText: response.statusText,
        headers: response.headers,
      });

      newResponse.headers.set('X-RateLimit-Limit', result.limit.toString());
      newResponse.headers.set(
        'X-RateLimit-Remaining',
        result.remaining.toString()
      );
      newResponse.headers.set('X-RateLimit-Reset', result.reset.toString());

      return newResponse;
    } catch (error) {
      console.error('Rate limit error:', error);
      // Fail open - allow request if rate limiter fails
      return handler(request, context);
    }
  };
}

Usage in Route Handlers

// app/api/posts/route.ts

import { NextRequest, NextResponse } from 'next/server';
import { withRateLimit } from '@/lib/with-rate-limit';

// Standard rate limits
export const GET = withRateLimit(async (request: NextRequest) => {
  const posts = await getPosts();
  return NextResponse.json(posts);
});

// Stricter limits for mutations
export const POST = withRateLimit(
  async (request: NextRequest) => {
    const body = await request.json();
    const post = await createPost(body);
    return NextResponse.json(post, { status: 201 });
  },
  {
    maxRequests: 10,
    windowMs: 60000,
    keyPrefix: 'create-post',
  }
);

// app/api/auth/login/route.ts

import { withRateLimit } from '@/lib/with-rate-limit';

// Very strict limits for auth endpoints
export const POST = withRateLimit(
  async (request: NextRequest) => {
    const { email, password } = await request.json();

    const user = await authenticate(email, password);

    if (!user) {
      // Additional rate limiting on failed attempts
      await trackFailedLogin(email);
      return NextResponse.json({ error: 'Invalid credentials' }, { status: 401 });
    }

    return NextResponse.json({ user, token: generateToken(user) });
  },
  {
    maxRequests: 5,
    windowMs: 60000, // 5 attempts per minute
    keyPrefix: 'login',
  }
);

Abuse Protection Patterns

Progressive Penalties

// lib/progressive-rate-limit.ts

import { redis } from './redis';

interface PenaltyState {
  violations: number;
  penaltyMultiplier: number;
  penaltyUntil: number;
}

export async function progressiveRateLimit(
  identifier: string,
  baseConfig: { maxRequests: number; windowMs: number }
): Promise<{
  success: boolean;
  remaining: number;
  penaltyMultiplier: number;
}> {
  const penaltyKey = `penalty:${identifier}`;

  // Get penalty state
  const penaltyData = await redis.get<PenaltyState>(penaltyKey);
  const now = Date.now();

  let multiplier = 1;

  if (penaltyData && penaltyData.penaltyUntil > now) {
    multiplier = penaltyData.penaltyMultiplier;
  }

  // Apply penalty to limits
  const effectiveConfig = {
    maxRequests: Math.floor(baseConfig.maxRequests / multiplier),
    windowMs: baseConfig.windowMs,
  };

  const result = await rateLimit(identifier, effectiveConfig);

  if (!result.success) {
    // Increase penalty on violation
    const newViolations = (penaltyData?.violations || 0) + 1;
    const newMultiplier = Math.min(8, Math.pow(2, newViolations - 1));

    await redis.set(
      penaltyKey,
      {
        violations: newViolations,
        penaltyMultiplier: newMultiplier,
        penaltyUntil: now + 3600000 * newMultiplier, // Penalty duration scales
      },
      { ex: 86400 } // 24 hour max
    );
  }

  return {
    success: result.success,
    remaining: result.remaining,
    penaltyMultiplier: multiplier,
  };
}

Endpoint-Specific Protection

// lib/endpoint-protection.ts

interface EndpointConfig {
  // Basic rate limiting
  rateLimit: { maxRequests: number; windowMs: number };

  // Cost-based (for expensive operations)
  cost?: number;

  // Require authentication
  requireAuth?: boolean;

  // Require verified email
  requireVerified?: boolean;

  // Block known bad actors
  blockList?: string[];

  // Geographic restrictions
  allowedCountries?: string[];
}

const ENDPOINT_CONFIGS: Record<string, EndpointConfig> = {
  '/api/signup': {
    rateLimit: { maxRequests: 3, windowMs: 3600000 }, // 3 per hour
    cost: 10,
  },
  '/api/password-reset': {
    rateLimit: { maxRequests: 3, windowMs: 3600000 },
    cost: 5,
  },
  '/api/export': {
    rateLimit: { maxRequests: 5, windowMs: 86400000 }, // 5 per day
    requireAuth: true,
    requireVerified: true,
    cost: 50,
  },
  '/api/search': {
    rateLimit: { maxRequests: 30, windowMs: 60000 },
    cost: 2,
  },
};

export function getEndpointConfig(pathname: string): EndpointConfig {
  return ENDPOINT_CONFIGS[pathname] || {
    rateLimit: { maxRequests: 100, windowMs: 60000 },
    cost: 1,
  };
}

Captcha Integration

// lib/captcha-rate-limit.ts

import { rateLimit } from './rate-limiter';

interface CaptchaRateLimitResult {
  success: boolean;
  requiresCaptcha: boolean;
  remaining: number;
}

export async function rateLimitWithCaptcha(
  identifier: string,
  captchaToken: string | null,
  config: { maxRequests: number; captchaThreshold: number; windowMs: number }
): Promise<CaptchaRateLimitResult> {
  const { maxRequests, captchaThreshold, windowMs } = config;

  // Check current count
  const preCheck = await rateLimit(identifier, {
    maxRequests,
    windowMs,
  });

  // If under captcha threshold, allow without captcha
  if (preCheck.remaining > maxRequests - captchaThreshold) {
    return {
      success: true,
      requiresCaptcha: false,
      remaining: preCheck.remaining - 1,
    };
  }

  // Above threshold - require captcha
  if (!captchaToken) {
    return {
      success: false,
      requiresCaptcha: true,
      remaining: preCheck.remaining,
    };
  }

  // Verify captcha
  const captchaValid = await verifyCaptcha(captchaToken);

  if (!captchaValid) {
    return {
      success: false,
      requiresCaptcha: true,
      remaining: preCheck.remaining,
    };
  }

  // Captcha valid - allow request
  const result = await rateLimit(identifier, { maxRequests, windowMs });

  return {
    success: result.success,
    requiresCaptcha: false,
    remaining: result.remaining,
  };
}

async function verifyCaptcha(token: string): Promise<boolean> {
  const response = await fetch(
    'https://www.google.com/recaptcha/api/siteverify',
    {
      method: 'POST',
      headers: { 'Content-Type': 'application/x-www-form-urlencoded' },
      body: `secret=${process.env.RECAPTCHA_SECRET}&response=${token}`,
    }
  );

  const data = await response.json();
  return data.success && data.score > 0.5;
}

Honeypot Fields

// lib/honeypot.ts

import { redis } from './redis';

export async function checkHoneypot(
  request: NextRequest,
  honeypotField = 'website'
): Promise<{ isBot: boolean; shouldBlock: boolean }> {
  const body = await request.json();

  // Check if honeypot field was filled
  if (body[honeypotField]) {
    const ip = getClientIP();

    // Track bot behavior
    const botKey = `bot:${ip}`;
    const botCount = await redis.incr(botKey);
    await redis.expire(botKey, 86400); // 24 hours

    return {
      isBot: true,
      shouldBlock: botCount > 3, // Block after 3 bot detections
    };
  }

  return { isBot: false, shouldBlock: false };
}

Monitoring and Alerting

Rate Limit Metrics

// lib/rate-limit-metrics.ts

import { redis } from './redis';

interface RateLimitMetrics {
  totalRequests: number;
  blockedRequests: number;
  blockRate: number;
  topBlockedIdentifiers: Array<{ identifier: string; count: number }>;
}

export async function trackRateLimitEvent(
  identifier: string,
  blocked: boolean,
  endpoint: string
) {
  const now = Date.now();
  const hourBucket = Math.floor(now / 3600000);

  const pipeline = redis.pipeline();

  // Total requests
  pipeline.incr(`metrics:requests:${hourBucket}`);
  pipeline.expire(`metrics:requests:${hourBucket}`, 86400);

  // Blocked requests
  if (blocked) {
    pipeline.incr(`metrics:blocked:${hourBucket}`);
    pipeline.expire(`metrics:blocked:${hourBucket}`, 86400);

    // Track blocked identifiers
    pipeline.zincrby(`metrics:blocked-ids:${hourBucket}`, 1, identifier);
    pipeline.expire(`metrics:blocked-ids:${hourBucket}`, 86400);

    // Track blocked endpoints
    pipeline.zincrby(`metrics:blocked-endpoints:${hourBucket}`, 1, endpoint);
    pipeline.expire(`metrics:blocked-endpoints:${hourBucket}`, 86400);
  }

  await pipeline.exec();
}

export async function getRateLimitMetrics(
  hoursBehind = 24
): Promise<RateLimitMetrics> {
  const now = Date.now();
  const buckets: number[] = [];

  for (let i = 0; i < hoursBehind; i++) {
    buckets.push(Math.floor((now - i * 3600000) / 3600000));
  }

  // Aggregate metrics
  let totalRequests = 0;
  let blockedRequests = 0;

  for (const bucket of buckets) {
    const [requests, blocked] = await Promise.all([
      redis.get<number>(`metrics:requests:${bucket}`),
      redis.get<number>(`metrics:blocked:${bucket}`),
    ]);
    totalRequests += requests || 0;
    blockedRequests += blocked || 0;
  }

  // Get top blocked identifiers from most recent bucket
  const recentBucket = buckets[0];
  const topBlocked = await redis.zrange<string[]>(
    `metrics:blocked-ids:${recentBucket}`,
    0,
    9,
    { rev: true, withScores: true }
  );

  const topBlockedIdentifiers: Array<{ identifier: string; count: number }> = [];
  for (let i = 0; i < topBlocked.length; i += 2) {
    topBlockedIdentifiers.push({
      identifier: topBlocked[i],
      count: Number(topBlocked[i + 1]),
    });
  }

  return {
    totalRequests,
    blockedRequests,
    blockRate: totalRequests > 0 ? blockedRequests / totalRequests : 0,
    topBlockedIdentifiers,
  };
}

Alerting

// lib/rate-limit-alerts.ts

interface AlertConfig {
  blockRateThreshold: number;
  singleIPBlockThreshold: number;
  webhookUrl?: string;
}

export async function checkAndAlert(config: AlertConfig) {
  const metrics = await getRateLimitMetrics(1); // Last hour

  const alerts: string[] = [];

  // High block rate
  if (metrics.blockRate > config.blockRateThreshold) {
    alerts.push(
      `High block rate: ${(metrics.blockRate * 100).toFixed(1)}% ` +
      `(${metrics.blockedRequests}/${metrics.totalRequests})`
    );
  }

  // Single IP abuse
  for (const { identifier, count } of metrics.topBlockedIdentifiers) {
    if (count > config.singleIPBlockThreshold) {
      alerts.push(
        `Potential abuse from ${identifier}: ${count} blocked requests`
      );
    }
  }

  if (alerts.length > 0 && config.webhookUrl) {
    await sendAlert(config.webhookUrl, alerts);
  }

  return alerts;
}

async function sendAlert(webhookUrl: string, alerts: string[]) {
  await fetch(webhookUrl, {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({
      text: `🚨 Rate Limit Alerts:\n${alerts.map(a => `• ${a}`).join('\n')}`,
    }),
  });
}

Quick Reference

Algorithm Selection

USE CASE                              ALGORITHM
────────────────────────────────────────────────────────────────────

Simple API protection                 Sliding Window Counter
Allow bursts, enforce average rate    Token Bucket
Need perfect accuracy                 Sliding Window Log
Simple implementation needed          Fixed Window

Response Headers

X-RateLimit-Limit       Maximum requests allowed
X-RateLimit-Remaining   Requests remaining in window
X-RateLimit-Reset       Unix timestamp when limit resets
Retry-After             Seconds until client can retry (on 429)

Status Codes

200  OK                 Request successful, rate limit headers included
429  Too Many Requests  Rate limit exceeded
503  Service Unavailable Rate limiter failed (if fail-closed)

Sample Limits by Endpoint Type

const RECOMMENDED_LIMITS = {
  // Authentication
  '/api/auth/login': { max: 5, window: '1m' },
  '/api/auth/signup': { max: 3, window: '1h' },
  '/api/auth/password-reset': { max: 3, window: '1h' },
  '/api/auth/verify-email': { max: 5, window: '1h' },

  // User actions
  '/api/posts': { max: 30, window: '1m' },
  '/api/comments': { max: 20, window: '1m' },
  '/api/uploads': { max: 10, window: '1m' },

  // Expensive operations
  '/api/export': { max: 5, window: '24h' },
  '/api/search': { max: 30, window: '1m' },
  '/api/ai/generate': { max: 10, window: '1h' },

  // Public/read-only
  '/api/public/': { max: 100, window: '1m' },
};

Testing Rate Limits

# Quick test with curl
for i in {1..20}; do
  curl -s -o /dev/null -w "%{http_code}\n" http://localhost:3000/api/test
done

# With headers
curl -i http://localhost:3000/api/test

# Check rate limit headers
curl -s -I http://localhost:3000/api/test | grep -i ratelimit

Closing Thoughts

Rate limiting isn't just about preventing abuse — it's about building a sustainable API. Without it, a single misbehaving client can degrade service for everyone.

The key decisions:

Algorithm: Sliding window counter for most cases. Token bucket if you need burst allowance.
Storage: Redis for serverless. You can't rate limit across instances without shared state.
Identification: Tiered approach — user ID > session > IP > fingerprint. Different trust levels get different limits.
Fail mode: Usually fail open (allow requests if rate limiter is down). Fail closed only for critical endpoints.
Granularity: Global limits as baseline, endpoint-specific for sensitive operations, user-specific for paid tiers.

Start simple. A basic sliding window counter with Redis handles 90% of cases. Add complexity (progressive penalties, captcha, honeypots) only when you see specific abuse patterns.

The best rate limiter is one your users never notice — until they try to abuse your API.

What did you think?