← Back to journal
·4 min read

API Gateway Patterns: JWT, Rate Limiting, and OAuth2 in Production

API GatewayJWTOAuth2SecurityMicroservices

An API gateway is one of those things that seems optional until you have 15 microservices each implementing their own authentication, rate limiting, and logging. Then it becomes the most important piece of infrastructure you own. I have configured and maintained API gateways in production across multiple projects, and the patterns below are the ones I keep coming back to.

Why You Need a Gateway

Without a gateway, every service needs to validate JWTs, enforce rate limits, handle CORS, and log requests independently. That means every service is a potential security hole. One developer forgets to validate token expiry in their auth middleware, and suddenly expired tokens work against that one service.

The gateway centralizes these cross-cutting concerns. Services behind the gateway can trust that requests have been authenticated, rate-limited, and logged before they arrive. This lets service developers focus on business logic instead of reimplementing security boilerplate.

JWT Validation at the Edge

JWT validation belongs at the gateway, not in individual services. Here is a generic gateway configuration pattern for JWT validation:

routes:
  - path: /api/v1/*
    auth:
      type: jwt
      issuer: "https://auth.example.com"
      jwks_uri: "https://auth.example.com/.well-known/jwks.json"
      required_claims:
        - sub
        - tenant_id
      token_source: "Authorization: Bearer"
      clock_skew: 30s
    plugins:
      - strip-auth-header: true
      - inject-claims:
          headers:
            X-User-ID: "sub"
            X-Tenant-ID: "tenant_id"

The gateway validates the token, extracts claims, and forwards them as headers to downstream services. Services read X-User-ID and X-Tenant-ID from headers without ever touching the JWT directly. This also means you can rotate signing keys or change JWT providers without modifying any downstream service.

The clock_skew setting matters more than you think. I have seen production issues where server clocks drifted by 45 seconds and suddenly every JWT was rejected. Allow at least 30 seconds of skew.

Rate Limiting Strategies

Not all endpoints deserve the same rate limits. Here is how I tier them:

Authentication endpoints (login, token refresh): Strict limits. 10 requests per minute per IP. These are the first target for brute force attacks, and loose limits here are an open invitation.

Public API endpoints: Per-user limits based on their plan tier. A free user might get 100 requests per minute, a paid user 1000. Use sliding window counters, not fixed windows, to prevent burst abuse at window boundaries.

Internal service-to-service endpoints: Higher limits or none at all, since these are behind network-level access controls. But still monitor them - a misbehaving service can DDoS your internal APIs.

Health check and metrics endpoints: Exempt from rate limiting entirely. You do not want your monitoring system to be rate-limited during an incident.

The biggest mistake I have seen: not rate limiting the login endpoint. An attacker ran a credential stuffing attack at 500 requests per second against an unprotected login endpoint. The gateway had rate limiting on every other route. Nobody thought to add it to the one endpoint that needed it most.

OAuth2 for Service-to-Service Auth

When services need to call each other, I use the OAuth2 client credentials flow. Each service has its own client ID and secret, requests a short-lived token, and presents it to the gateway. This gives you an audit trail of which service called which, and you can revoke a compromised service's credentials without rotating secrets for everything else.

The pattern: services request tokens with specific scopes. The gateway validates that the token's scopes match the endpoint's requirements. A notification service with scope notifications:send cannot call the billing API that requires billing:read.

Common Mistakes

  1. Not validating JWT expiry - I have seen gateways configured to validate the signature but not the exp claim. Tokens work forever.
  2. Caching JWKs indefinitely - Key rotation becomes impossible. Cache with a TTL of 5 to 15 minutes.
  3. Rate limiting by IP only - Shared office IPs and VPNs mean legitimate users get blocked. Use user ID when available, fall back to IP for unauthenticated endpoints.
  4. No circuit breaking - When a downstream service is down, the gateway should fail fast instead of holding connections open until timeout.

The gateway is your first and last line of defense. Invest the time to configure it properly, and it will save you from incidents that would otherwise require waking up the entire on-call rotation.