Updated April 2026
Rate Limiting
Rate limiting restricts how many requests a client can make within a time window. When the limit is exceeded, the server returns 429 Too Many Requests. It prevents brute-force attacks, API abuse, and DDoS while protecting backend resources.
How it works
The server tracks request counts per client (by IP, API key, or user ID). When the count exceeds the configured limit, subsequent requests get a 429 response until the window resets.
Headers that accompany 429
| Header | Value | Meaning |
|---|---|---|
| Retry-After | 60 | Seconds until the limit resets |
| X-RateLimit-Limit | 100 | Max requests per window |
| X-RateLimit-Remaining | 0 | Requests left in current window |
| X-RateLimit-Reset | 1712001600 | Unix timestamp when limit resets |
Nginx rate limiting
# Define zone: 10MB memory, 10 requests/second per IP
limit_req_zone $binary_remote_addr zone=api:10m rate=10r/s;
server { location /api/ { limit_req zone=api burst=20 nodelay; limit_req_status 429; add_header Retry-After 60 always; }
}
Cloudflare rate limiting
Dashboard → Security → Rate Limiting → Create Rule. Set the threshold, path, and mitigation action (Block, Challenge, or Simulate).
Common mistakes
- Rate limiting by IP alone fails behind shared NAT (office networks all look like one IP)
- Not sending Retry-After means clients don't know when to retry, causing retry storms
- Applying the same limit to login endpoints and public APIs — auth endpoints need tighter limits