The best examples of rate limiting with Redis: practical examples for real APIs
Let’s start where most engineers actually start: a simple per‑IP cap on an HTTP API. This is the “hello world” of Redis rate limiting, and it’s still one of the best examples because it shows the pattern you’ll reuse everywhere.
You have an endpoint like /v1/search. You want to allow 100 requests per minute per IP. In Redis, you store a counter keyed by IP and time window:
Key: rate:ip:203.0.113.5:2025-12-02T15:47
Value: 42
Your app logic looks like this (in pseudocode):
key = f"rate:ip:{ip}:{current_minute() }"
count = redis.incr(key)
if count == 1:
redis.expire(key, 60) # seconds
if count > 100:
reject_request(429, "Too Many Requests")
else:
handle_request()
That’s the baseline. From here, the best examples of rate limiting with Redis: practical examples all build on this same idea: a fast key, a small piece of state, and a cheap check on every request.
Sliding window example of rate limiting with Redis for smoother traffic
Fixed windows are fine until someone hits the boundary. A client can send 100 requests at 12:00:59 and another 100 at 12:01:01 and still “comply,” even though your service just took 200 hits in 2 seconds.
A better example of rate limiting with Redis: practical examples is the sliding window counter. Instead of bucketing by whole minutes, you track timestamps of recent requests and trim anything older than your window.
A common pattern uses a sorted set per client:
Key: rate:user:1234
Member: timestamp (as score and value)
On each request:
-- KEYS[1] = key, ARGV[1] = now, ARGV[2] = window, ARGV[3] = limit
local key = KEYS[1]
local now = tonumber(ARGV[1])
local window = tonumber(ARGV[2])
local limit = tonumber(ARGV[3])
-- Remove entries older than window
redis.call("ZREMRANGEBYSCORE", key, 0, now - window)
-- Count remaining
local count = redis.call("ZCARD", key)
if count >= limit then
return {0, count}
end
-- Add current request
redis.call("ZADD", key, now, now)
redis.call("EXPIRE", key, window)
return {1, count + 1}
This Lua script runs atomically in Redis. You get a much smoother rate limit: “no more than 100 requests in any rolling 60‑second window,” not just per minute bucket. For APIs with bursty clients—mobile apps, browser tabs resuming after sleep—this is one of the best examples of rate limiting with Redis: practical examples you can ship today.
User‑tier quotas: paywalled examples of rate limiting with Redis
Real systems rarely treat all users the same. Free users get 60 requests per minute, paid users get 600, and internal services might get even more.
Here’s a real example of rate limiting with Redis: practical examples for tiered plans:
- You store the user’s plan in your primary database:
free,pro, orenterprise. - On login or token validation, you cache the effective limit in Redis:
Key: user:quota:1234
Value: {"plan":"pro","rpm":600}
TTL: 300 seconds
- Your rate limiter reads that cached quota and uses it to set the limit for the sliding window or fixed window logic.
Now your rate limiting becomes data‑driven:
quota = json.loads(redis.get(f"user:quota:{user_id}") or '{"rpm":60}')
limit = quota["rpm"]
allowed, current = sliding_window_check(user_id, limit, 60)
This pattern matters in 2024–2025 because API monetization has become a serious business model again. Look at platforms like Stripe or Twilio: rate limits are part of product packaging, not just security hardening. Redis gives you the low‑latency read/write behavior you need to enforce those quotas without hammering your primary database.
Login abuse and bot throttling: real examples that actually stop attacks
Let’s talk about a more security‑flavored example of rate limiting with Redis: practical examples: login and OTP throttling.
Imagine a /login endpoint. You want to:
- Limit attempts per IP to slow down credential stuffing.
- Limit attempts per account to stop brute‑forcing a single user.
- Apply stricter limits by country or ASN if you detect suspicious ranges.
A practical Redis setup might include:
rate:login:ip:203.0.113.5 -> counter, 10 attempts per 5 minutes
rate:login:user:user@example.com -> counter, 5 attempts per 15 minutes
rate:login:country:RU -> counter, 1000 attempts per 15 minutes
Each key uses the same fixed‑window pattern as the first example, but with different thresholds and expirations. If any of those keys exceeds the limit, you block the attempt or step up authentication (CAPTCHA, WebAuthn, or email verification).
Security teams like this because it’s explainable and tunable. You can export Redis metrics to your monitoring system and watch how many login attempts per IP you’re dropping, then tune thresholds without changing code.
For background on credential‑stuffing attacks and why this matters, the FBI and CISA have published advisories on automated attacks and account compromise patterns; they’re worth a read even if you’re not in finance or healthcare.
Protecting expensive endpoints: AI, search, and reports
Not all endpoints are equal. Some are cheap (GET /status), some are expensive (POST /generate-report, POST /chat/completions). In 2024–2025, with AI‑backed features everywhere, you really do not want a single customer to spin up hundreds of concurrent GPT calls and surprise you with a cloud bill.
Here’s a real example of rate limiting with Redis: practical examples for expensive operations:
- You define a separate Redis keyspace, like
rate:expensive:user:{id}. - You use a token bucket algorithm instead of a simple counter.
Token bucket in Redis looks like this:
- The bucket has a capacity (say, 20 tokens) and a refill rate (5 tokens per minute).
- Every request consumes a token.
- If the bucket is empty, you either block or queue the request.
You can implement this with a Lua script that:
- Reads the last refill timestamp and current tokens.
- Calculates how many tokens to add since last refill.
- Deducts one token if available.
Because the script runs atomically, you avoid race conditions when many requests hit at once. This is one of the best examples of rate limiting with Redis: practical examples because it directly controls cost, not just traffic.
API gateway integration: NGINX, Envoy, and cloud load balancers
You don’t have to write all of this by hand. Modern API gateways either support Redis directly or can be extended to use it.
Some common examples of rate limiting with Redis: practical examples at the edge:
- NGINX with
lua-nginx-moduleand Redis for per‑IP and per‑API key limits. - Envoy using an external rate limit service backed by Redis, where the service applies sliding windows and token buckets.
- Cloud load balancers (AWS, GCP, Azure) calling a small rate‑limiting microservice that talks to Redis.
The pattern is the same:
- The gateway extracts an identity (IP, API key, user ID, org ID).
- It calls a small service or Lua script that checks Redis.
- If the check fails, the gateway returns HTTP 429 without touching your app servers.
This keeps the hot path extremely fast and moves rate‑limit logic to a layer that’s easy to observe and scale.
Multi‑region and multi‑tenant examples of rate limiting with Redis
Things get interesting when you’re global. If you have traffic in North America and Europe, you might run Redis clusters in both regions. Now you have to decide: is your limit global per user, or per region?
Some real examples of rate limiting with Redis: practical examples in this space:
- Per‑region limits: Each region has its own Redis, and users are effectively limited per region. This is easier, with lower latency, but a single user can use more global capacity.
- Global limits with periodic sync: You keep authoritative counters in one region and periodically sync aggregates from others. This is approximate, but often good enough if limits are generous.
- Global limits via Redis Cluster or Redis Enterprise: You use a managed or clustered Redis setup that spans regions, accepting a bit more latency for truly global caps.
For multi‑tenant SaaS, you often combine this with organization‑level keys:
rate:org:acme:minute:2025-12-02T15:47 -> 1200
rate:user:1234:minute:2025-12-02T15:47 -> 80
You enforce both org‑wide and per‑user caps. This gives you a very practical guardrail against a single integration going haywire and consuming the entire tenant’s budget.
Observability and tuning: using data to adjust Redis rate limits
Rate limiting without visibility is just guesswork. The better examples of rate limiting with Redis: practical examples always pair the limit itself with metrics and logs.
A realistic setup in 2024–2025 looks like this:
- Every time you block a request, you increment a Redis or Prometheus counter like
rate_limit_blocked_total{key_type="ip"}. - You export Redis key stats (hits, misses, memory usage) to your monitoring stack.
- You sample logs of 429 responses with details: user ID, endpoint, region, and limit type.
With that data, you can:
- Identify noisy endpoints that need tighter or looser limits.
- Spot abuse patterns before they become outages.
- Justify plan upgrades to customers based on hard numbers.
For general thinking about service reliability and overload protection, the SRE guides from Google and university courses on distributed systems (for example, material from MIT or Stanford CS programs) are good background reading, even though they don’t focus specifically on Redis.
Common pitfalls in Redis rate limiting (and how to avoid them)
A few patterns show up over and over when teams adopt Redis for rate limiting:
Key explosion
If you use per‑request timestamps in sorted sets without expirations, you’ll eventually fill memory with dead keys. Always set EXPIRE on keys, and if you’re using sorted sets, prune old entries aggressively.
Incorrect time synchronization
If you rely on client clocks for timestamps, your sliding windows break. Always use server‑side time (from your app servers or Redis’ TIME command) when building examples of rate limiting with Redis: practical examples.
Single point of failure
If all rate‑limit decisions depend on one tiny Redis instance, that instance is now as critical as your database. Use Redis replication, clustering, or a managed service, and decide what your app does if Redis is temporarily unavailable. Many teams choose to fail open for low‑risk endpoints and fail closed for security‑sensitive ones like login.
Ignoring privacy and compliance
If you’re storing IP addresses or user identifiers in Redis keys, think about data retention and compliance. Public health and privacy resources from U.S. agencies (for example, guidance from HHS.gov on data protection) are a good reminder that short TTLs and careful key design are your friends.
FAQ: examples of rate limiting with Redis and common questions
Q: What are some simple examples of rate limiting with Redis I can start with today?
A: Two easy starting points are per‑IP limits for public APIs and per‑user limits for authenticated endpoints. Use a fixed window counter with INCR and EXPIRE for each key, and return HTTP 429 when the counter exceeds your threshold. These are straightforward examples of rate limiting with Redis: practical examples that catch a lot of abuse with minimal code.
Q: Can you give an example of combining Redis rate limiting with authentication?
A: Yes. After validating a JWT or session cookie, you extract the user ID and plan, look up the plan’s rate limit in Redis, and then apply a sliding window check keyed by user ID and endpoint. If the user is over limit, you return 429 along with a hint header like X-RateLimit-Reset so clients know when to retry.
Q: How do I test real examples of Redis rate limiting in staging?
A: In staging, lower your limits dramatically (for example, from 1000 requests per minute to 10) and run load tests with tools like k6 or wrk. Watch how your Redis keys behave, how many 429s you generate, and whether your app’s latency changes under load.
Q: Is Redis accurate enough for billing‑grade quotas?
A: For most APIs, yes. The atomic INCR operations and Lua scripts provide consistent counters under heavy concurrency. If you need strict auditable billing, you might pair Redis with a write‑behind process that periodically flushes counters to a durable store like PostgreSQL, then reconcile from there.
Q: Are there best examples of rate limiting with Redis for background jobs, not HTTP requests?
A: Definitely. You can rate limit job dispatch from a queue by having each worker check a Redis token bucket before pulling a job. This keeps you from overwhelming downstream systems like payment gateways or third‑party APIs.
If you remember nothing else, remember this: the strongest examples of rate limiting with Redis: practical examples are small, focused patterns that you can compose—fixed windows for simple IP limits, sliding windows for smoother user quotas, and token buckets for expensive operations. Start simple, measure, then refine.
Related Topics
Explore More Rate Limiting and Pagination in APIs
Discover more examples and insights in this category.
View All Rate Limiting and Pagination in APIs