Day 14: Designing a Distributed Rate Limiting System (Token Bucket + Redis)
Hey guys, eventhough we have discussed about load balancing previously, Rate limiting is a critical building block in modern distributed systems. It protects APIs from abuse, ensures fairness across clients, and prevents overload of backend services.
Today, weโll design a Token Bucket Rate Limiter that works in a distributed setup using Redis, but without requiring Lua scripts.
๐น Why Rate Limiting Matters
Prevent abuse โ block malicious users hammering APIs.
Fairness โ prevent one client from hogging resources.
Protect backend โ avoid cascading failures from overload.
Enable monetization โ enforce quotas for free vs premium tiers.
๐น Popular Algorithms
Fixed Window โ simple, but unfair at boundaries.
Sliding Window โ smoother, but heavier to compute.
Leaky Bucket โ constant outflow, not burst-friendly.
Token Bucket โ allows short bursts, but enforces average limit โ
Weโll use Token Bucket for its balance of fairness + practicality.
๐น Token Bucket Basics
A bucket can hold up to N tokens.
Tokens refill at a steady rate (e.g., 1 token/sec).
Each request consumes a token.
If tokens โฅ 1 โ allow โ
If empty โ reject โ
This ensures fair usage while still allowing occasional bursts.
๐น Making It Distributed with Redis
In single-node setups, the bucket can live in memory.
But in distributed systems (multiple API servers), we need a shared state.
Redis is perfect:
Centralized, super-fast in-memory store.
Atomic counters (
INCRBY,DECRBY).TTL support to auto-expire inactive buckets.
๐น Redis Design
For each client (say user123), we store:
tokensโ current number of tokens.last_refillโ last time we refilled the bucket.
Algorithm Steps
Fetch bucket state:
tokens+last_refill.Refill tokens:
Compute how many tokens should have been added since
last_refill.Add them, cap at
capacity.
Check tokens:
If โฅ 1 โ allow and
DECR.If 0 โ reject.
Update Redis: new
tokens, newlast_refill.
๐น Node.js Example
const Redis = require("ioredis");
const redis = new Redis();
async function isAllowed(userId, capacity = 10, refillRate = 1) {
// refillRate = tokens per second
const key = `rate_limit:${userId}`;
const now = Date.now();
// Fetch state
const data = await redis.hgetall(key);
let tokens = data.tokens ? parseFloat(data.tokens) : capacity;
let lastRefill = data.last_refill ? parseInt(data.last_refill) : now;
// Refill tokens
const elapsed = (now - lastRefill) / 1000; // in seconds
const refill = elapsed * refillRate;
tokens = Math.min(capacity, tokens + refill);
let allowed = false;
if (tokens >= 1) {
allowed = true;
tokens -= 1;
}
// Save back to Redis
await redis.hmset(key, {
tokens: tokens,
last_refill: now,
});
// Optional: expire key if bucket is unused
await redis.expire(key, 60);
return allowed;
}
// Example usage
(async () => {
for (let i = 0; i < 12; i++) {
const ok = await isAllowed("user123");
console.log(`Request ${i + 1}: ${ok ? "โ
Allowed" : "โ Blocked"}`);
}
})();
โ This approach uses only basic Redis commands (HGETALL, HMSET, EXPIRE).
โ No Lua needed.
โ Slightly more network round-trips than Lua-based atomic updates.
๐น Architecture Diagram

๐น Scaling Strategies
Sharded Redis Cluster โ scale horizontally for high throughput.
Local + Redis hybrid โ keep a small in-memory bucket per instance, sync with Redis periodically (reduces Redis calls).
Tiered Limits โ different capacities & refill rates for free vs premium users.
Metrics + Monitoring โ track allowed vs throttled requests.
๐น Real-World Usage
Stripe enforces per-API key limits.
Cloudflare applies edge-based token buckets.
Twitter/X API uses strict quotas per user/app.
โ Key Takeaways
Rate limiting is non-negotiable in production systems.
Token Bucket = fair + burst-friendly.
Redis enables distributed, centralized state management.
You donโt have to know Lua โ basic Redis ops + Node.js are enough.
Lua only becomes necessary when you need atomic updates at extreme scale.
Thatโs it for today. See you later!!