Day 14: Designing a Distributed Rate Limiting System (Token Bucket + Redis)

Hey guys, eventhough we have discussed about load balancing previously, Rate limiting is a critical building block in modern distributed systems. It protects APIs from abuse, ensures fairness across clients, and prevents overload of backend services.

Today, we’ll design a Token Bucket Rate Limiter that works in a distributed setup using Redis, but without requiring Lua scripts.

🔹 Why Rate Limiting Matters

Prevent abuse → block malicious users hammering APIs.
Fairness → prevent one client from hogging resources.
Protect backend → avoid cascading failures from overload.
Enable monetization → enforce quotas for free vs premium tiers.

🔹 Popular Algorithms

Fixed Window → simple, but unfair at boundaries.
Sliding Window → smoother, but heavier to compute.
Leaky Bucket → constant outflow, not burst-friendly.
Token Bucket → allows short bursts, but enforces average limit ✅

We’ll use Token Bucket for its balance of fairness + practicality.

🔹 Token Bucket Basics

A bucket can hold up to N tokens.
Tokens refill at a steady rate (e.g., 1 token/sec).
Each request consumes a token.
If tokens ≥ 1 → allow ✅
If empty → reject ❌

This ensures fair usage while still allowing occasional bursts.

🔹 Making It Distributed with Redis

In single-node setups, the bucket can live in memory.
But in distributed systems (multiple API servers), we need a shared state.

Redis is perfect:

Centralized, super-fast in-memory store.
Atomic counters (INCRBY, DECRBY).
TTL support to auto-expire inactive buckets.

🔹 Redis Design

For each client (say user123), we store:

tokens → current number of tokens.
last_refill → last time we refilled the bucket.

Algorithm Steps

Fetch bucket state: tokens + last_refill.
Refill tokens:
- Compute how many tokens should have been added since last_refill.
- Add them, cap at capacity.
Check tokens:
- If ≥ 1 → allow and DECR.
- If 0 → reject.
Update Redis: new tokens, new last_refill.

🔹 Node.js Example

const Redis = require("ioredis");
const redis = new Redis();

async function isAllowed(userId, capacity = 10, refillRate = 1) {
  // refillRate = tokens per second
  const key = `rate_limit:${userId}`;
  const now = Date.now();

  // Fetch state
  const data = await redis.hgetall(key);
  let tokens = data.tokens ? parseFloat(data.tokens) : capacity;
  let lastRefill = data.last_refill ? parseInt(data.last_refill) : now;

  // Refill tokens
  const elapsed = (now - lastRefill) / 1000; // in seconds
  const refill = elapsed * refillRate;
  tokens = Math.min(capacity, tokens + refill);

  let allowed = false;

  if (tokens >= 1) {
    allowed = true;
    tokens -= 1;
  }

  // Save back to Redis
  await redis.hmset(key, {
    tokens: tokens,
    last_refill: now,
  });

  // Optional: expire key if bucket is unused
  await redis.expire(key, 60);

  return allowed;
}

// Example usage
(async () => {
  for (let i = 0; i < 12; i++) {
    const ok = await isAllowed("user123");
    console.log(`Request ${i + 1}: ${ok ? "✅ Allowed" : "❌ Blocked"}`);
  }
})();

✔ This approach uses only basic Redis commands (HGETALL, HMSET, EXPIRE).
✔ No Lua needed.
⚠ Slightly more network round-trips than Lua-based atomic updates.

🔹 Architecture Diagram

🔹 Scaling Strategies

Sharded Redis Cluster → scale horizontally for high throughput.
Local + Redis hybrid → keep a small in-memory bucket per instance, sync with Redis periodically (reduces Redis calls).
Tiered Limits → different capacities & refill rates for free vs premium users.
Metrics + Monitoring → track allowed vs throttled requests.

🔹 Real-World Usage

Stripe enforces per-API key limits.
Cloudflare applies edge-based token buckets.
Twitter/X API uses strict quotas per user/app.

✅ Key Takeaways

Rate limiting is non-negotiable in production systems.
Token Bucket = fair + burst-friendly.
Redis enables distributed, centralized state management.
You don’t have to know Lua — basic Redis ops + Node.js are enough.
Lua only becomes necessary when you need atomic updates at extreme scale.

That’s it for today. See you later!!

Day 14: Designing a Distributed Rate Limiting System (Token Bucket + Redis)

🔹 Why Rate Limiting Matters

🔹 Popular Algorithms

🔹 Token Bucket Basics

🔹 Making It Distributed with Redis

🔹 Redis Design

Algorithm Steps

🔹 Node.js Example

🔹 Architecture Diagram

🔹 Scaling Strategies

🔹 Real-World Usage

✅ Key Takeaways

Comments

More from this blog

Day 15: Asynchronous Iterators & Generators in Node.js 🚀

Day 13: Designing a High-Throughput Logging System (ELK/Kafka pipelines)

Day 12: Message Queues & Event-Driven Architectures (Kafka, RabbitMQ)

Day 11: Database Replication & Sharding (MongoDB + SQL Tradeoffs)

Command Palette

🔹 Why Rate Limiting Matters

🔹 Popular Algorithms

🔹 Token Bucket Basics

🔹 Making It Distributed with Redis

🔹 Redis Design

Algorithm Steps

🔹 Node.js Example

🔹 Architecture Diagram

🔹 Scaling Strategies

🔹 Real-World Usage

✅ Key Takeaways

Comments

More from this blog