Monday, March 3, 2025

Rate Limiting Algorithms: A Deep Dive 2

Programming LanguageRate Limiting Algorithms: A Deep Dive 2


Introduction

Rate limiting is a technique used to control the number of requests a system processes within a specific time frame. It helps prevent abuse, ensures fair usage, protects against DDoS attacks, and maintains system stability.

This blog explores the most commonly used rate-limiting algorithms, their advantages and disadvantages, and how to implement them in Java.


1️⃣ Token Bucket Algorithm

How It Works

  • A bucket holds a fixed number of tokens.
  • Tokens are added to the bucket at a constant rate.
  • Each request consumes a token.
  • If the bucket is empty, excess requests are denied until new tokens are added.

Pros & Cons

✅ Allows short bursts of traffic while controlling overall request rate.

✅ Efficient for distributed systems.

❌ If the bucket drains quickly, requests may be blocked until tokens are refilled.

Java Implementation

import java.util.concurrent.Semaphore;
import java.util.concurrent.TimeUnit;

class TokenBucketRateLimiter {
    private final Semaphore tokens;
    private final int capacity;

    public TokenBucketRateLimiter(int capacity, int refillRatePerSecond) {
        this.capacity = capacity;
        this.tokens = new Semaphore(capacity);

        // Refill tokens periodically
        new Thread(() -> {
            while (true) {
                tokens.release(refillRatePerSecond);
                if (tokens.availablePermits() > capacity) {
                    tokens.drainPermits();
                    tokens.release(capacity);
                }
                try {
                    TimeUnit.SECONDS.sleep(1);
                } catch (InterruptedException e) {
                    Thread.currentThread().interrupt();
                }
            }
        }).start();
    }

    public boolean allowRequest() {
        return tokens.tryAcquire();
    }
}
Enter fullscreen mode

Exit fullscreen mode


2️⃣ Leaky Bucket Algorithm

How It Works

  • Requests enter a queue (bucket).
  • Requests are processed at a fixed rate (like water leaking from a bucket).
  • If the queue overflows, extra requests are discarded.

Pros & Cons

✅ Ensures a steady flow of requests.
✅ Prevents sudden traffic spikes from overloading the system.
❌ Can introduce delays if the queue is full.

Java Implementation

import java.util.LinkedList;
import java.util.Queue;

class LeakyBucketRateLimiter {
    private final Queue<Long> queue;
    private final int capacity;
    private final long leakRateMillis;

    public LeakyBucketRateLimiter(int capacity, int leakRatePerSecond) {
        this.capacity = capacity;
        this.leakRateMillis = 1000L / leakRatePerSecond;
        this.queue = new LinkedList<>();
    }

    public synchronized boolean allowRequest() {
        long currentTime = System.currentTimeMillis();
        while (!queue.isEmpty() && queue.peek() <= currentTime - leakRateMillis) {
            queue.poll();
        }
        if (queue.size() < capacity) {
            queue.add(currentTime);
            return true;
        }
        return false;
    }
}
Enter fullscreen mode

Exit fullscreen mode


3️⃣ Fixed Window Counter Algorithm

How It Works

  • A counter tracks the number of requests per fixed time window (e.g., per minute).
  • If the count exceeds the allowed limit, further requests are rejected until the next window.

Pros & Cons

✅ Easy to implement.

✅ Works well for predictable traffic patterns.

❌ Can lead to spikes at window boundaries.

Java Implementation

import java.util.concurrent.atomic.AtomicInteger;

class FixedWindowRateLimiter {
    private final int limit;
    private final long windowSizeMillis;
    private long windowStart;
    private final AtomicInteger requestCount;

    public FixedWindowRateLimiter(int limit, int windowSizeSeconds) {
        this.limit = limit;
        this.windowSizeMillis = windowSizeSeconds * 1000L;
        this.windowStart = System.currentTimeMillis();
        this.requestCount = new AtomicInteger(0);
    }

    public synchronized boolean allowRequest() {
        long now = System.currentTimeMillis();
        if (now - windowStart >= windowSizeMillis) {
            windowStart = now;
            requestCount.set(0);
        }
        return requestCount.incrementAndGet() <= limit;
    }
}
Enter fullscreen mode

Exit fullscreen mode


4️⃣ Sliding Window Counter Algorithm

How It Works

  • Uses smaller sub-windows within a fixed window.
  • More accurate and distributes requests evenly.

Pros & Cons

✅ More accurate than Fixed Window Counter.

✅ Reduces request bursts at window boundaries.

❌ More complex to implement.


5️⃣ Sliding Window Log Algorithm

How It Works

  • Stores timestamps of each request.
  • Removes timestamps outside the allowed time window.

Pros & Cons

✅ High accuracy in rate limiting.

❌ High memory usage due to storing request timestamps.


6️⃣ Adaptive Rate Limiting

How It Works

  • Uses machine learning or heuristics to adjust rate limits dynamically.
  • Can consider factors like server load, request patterns, and user behavior.

Pros & Cons

✅ Dynamically adjusts limits for better efficiency.

❌ Complex implementation.


Comparison Table

Algorithm Burst Handling Smoothness Memory Usage Complexity
Token Bucket ✅ Yes ✅ Yes 🔹 Low 🔹 Simple
Leaky Bucket ❌ No ✅ Yes 🔹 Low 🔹 Simple
Fixed Window Counter ❌ No ❌ No 🔹 Low 🔹 Simple
Sliding Window Counter ✅ Yes ✅ Yes 🔹 Medium 🔹 Medium
Sliding Window Log ✅ Yes ✅ Yes 🔴 High 🔴 Complex
Adaptive Rate Limiting ✅ Yes ✅ Yes 🔴 High 🔴 Complex

Interview Preparation

Commonly Asked Questions:

1️⃣ What is the difference between Token Bucket and Leaky Bucket?

2️⃣ Which algorithm prevents bursts better?

3️⃣ How would you implement rate limiting in a distributed system?

4️⃣ How do you ensure fairness in rate limiting?


Conclusion

Choosing the right rate-limiting algorithm depends on your system’s requirements. For APIs, Token Bucket is widely used due to its efficiency. If smooth processing is essential, Leaky Bucket is a good choice. For better accuracy, Sliding Window methods work well.

Want a deeper dive into API Gateway-based rate limiting? Let me know! 🚀

Check out our other content

Check out other tags:

Most Popular Articles