Skip to content

Conversation

@dennismenken
Copy link

@dennismenken dennismenken commented Nov 18, 2025

Summary

This Pull Request fixes a bug in the Redis-based rate limiter (both single-node and cluster variants) where old entries in the sliding window were never removed. Under sustained traffic, this could cause individual rate-limit keys to remain permanently "full" and reject all subsequent requests with HTTP 429, even after the configured time window had elapsed.

Background

The Redis rate limiter uses a sorted set per key to implement a sliding-window algorithm:

  • Each request is stored as a timestamp in a Redis ZSET.
  • On each check, entries older than the configured window are supposed to be removed.
  • The current request count is then derived from the ZSET's cardinality.

In both RedisRateLimiter and RedisClusterRateLimiter, the cleanup logic was implemented as:

// Remove all elements older than our window
let _: () = conn
    .zrevrangebyscore(&redis_key, 0, window_start as i64)
    .await
    .map_err(|e| Error::Redis(format!("Failed to clean up Redis sorted set: {e}")))?;

ZREVRANGEBYSCORE only reads the matching members; it does not delete them. Because the result was ignored and there was no additional cleanup, old timestamps were never removed from the ZSET. When combined with the periodic EXPIRE on the key, a hot key could end up in a state where:

  • The sorted set keeps accumulating entries.
  • ZCARD eventually returns a value greater than or equal to max_requests.
  • The key's TTL is constantly extended via EXPIRE, so it never naturally expires.
  • The limiter reports remaining = 0 and rejects all new requests indefinitely for that key.

The only way out of this state was an explicit DEL on the key (e.g., via reset() or manual intervention in Redis).

Changes

The fix is minimal and targeted:

  • In src/rate_limiter/redis_limiter.rs and src/rate_limiter/redis_cluster_limiter.rs, replace the cleanup call:

    // Remove all elements older than our window
    let _: () = conn
        .zrevrangebyscore(&redis_key, 0, window_start as i64)
        .await
        .map_err(|e| Error::Redis(format!("Failed to clean up Redis sorted set: {e}")))?;

    with:

    // Remove all elements older than our window
    let _: () = conn
        .zrembyscore(&redis_key, 0, window_start as i64)
        .await
        .map_err(|e| Error::Redis(format!("Failed to clean up Redis sorted set: {e}")))?;
  • No configuration fields, public types, or external interfaces are changed.

Impact

With this change:

  • The sliding window behaves as intended:
    • Entries older than the configured window are actually removed from the ZSET.
    • Keys that see bursts of traffic recover naturally once the window has passed.
  • The bug where a hot key could become permanently rate-limited is eliminated.
  • The behavior of the limiter is now consistent with the existing comments and the expected semantics of a sliding-window algorithm.

This change is backward compatible from a configuration and API perspective and only affects the internal cleanup semantics of the rate limiter.

Testing

  • cargo check passes locally.
  • Manual validation with Redis-backed rate limiting:
    • Generate requests up to and beyond the configured max_requests within the window and observe 429 as expected.
    • Wait for the window to elapse and confirm that:
      • the ZSET cardinality decreases as old entries are removed, and
      • new requests are accepted again without requiring manual Redis key deletion.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @dennismenken, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves a significant issue in the Redis-based rate limiter where the sliding window mechanism failed to properly clean up old entries. Previously, the incorrect Redis command was used for cleanup, causing rate-limited keys to remain perpetually blocked. The change replaces the command with the correct one, ensuring that the rate limiter functions as designed, allowing keys to recover after their configured time window has elapsed.

Highlights

  • Bug Fix: Addresses a critical bug in the Redis-based rate limiter where old entries in the sliding window were never removed, leading to individual rate-limit keys becoming permanently 'full' and rejecting all subsequent requests with HTTP 429.
  • Correction of Redis Command Usage: The cleanup logic incorrectly used ZREVRANGEBYSCORE, which only reads elements, instead of ZREMRANGEBYSCORE, which actually removes elements older than the configured window. This PR rectifies this by replacing the command.
  • Improved Rate Limiter Behavior: With this fix, the sliding window now behaves as intended, ensuring that keys that experience bursts of traffic recover naturally once the time window has passed, eliminating the permanent rate-limiting issue.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

gemini-code-assist[bot]

This comment was marked as outdated.

Copy link
Collaborator

@MarcEspiard MarcEspiard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please build and test your changes before submitting a PR.

The issue is real so I'll keep this open so you can finish addressing the issue.

Thank you

// Remove all elements older than our window
let _: () = conn
.zrevrangebyscore(&redis_key, 0, window_start as i64)
.zremrangebyscore(&redis_key, 0, window_start as i64)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The build doesn't compile. zremrangebyscore doesn't exist in redis-rs.
The equivalent redis-rs function is zrembyscore

// Remove all elements older than our window
let _: () = conn
.zrevrangebyscore(&redis_key, 0, window_start as i64)
.zremrangebyscore(&redis_key, 0, window_start as i64)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The build doesn't compile. zremrangebyscore doesn't exist in redis-rs.
The equivalent redis-rs function is zrembyscore

@dennismenken dennismenken force-pushed the fix-redis-rate-limiter-cleanup branch from 95eba0b to 99fcea7 Compare November 18, 2025 07:31
@dennismenken
Copy link
Author

Hey @MarcEspiard, good catch, the Redis command is ZREMRANGEBYSCORE and the redis rs crate exposes it as zrembyscore. I was blinded by my fatigue and built the containers with the wrong repo.

I have updated the code to use that method, ran cargo check and cargo build, and can confirm the build passes now, thanks for pointing it out.

Yesterday, I noticed that Sockudo threw a rate limit error behind Traefik when trusted_hops = 0 during the health check, so I thought it would be a good idea to fix it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants