-
Notifications
You must be signed in to change notification settings - Fork 85
Redis: Do Not Open Circuit on OOM #940
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
CI failures appear to be unrelated and a similar pattern can be seen on recent PRs. The Redis part of the matrix seems happy. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR changes Redis OOM (Out Of Memory) error handling to prevent circuit breaking, allowing Redis to recover more effectively from memory pressure situations. The change recognizes that OOM errors are fast failures where circuit breaking provides no benefit and can actually hinder recovery by blocking read and dequeue operations.
- Adds
marks_semian_circuits?method override returningfalsetoOutOfMemoryErrorclasses in both redis and redis_client adapters - Updates test expectations to verify circuits remain closed after OOM errors exceed the error threshold
- Includes clear documentation explaining the rationale: OOM errors are fast failures and blocking reads/dequeues prevents recovery
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| lib/semian/redis.rb | Adds marks_semian_circuits? method to Redis::OutOfMemoryError class to prevent circuit opening for Redis v4 |
| lib/semian/redis_client.rb | Adds marks_semian_circuits? method to RedisClient::OutOfMemoryError using class_eval to prevent circuit opening |
| test/adapters/redis_test.rb | Updates two OOM test cases to verify circuits do NOT open after exceeding error threshold, with clear explanatory comments |
Note: The implementation correctly handles both Redis v4 (via lib/semian/redis.rb) and RedisClient (via lib/semian/redis_client.rb). However, for completeness, lib/semian/redis/v5.rb (not modified in this PR) should also be updated to add the same marks_semian_circuits? override for Redis::OutOfMemoryError to ensure consistent behavior when Redis v5+ translates RedisClient::OutOfMemoryError to Redis::OutOfMemoryError.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
d72b796 to
67c3620
Compare
|
Rebased on main after the break, the entire matrix is now happy 🥳 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
a8bdc74 to
a21ee66
Compare
a21ee66 to
b20d4f1
Compare
What are you trying to accomplish?
I am changing Semian behavior on Redis OOM errors to not open the circuit in order to better allow Redis to recover (and since there is not much benefit here form circuit breaking anyways).
What approach did you choose and why?
The Jobs Platform Team has identified that not opening the circuit is preferable here. The current behavior concern was anticipated early on in #172.
Per @Mangara :
| We've seen this multiple times, both from HedwigUI and from Hedwig itself, when it tries to unsafely dequeue to recover from OOM.
What should reviewers focus on?