feat: Add with_fallback for model-level failover#622
Closed
kieranklaassen wants to merge 5 commits intocrmne:mainfrom
Closed
feat: Add with_fallback for model-level failover#622kieranklaassen wants to merge 5 commits intocrmne:mainfrom
kieranklaassen wants to merge 5 commits intocrmne:mainfrom
Conversation
When a model is overloaded or unavailable after retries are exhausted, automatically switch to a fallback model. Triggers on transient errors only (429, 500, 502-503, 529). Restores original model if fallback also fails. Logs when fallback activates. Closes crmne#621 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Use ensure block to restore original model/provider/connection after fallback attempt, regardless of success or failure - Add @in_fallback guard to prevent double-fallback during tool call recursion - Move FALLBACK_ERRORS constant to top of class per codebase convention - Initialize @fallback and @in_fallback in constructor for consistency - Inline complete_with_fallback into complete - Extract shared test setup, remove duplicate test Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… legacy support - Add Faraday::TimeoutError and Faraday::ConnectionFailed to FALLBACK_ERRORS - Map HTTP 504 to ServiceUnavailableError for fallback coverage - Log fallback error details when both primary and fallback fail - Sanitize all dynamic values in fallback log lines - Add fallback macro to Agent DSL for feature parity - Add with_fallback to legacy acts_as integration - Guard persist_new_message against destroying valid messages (tool calls, content_raw/structured output) - Widen AR cleanup rescue to include Faraday transport errors - Refactor fallback specs to eliminate instance_variable_get coupling Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
87482e7 to
7b63394
Compare
Move fallback logic into RubyLLM::Fallback module, following the same pattern as Streaming for cross-cutting concerns. Chat#complete shrinks from 24 lines of nested rescue to a single delegation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
7b63394 to
fd5cf17
Compare
Contributor
Author
|
Testing this out on production before ready for review |
Contributor
Author
|
So far working great on prod |
Owner
|
Hey @kieranklaassen please use the PR template and don't open PRs from Claude Code. It's there for a reason. Especially this part is important:
This maximises chances that the design of the PR you submit is in line with the rest of the design of RubyLLM and therefore the chances of getting it accepted. Also:
Closing to go back to the discussion in the issue. Will reopen when we agree on a design. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #621
Summary
Adds
with_fallbacktoChat,Agent, and ActiveRecord integrations, enabling automatic model-level failover on transient errors.Chat#with_fallback(model_id, provider:)— On transient errors (rate limit, server error, overloaded, timeout, connection failed), automatically retries with a fallback model, then restores the original modelFallbackmodule — Extracted cross-cutting concern (likeStreaming), included byChatfallbackmacro — Class-level configuration with inheritance supportchat_methods.rb) and legacy (acts_as_legacy.rb) pathsFallback::ERRORSpersist_new_messageguards against destroying messages with tool_calls or content_rawUsage
Changes
lib/ruby_llm/fallback.rbFallbackmodule withERRORS,with_fallback,with_fallback_protection,attempt_fallback, log helpers,sanitize_for_loglib/ruby_llm/chat.rbinclude Fallback,completewraps body inwith_fallback_protectionlib/ruby_llm/error.rbServiceUnavailableErrorlib/ruby_llm/agent.rbfallbackclass macro, inheritance,apply_fallbacklib/ruby_llm/active_record/chat_methods.rbwith_fallbackdelegation, rescue usesFallback::ERRORS, phantom cleanup guardslib/ruby_llm/active_record/acts_as_legacy.rbwith_fallbackdelegationdocs/_advanced/error-handling.mddocs/_core_features/chat.mdREADME.mdDesign Decisions
RubyLLM::Fallback, following the same pattern asStreaming. Included viainclude(notmodule_function) since it needs instance state.with_fallbackreturnsselffor chaining, consistent withwith_model,with_tool, etc.ensureblock — Always restores original model/provider/connection after fallback attempt, whether it succeeds or fails@in_fallbackguard — Prevents recursive fallback during tool call loopsInvalidRequestError,AuthenticationError, etc. pass through immediatelyTest Plan
Post-Deploy Monitoring & Validation
No additional operational monitoring required: this is a library gem — consumers opt-in to
with_fallbackexplicitly.