Skip to content

feat: Add with_fallback for model-level failover#622

Closed
kieranklaassen wants to merge 5 commits intocrmne:mainfrom
kieranklaassen:feat/with-fallback
Closed

feat: Add with_fallback for model-level failover#622
kieranklaassen wants to merge 5 commits intocrmne:mainfrom
kieranklaassen:feat/with-fallback

Conversation

@kieranklaassen
Copy link
Contributor

@kieranklaassen kieranklaassen commented Feb 19, 2026

Closes #621

Summary

Adds with_fallback to Chat, Agent, and ActiveRecord integrations, enabling automatic model-level failover on transient errors.

  • Chat#with_fallback(model_id, provider:) — On transient errors (rate limit, server error, overloaded, timeout, connection failed), automatically retries with a fallback model, then restores the original model
  • Fallback module — Extracted cross-cutting concern (like Streaming), included by Chat
  • Agent DSL fallback macro — Class-level configuration with inheritance support
  • ActiveRecord delegation — Both new (chat_methods.rb) and legacy (acts_as_legacy.rb) paths
  • Hardened error handling — Transport errors (Faraday::TimeoutError, Faraday::ConnectionFailed) trigger fallback; 504 mapped to ServiceUnavailableError; AR cleanup widened to cover Fallback::ERRORS
  • Log sanitization — Control characters stripped from dynamic values in fallback log messages
  • Phantom message safetypersist_new_message guards against destroying messages with tool_calls or content_raw

Usage

chat = RubyLLM.chat(model: 'gpt-4.1')
  .with_fallback('claude-sonnet-4-5-20250514')
  .ask("Hello!")

# Agent DSL
class MyAgent < RubyLLM::Agent
  model 'gpt-4.1'
  fallback 'claude-sonnet-4-5-20250514'
end

# ActiveRecord
chat_record.with_fallback('claude-sonnet-4-5-20250514').ask("Hello!")

Changes

File Change
lib/ruby_llm/fallback.rb NewFallback module with ERRORS, with_fallback, with_fallback_protection, attempt_fallback, log helpers, sanitize_for_log
lib/ruby_llm/chat.rb include Fallback, complete wraps body in with_fallback_protection
lib/ruby_llm/error.rb Map HTTP 504 → ServiceUnavailableError
lib/ruby_llm/agent.rb fallback class macro, inheritance, apply_fallback
lib/ruby_llm/active_record/chat_methods.rb with_fallback delegation, rescue uses Fallback::ERRORS, phantom cleanup guards
lib/ruby_llm/active_record/acts_as_legacy.rb with_fallback delegation
docs/_advanced/error-handling.md Fallback documentation
docs/_core_features/chat.md Usage examples
README.md Feature highlight

Design Decisions

  • Module extraction — Fallback is a cross-cutting concern extracted into RubyLLM::Fallback, following the same pattern as Streaming. Included via include (not module_function) since it needs instance state.
  • Builder patternwith_fallback returns self for chaining, consistent with with_model, with_tool, etc.
  • ensure block — Always restores original model/provider/connection after fallback attempt, whether it succeeds or fails
  • @in_fallback guard — Prevents recursive fallback during tool call loops
  • Re-raises original error — When fallback also fails, the original error is raised (not the fallback error)
  • No retry on non-transient errorsInvalidRequestError, AuthenticationError, etc. pass through immediately

Test Plan

  • 18 chat fallback specs (transient errors, both-fail, model restoration, streaming, log sanitization, all error classes)
  • 7 agent fallback specs (macro, inheritance, isolation, apply, no-fallback)
  • 4 AR phantom message cleanup specs (orphan, tool_calls, content_raw, populated)
  • All 92 targeted examples pass, 0 failures

Post-Deploy Monitoring & Validation

No additional operational monitoring required: this is a library gem — consumers opt-in to with_fallback explicitly.


Compound Engineered Generated with Claude Code (Claude Opus 4.6) and Codex (Codex 5.3)

kieranklaassen and others added 3 commits February 18, 2026 21:16
When a model is overloaded or unavailable after retries are exhausted,
automatically switch to a fallback model. Triggers on transient errors
only (429, 500, 502-503, 529). Restores original model if fallback
also fails. Logs when fallback activates.

Closes crmne#621

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Use ensure block to restore original model/provider/connection after
  fallback attempt, regardless of success or failure
- Add @in_fallback guard to prevent double-fallback during tool call
  recursion
- Move FALLBACK_ERRORS constant to top of class per codebase convention
- Initialize @fallback and @in_fallback in constructor for consistency
- Inline complete_with_fallback into complete
- Extract shared test setup, remove duplicate test

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… legacy support

- Add Faraday::TimeoutError and Faraday::ConnectionFailed to FALLBACK_ERRORS
- Map HTTP 504 to ServiceUnavailableError for fallback coverage
- Log fallback error details when both primary and fallback fail
- Sanitize all dynamic values in fallback log lines
- Add fallback macro to Agent DSL for feature parity
- Add with_fallback to legacy acts_as integration
- Guard persist_new_message against destroying valid messages
  (tool calls, content_raw/structured output)
- Widen AR cleanup rescue to include Faraday transport errors
- Refactor fallback specs to eliminate instance_variable_get coupling

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move fallback logic into RubyLLM::Fallback module, following the same
pattern as Streaming for cross-cutting concerns. Chat#complete shrinks
from 24 lines of nested rescue to a single delegation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@kieranklaassen
Copy link
Contributor Author

Testing this out on production before ready for review

@kieranklaassen
Copy link
Contributor Author

So far working great on prod

@kieranklaassen kieranklaassen marked this pull request as ready for review February 20, 2026 20:24
@crmne
Copy link
Owner

crmne commented Feb 27, 2026

Hey @kieranklaassen please use the PR template and don't open PRs from Claude Code. It's there for a reason.

Especially this part is important:

  • I opened an issue before writing code and received maintainer approval
  • Linked issue: #___

This maximises chances that the design of the PR you submit is in line with the rest of the design of RubyLLM and therefore the chances of getting it accepted.

Also:

  • I used AI tools to help write this code
  • I have reviewed and understand all generated code (required if above is checked)

Closing to go back to the discussion in the issue. Will reopen when we agree on a design.

@crmne crmne closed this Feb 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature: with_fallback for model-level failover

2 participants