Skip to content

Conversation

@Ahmath-Gadji
Copy link
Collaborator

@Ahmath-Gadji Ahmath-Gadji commented Jan 26, 2026

Fix OpenAI endpoint failures due to LLM availability checks

OpenAI endpoints are failing when the underlying LLM or VLM in openrag is unavailable, caused by the check_llm_model_availability dependency blocking requests.

Changes

  • v1/models endpoint: Removed the check_llm_model_availability dependency since this endpoint doesn't require model availability to function

  • v1/chat/completions and v1/completions endpoints: These endpoints depend on LLM availability (for answer generation). However, clients weren't receiving timely feedback about availability issues because the health check lacks a timeout, causing requests to hang indefinitely. Added timeout handling (30s) to provide immediate feedback to clients.

  • Additional fix: Corrected the OpenAI endpoint duplication warning

Summary by CodeRabbit

  • Bug Fixes

    • Model availability checks now return clearer HTTP status codes and messages; missing models yield explicit 404 and API errors return 500.
    • External API calls now use timeouts to avoid indefinite hangs.
  • Refactor

    • Router mounting logic consolidated to remove duplication and ensure consistent behavior.
    • Model validation simplified to a single-flow with improved contextual logging for easier diagnostics.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Jan 26, 2026

📝 Walkthrough

Walkthrough

OpenAI router mounting now occurs when either the OpenAI API or Chainlit UI is enabled. list_models no longer requires the LLM-availability dependency. The LLM availability check now validates a single configured LLM with 30s timeouts, contextual logging, and refined 404/500 error handling.

Changes

Cohort / File(s) Summary
Router Setup
openrag/api.py
Mounts the OpenAI router when OpenAI API OR Chainlit UI is enabled; removes a duplicate unconditional mount.
OpenAI Endpoints
openrag/routers/openai.py
Removed _: None = Depends(check_llm_model_availability) from list_models; moved that dependency to the end of the parameter lists for openai_chat_completion and openai_completion.
Model Availability Utils
openrag/routers/utils.py
Consolidated validation to a single LLM flow: added openai import, instantiate AsyncOpenAI client with 30s timeouts, call models.list(timeout=30), use bound contextual logging (base_url, model, model_type="LLM"), and refine HTTP 404/500 handling and exception propagation.

Sequence Diagram(s)

sequenceDiagram
  participant Client as Client
  participant Server as Server (FastAPI)
  participant OpenAI as OpenAI API

  Client->>Server: Request OpenAI endpoint
  Server->>Server: check_llm_model_availability (dependency)
  Server->>OpenAI: AsyncOpenAI.models.list(timeout=30)
  OpenAI-->>Server: models list or error
  Server-->>Client: Endpoint response or HTTP error
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~40 minutes

Poem

🐇 I hopped through routes and checked the sky,
One LLM, one timeout — I gave it a try,
I logged each step and caught the fall,
I nudged a mount and trimmed a sprawl,
Fresh burrow, neat and spry 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: fixing OpenAI endpoints that were blocked due to the LLM/VLM availability check dependency.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/llm_dependency

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@openrag/routers/utils.py`:
- Around line 253-255: Fix the typo in the error message string returned by the
function in openrag/routers/utils.py: change "endpint" to "endpoint" in the
detail message that references available_models and param['base_url'] so the
raised HTTPException's detail reads "...available from this endpoint..." (look
for the code constructing the HTTPException with
status_code=status.HTTP_404_NOT_FOUND and detail using available_models and
param['base_url']).
- Around line 242-247: logger.bind(...) returns a new logger and the bound
instance is being discarded; change the code to capture the bound logger (e.g.,
bound_logger = logger.bind(endpoint=param["base_url"],
model_name=param["model"], model_type="LLM")) and replace subsequent logging
calls (e.g., the logger.info("Validating model") here and any later uses that
should include the bound context) with bound_logger.info (and bound_logger.* for
other levels) so the endpoint/model fields are actually included in the emitted
logs.

@Ahmath-Gadji Ahmath-Gadji force-pushed the fix/llm_dependency branch 2 times, most recently from 8e06c16 to 3048c55 Compare January 28, 2026 08:55
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@openrag/routers/utils.py`:
- Around line 240-246: Remove the unused request: Request parameter from the
check_llm_model_availability signature and update the function to consistently
use the safely-extracted variables (base_url, model, api_key) instead of
indexing llm_param directly; specifically replace
logger.bind(base_url=llm_param["base_url"], model=llm_param["model"], ...) with
logger.bind(base_url=base_url, model=model, model_type="LLM") and ensure any
other references use the extracted variables to avoid KeyError from dict
indexing.

UserWarning: Duplicate Operation ID list_models_v1_models_get for function list_models at /app/openrag/routers/openai.py
  warnings.warn(message, stacklevel=1)
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@openrag/routers/utils.py`:
- Around line 250-252: The AsyncOpenAI client in this block (variable client) is
never closed; replace the plain instantiation with an async context manager
using AsyncOpenAI (e.g., use "async with AsyncOpenAI(... ) as client:") and move
the calls to client.models.list and the creation of available_models ({m.id for
m in openai_models.data}) inside that context so the connection pool is
automatically closed when the block exits; ensure any references to
openai_models and available_models remain scoped inside the async with block.

@Ahmath-Gadji Ahmath-Gadji added the fix Fix issue label Jan 28, 2026

log = logger.bind(base_url=llm_param["base_url"], model=llm_param["model"], model_type="LLM")
try:
log.info("Validating model")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This log will appear for most request and bring no info, I suggest either remove it, or decrease the level to debug

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

fix Fix issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants