-
Notifications
You must be signed in to change notification settings - Fork 19
Fix: OpenAI endpoints blocked due to LLM/VLM availability check #225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
📝 WalkthroughWalkthroughOpenAI router mounting now occurs when either the OpenAI API or Chainlit UI is enabled. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client
participant Server as Server (FastAPI)
participant OpenAI as OpenAI API
Client->>Server: Request OpenAI endpoint
Server->>Server: check_llm_model_availability (dependency)
Server->>OpenAI: AsyncOpenAI.models.list(timeout=30)
OpenAI-->>Server: models list or error
Server-->>Client: Endpoint response or HTTP error
Estimated code review effort🎯 4 (Complex) | ⏱️ ~40 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@openrag/routers/utils.py`:
- Around line 253-255: Fix the typo in the error message string returned by the
function in openrag/routers/utils.py: change "endpint" to "endpoint" in the
detail message that references available_models and param['base_url'] so the
raised HTTPException's detail reads "...available from this endpoint..." (look
for the code constructing the HTTPException with
status_code=status.HTTP_404_NOT_FOUND and detail using available_models and
param['base_url']).
- Around line 242-247: logger.bind(...) returns a new logger and the bound
instance is being discarded; change the code to capture the bound logger (e.g.,
bound_logger = logger.bind(endpoint=param["base_url"],
model_name=param["model"], model_type="LLM")) and replace subsequent logging
calls (e.g., the logger.info("Validating model") here and any later uses that
should include the bound context) with bound_logger.info (and bound_logger.* for
other levels) so the endpoint/model fields are actually included in the emitted
logs.
8e06c16 to
3048c55
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@openrag/routers/utils.py`:
- Around line 240-246: Remove the unused request: Request parameter from the
check_llm_model_availability signature and update the function to consistently
use the safely-extracted variables (base_url, model, api_key) instead of
indexing llm_param directly; specifically replace
logger.bind(base_url=llm_param["base_url"], model=llm_param["model"], ...) with
logger.bind(base_url=base_url, model=model, model_type="LLM") and ensure any
other references use the extracted variables to avoid KeyError from dict
indexing.
UserWarning: Duplicate Operation ID list_models_v1_models_get for function list_models at /app/openrag/routers/openai.py warnings.warn(message, stacklevel=1)
3048c55 to
14b2e12
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@openrag/routers/utils.py`:
- Around line 250-252: The AsyncOpenAI client in this block (variable client) is
never closed; replace the plain instantiation with an async context manager
using AsyncOpenAI (e.g., use "async with AsyncOpenAI(... ) as client:") and move
the calls to client.models.list and the creation of available_models ({m.id for
m in openai_models.data}) inside that context so the connection pool is
automatically closed when the block exits; ensure any references to
openai_models and available_models remain scoped inside the async with block.
openrag/routers/utils.py
Outdated
|
|
||
| log = logger.bind(base_url=llm_param["base_url"], model=llm_param["model"], model_type="LLM") | ||
| try: | ||
| log.info("Validating model") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This log will appear for most request and bring no info, I suggest either remove it, or decrease the level to debug
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
Fix OpenAI endpoint failures due to LLM availability checks
OpenAI endpoints are failing when the underlying LLM or VLM in openrag is unavailable, caused by the
check_llm_model_availabilitydependency blocking requests.Changes
v1/models endpoint: Removed the
check_llm_model_availabilitydependency since this endpoint doesn't require model availability to functionv1/chat/completions and v1/completions endpoints: These endpoints depend on LLM availability (for answer generation). However, clients weren't receiving timely feedback about availability issues because the health check lacks a timeout, causing requests to hang indefinitely. Added timeout handling (30s) to provide immediate feedback to clients.
Additional fix: Corrected the OpenAI endpoint duplication warning
Summary by CodeRabbit
Bug Fixes
Refactor
✏️ Tip: You can customize this high-level summary in your review settings.