Skip to content

Conversation

akram
Copy link
Contributor

@akram akram commented Oct 3, 2025

  • Add enable_model_discovery configuration flag to vLLM provider to control model listing behavior
  • Implement enable_model_discovery() method across all providers with default implementations in base classes
  • Prevent HTTP requests to /v1/models endpoint when enable_model_discovery=false to prevent crash on startup

What does this PR do?

Allows to disable model listing on startup. Useful for models that are declared in vllm but not reachable (not configured or behind authentication wall)
participate in fixing #3151

Test Plan

providers:
  inference:
  - provider_id: ${env.VLLM_URL:+vllm}
    provider_type: remote::vllm
    config:
      url: ${env.VLLM_URL:=}
      max_tokens: ${env.VLLM_MAX_TOKENS:=4096}
      api_token: ${env.VLLM_API_TOKEN:=fake}
      tls_verify: ${env.VLLM_TLS_VERIFY:=true}
      refresh_models: ${env.VLLM_REFRESH_MODELS:=false}
      enable_model_discovery: ${env.ENABLE_MODEL_DISCOVERY:=false}

models:
- metadata:
    display_name: vllm
  model_id: vllm
  provider_id: vllm
  model_type: llm
      

run

LLAMA_STACK_LOGGING="all=debug" VLLM_URL=https://my-vllm-server:8443/v1  MILVUS_DB_PATH=./milvus.db INFERENCE_MODEL=vllm uv run --with llama-stack llama stack build --distro starter --image-type venv --run

you should see:

DEBUG    2025-10-06 17:06:43,560 llama_stack.providers.remote.inference.vllm.vllm:286 inference::vllm: VLLM list_models called, enable_model_discovery=False
DEBUG    2025-10-06 17:06:43,561 llama_stack.providers.remote.inference.vllm.vllm:288 inference::vllm: VLLM list_models returning None due to enable_model_discovery=False```

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Oct 3, 2025
@akram akram force-pushed the add_allow_listing_models branch 3 times, most recently from 3ad3f20 to d301389 Compare October 3, 2025 22:12
• Add allow_listing_models configuration flag to VLLM provider to control model listing behavior
• Implement allow_listing_models() method across all providers with default implementations in base classes
• Prevent HTTP requests to /v1/models endpoint when allow_listing_models=false for improved security and performance
• Fix unit tests to include allow_listing_models method in test classes and mock objects
@akram akram force-pushed the add_allow_listing_models branch from d301389 to e9214f9 Compare October 3, 2025 22:18
@akram
Copy link
Contributor Author

akram commented Oct 4, 2025

@ashwinb can you PTAL ?

@akram akram changed the title feat: Add allow_listing_models feat: Add enable_model_discovery to enable/disable model discovery on startup Oct 6, 2025
• Add comprehensive error handling in check_model_availability method
• Provide helpful error messages with actionable solutions for 404 errors
• Warn when API token is set but model discovery is disabled
Copy link
Collaborator

@mattf mattf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the logic around model management is already complex with multiple knobs.

can you use allowed_models instead?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants