Skip to content

watsonx.ai inference provider is not working for me #3165

@jwm4

Description

@jwm4

System Info

llama_stack==0.2.17
llama_stack_client==0.2.17

Information

  • The official example scripts
  • My own modified scripts

🐛 Describe the bug

When I try to call chat completions or responses from the LLS client referencing a watsonx.ai model and provider, I get a 500 error on the client and a 'WatsonXInferenceAdapter' object has no attribute '_openai_client' error on the server logs.

Error logs

INFO     2025-08-15 14:44:28,875 console_span_processor:65 telemetry:  18:44:28.848 [ERROR] Error executing endpoint route='/v1/openai/v1/responses'
         method='post': 'WatsonXInferenceAdapter' object has no attribute '_openai_client'

and also:

ERROR    2025-08-15 14:47:16,968 __main__:244 server: Error executing endpoint route='/v1/openai/v1/chat/completions' method='post':
         'WatsonXInferenceAdapter' object has no attribute '_openai_client'

Expected behavior

When I run

client.chat.completions.create(
    model=LLAMA_STACK_MODEL_ID,
    messages=[{"role": "user", "content": "What is the capital of France?"}]
)

or

client.responses.create(
    model=LLAMA_STACK_MODEL_ID,
    input="What is the capital of France?"
)

with LLAMA_STACK_MODEL_ID referring to a watsonx model, I should get a valid response (since I have a watsonx.ai API key and have configured my run.yaml to use it).

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions