-
Notifications
You must be signed in to change notification settings - Fork 1.2k
chore: turn OpenAIMixin into a pydantic.BaseModel #3671
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…aking _model_cache details
241e77c
to
dc97160
Compare
…c.BaseModel - implement get_api_key instead of relying on LiteLLMOpenAIMixin.get_api_key - remove use of LiteLLMOpenAIMixin - add default initialize/shutdown methods to OpenAIMixin - remove __init__s to allow proper pydantic construction - remove dead code from vllm adapter and associated / duplicate unit tests - update vllm adapter to use openaimixin for model registration - remove ModelRegistryHelper from fireworks & together adapters - remove Inference from nvidia adapter - complete type hints on embedding_model_metadata - allow extra fields on OpenAIMixin, for model_store, __provider_id__, etc - new recordings for ollama - enhance the list models error handling w/ new tests - update cerebras (remove cerebras-cloud-sdk) and anthropic (custom model listing) inference adapters - parametrized test_inference_client_caching - remove cerebras, databricks, fireworks, together from blanket mypy exclude
dc97160
to
fd06717
Compare
this provides a foundation for #3517 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this lgtm as you answered the open questions I had in the PR description but i'll wait for someone else who has spent more time on inference to do another pass. but it looks a rebase is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing 2 files (vllm.py and openai_mixin.py,)
adapter_type="anthropic", | ||
provider_type="remote::anthropic", | ||
pip_packages=["litellm"], | ||
pip_packages=["litellm", "anthropic"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still need litellm here? I only see OpenAIMixin being used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch. i'll remove all the others too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
from urllib.parse import urljoin | ||
|
||
from llama_stack.apis.inference import ChatCompletionRequest | ||
from llama_stack.providers.utils.inference.litellm_openai_mixin import ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we remove litellm from RemoteProviderSpec in the registry?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
) | ||
api_token: SecretStr = Field( | ||
default=SecretStr(None), | ||
default=SecretStr(None), # type: ignore[arg-type] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not set default=SecretStr("")
and avoid ignore type comment? (yes I know, that same discussion again)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i hope to handle this later. i removed the blanket exclude so we can see the specific instances.
) | ||
api_key: SecretStr = Field( | ||
default=SecretStr(os.environ.get("CEREBRAS_API_KEY")), | ||
default=SecretStr(os.environ.get("CEREBRAS_API_KEY")), # type: ignore[arg-type] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
default=SecretStr(os.environ.get("CEREBRAS_API_KEY")), # type: ignore[arg-type] | |
default_factory=lambda: SecretStr(os.getenv("CEREBRAS_API_KEY", "")) |
And mypy will be happy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i removed the blanket excludes. i want to avoid rolling more things into this PR.
'Pass Fireworks API Key in the header X-LlamaStack-Provider-Data as { "fireworks_api_key": <your api key>}' | ||
) | ||
return provider_data.fireworks_api_key | ||
return self.config.api_key.get_secret_value() if self.config.api_key else None # type: ignore[return-value] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do you need to type ignore comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i removed the blanket excludes to expose specific places that need improvement
response.usage = OpenAIEmbeddingUsage(prompt_tokens=-1, total_tokens=-1) | ||
|
||
return response | ||
return response # type: ignore[no-any-return] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you leave a comment about why we are ignoring?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i removed the blanket exclude
@leseb what do you mean? |
Sorry, I typed too fast, I meant that I still haven't gone through those two files :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HUGE! Thanks for this one!
A few questions:
- can we remove llama_stack/providers/utils/inference/litellm_openai_mixin.py now (along with test files)?
- can we remove litellm from pyproject after that too?
unfortunately we cannot. the watsonx adapter is being updated and will rely on LiteLLMOpenAIMixin because watsonx does not provide an openai-compat endpoint. |
What does this PR do?
Test Plan
ci