Skip to content

OLS-1790: Document the supported vLLM version in OLS doc #96298

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 5 additions & 2 deletions modules/ols-large-language-model-requirements.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -41,14 +41,17 @@ To use {azure-official} with {ols-official}, you need access to link:https://azu

{rhelai} is OpenAI API-compatible, and is configured in a similar manner as the OpenAI provider.

You can configure {rhelai} as the (Large Language Model) LLM provider.
You can configure {rhelai} as the LLM provider.

Because the {rhel} is in a different environment than the {ols-long} deployment, the model deployment must allow access using a secure connection. For more information, see link:https://docs.redhat.com/en/documentation/red_hat_enterprise_linux_ai/1.2/html-single/building_your_rhel_ai_environment/index#creating_secure_endpoint[Optional: Allowing access to a model from a secure endpoint].

{ols-long} version 1.0 and later supports vLLM Server version 0.8.4. When self-hosting an LLM with {rhelai}, you can use vLLM Server as the inference engine for your model deployment.

[id="rhoai_{context}"]
== {rhoai}

{rhoai} is OpenAI API-compatible, and is configured largely the same as the OpenAI provider.

You need a Large Language Model (LLM) deployed on the single model-serving platform of {rhoai} using the Virtual Large Language Model (vLLM) runtime. If the model deployment is in a different {ocp-short-name} environment than the {ols-long} deployment, the model deployment must include a route to expose it outside the cluster. For more information, see link:https://docs.redhat.com/en/documentation/red_hat_openshift_ai_self-managed/2-latest/html/serving_models/serving-large-models_serving-large-models#about-the-single-model-serving-platform_serving-large-models[About the single-model serving platform].
You need an LLM deployed on the single model-serving platform of {rhoai} using the Virtual Large Language Model (vLLM) runtime. If the model deployment is in a different {ocp-short-name} environment than the {ols-long} deployment, the model deployment must include a route to expose it outside the cluster. For more information, see link:https://docs.redhat.com/en/documentation/red_hat_openshift_ai_self-managed/2-latest/html/serving_models/serving-large-models_serving-large-models#about-the-single-model-serving-platform_serving-large-models[About the single-model serving platform].

{ols-long} version 1.0 and later supports vLLM Server version 0.8.4. When self-hosting an LLM with {rhoai}, you can use vLLM Server as the inference engine for your model deployment.