Skip to content

chore(ci): change CI to use ogx_open_client#6016

Open
aegeiger wants to merge 21 commits into
ogx-ai:mainfrom
aegeiger:chore/ci_using_ogx_open_client
Open

chore(ci): change CI to use ogx_open_client#6016
aegeiger wants to merge 21 commits into
ogx-ai:mainfrom
aegeiger:chore/ci_using_ogx_open_client

Conversation

@aegeiger

@aegeiger aegeiger commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

What does this PR do?

Fixes multiple integration test failures when running with the OpenAPI-generated SDK (ogx_open_client) instead of the Stainless SDK (ogx_client). This PR switches the integration test suite to use the new SDK and resolves all compatibility issues discovered during the transition.

Changes by commit:

  1. fix(ci): Fix the OpenAPI SDK publish workflow to use OIDC trusted publishing instead of unconfigured secret-based auth.

  2. chore(client-sdks): Switch all integration tests and the library client from ogx_client to ogx_open_client. Add fallback import chains for backward compatibility. Update model class references in test_rerank.py and adapt test_openai_vector_store_file_contents_with_extra_query for the new SDK's direct query parameter convention (vs. the OpenAI SDK's extra_query).

  3. fix(client-sdks): Fix dict serialization in multipart form requests. The generated SDK was sending dicts as JSON strings, but the OGX server expects bracket-notation (key[subkey]=value). Affects expires_after and similar nested fields on file upload.

  4. fix(client-sdks): Add a ChoiceDeltaToolCall streaming delta variant in the OpenAPI spec. The spec previously used the same ChatCompletionMessageToolCall schema (all fields required, no index) for both complete responses and streaming deltas, but streaming deltas on the wire include index and omit id/type/function on continuation chunks. This formalizes the de-facto protocol without changing anything on the wire.

  5. fix(client-sdks): Fix enum deserialization in anyOf validators. The generated actual_instance_must_validate_anyof used isinstance() to check variant types, which failed for str,Enum types (e.g., OpenAIResponseInputToolChoiceMode) because a bare string like "auto" is not an isinstance of the enum class. Add a fallback that tries constructing the type from str/int values.

  6. fix(tests): Fix four bugs in vision inference integration tests: guard against None in streaming delta.content, fix multi_image_data fixture reference (was using the function instead of the resolved value), fix wrong streaming API call (chunk.event.delta.textchunk.choices[0].delta.content), and update stale GitHub image URL.

Test Plan

Integration tests were run against a live OpenAI backend with the ci-tests distribution:

# Ensure the generated SDK is installed with all template fixes
cp -r client-sdks/openapi/sdks/python/ogx_open_client/* \
  .venv/lib64/python3.12/site-packages/ogx_open_client/

# Run the full base test suite
uv run --no-sync ./scripts/integration-tests.sh \
  --stack-config server:ci-tests --setup gpt --suite base --inference-mode live

Results: 22 failures (down from 29 baseline), 0 regressions.

All 22 remaining failures are pre-existing and unrelated:

  • 16x ranker/hybrid search returning empty results (keyword indexing issue across backends)
  • 6x legacy /v1/completions endpoint 404 (not served by the ci-tests distribution with gpt setup — not a real CI configuration)

Tests fixed by this PR (7 resolved):

Test Root cause Fix
test_openai_vector_store_file_contents_with_extra_query[client_with_models-*] (x6 backends) Generated SDK rejects extra_query kwarg _get_file_content helper adapts params per client type
test_inference_store_tool_calls[client_with_models-True] Streaming tool call chunks deserialized as raw dicts ChoiceDeltaToolCall schema with relaxed fields
test_list_response_input_items[client_with_models] tool_choice: "auto" fails anyOf enum validation Try enum construction for str/int in validator
test_image_chat_completion_multiple_images[True] None delta.content on streaming chunks or "" guard
test_image_chat_completion_multiple_images[False] multi_image_data fixture function used as value Changed to multi_image_url
test_image_chat_completion_streaming .lower() called on None delta.content if delta_content: guard
test_openai_chat_completion_image_url Stale GitHub URL (meta-llama/ogx) Updated to ogx-ai/ogx

aegeiger added 6 commits June 4, 2026 10:27
Replace token-based authentication (PYPI_API_TOKEN, TEST_PYPI_API_TOKEN)
with OIDC trusted publishing via pypa/gh-action-pypi-publish, matching
the approach used by the main pypi.yml workflow.

Split the single publish-sdk job into three jobs:
- build-sdk: generates and builds the package
- publish-testpypi: publishes to TestPyPI (environment: testpypi)
- publish-pypi: publishes to production PyPI (no environment)

The previous workflow referenced secrets that were never configured and
used a pypi-production environment that did not exist in the repository.

Signed-off-by: E Geiger <egeiger@redhat.com>
…i-generated SDK

Replace ogx_client (stainless-generated) with ogx_open_client
(openapi-generated) as the SDK used by integration tests and the library
client. Add fallback import chains in library_client.py and api_recorder.py
for backward compatibility with the stainless package.

Update test_rerank.py for model class names in the new SDK.

Adapt test_openai_vector_store_file_contents_with_extra_query to pass
include_embeddings/include_metadata as direct parameters when using the
generated SDK, which exposes them as first-class query parameters rather
than requiring the OpenAI SDK extra_query convention.

Signed-off-by: E Geiger <egeiger@redhat.com>
There are two valid ways to serialize a dict in multipart requests:
as a JSON string, or as bracket-notation key-value pairs (e.g.
expires_after[anchor]=created_at&expires_after[seconds]=3600). The OGX
server only supports the bracket-notation form.

Fix the generated SDK REST client to use bracket-notation flattening
for dict values in multipart form data instead of json.dumps.

Signed-off-by: E Geiger <egeiger@redhat.com>
…zation

The OpenAPI spec used ChatCompletionMessageToolCall for both complete
responses and streaming deltas. However, on the wire streaming deltas
include an index field and omit id/type/function on continuation chunks.
The spec did not reflect this de-facto protocol, causing the generated
SDK to fail when deserializing streaming tool call chunks.

Formalize the existing streaming wire format by creating a separate
ChoiceDeltaToolCall schema: index is required for correlating chunks,
while id, type, and function are optional. Point OpenAIChoiceDelta
tool_calls to the new schema. This does not change anything on the wire,
it aligns the spec with what was already being sent.

Signed-off-by: E Geiger <egeiger@redhat.com>
…str/int values

The anyOf validator in generated SDK models used isinstance() to check if
a value matched a variant type. This failed for str,Enum types like
OpenAIResponseInputToolChoiceMode because a bare string like "auto" is not
an isinstance of the enum class, even though the enum can be constructed
from it.

Add a fallback in the non-primitive branch that attempts to construct the
target type from str/int values before rejecting. This correctly handles
enum construction (e.g. OpenAIResponseInputToolChoiceMode("auto")) while
leaving all other type checks unchanged.

Signed-off-by: E Geiger <egeiger@redhat.com>
Fix four issues in vision inference tests:

- Guard against None delta.content in streaming chunks by using
  "or empty string" fallback, preventing TypeError on concatenation
- Fix test_image_chat_completion_streaming to check delta.content
  before calling .lower(), preventing AttributeError on None
- Fix multi_image_data[2] reference to multi_image_url[2] in
  test_image_chat_completion_multiple_images, which was referencing
  the fixture function instead of the resolved fixture value
- Fix chunk.event.delta.text to chunk.choices[0].delta.content in
  the second streaming loop of test_image_chat_completion_multiple_images
- Update stale GitHub URL from meta-llama/ogx to ogx-ai/ogx in
  test_openai_chat_completion_image_url

Signed-off-by: E Geiger <egeiger@redhat.com>
@github-actions

github-actions Bot commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

✱ Stainless preview builds

This PR will update the llama-stack-client SDKs with the following commit message.

chore(ci): change CI to use ogx_open_client

Edit this comment to update it. It will appear in the SDK's changelogs.

llama-stack-client-openapi studio · code · diff

Your SDK build had at least one "warning" diagnostic, but this did not represent a regression.
generate ⚠️

llama-stack-client-python studio · code · diff

Your SDK build had at least one "warning" diagnostic, but this did not represent a regression.
generate ⚠️build ✅lint ✅test ✅

pip install https://pkg.stainless.com/s/llama-stack-client-python/d1f83fbc18628dc6794757523de03a64c043b339/ogx_client-1.1.0a3-py3-none-any.whl
llama-stack-client-node studio · code · diff

Your SDK build had at least one "warning" diagnostic, but this did not represent a regression.
generate ⚠️build ✅lint ✅test ✅

npm install https://pkg.stainless.com/s/llama-stack-client-node/116d7eef1e90fcc3bf70ae6c5516cf8ef82553b9/dist.tar.gz
llama-stack-client-go studio · conflict

Your SDK build had at least one new note diagnostic, which is a regression from the base state.

New diagnostics (1 note)
💡 Schema/EnumHasOneMember: Confirm intentional use of `enum` with single member.

This comment is auto-generated by GitHub Actions and is automatically kept up to date as you push.
If you push custom code to the preview branch, re-run this workflow to update the comment.
Last updated: 2026-06-11 13:25:42 UTC

aegeiger and others added 10 commits June 4, 2026 14:15
The validation workflow was generating the SDK as ogx_client (OPEN=0)
but the codebase now imports from ogx_open_client. Update to OPEN=1 and
fix all references from ogx_client to ogx_open_client in the install
and verification steps.

Signed-off-by: E Geiger <egeiger@redhat.com>
…rary client

OGXAsLibraryClient constructs APIResponse/AsyncAPIResponse with
stainless-specific kwargs (raw, client, cast_to, options, stream,
stream_cls) and calls .parse() to deserialize responses. The
ogx_open_client ApiResponse had a different constructor and no parse
method, causing mypy errors and runtime failures in library mode.

Add stainless-compatible constructors and .parse() methods to the
api_response and async_api_response templates. For non-streaming, parse
deserializes JSON via from_dict/model_validate. For streaming, parse
constructs the stream class with the raw response. Also make
AsyncStream accept client=None for library client usage.

Add type: ignore on the ogx_client fallback import in library_client.py
since both packages may be installed with incompatible type signatures.

Signed-off-by: E Geiger <egeiger@redhat.com>
…utation

The OpenAI SDK treats extra_body as a dict whose keys are merged into
the request body at the top level. The generated SDK was storing
extra_body as a literal field name, so OGX-specific extensions like
embedding_model and provider_id never reached the server. This caused
vector store creation to miss embedding config, breaking keyword and
hybrid search.

Add extra_body merging in partial_api.mustache: pop extra_body from
kwargs and merge its contents before constructing the request model.

Also fix a bug in model_generic.mustache where __getattribute__
unconditionally copied dict values, preventing mutation of
additional_properties. Only create a new dict when OneOf/AnyOf
unwrapping is actually needed.

Signed-off-by: E Geiger <egeiger@redhat.com>
The model template emits __iter__/__getitem__/__len__ methods only for
the first array field in a model. Several list response schemas had
non-array fields (like object) before the data field, causing the
template to skip iteration methods. This made iterating a response
yield (field_name, value) tuples instead of data items.

Add reorder_data_field_first() in build_hierarchy.py to ensure the data
property comes first in any schema with an array data field.

Also update uv.lock to ogx-open-client 1.0.3.dev4.

Signed-off-by: E Geiger <egeiger@redhat.com>
Co-Authored-By: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
…fully

The api_recorder monkey-patches OgxClient._prepare_request to inject
test IDs into request headers. The patched function calls the stored
original via _original_methods dict. When ogx_open_client and ogx_client
are both installed, a class identity mismatch can cause the patch to be
applied to one OgxClient class while the originals are stored from
another, leaving _original_methods without the expected keys.

Use dict.get() instead of direct key access so the patched function
handles missing originals gracefully. The originals are no-op methods
(return None), so skipping them is safe.

Signed-off-by: E Geiger <egeiger@redhat.com>
…ll_api patch

OGXAsLibraryClient runs the OGX server in-process without HTTP. The
stainless SDK routed requests through a self.request() override, but
ogx_open_client uses a different call chain: API methods call
api_client.call_api() which delegates to rest_client.request() and
httpx. The request() override was never invoked, causing all library
mode API calls to fail with network errors.

Patch api_client.call_api() on initialization to intercept requests
before they reach httpx. The patched method:
- Parses the URL to extract the path and query parameters
- Matches the route to a FastAPI endpoint handler via find_matching_route
- Converts the request body to proper function kwargs via _convert_body
- Executes the handler directly (in-process, async via event loop)
- Wraps the result in a RESTResponse for the SDK to deserialize

Handles non-streaming JSON responses, streaming SSE responses, FastAPI
Response objects, and multipart file uploads.

Signed-off-by: E Geiger <egeiger@redhat.com>
The rerank tests asserted isinstance(response, list) but ogx_open_client
returns a RerankResponse object (which is iterable and supports len/indexing)
rather than a bare list. Replace isinstance checks with len() >= 0 assertions
that verify the response is sequence-like without requiring a specific type.

Signed-off-by: E Geiger <egeiger@redhat.com>
…K compatibility

Langchain and langgraph integrations access client.api_key to configure
themselves. The stainless SDK OgxClient had this as a class attribute.
Add api_key as an optional constructor parameter to both OgxClient and
AsyncOgxClient templates, stored as an instance attribute for
compatibility.

Signed-off-by: E Geiger <egeiger@redhat.com>
@aegeiger aegeiger marked this pull request as ready for review June 4, 2026 14:56

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 potential issues.

View 6 additional findings in Devin Review.

Open in Devin Review

Comment thread src/ogx/core/library_client.py Outdated
from urllib.parse import urlparse

from fastapi.responses import StreamingResponse
from ogx_open_client.rest import RESTResponse

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Hard-coded ogx_open_client import breaks fallback to ogx_client

The top-level imports at src/ogx/core/library_client.py:30-52 implement a fallback mechanism: try ogx_open_client first, then fall back to ogx_client. However, the inline import at line 415 (from ogx_open_client.rest import RESTResponse) is hard-coded to only use ogx_open_client. When a user has only ogx_client installed (the ogx[client] extra), the top-level import succeeds via the fallback, but the first call to _async_in_process_call will raise an ImportError because ogx_open_client doesn't exist. This breaks the OGXAsLibraryClient for anyone using the old ogx_client package.

Prompt for agents
In src/ogx/core/library_client.py, the inline import at line 415 hard-codes `from ogx_open_client.rest import RESTResponse`, but the top-level imports (lines 30-52) have a fallback mechanism that tries ogx_open_client first, then falls back to ogx_client. The inline import needs to follow the same pattern. One approach: at the top of the file (near lines 30-52), also import RESTResponse through the same try/except fallback and store it, or use a helper that resolves the correct module at import time. Alternatively, replicate the try/except pattern inline: try importing from ogx_open_client.rest, and if that fails, import from ogx_client.rest.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +148 to +150
needs: build-sdk
runs-on: ubuntu-latest
permissions:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Missing environment in publish-pypi job breaks OIDC trusted publishing

The publish-pypi job is missing the environment configuration required for OIDC trusted publishing. The old workflow had environment: name: ${{ ... 'pypi-production' ... }} which resolved to pypi-production for PyPI publishes. The refactored publish-testpypi job correctly sets environment: testpypi (line 113), but publish-pypi (line 143) has no environment at all. PyPI's trusted publisher configuration requires the GitHub environment name to match; without it, the pypa/gh-action-pypi-publish OIDC token exchange will fail and production PyPI publishing will be broken.

Suggested change
needs: build-sdk
runs-on: ubuntu-latest
permissions:
needs: build-sdk
runs-on: ubuntu-latest
environment: pypi-production
permissions:
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually the trusted environment in pypi is "" so it will accept anything. As far as I can tell the workflow that's pushing ogx package is an empty environment so I wanted to create the option to scope things better in the future.

…patibility

The call_api patch in OGXAsLibraryClient.__init__ unconditionally
accessed self.api_client.call_api, but the stainless SDK OgxClient
does not have an api_client attribute. Guard with hasattr checks so
the fallback to the stainless SDK (which uses the request() override
instead) still works.

Also add try/except fallback for the ogx_open_client.rest import in
_async_in_process_call to avoid ModuleNotFoundError when only the
stainless SDK is installed.

Signed-off-by: E Geiger <egeiger@redhat.com>
@aegeiger aegeiger force-pushed the chore/ci_using_ogx_open_client branch from 57ebe0d to 64ccd26 Compare June 7, 2026 12:22
@aegeiger

aegeiger commented Jun 8, 2026

Copy link
Copy Markdown
Contributor Author

I think that the failing vllm test is a fluke, but seems like I can't re-run, would appreciate a maintainer's help.
As kind-of evidence to support this claim, I had this workflow fail on my fork and a re-run had it passing https://github.com/aegeiger/ogx/actions/runs/27092454314/job/80030723690?pr=1
It also passed here for a previous commit and I don't think the last commit should have introduced this error.

@mergify

mergify Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

This pull request has merge conflicts that must be resolved before it can be merged. @aegeiger please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify Bot added the needs-rebase label Jun 9, 2026
@mergify mergify Bot removed the needs-rebase label Jun 9, 2026
echo "Installing OpenAPI-generated SDK (ogx_client)..."
# Uninstall existing client (note: uv pip uninstall doesn't support -y flag)
uv pip uninstall ogx-client || true
echo "Installing OpenAPI-generated SDK (ogx_open_client)..."

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: no action required since it was there before, the echo is a little misleading since we first uninstall then install

@cdoern

cdoern commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator

I am planning on cutting 1.1.0 tomorrow ..... should we hold off on this until then?

@leseb leseb left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

holding until we cut the release tomorrow - cc @cdoern

@skamenan7 skamenan7 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great PR, couple of nits. Thanks.

try:
v = {{{dataType}}}(v)
return v
except Exception:

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The except Exception: pass in the new enum coercion path swallows everything. Can you narrow this to except (ValueError, KeyError) to match the expected failure modes? Unexpected errors like TypeError from a broken constructor would be silently eaten.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method tries to match a value with several possible types when we deserialize something, e.g. response. When we don't have clear way of knowing what type should be used, we basically try everything and see what sticks. Now, let's assume we have two legitimate types, one is an IntEnum and one is StrEnum. IIUC (tested loaclly) if we try to construct the IntEnum with an str we get ValueError. But what we want in such a case is to add it to the error list and move on to the next possiblity.

This is just a long and winded way to say that I think in this specific case we do want a broad except, but if you disagree with this reasoning I can do except (ValueError, KeyError, TypeError)

Comment thread src/ogx/core/library_client.py Outdated
from urllib.parse import parse_qs

query_params = {k: v[0] if len(v) == 1 else v for k, v in parse_qs(query_string).items()}
request_body.update(query_params)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like request_body.update(query_params) would silently overwrite any body field that shares a name with a query param. If a query param and body field collide, the query param wins without warning. Could you add a check or at least log when an overwrite happens?

response = client_with_models.alpha.inference.rerank(model=rerank_model_id, query=query, items=items)
assert isinstance(response, list)
# TODO: Add type validation for response items once InferenceRerankResponseItem is exported from ogx client.
assert len(response) >= 0, f"Expected a list-like response, got {type(response)}"

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think assert len(response) >= 0 might not be ideal since len() can't return negative. The old isinstance check was doing more work. Would something like assert len(response) > 0 or a type check against the new response type work better?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very confusing way to test that the response is acting like a list. It would have made more sense to do something like assert hasattr(response, '__len__') and callable(hasattr(response, '__len__')). I didn't want to check anything about the length, just that there is one, because in stainless they returned a list[Something] and now we return something that acts like a list but its type is SomethingList.

…n library client

When the library client flattens query params and body fields into a
single dict for direct handler invocation, body params now take
precedence over query params on name collisions. A warning is logged
when a collision is detected. Also fix misleading echo in the SDK
install workflow step.

Signed-off-by: E Geiger <egeiger@redhat.com>
@cdoern cdoern enabled auto-merge June 11, 2026 13:22
@mergify

mergify Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

This pull request has merge conflicts that must be resolved before it can be merged. @aegeiger please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify Bot added the needs-rebase label Jun 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants