chore(ci): change CI to use ogx_open_client#6016
Conversation
Replace token-based authentication (PYPI_API_TOKEN, TEST_PYPI_API_TOKEN) with OIDC trusted publishing via pypa/gh-action-pypi-publish, matching the approach used by the main pypi.yml workflow. Split the single publish-sdk job into three jobs: - build-sdk: generates and builds the package - publish-testpypi: publishes to TestPyPI (environment: testpypi) - publish-pypi: publishes to production PyPI (no environment) The previous workflow referenced secrets that were never configured and used a pypi-production environment that did not exist in the repository. Signed-off-by: E Geiger <egeiger@redhat.com>
…i-generated SDK Replace ogx_client (stainless-generated) with ogx_open_client (openapi-generated) as the SDK used by integration tests and the library client. Add fallback import chains in library_client.py and api_recorder.py for backward compatibility with the stainless package. Update test_rerank.py for model class names in the new SDK. Adapt test_openai_vector_store_file_contents_with_extra_query to pass include_embeddings/include_metadata as direct parameters when using the generated SDK, which exposes them as first-class query parameters rather than requiring the OpenAI SDK extra_query convention. Signed-off-by: E Geiger <egeiger@redhat.com>
There are two valid ways to serialize a dict in multipart requests: as a JSON string, or as bracket-notation key-value pairs (e.g. expires_after[anchor]=created_at&expires_after[seconds]=3600). The OGX server only supports the bracket-notation form. Fix the generated SDK REST client to use bracket-notation flattening for dict values in multipart form data instead of json.dumps. Signed-off-by: E Geiger <egeiger@redhat.com>
…zation The OpenAPI spec used ChatCompletionMessageToolCall for both complete responses and streaming deltas. However, on the wire streaming deltas include an index field and omit id/type/function on continuation chunks. The spec did not reflect this de-facto protocol, causing the generated SDK to fail when deserializing streaming tool call chunks. Formalize the existing streaming wire format by creating a separate ChoiceDeltaToolCall schema: index is required for correlating chunks, while id, type, and function are optional. Point OpenAIChoiceDelta tool_calls to the new schema. This does not change anything on the wire, it aligns the spec with what was already being sent. Signed-off-by: E Geiger <egeiger@redhat.com>
…str/int values
The anyOf validator in generated SDK models used isinstance() to check if
a value matched a variant type. This failed for str,Enum types like
OpenAIResponseInputToolChoiceMode because a bare string like "auto" is not
an isinstance of the enum class, even though the enum can be constructed
from it.
Add a fallback in the non-primitive branch that attempts to construct the
target type from str/int values before rejecting. This correctly handles
enum construction (e.g. OpenAIResponseInputToolChoiceMode("auto")) while
leaving all other type checks unchanged.
Signed-off-by: E Geiger <egeiger@redhat.com>
Fix four issues in vision inference tests: - Guard against None delta.content in streaming chunks by using "or empty string" fallback, preventing TypeError on concatenation - Fix test_image_chat_completion_streaming to check delta.content before calling .lower(), preventing AttributeError on None - Fix multi_image_data[2] reference to multi_image_url[2] in test_image_chat_completion_multiple_images, which was referencing the fixture function instead of the resolved fixture value - Fix chunk.event.delta.text to chunk.choices[0].delta.content in the second streaming loop of test_image_chat_completion_multiple_images - Update stale GitHub URL from meta-llama/ogx to ogx-ai/ogx in test_openai_chat_completion_image_url Signed-off-by: E Geiger <egeiger@redhat.com>
✱ Stainless preview buildsThis PR will update the Edit this comment to update it. It will appear in the SDK's changelogs. ✅ llama-stack-client-openapi studio · code · diff
✅ llama-stack-client-python studio · code · diff
✅ llama-stack-client-node studio · code · diff
✅ llama-stack-client-go studio · conflict
This comment is auto-generated by GitHub Actions and is automatically kept up to date as you push. |
The validation workflow was generating the SDK as ogx_client (OPEN=0) but the codebase now imports from ogx_open_client. Update to OPEN=1 and fix all references from ogx_client to ogx_open_client in the install and verification steps. Signed-off-by: E Geiger <egeiger@redhat.com>
…rary client OGXAsLibraryClient constructs APIResponse/AsyncAPIResponse with stainless-specific kwargs (raw, client, cast_to, options, stream, stream_cls) and calls .parse() to deserialize responses. The ogx_open_client ApiResponse had a different constructor and no parse method, causing mypy errors and runtime failures in library mode. Add stainless-compatible constructors and .parse() methods to the api_response and async_api_response templates. For non-streaming, parse deserializes JSON via from_dict/model_validate. For streaming, parse constructs the stream class with the raw response. Also make AsyncStream accept client=None for library client usage. Add type: ignore on the ogx_client fallback import in library_client.py since both packages may be installed with incompatible type signatures. Signed-off-by: E Geiger <egeiger@redhat.com>
…utation The OpenAI SDK treats extra_body as a dict whose keys are merged into the request body at the top level. The generated SDK was storing extra_body as a literal field name, so OGX-specific extensions like embedding_model and provider_id never reached the server. This caused vector store creation to miss embedding config, breaking keyword and hybrid search. Add extra_body merging in partial_api.mustache: pop extra_body from kwargs and merge its contents before constructing the request model. Also fix a bug in model_generic.mustache where __getattribute__ unconditionally copied dict values, preventing mutation of additional_properties. Only create a new dict when OneOf/AnyOf unwrapping is actually needed. Signed-off-by: E Geiger <egeiger@redhat.com>
The model template emits __iter__/__getitem__/__len__ methods only for the first array field in a model. Several list response schemas had non-array fields (like object) before the data field, causing the template to skip iteration methods. This made iterating a response yield (field_name, value) tuples instead of data items. Add reorder_data_field_first() in build_hierarchy.py to ensure the data property comes first in any schema with an array data field. Also update uv.lock to ogx-open-client 1.0.3.dev4. Signed-off-by: E Geiger <egeiger@redhat.com>
Co-Authored-By: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
…fully The api_recorder monkey-patches OgxClient._prepare_request to inject test IDs into request headers. The patched function calls the stored original via _original_methods dict. When ogx_open_client and ogx_client are both installed, a class identity mismatch can cause the patch to be applied to one OgxClient class while the originals are stored from another, leaving _original_methods without the expected keys. Use dict.get() instead of direct key access so the patched function handles missing originals gracefully. The originals are no-op methods (return None), so skipping them is safe. Signed-off-by: E Geiger <egeiger@redhat.com>
…ll_api patch OGXAsLibraryClient runs the OGX server in-process without HTTP. The stainless SDK routed requests through a self.request() override, but ogx_open_client uses a different call chain: API methods call api_client.call_api() which delegates to rest_client.request() and httpx. The request() override was never invoked, causing all library mode API calls to fail with network errors. Patch api_client.call_api() on initialization to intercept requests before they reach httpx. The patched method: - Parses the URL to extract the path and query parameters - Matches the route to a FastAPI endpoint handler via find_matching_route - Converts the request body to proper function kwargs via _convert_body - Executes the handler directly (in-process, async via event loop) - Wraps the result in a RESTResponse for the SDK to deserialize Handles non-streaming JSON responses, streaming SSE responses, FastAPI Response objects, and multipart file uploads. Signed-off-by: E Geiger <egeiger@redhat.com>
The rerank tests asserted isinstance(response, list) but ogx_open_client returns a RerankResponse object (which is iterable and supports len/indexing) rather than a bare list. Replace isinstance checks with len() >= 0 assertions that verify the response is sequence-like without requiring a specific type. Signed-off-by: E Geiger <egeiger@redhat.com>
…K compatibility Langchain and langgraph integrations access client.api_key to configure themselves. The stainless SDK OgxClient had this as a class attribute. Add api_key as an optional constructor parameter to both OgxClient and AsyncOgxClient templates, stored as an instance attribute for compatibility. Signed-off-by: E Geiger <egeiger@redhat.com>
| from urllib.parse import urlparse | ||
|
|
||
| from fastapi.responses import StreamingResponse | ||
| from ogx_open_client.rest import RESTResponse |
There was a problem hiding this comment.
🔴 Hard-coded ogx_open_client import breaks fallback to ogx_client
The top-level imports at src/ogx/core/library_client.py:30-52 implement a fallback mechanism: try ogx_open_client first, then fall back to ogx_client. However, the inline import at line 415 (from ogx_open_client.rest import RESTResponse) is hard-coded to only use ogx_open_client. When a user has only ogx_client installed (the ogx[client] extra), the top-level import succeeds via the fallback, but the first call to _async_in_process_call will raise an ImportError because ogx_open_client doesn't exist. This breaks the OGXAsLibraryClient for anyone using the old ogx_client package.
Prompt for agents
In src/ogx/core/library_client.py, the inline import at line 415 hard-codes `from ogx_open_client.rest import RESTResponse`, but the top-level imports (lines 30-52) have a fallback mechanism that tries ogx_open_client first, then falls back to ogx_client. The inline import needs to follow the same pattern. One approach: at the top of the file (near lines 30-52), also import RESTResponse through the same try/except fallback and store it, or use a helper that resolves the correct module at import time. Alternatively, replicate the try/except pattern inline: try importing from ogx_open_client.rest, and if that fails, import from ogx_client.rest.
Was this helpful? React with 👍 or 👎 to provide feedback.
| needs: build-sdk | ||
| runs-on: ubuntu-latest | ||
| permissions: |
There was a problem hiding this comment.
🔴 Missing environment in publish-pypi job breaks OIDC trusted publishing
The publish-pypi job is missing the environment configuration required for OIDC trusted publishing. The old workflow had environment: name: ${{ ... 'pypi-production' ... }} which resolved to pypi-production for PyPI publishes. The refactored publish-testpypi job correctly sets environment: testpypi (line 113), but publish-pypi (line 143) has no environment at all. PyPI's trusted publisher configuration requires the GitHub environment name to match; without it, the pypa/gh-action-pypi-publish OIDC token exchange will fail and production PyPI publishing will be broken.
| needs: build-sdk | |
| runs-on: ubuntu-latest | |
| permissions: | |
| needs: build-sdk | |
| runs-on: ubuntu-latest | |
| environment: pypi-production | |
| permissions: |
Was this helpful? React with 👍 or 👎 to provide feedback.
There was a problem hiding this comment.
actually the trusted environment in pypi is "" so it will accept anything. As far as I can tell the workflow that's pushing ogx package is an empty environment so I wanted to create the option to scope things better in the future.
…patibility The call_api patch in OGXAsLibraryClient.__init__ unconditionally accessed self.api_client.call_api, but the stainless SDK OgxClient does not have an api_client attribute. Guard with hasattr checks so the fallback to the stainless SDK (which uses the request() override instead) still works. Also add try/except fallback for the ogx_open_client.rest import in _async_in_process_call to avoid ModuleNotFoundError when only the stainless SDK is installed. Signed-off-by: E Geiger <egeiger@redhat.com>
57ebe0d to
64ccd26
Compare
|
I think that the failing vllm test is a fluke, but seems like I can't re-run, would appreciate a maintainer's help. |
|
This pull request has merge conflicts that must be resolved before it can be merged. @aegeiger please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork |
| echo "Installing OpenAPI-generated SDK (ogx_client)..." | ||
| # Uninstall existing client (note: uv pip uninstall doesn't support -y flag) | ||
| uv pip uninstall ogx-client || true | ||
| echo "Installing OpenAPI-generated SDK (ogx_open_client)..." |
There was a problem hiding this comment.
nit: no action required since it was there before, the echo is a little misleading since we first uninstall then install
|
I am planning on cutting 1.1.0 tomorrow ..... should we hold off on this until then? |
skamenan7
left a comment
There was a problem hiding this comment.
Great PR, couple of nits. Thanks.
| try: | ||
| v = {{{dataType}}}(v) | ||
| return v | ||
| except Exception: |
There was a problem hiding this comment.
The except Exception: pass in the new enum coercion path swallows everything. Can you narrow this to except (ValueError, KeyError) to match the expected failure modes? Unexpected errors like TypeError from a broken constructor would be silently eaten.
There was a problem hiding this comment.
This method tries to match a value with several possible types when we deserialize something, e.g. response. When we don't have clear way of knowing what type should be used, we basically try everything and see what sticks. Now, let's assume we have two legitimate types, one is an IntEnum and one is StrEnum. IIUC (tested loaclly) if we try to construct the IntEnum with an str we get ValueError. But what we want in such a case is to add it to the error list and move on to the next possiblity.
This is just a long and winded way to say that I think in this specific case we do want a broad except, but if you disagree with this reasoning I can do except (ValueError, KeyError, TypeError)
| from urllib.parse import parse_qs | ||
|
|
||
| query_params = {k: v[0] if len(v) == 1 else v for k, v in parse_qs(query_string).items()} | ||
| request_body.update(query_params) |
There was a problem hiding this comment.
Looks like request_body.update(query_params) would silently overwrite any body field that shares a name with a query param. If a query param and body field collide, the query param wins without warning. Could you add a check or at least log when an overwrite happens?
| response = client_with_models.alpha.inference.rerank(model=rerank_model_id, query=query, items=items) | ||
| assert isinstance(response, list) | ||
| # TODO: Add type validation for response items once InferenceRerankResponseItem is exported from ogx client. | ||
| assert len(response) >= 0, f"Expected a list-like response, got {type(response)}" |
There was a problem hiding this comment.
I think assert len(response) >= 0 might not be ideal since len() can't return negative. The old isinstance check was doing more work. Would something like assert len(response) > 0 or a type check against the new response type work better?
There was a problem hiding this comment.
This is a very confusing way to test that the response is acting like a list. It would have made more sense to do something like assert hasattr(response, '__len__') and callable(hasattr(response, '__len__')). I didn't want to check anything about the length, just that there is one, because in stainless they returned a list[Something] and now we return something that acts like a list but its type is SomethingList.
…n library client When the library client flattens query params and body fields into a single dict for direct handler invocation, body params now take precedence over query params on name collisions. A warning is logged when a collision is detected. Also fix misleading echo in the SDK install workflow step. Signed-off-by: E Geiger <egeiger@redhat.com>
|
This pull request has merge conflicts that must be resolved before it can be merged. @aegeiger please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork |
What does this PR do?
Fixes multiple integration test failures when running with the OpenAPI-generated SDK (
ogx_open_client) instead of the Stainless SDK (ogx_client). This PR switches the integration test suite to use the new SDK and resolves all compatibility issues discovered during the transition.Changes by commit:
fix(ci): Fix the OpenAPI SDK publish workflow to use OIDC trusted publishing instead of unconfigured secret-based auth.
chore(client-sdks): Switch all integration tests and the library client from
ogx_clienttoogx_open_client. Add fallback import chains for backward compatibility. Update model class references intest_rerank.pyand adapttest_openai_vector_store_file_contents_with_extra_queryfor the new SDK's direct query parameter convention (vs. the OpenAI SDK'sextra_query).fix(client-sdks): Fix dict serialization in multipart form requests. The generated SDK was sending dicts as JSON strings, but the OGX server expects bracket-notation (
key[subkey]=value). Affectsexpires_afterand similar nested fields on file upload.fix(client-sdks): Add a
ChoiceDeltaToolCallstreaming delta variant in the OpenAPI spec. The spec previously used the sameChatCompletionMessageToolCallschema (all fields required, noindex) for both complete responses and streaming deltas, but streaming deltas on the wire includeindexand omitid/type/functionon continuation chunks. This formalizes the de-facto protocol without changing anything on the wire.fix(client-sdks): Fix enum deserialization in anyOf validators. The generated
actual_instance_must_validate_anyofusedisinstance()to check variant types, which failed forstr,Enumtypes (e.g.,OpenAIResponseInputToolChoiceMode) because a bare string like"auto"is not anisinstanceof the enum class. Add a fallback that tries constructing the type fromstr/intvalues.fix(tests): Fix four bugs in vision inference integration tests: guard against
Nonein streamingdelta.content, fixmulti_image_datafixture reference (was using the function instead of the resolved value), fix wrong streaming API call (chunk.event.delta.text→chunk.choices[0].delta.content), and update stale GitHub image URL.Test Plan
Integration tests were run against a live OpenAI backend with the
ci-testsdistribution:Results: 22 failures (down from 29 baseline), 0 regressions.
All 22 remaining failures are pre-existing and unrelated:
/v1/completionsendpoint 404 (not served by theci-testsdistribution withgptsetup — not a real CI configuration)Tests fixed by this PR (7 resolved):
test_openai_vector_store_file_contents_with_extra_query[client_with_models-*](x6 backends)extra_querykwarg_get_file_contenthelper adapts params per client typetest_inference_store_tool_calls[client_with_models-True]ChoiceDeltaToolCallschema with relaxed fieldstest_list_response_input_items[client_with_models]tool_choice: "auto"fails anyOf enum validationtest_image_chat_completion_multiple_images[True]Nonedelta.content on streaming chunksor ""guardtest_image_chat_completion_multiple_images[False]multi_image_datafixture function used as valuemulti_image_urltest_image_chat_completion_streaming.lower()called onNonedelta.contentif delta_content:guardtest_openai_chat_completion_image_urlmeta-llama/ogx)ogx-ai/ogx