Skip to content

Conversation

arthurgousset
Copy link
Member

@arthurgousset arthurgousset commented May 17, 2025

Description of changes

Adds a minimal test case to reproduce the bug described in chroma-core#1770. Can't reproduce the bug with this script, seems to be resolved in the client already unless I'm missing something.

Test plan

Setup environment

python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt -r requirements_dev.txt
pip install -e .

Run test:

# From root directory
$ AZURE_OPENAI_API_KEY="azure-openai-api-key" python azure_openai_repro.py

{'ids': [['cat_embedding']], 'embeddings': None, 'documents': [['Cat']], 'uris': None, 'included': ['metadatas', 'documents', 'distances'], 'data': None, 'metadatas': [[None]], 'distances': [[0.8539673089981079]]}

Use OPENAI_LOG=debug to see more verbose debug logs and confirm the embedding request is going to our private Azure OpenAI deployment.

# From root directory
$ OPENAI_LOG=debug AZURE_OPENAI_API_KEY="azure-openai-api-key" python azure_openai_repro.py

[2025-05-17 15:48:10 - openai._base_client:482 - DEBUG] Request options: {'method': 'post', 'url': '/deployments/text-embedding-3-small/embeddings', 'headers': {'api-key': '<redacted>'}, 'files': None, 'idempotency_key': 'stainless-python-retry-74021265-20cd-46a4-9a3b-9eda799efed3', 'post_parser': <function Embeddings.create.<locals>.parser at 0x10508e5c0>, 'json_data': {'input': ['Cat', 'Apple', 'San Francisco'], 'model': 'text-embedding-3-small', 'encoding_format': 'base64'}}
[2025-05-17 15:48:10 - openai._base_client:965 - DEBUG] Sending HTTP Request: POST https://chroma-repro-bug.openai.azure.com/openai/deployments/text-embedding-3-small/embeddings?api-version=2024-02-01
[2025-05-17 15:48:10 - httpx:1025 - INFO] HTTP Request: POST https://chroma-repro-bug.openai.azure.com/openai/deployments/text-embedding-3-small/embeddings?api-version=2024-02-01 "HTTP/1.1 200 OK"
[2025-05-17 15:48:10 - openai._base_client:1003 - DEBUG] HTTP Response: POST https://chroma-repro-bug.openai.azure.com/openai/deployments/text-embedding-3-small/embeddings?api-version=2024-02-01 "200 OK" Headers({'content-length': '24962', 'content-type': 'application/json', 'access-control-allow-origin': '*', 'apim-request-id': 'bdb2632d-73df-4a05-8a05-702544fb05f5', 'strict-transport-security': 'max-age=31536000; includeSubDomains; preload', 'x-content-type-options': 'nosniff', 'x-ms-region': 'East US', 'x-ratelimit-remaining-tokens': '149993', 'x-ratelimit-limit-tokens': '150000', 'x-request-id': '72d69d11-3ae2-4bc9-91e0-2576ddc7d84c', 'azureml-model-session': 'd011-20250512072110', 'x-envoy-upstream-service-time': '30', 'x-ms-client-request-id': 'bdb2632d-73df-4a05-8a05-702544fb05f5', 'x-ms-deployment-name': 'text-embedding-3-small', 'date': 'Sat, 17 May 2025 22:48:10 GMT'})
[2025-05-17 15:48:10 - openai._base_client:1011 - DEBUG] request_id: 72d69d11-3ae2-4bc9-91e0-2576ddc7d84c
[2025-05-17 15:48:11 - openai._base_client:482 - DEBUG] Request options: {'method': 'post', 'url': '/deployments/text-embedding-3-small/embeddings', 'headers': {'api-key': '<redacted>'}, 'files': None, 'idempotency_key': 'stainless-python-retry-797fdf84-a809-4785-842f-35b7745f6f97', 'post_parser': <function Embeddings.create.<locals>.parser at 0x10508e5c0>, 'json_data': {'input': ['Dog'], 'model': 'text-embedding-3-small', 'encoding_format': 'base64'}}
[2025-05-17 15:48:11 - openai._base_client:965 - DEBUG] Sending HTTP Request: POST https://chroma-repro-bug.openai.azure.com/openai/deployments/text-embedding-3-small/embeddings?api-version=2024-02-01
[2025-05-17 15:48:11 - httpx:1025 - INFO] HTTP Request: POST https://chroma-repro-bug.openai.azure.com/openai/deployments/text-embedding-3-small/embeddings?api-version=2024-02-01 "HTTP/1.1 200 OK"
[2025-05-17 15:48:11 - openai._base_client:1003 - DEBUG] HTTP Response: POST https://chroma-repro-bug.openai.azure.com/openai/deployments/text-embedding-3-small/embeddings?api-version=2024-02-01 "200 OK" Headers({'content-length': '8414', 'content-type': 'application/json', 'access-control-allow-origin': '*', 'apim-request-id': 'cf07bf26-f633-4f5b-bd52-2e80fb4f7251', 'strict-transport-security': 'max-age=31536000; includeSubDomains; preload', 'x-content-type-options': 'nosniff', 'x-ms-region': 'East US', 'x-ratelimit-remaining-tokens': '149992', 'x-ratelimit-limit-tokens': '150000', 'x-request-id': '2521d7bc-b1eb-4820-a57c-d443f8ee9d69', 'azureml-model-session': 'd011-20250512072110', 'x-envoy-upstream-service-time': '24', 'x-ms-client-request-id': 'cf07bf26-f633-4f5b-bd52-2e80fb4f7251', 'x-ms-deployment-name': 'text-embedding-3-small', 'date': 'Sat, 17 May 2025 22:48:10 GMT'})
[2025-05-17 15:48:11 - openai._base_client:1011 - DEBUG] request_id: 2521d7bc-b1eb-4820-a57c-d443f8ee9d69
{'ids': [['cat_embedding']], 'embeddings': None, 'documents': [['Cat']], 'uris': None, 'included': ['metadatas', 'documents', 'distances'], 'data': None, 'metadatas': [[None]], 'distances': [[0.8539673089981079]]}
  • Tests pass locally with pytest for python, yarn test for js, cargo test for rust

Documentation Changes

Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the docs section?

Can't reproduce the bug with this script, seems to be resolved in the client already unless I'm missing something.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant