fix(genai): restore accurate Gemini token usage reporting #1210

parkerhancock · 2025-09-24T16:43:47Z

Why this matters

Teams running Gemini and Vertex adapters today cannot trust the billable token numbers they receive. LangSmith traces, quota safeguards, and customer invoices are all deriving cost from incorrect totals because we drop cache/tool usage, lose modality details, and compute negative or inflated deltas during streaming. Several production users (issues #975, #940, #1011, #1053, #879) have reported that this blocks them from rolling out Gemini 2.x and multimodal workloads; this PR resolves those regressions end to end.

What’s included

Normalize Gemini usage_metadata so totals always equal inputs + outputs and we preserve cache/tool/modality detail dictionaries across both sync and streaming responses.
Apply the same normalization to the Vertex adapter (mirroring merged PR fix: Add cache token support to ChatAnthropicVertex streaming responses #1010 for Anthropic) so input_token_details and modality enums flow through exactly as Google returns them.
Introduce a shared delta helper for streaming that prevents negative/duplicated counts and ensures tool-call prompts and reasoning tokens are visible chunk-by-chunk. Also harden tool-call argument coercion to avoid Cloud Build failures from Infinity/NaN payloads.

Issues addressed

Fixes #975, fixes #940, fixes #1011, fixes #1053, fixes #879

Testing

GOOGLE_API_KEY=fake uv run pytest libs/genai/tests/unit_tests/test_chat_models.py
GOOGLE_API_KEY=fake VERTEXAI_LOCATION=us-central1 uv run pytest libs/vertexai/tests/unit_tests/test_usage_metadata.py

mdrxy · 2025-09-24T19:38:39Z

Thank you - will investigate as soon as able

parkerhancock · 2025-09-25T19:42:31Z

Updated the branch to latest main, fixed Gemini tool-call argument serialization so Infinity/NaN responses no longer break parsing, and cleaned up the local lint (mypy) failure. The failing Cloud Build run was caused by proto tool-call args containing Infinity; the new _coerce_function_call_args path normalizes those values and we added a regression test to cover it. Lint passes with make lint locally; Google Cloud Build should rerun clean now.

parkerhancock changed the title ~~Fix Gemini usage metadata handling~~ fix: preserve Gemini usage metadata across genai and vertex Sep 24, 2025

parkerhancock force-pushed the fix-gemini-usage branch 2 times, most recently from 986ba5b to fe3f8f3 Compare September 24, 2025 17:28

Fix Gemini usage metadata handling

c643d73

parkerhancock force-pushed the fix-gemini-usage branch from fe3f8f3 to c643d73 Compare September 24, 2025 18:06

Merge branch 'main' into fix-gemini-usage

a5a4445

parkerhancock added 3 commits September 25, 2025 14:06

Merge remote-tracking branch 'upstream/main' into pr-1210

5f73ecd

fix(vertexai): guard tool call args against non-finite numbers

325c4ea

chore(vertexai): drop unused protobuf import ignores

bc226f7

parkerhancock force-pushed the fix-gemini-usage branch from 5ba4930 to bc226f7 Compare September 25, 2025 19:42

parkerhancock changed the title ~~fix: preserve Gemini usage metadata across genai and vertex~~ fix(genai,vertexai): restore accurate Gemini token usage reporting Sep 29, 2025

parkerhancock changed the title ~~fix(genai,vertexai): restore accurate Gemini token usage reporting~~ fix(genai): restore accurate Gemini token usage reporting Sep 29, 2025

Merge branch 'main' into fix-gemini-usage

3cbb57b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(genai): restore accurate Gemini token usage reporting #1210

fix(genai): restore accurate Gemini token usage reporting #1210

Uh oh!

parkerhancock commented Sep 24, 2025 •

edited

Loading

Uh oh!

mdrxy commented Sep 24, 2025

Uh oh!

parkerhancock commented Sep 25, 2025 •

edited

Loading

Uh oh!

Uh oh!

fix(genai): restore accurate Gemini token usage reporting #1210

Are you sure you want to change the base?

fix(genai): restore accurate Gemini token usage reporting #1210

Uh oh!

Conversation

parkerhancock commented Sep 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why this matters

What’s included

Issues addressed

Testing

Uh oh!

mdrxy commented Sep 24, 2025

Uh oh!

parkerhancock commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

parkerhancock commented Sep 24, 2025 •

edited

Loading

parkerhancock commented Sep 25, 2025 •

edited

Loading