Skip to content

fix(prometheus): default to status_code=500 for exceptions without status code#24264

Open
sourrris wants to merge 3 commits intoBerriAI:mainfrom
sourrris:fix/issue-24224-prometheus-status-code-none
Open

fix(prometheus): default to status_code=500 for exceptions without status code#24264
sourrris wants to merge 3 commits intoBerriAI:mainfrom
sourrris:fix/issue-24224-prometheus-status-code-none

Conversation

@sourrris
Copy link

Summary

  • _extract_status_code() returned None when an exception lacked status_code/code attributes
  • str(None) became the literal "None" in Prometheus labels, causing litellm_proxy_total_requests_metric 4xx/5xx aggregations to not match litellm_proxy_failed_requests_metric
  • Now defaults to 500 (unclassified server error) when an exception is present but carries no extractable status code — covers both direct exception param and kwargs["exception"] paths

Test plan

  • Added 5 regression tests in tests/test_litellm/integrations/test_prometheus_status_code_none.py
  • Verified no regressions in existing prometheus tests (test_prometheus_invalid_key_filtering.py — 2 pre-existing async failures unrelated to this change)

Fixes #24224

🤖 Generated with Claude Code

sourrris and others added 2 commits March 21, 2026 08:32
…24224)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…atus code

_extract_status_code() returned None when an exception lacked status_code/code
attributes. str(None) became the literal 'None' in Prometheus labels, causing
litellm_proxy_total_requests_metric 4xx/5xx aggregations to not match
litellm_proxy_failed_requests_metric.

Fixes BerriAI#24224

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@vercel
Copy link

vercel bot commented Mar 21, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Mar 21, 2026 4:01am

Request Review

@codspeed-hq
Copy link
Contributor

codspeed-hq bot commented Mar 21, 2026

Merging this PR will not alter performance

✅ 16 untouched benchmarks


Comparing sourrris:fix/issue-24224-prometheus-status-code-none (9d88a63) with main (d8e4fc4)

Open in CodSpeed

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 21, 2026

Greptile Summary

This PR fixes a bug where _extract_status_code() in PrometheusLogger returned None for exceptions lacking status_code/code attributes, causing str(None)"None" Prometheus label values and breaking 4xx/5xx aggregation consistency between litellm_proxy_total_requests_metric and litellm_proxy_failed_requests_metric.

Key changes:

  • Added a targeted fallback at the end of _extract_status_code(): when an exception is present (via the direct exception param or kwargs["exception"]) but no status code could be extracted, the function now returns 500 instead of None. The is None check (rather than the falsy not status_code used in the earlier guards) correctly avoids overwriting a legitimately extracted 0 value.
  • Two cosmetic formatting fixes wrapping inline ternary stream label assignments in parentheses for readability.
  • Five new regression tests in tests/test_litellm/integrations/test_prometheus_status_code_none.py, all mock-only, covering the fixed paths and the preserved None-when-no-exception contract.

The fix is minimal, well-scoped, and correctly avoids touching the non-exception (enum_values-only) path where None remains a valid signal for a non-error context.

Confidence Score: 5/5

  • This PR is safe to merge — the fix is minimal, targeted, and well-tested with no backwards-incompatible behavior changes.
  • The change is a single-function guard clause that only activates in the previously-broken path (exception present, no status code extractable). The existing enum_values-only and no-exception paths are untouched. Five focused regression tests cover all relevant code paths. No real network calls, no new dependencies, no architectural changes.
  • No files require special attention.

Important Files Changed

Filename Overview
litellm/integrations/prometheus.py Adds a fallback in _extract_status_code() to return 500 when an exception is present but carries no extractable status code, preventing "None" Prometheus labels. Also includes two purely cosmetic formatting fixes to ternary expressions for stream labels.
tests/test_litellm/integrations/test_prometheus_status_code_none.py New regression test file with 5 mock-only unit tests covering: exception with status_code, exception with code, bare exception (→500), no exception (→None), and bare exception via kwargs. All tests are mock-only with no real network calls.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[_extract_status_code called] --> B{enum_values has\nstatus_code?}
    B -- Yes --> C[status_code = int\nenum_values.status_code]
    B -- No --> D{exception\nprovided?}
    C --> D
    D -- Yes --> E{exception has\nstatus_code or code?}
    E -- Yes --> F[status_code = int\nexception attr]
    E -- No --> G{kwargs has\nexception?}
    F --> G
    D -- No --> G
    G -- Yes --> H{kwargs exception has\nstatus_code or code?}
    H -- Yes --> I[status_code = int\nkwargs exception attr]
    H -- No --> J{status_code is None AND\nexception present?}
    I --> J
    G -- No --> J
    J -- Yes --> K[return 500\nnew fallback]
    J -- No --> L[return status_code\nor None]
    style K fill:#f96,stroke:#c00,color:#fff
Loading

Last reviewed commit: "style(prometheus): a..."

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@sourrris sourrris force-pushed the fix/issue-24224-prometheus-status-code-none branch from edd2dd7 to 9d88a63 Compare March 21, 2026 04:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: litellm_proxy_total_requests_metric Emits status_code=None for some of failed requests

1 participant