Skip to content

Add histogram sum/count publishing to reduce metrics cardinality#296

Merged
xiaoxichen merged 1 commit intoeBay:masterfrom
xiaoxichen:worktree-histogram-sum-count
Mar 10, 2026
Merged

Add histogram sum/count publishing to reduce metrics cardinality#296
xiaoxichen merged 1 commit intoeBay:masterfrom
xiaoxichen:worktree-histogram-sum-count

Conversation

@xiaoxichen
Copy link
Copy Markdown
Contributor

This change allows histograms to publish only sum and count metrics instead of full bucket distributions, significantly reducing Prometheus cardinality from 30+ time series per metric down to just 2.

Changes:

  • Add publish_as_sum_count enum value to _publish_as
  • Add ReportSumCount interface and PrometheusReportSumCount implementation
  • Update HistogramDynamicInfo to support sum/count reporting mode via variant
  • Add is_sum_count_reporter() helper and update unregister() logic
  • Add remove_sum_count() to Reporter interface and PrometheusReporter
  • Add test case demonstrating sum/count histogram usage

When a histogram is registered with publish_as_sum_count:

  • Reports only metric_sum and metric_count to Prometheus (2 time series)
  • Still collects full bucket data in memory for local JSON API access
  • Maintains backward compatibility with existing histogram behavior

This change allows histograms to publish only sum and count metrics instead
of full bucket distributions, significantly reducing Prometheus cardinality
from 30+ time series per metric down to just 2.

Changes:
- Add publish_as_sum_count enum value to _publish_as
- Add ReportSumCount interface and PrometheusReportSumCount implementation
- Update HistogramDynamicInfo to support sum/count reporting mode via variant
- Add is_sum_count_reporter() helper and update unregister() logic
- Add remove_sum_count() to Reporter interface and PrometheusReporter
- Add test case demonstrating sum/count histogram usage

When a histogram is registered with publish_as_sum_count:
- Reports only metric_sum and metric_count to Prometheus (2 time series)
- Still collects full bucket data in memory for local JSON API access
- Maintains backward compatibility with existing histogram behavior

Signed-off-by: Xiaoxi Chen <xiaoxchen@ebay.com>
@xiaoxichen xiaoxichen force-pushed the worktree-histogram-sum-count branch from 3299d21 to 34549a0 Compare March 9, 2026 16:17
@xiaoxichen xiaoxichen requested review from Besroy and szmyd March 9, 2026 16:18
@xiaoxichen xiaoxichen changed the title Add histogram sum/count publishing to reduce Prometheus cardinality Add histogram sum/count publishing to reduce metrics cardinality Mar 9, 2026
@codecov-commenter
Copy link
Copy Markdown

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 40.38462% with 31 lines in your changes missing coverage. Please review.
✅ Project coverage is 50.49%. Comparing base (370c772) to head (34549a0).
⚠️ Report is 87 commits behind head on master.
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@             Coverage Diff             @@
##           master     #296       +/-   ##
===========================================
- Coverage   64.29%   50.49%   -13.81%     
===========================================
  Files          72       63        -9     
  Lines        4406     4135      -271     
  Branches      555     1803     +1248     
===========================================
- Hits         2833     2088      -745     
+ Misses       1327      867      -460     
- Partials      246     1180      +934     
Components Coverage Δ
AuthManager 55.55% <ø> (-22.23%) ⬇️
Cache 31.52% <54.12%> (+1.58%) ⬆️
FDS 60.74% <21.73%> (-10.38%) ⬇️
FileWatcher 37.60% <20.00%> (-18.65%) ⬇️
Flip ∅ <ø> (∅)
gRPC 56.59% <46.87%> (-20.46%) ⬇️
Logging 27.65% <22.72%> (-2.53%) ⬇️
Metrics 58.21% <51.51%> (-21.99%) ⬇️
Options 25.00% <ø> (-75.00%) ⬇️
Setting 29.62% <ø> (-27.17%) ⬇️
StatusObject 37.16% <ø> (-36.67%) ⬇️
Utility 60.98% <100.00%> (-21.74%) ⬇️
Version 36.00% <ø> (-59.84%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@xiaoxichen xiaoxichen requested review from yuwmao and zhihzhang March 10, 2026 00:44
Copy link
Copy Markdown
Contributor

@Besroy Besroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just curious if this sum/count is guaranteed to be non-decreasing - if so, do you want add a check in set_value (assert or log) when diff < 0

@xiaoxichen
Copy link
Copy Markdown
Contributor Author

LGTM. Just curious if this sum/count is guaranteed to be non-decreasing - if so, do you want add a check in set_value (assert or log) when diff < 0

I am not sure we put some negative value into the histogram as there is no bucket defined with negative, but technically you are right.

we can change it to gauge in next commit.

@xiaoxichen xiaoxichen merged commit fe10141 into eBay:master Mar 10, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants