Skip to content

Commit 3b7d783

Browse files
committed
fix(om2): histograms and negative observed values
OM1.0 required that the Sum of Histograms is not represented when there are negative observations in a histogram. This PR is removing this requirement in OM2.0. Due to: The requirement was never implemented by the Go and Java instrumentation libraries. Enforcing it now would be breaking. The requirement makes it impossible to implement the use case where the user wants to measure the Sum anyway. We already warned users in the documentation about the possibility of Sum decreasing and not being usable for rate() 10 years ago: #43. And native histograms will not take Sum into account when calculating counter resets during rate() , thus this problem won't come up. Note: this PR does not make Sum mandatory, that is a different question. Signed-off-by: György Krajcsovits <[email protected]>
1 parent cbe12c5 commit 3b7d783

File tree

1 file changed

+12
-4
lines changed

1 file changed

+12
-4
lines changed

content/docs/specs/om/open_metrics_spec_2_0.md

Lines changed: 12 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,10 @@ OpenMetrics is primarily a wire format, independent of any particular transport
4848

4949
Implementers MUST expose metrics in the OpenMetrics text format in response to a simple HTTP GET request to a documented URL for a given process or device. This endpoint SHOULD be called "/metrics". Implementers MAY also expose OpenMetrics formatted metrics in other ways, such as by regularly pushing metric sets to an operator-configured endpoint over HTTP.
5050

51+
## Changes from version 1.0
52+
53+
In the data model, histograms are no longer required to omit the Sum if there are negative measured event values. #2627.
54+
5155
### Metrics and Time Series
5256

5357
This standard expresses all system states as numerical values; counts, current values, enumerations, and boolean states being common examples. Contrary to metrics, singular events occur at a specific time. Metrics tend to aggregate data temporally. While this can lose information, the reduction in overhead is an engineering trade-off commonly chosen in many modern monitoring systems.
@@ -220,12 +224,16 @@ Histograms measure distributions of discrete events. Common examples are the lat
220224

221225
A Histogram MetricPoint MUST contain at least one bucket, and SHOULD contain Sum, and Created values. Every bucket MUST have a threshold and a value.
222226

223-
Histogram MetricPoints MUST have one bucket with an +Inf threshold. Buckets MUST be cumulative. As an example for a metric representing request latency in seconds its values for buckets with thresholds 1, 2, 3, and +Inf MUST follow value_1 <= value_2 <= value_3 <= value_+Inf. If ten requests took 1 second each, the values of the 1, 2, 3, and +Inf buckets MUST equal 10.
227+
Histogram MetricPoints MUST have one bucket with threshold equal to +Inf. Buckets MUST be cumulative.
228+
As an example: for a metric representing request latency in seconds that has the following bucket thresholds: 1, 2, 3, and +Inf,
229+
it MUST follow that value_1 <= value_2 <= value_3 <= value_+Inf. If ten requests took 1 second each, the values of the 1, 2, 3, and +Inf buckets MUST equal 10.
230+
Or in other words, the count of measured event values that are >1 and <=2 is equal to value_2 - value_1.
231+
232+
The +Inf bucket counts all requests. Bucket thresholds within a MetricPoint MUST be unique. Negative threshold buckets MAY be used. Bucket thresholds MUST NOT equal NaN.
224233

225-
The +Inf bucket counts all requests. If present, the Sum value MUST equal the Sum of all the measured event values. Bucket thresholds within a MetricPoint MUST be unique.
234+
Semantically, buckets values are counters so MUST NOT be NaN or negative. Bucket values MUST be integers.
226235

227-
Semantically, Sum, and buckets values are counters so MUST NOT be NaN or negative.
228-
Negative threshold buckets MAY be used, but then the Histogram MetricPoint MUST NOT contain a sum value as it would no longer be a counter semantically. Bucket thresholds MUST NOT equal NaN. Count and bucket values MUST be integers.
236+
If present, the Sum value MUST equal the Sum of all the measured event values. The histogram MAY count negative event values, which means that the Sum may decrease.
229237

230238
A Histogram MetricPoint SHOULD have a Timestamp value called Created. This can help ingestors discern between new metrics and long-running ones it did not see before.
231239

0 commit comments

Comments
 (0)