feat: implement query_metrics #3074

cdoern · 2025-08-08T00:18:00Z

What does this PR do?

query_metrics currently has no implementation, meaning once a metric is emitted there is no way in llama stack to query it from the store.

implement query_metrics for the meta_reference provider which follows a similar style to query_traces, using the trace_store to format an SQL query and execute it

in this case the parameters for the query are metric.METRIC_NAME, start_time, and end_time and any other matchers if they are provided.

this required client side changes since the client had no query_metrics or any associated resources, so any tests here will fail but I will provide manual execution logs for the new tests I am adding

order the metrics by timestamp.

Additionally add unit to the MetricDataPoint class since this adds much more context to the metric being queried.

depends on llamastack/llama-stack-client-python#260

Test Plan

import time
import uuid


def create_http_client():
    from llama_stack_client import LlamaStackClient

    return LlamaStackClient(base_url="http://localhost:8321")


client = create_http_client()

response = client.telemetry.query_metrics(metric_name="total_tokens", start_time=0)
print(response)

╰─ python3.12 ~/telemetry.py
INFO:httpx:HTTP Request: POST http://localhost:8322/v1/telemetry/metrics/total_tokens "HTTP/1.1 200 OK"
[TelemetryQueryMetricsResponse(data=None, metric='total_tokens', labels=[], values=[{'timestamp': 1753999514, 'value': 34.0, 'unit': 'tokens'}, {'timestamp': 1753999816, 'value': 34.0, 'unit': 'tokens'}, {'timestamp': 1753999881, 'value': 34.0, 'unit': 'tokens'}, {'timestamp': 1753999956, 'value': 34.0, 'unit': 'tokens'}, {'timestamp': 1754000200, 'value': 34.0, 'unit': 'tokens'}, {'timestamp': 1754000419, 'value': 36.0, 'unit': 'tokens'}, {'timestamp': 1754000714, 'value': 36.0, 'unit': 'tokens'}, {'timestamp': 1754000876, 'value': 36.0, 'unit': 'tokens'}, {'timestamp': 1754000908, 'value': 34.0, 'unit': 'tokens'}, {'timestamp': 1754001309, 'value': 584.0, 'unit': 'tokens'}, {'timestamp': 1754001311, 'value': 138.0, 'unit': 'tokens'}, {'timestamp': 1754001316, 'value': 349.0, 'unit': 'tokens'}, {'timestamp': 1754001318, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754001320, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754001341, 'value': 923.0, 'unit': 'tokens'}, {'timestamp': 1754001350, 'value': 354.0, 'unit': 'tokens'}, {'timestamp': 1754001462, 'value': 417.0, 'unit': 'tokens'}, {'timestamp': 1754001464, 'value': 158.0, 'unit': 'tokens'}, {'timestamp': 1754001475, 'value': 697.0, 'unit': 'tokens'}, {'timestamp': 1754001477, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754001479, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754001489, 'value': 298.0, 'unit': 'tokens'}, {'timestamp': 1754001541, 'value': 615.0, 'unit': 'tokens'}, {'timestamp': 1754001543, 'value': 119.0, 'unit': 'tokens'}, {'timestamp': 1754001548, 'value': 310.0, 'unit': 'tokens'}, {'timestamp': 1754001549, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754001551, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754001568, 'value': 714.0, 'unit': 'tokens'}, {'timestamp': 1754001800, 'value': 437.0, 'unit': 'tokens'}, {'timestamp': 1754001802, 'value': 200.0, 'unit': 'tokens'}, {'timestamp': 1754001806, 'value': 262.0, 'unit': 'tokens'}, {'timestamp': 1754001808, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754001810, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754001816, 'value': 82.0, 'unit': 'tokens'}, {'timestamp': 1754001923, 'value': 61.0, 'unit': 'tokens'}, {'timestamp': 1754001929, 'value': 391.0, 'unit': 'tokens'}, {'timestamp': 1754001939, 'value': 598.0, 'unit': 'tokens'}, {'timestamp': 1754001941, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754001942, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754001952, 'value': 252.0, 'unit': 'tokens'}, {'timestamp': 1754002053, 'value': 251.0, 'unit': 'tokens'}, {'timestamp': 1754002059, 'value': 375.0, 'unit': 'tokens'}, {'timestamp': 1754002062, 'value': 244.0, 'unit': 'tokens'}, {'timestamp': 1754002064, 'value': 111.0, 'unit': 'tokens'}, {'timestamp': 1754002065, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754002083, 'value': 719.0, 'unit': 'tokens'}, {'timestamp': 1754002302, 'value': 279.0, 'unit': 'tokens'}, {'timestamp': 1754002306, 'value': 218.0, 'unit': 'tokens'}, {'timestamp': 1754002308, 'value': 198.0, 'unit': 'tokens'}, {'timestamp': 1754002309, 'value': 69.0, 'unit': 'tokens'}, {'timestamp': 1754002311, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754002324, 'value': 481.0, 'unit': 'tokens'}, {'timestamp': 1754003161, 'value': 579.0, 'unit': 'tokens'}, {'timestamp': 1754003161, 'value': 69.0, 'unit': 'tokens'}, {'timestamp': 1754003169, 'value': 499.0, 'unit': 'tokens'}, {'timestamp': 1754003171, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754003173, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754003185, 'value': 422.0, 'unit': 'tokens'}, {'timestamp': 1754003448, 'value': 579.0, 'unit': 'tokens'}, {'timestamp': 1754003453, 'value': 422.0, 'unit': 'tokens'}, {'timestamp': 1754003589, 'value': 579.0, 'unit': 'tokens'}, {'timestamp': 1754003609, 'value': 279.0, 'unit': 'tokens'}, {'timestamp': 1754003614, 'value': 481.0, 'unit': 'tokens'}, {'timestamp': 1754003706, 'value': 303.0, 'unit': 'tokens'}, {'timestamp': 1754003706, 'value': 51.0, 'unit': 'tokens'}, {'timestamp': 1754003713, 'value': 426.0, 'unit': 'tokens'}, {'timestamp': 1754003714, 'value': 70.0, 'unit': 'tokens'}, {'timestamp': 1754003715, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754003724, 'value': 225.0, 'unit': 'tokens'}, {'timestamp': 1754004226, 'value': 516.0, 'unit': 'tokens'}, {'timestamp': 1754004228, 'value': 127.0, 'unit': 'tokens'}, {'timestamp': 1754004232, 'value': 281.0, 'unit': 'tokens'}, {'timestamp': 1754004234, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754004236, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754004244, 'value': 206.0, 'unit': 'tokens'}, {'timestamp': 1754004683, 'value': 338.0, 'unit': 'tokens'}, {'timestamp': 1754004690, 'value': 481.0, 'unit': 'tokens'}, {'timestamp': 1754004692, 'value': 124.0, 'unit': 'tokens'}, {'timestamp': 1754004692, 'value': 65.0, 'unit': 'tokens'}, {'timestamp': 1754004694, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754004703, 'value': 211.0, 'unit': 'tokens'}, {'timestamp': 1754004743, 'value': 338.0, 'unit': 'tokens'}, {'timestamp': 1754004749, 'value': 211.0, 'unit': 'tokens'}, {'timestamp': 1754005566, 'value': 481.0, 'unit': 'tokens'}, {'timestamp': 1754006101, 'value': 159.0, 'unit': 'tokens'}, {'timestamp': 1754006105, 'value': 272.0, 'unit': 'tokens'}, {'timestamp': 1754006109, 'value': 308.0, 'unit': 'tokens'}, {'timestamp': 1754006110, 'value': 61.0, 'unit': 'tokens'}, {'timestamp': 1754006112, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754006130, 'value': 705.0, 'unit': 'tokens'}, {'timestamp': 1754051825, 'value': 454.0, 'unit': 'tokens'}, {'timestamp': 1754051827, 'value': 152.0, 'unit': 'tokens'}, {'timestamp': 1754051834, 'value': 481.0, 'unit': 'tokens'}, {'timestamp': 1754051835, 'value': 55.0, 'unit': 'tokens'}, {'timestamp': 1754051837, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754051845, 'value': 102.0, 'unit': 'tokens'}, {'timestamp': 1754099929, 'value': 36.0, 'unit': 'tokens'}, {'timestamp': 1754510050, 'value': 598.0, 'unit': 'tokens'}, {'timestamp': 1754510052, 'value': 160.0, 'unit': 'tokens'}, {'timestamp': 1754510064, 'value': 725.0, 'unit': 'tokens'}, {'timestamp': 1754510065, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754510067, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754510083, 'value': 535.0, 'unit': 'tokens'}, {'timestamp': 1754596582, 'value': 36.0, 'unit': 'tokens'}])]

adding tests for each currently documented metric in llama stack using this new function. attached is also some manual testing

integrations tests passing locally with replay mode and the linked client changes:

franciscojavierarceo

This is great. It's probably worth throwing a unit test showing the intended aggregation behavior of the telemetry.

I haven't spent much time on this part of the stack yet but I would love to add a page in the UI for telemetry with some basic info.

cdoern · 2025-08-08T01:03:37Z

@franciscojavierarceo great idea! I will add some unit tests. I am also planning on adding a bunch of integration tests for telemetry using query_metrics as a part of this PR

ashwinb · 2025-08-11T15:13:07Z

llama_stack/providers/inline/telemetry/meta_reference/telemetry.py

+        end_dt = datetime.datetime.fromtimestamp(end_time, datetime.UTC) if end_time else None
+
+        # Use SQLite trace store if available
+        if hasattr(self, "trace_store") and self.trace_store:


why is this conditional? can the meta-reference provider have no trace-store? in general, this feels odd. at the very least, you should error when there is no trace store and someone is querying for metrics instead of silently returning nothing.

true, I think this was to satisfy mypy. I will double check if trace_store can be nil, and if it can't just remove this but if it can, raise an error.

ashwinb · 2025-08-11T15:16:20Z

tests/integration/telemetry/test_telemetry_metrics.py

@@ -0,0 +1,180 @@
+# Copyright (c) Meta Platforms, Inc. and affiliates.


thank you for writing a meaningful and true integration test!

cdoern · 2025-08-11T16:04:14Z

I think this is set, please let me know if any of my SQL queries look a bit off, tried my best to make them make sense cc @ehhuang

not sure how we want to deal with merging the client side changes and the test failures here due to those.

ashwinb · 2025-08-11T16:36:03Z

not sure how we want to deal with merging the client side changes and the test failures here due to those.

Can you explain what the missing client-side change is which results in the failure? I haven't dug into the failures.

cdoern · 2025-08-11T17:15:26Z

sure @ashwinb

all of the tests/integration/telemetry/test_telemetry_metrics.py failures are expected, I just fixed the other ones which were just due to some model changes I made locally when recording.

Those telemetry expected failures pass locally for me (see the image in PR description) and will pass here once llamastack/llama-stack-client-python#260 merges

cdoern · 2025-08-12T19:29:44Z

in order for the integration tests to pass, llamastack/llama-stack-client-python#19 needs to merge first and be released. But the tests pass locally for me.

The client changes are harmless to merge first since they add types and resources for an existing Api route

ashwinb · 2025-08-13T16:54:49Z

in order for the integration tests to pass, llamastack/llama-stack-client-python#19 needs to merge first and be released. But the tests pass locally for me.

How did you make those changes @cdoern ? We need to make them via Stainless and I believe you don't have access. We haven't yet made Stainless fully self-serve although we have been making gradual progress.

cdoern · 2025-08-13T19:19:43Z

in order for the integration tests to pass, llamastack/llama-stack-client-python#19 needs to merge first and be released. But the tests pass locally for me.

How did you make those changes @cdoern ? We need to make them via Stainless and I believe you don't have access. We haven't yet made Stainless fully self-serve although we have been making gradual progress.

I did them manually! I have done quite a few of these client-server shuffles so I kind of knew my way around the codebase :)

also these types were similar enough to query_spans on the client side so it wasn't too bad

cdoern · 2025-08-13T19:22:07Z

tests/integration/telemetry/test_telemetry_metrics.py

+def setup_telemetry_metrics_data(openai_client, client_with_models, text_model_id):
+    """Setup fixture that creates telemetry metrics data before tests run."""
+
+    # Skip OpenAI tests if running in library mode


@ehhuang check me on this, but I noticed in other tests that use the openai_client helper, we need to skip when using library client?

Also, technically shouldn't llama_stack_client.chat.completions use the openai route? in our integration suite only the openai_client routes through /v1/openai/v1

@cdoern that is correct. if you use the compat_client fixture it automatically does this skipping.

@cdoern can you use the compat_client fixture everywhere please and remove the hasattr() check below?

@ashwinb I was not able to use the compat_client for the following reasons:

when using compat_client I do not have access to .telemetry.query_metrics as that is not OpenAI compatible

when using compat_client the setup_telemetry_metrics_data would error out half of the time saying AttributeError: 'dict' object has no attribute 'prompt_tokens'. This is because openai completions format their .usage in a different format than our API.

What I was able to do is get rid of manual testing for .inference.chat_completion which is deprecated so all this does now is use the openai_client to create completions and the client_with_models to query .telemetry. Let me know if this is ok for now. I regenerated all of the json

@cdoern this is okay for now. I think once we have the client-stubs generated, we can see how we should think about it. re: usage data being differently formatted, we should probably adopt the same format right?

cdoern · 2025-08-18T12:56:31Z

hey @ehhuang @ashwinb wondering if I could get some more eyes on this, thanks!

ashwinb · 2025-08-20T23:30:00Z

I did them manually! I have done quite a few of these client-server shuffles so I kind of knew my way around the codebase :)

yeah we cannot do them manually. They need to be done via Stainless only otherwise the next stainless update will wipe them. Can you disable all the tests which depend on client types? Then I will regenerate the next client-SDK update based on the new openapi spec and once that lands we can re-enable the tests.

(on the side, I am working on making stainless automatic.)

ashwinb · 2025-08-22T18:34:41Z

tests/integration/telemetry/test_telemetry_metrics.py

+def setup_telemetry_metrics_data(openai_client, client_with_models, text_model_id):
+    """Setup fixture that creates telemetry metrics data before tests run."""
+
+    # Skip OpenAI tests if running in library mode


@cdoern can you use the compat_client fixture everywhere please and remove the hasattr() check below?

query_metrics currently has no implementation, meaning once a metric is emitted there is no way in llama stack to query it from the store. implement query_metrics for the meta_reference provider which follows a similar style to `query_traces`, using the trace_store to format an SQL query and execute it in this case the parameters for the query are `metric.METRIC_NAME, start_time, and end_time`. this required client side changes since the client had no `query_metrics` or any associated resources, so any tests here will fail but I will provider manual execution logs for the new tests I am adding order the metrics by timestamp. Additionally add `unit` to the `MetricDataPoint` class since this adds much more context to the metric being queried. these metrics can also be aggregated via a `granularity` parameter. This was pre-defined as a string like: `1m, 1h, 1d` where metrics occuring in same timespan specified are aggregated together. Signed-off-by: Charlie Doern <[email protected]>

the integration tests check if all of the metrics we currently support actual are queryable after inference requests this also tests thinks like aggregation, label filtering, etc Signed-off-by: Charlie Doern <[email protected]>

# What does this PR do? query_metrics currently has no implementation, meaning once a metric is emitted there is no way in llama stack to query it from the store. implement query_metrics for the meta_reference provider which follows a similar style to `query_traces`, using the trace_store to format an SQL query and execute it in this case the parameters for the query are `metric.METRIC_NAME, start_time, and end_time` and any other matchers if they are provided. this required client side changes since the client had no `query_metrics` or any associated resources, so any tests here will fail but I will provide manual execution logs for the new tests I am adding order the metrics by timestamp. Additionally add `unit` to the `MetricDataPoint` class since this adds much more context to the metric being queried. depends on llamastack/llama-stack-client-python#260 ## Test Plan ``` import time import uuid def create_http_client(): from llama_stack_client import LlamaStackClient return LlamaStackClient(base_url="http://localhost:8321") client = create_http_client() response = client.telemetry.query_metrics(metric_name="total_tokens", start_time=0) print(response) ``` ``` ╰─ python3.12 ~/telemetry.py INFO:httpx:HTTP Request: POST http://localhost:8322/v1/telemetry/metrics/total_tokens "HTTP/1.1 200 OK" [TelemetryQueryMetricsResponse(data=None, metric='total_tokens', labels=[], values=[{'timestamp': 1753999514, 'value': 34.0, 'unit': 'tokens'}, {'timestamp': 1753999816, 'value': 34.0, 'unit': 'tokens'}, {'timestamp': 1753999881, 'value': 34.0, 'unit': 'tokens'}, {'timestamp': 1753999956, 'value': 34.0, 'unit': 'tokens'}, {'timestamp': 1754000200, 'value': 34.0, 'unit': 'tokens'}, {'timestamp': 1754000419, 'value': 36.0, 'unit': 'tokens'}, {'timestamp': 1754000714, 'value': 36.0, 'unit': 'tokens'}, {'timestamp': 1754000876, 'value': 36.0, 'unit': 'tokens'}, {'timestamp': 1754000908, 'value': 34.0, 'unit': 'tokens'}, {'timestamp': 1754001309, 'value': 584.0, 'unit': 'tokens'}, {'timestamp': 1754001311, 'value': 138.0, 'unit': 'tokens'}, {'timestamp': 1754001316, 'value': 349.0, 'unit': 'tokens'}, {'timestamp': 1754001318, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754001320, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754001341, 'value': 923.0, 'unit': 'tokens'}, {'timestamp': 1754001350, 'value': 354.0, 'unit': 'tokens'}, {'timestamp': 1754001462, 'value': 417.0, 'unit': 'tokens'}, {'timestamp': 1754001464, 'value': 158.0, 'unit': 'tokens'}, {'timestamp': 1754001475, 'value': 697.0, 'unit': 'tokens'}, {'timestamp': 1754001477, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754001479, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754001489, 'value': 298.0, 'unit': 'tokens'}, {'timestamp': 1754001541, 'value': 615.0, 'unit': 'tokens'}, {'timestamp': 1754001543, 'value': 119.0, 'unit': 'tokens'}, {'timestamp': 1754001548, 'value': 310.0, 'unit': 'tokens'}, {'timestamp': 1754001549, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754001551, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754001568, 'value': 714.0, 'unit': 'tokens'}, {'timestamp': 1754001800, 'value': 437.0, 'unit': 'tokens'}, {'timestamp': 1754001802, 'value': 200.0, 'unit': 'tokens'}, {'timestamp': 1754001806, 'value': 262.0, 'unit': 'tokens'}, {'timestamp': 1754001808, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754001810, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754001816, 'value': 82.0, 'unit': 'tokens'}, {'timestamp': 1754001923, 'value': 61.0, 'unit': 'tokens'}, {'timestamp': 1754001929, 'value': 391.0, 'unit': 'tokens'}, {'timestamp': 1754001939, 'value': 598.0, 'unit': 'tokens'}, {'timestamp': 1754001941, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754001942, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754001952, 'value': 252.0, 'unit': 'tokens'}, {'timestamp': 1754002053, 'value': 251.0, 'unit': 'tokens'}, {'timestamp': 1754002059, 'value': 375.0, 'unit': 'tokens'}, {'timestamp': 1754002062, 'value': 244.0, 'unit': 'tokens'}, {'timestamp': 1754002064, 'value': 111.0, 'unit': 'tokens'}, {'timestamp': 1754002065, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754002083, 'value': 719.0, 'unit': 'tokens'}, {'timestamp': 1754002302, 'value': 279.0, 'unit': 'tokens'}, {'timestamp': 1754002306, 'value': 218.0, 'unit': 'tokens'}, {'timestamp': 1754002308, 'value': 198.0, 'unit': 'tokens'}, {'timestamp': 1754002309, 'value': 69.0, 'unit': 'tokens'}, {'timestamp': 1754002311, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754002324, 'value': 481.0, 'unit': 'tokens'}, {'timestamp': 1754003161, 'value': 579.0, 'unit': 'tokens'}, {'timestamp': 1754003161, 'value': 69.0, 'unit': 'tokens'}, {'timestamp': 1754003169, 'value': 499.0, 'unit': 'tokens'}, {'timestamp': 1754003171, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754003173, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754003185, 'value': 422.0, 'unit': 'tokens'}, {'timestamp': 1754003448, 'value': 579.0, 'unit': 'tokens'}, {'timestamp': 1754003453, 'value': 422.0, 'unit': 'tokens'}, {'timestamp': 1754003589, 'value': 579.0, 'unit': 'tokens'}, {'timestamp': 1754003609, 'value': 279.0, 'unit': 'tokens'}, {'timestamp': 1754003614, 'value': 481.0, 'unit': 'tokens'}, {'timestamp': 1754003706, 'value': 303.0, 'unit': 'tokens'}, {'timestamp': 1754003706, 'value': 51.0, 'unit': 'tokens'}, {'timestamp': 1754003713, 'value': 426.0, 'unit': 'tokens'}, {'timestamp': 1754003714, 'value': 70.0, 'unit': 'tokens'}, {'timestamp': 1754003715, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754003724, 'value': 225.0, 'unit': 'tokens'}, {'timestamp': 1754004226, 'value': 516.0, 'unit': 'tokens'}, {'timestamp': 1754004228, 'value': 127.0, 'unit': 'tokens'}, {'timestamp': 1754004232, 'value': 281.0, 'unit': 'tokens'}, {'timestamp': 1754004234, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754004236, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754004244, 'value': 206.0, 'unit': 'tokens'}, {'timestamp': 1754004683, 'value': 338.0, 'unit': 'tokens'}, {'timestamp': 1754004690, 'value': 481.0, 'unit': 'tokens'}, {'timestamp': 1754004692, 'value': 124.0, 'unit': 'tokens'}, {'timestamp': 1754004692, 'value': 65.0, 'unit': 'tokens'}, {'timestamp': 1754004694, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754004703, 'value': 211.0, 'unit': 'tokens'}, {'timestamp': 1754004743, 'value': 338.0, 'unit': 'tokens'}, {'timestamp': 1754004749, 'value': 211.0, 'unit': 'tokens'}, {'timestamp': 1754005566, 'value': 481.0, 'unit': 'tokens'}, {'timestamp': 1754006101, 'value': 159.0, 'unit': 'tokens'}, {'timestamp': 1754006105, 'value': 272.0, 'unit': 'tokens'}, {'timestamp': 1754006109, 'value': 308.0, 'unit': 'tokens'}, {'timestamp': 1754006110, 'value': 61.0, 'unit': 'tokens'}, {'timestamp': 1754006112, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754006130, 'value': 705.0, 'unit': 'tokens'}, {'timestamp': 1754051825, 'value': 454.0, 'unit': 'tokens'}, {'timestamp': 1754051827, 'value': 152.0, 'unit': 'tokens'}, {'timestamp': 1754051834, 'value': 481.0, 'unit': 'tokens'}, {'timestamp': 1754051835, 'value': 55.0, 'unit': 'tokens'}, {'timestamp': 1754051837, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754051845, 'value': 102.0, 'unit': 'tokens'}, {'timestamp': 1754099929, 'value': 36.0, 'unit': 'tokens'}, {'timestamp': 1754510050, 'value': 598.0, 'unit': 'tokens'}, {'timestamp': 1754510052, 'value': 160.0, 'unit': 'tokens'}, {'timestamp': 1754510064, 'value': 725.0, 'unit': 'tokens'}, {'timestamp': 1754510065, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754510067, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754510083, 'value': 535.0, 'unit': 'tokens'}, {'timestamp': 1754596582, 'value': 36.0, 'unit': 'tokens'}])] ``` adding tests for each currently documented metric in llama stack using this new function. attached is also some manual testing integrations tests passing locally with replay mode and the linked client changes: <img width="1907" height="529" alt="Screenshot 2025-08-08 at 2 49 14 PM" src="https://github.com/user-attachments/assets/d482ab06-dcff-4f0c-a1f1-f870670ee9bc" /> --------- Signed-off-by: Charlie Doern <[email protected]>

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Aug 8, 2025

cdoern force-pushed the query-metrics branch from 7869057 to d04f607 Compare August 8, 2025 00:19

cdoern mentioned this pull request Aug 8, 2025

feat: implement query_metrics llamastack/llama-stack-client-python-old#19

Open

franciscojavierarceo reviewed Aug 8, 2025

View reviewed changes

cdoern force-pushed the query-metrics branch 4 times, most recently from 81878fc to 7343a4c Compare August 11, 2025 14:58

ashwinb reviewed Aug 11, 2025

View reviewed changes

cdoern force-pushed the query-metrics branch 2 times, most recently from fe76d97 to 55de706 Compare August 11, 2025 16:03

cdoern marked this pull request as ready for review August 11, 2025 16:03

cdoern requested review from yanxi0830, hardikjshah, raghotham, ehhuang, terrytangyuan, leseb, bbrowning, reluctantfuturist, mattf and slekkala1 as code owners August 11, 2025 16:03

cdoern force-pushed the query-metrics branch from 55de706 to c4167ea Compare August 11, 2025 17:11

cdoern changed the title ~~feat: (WIP) implement query_metrics~~ feat: implement query_metrics Aug 11, 2025

cdoern force-pushed the query-metrics branch from c4167ea to e347efa Compare August 11, 2025 17:13

cdoern force-pushed the query-metrics branch 4 times, most recently from c38b29c to 47eafc4 Compare August 12, 2025 19:23

cdoern requested review from ehhuang, ashwinb and franciscojavierarceo August 12, 2025 19:24

cdoern force-pushed the query-metrics branch 2 times, most recently from b64d10f to f3c0e58 Compare August 12, 2025 19:26

cdoern force-pushed the query-metrics branch 2 times, most recently from 668987b to 130bf7d Compare August 13, 2025 15:18

cdoern commented Aug 13, 2025

View reviewed changes

cdoern force-pushed the query-metrics branch from 130bf7d to df63638 Compare August 18, 2025 12:57

cdoern mentioned this pull request Aug 21, 2025

feat: implement query_metrics llamastack/llama-stack-client-python#260

Closed

cdoern force-pushed the query-metrics branch from df63638 to 8aa296b Compare August 22, 2025 17:59

ashwinb requested changes Aug 22, 2025

View reviewed changes

cdoern force-pushed the query-metrics branch 2 times, most recently from 94e8ca6 to 6a90628 Compare August 22, 2025 19:46

cdoern force-pushed the query-metrics branch from 6a90628 to c7e6e6a Compare August 22, 2025 19:47

test: add integration tests for query_metrics

ddefd15

the integration tests check if all of the metrics we currently support actual are queryable after inference requests this also tests thinks like aggregation, label filtering, etc Signed-off-by: Charlie Doern <[email protected]>

cdoern force-pushed the query-metrics branch from c7e6e6a to ddefd15 Compare August 22, 2025 19:54

ashwinb approved these changes Aug 22, 2025

View reviewed changes

ashwinb merged commit 3b9278f into llamastack:main Aug 22, 2025
22 checks passed

		@@ -0,0 +1,180 @@
		# Copyright (c) Meta Platforms, Inc. and affiliates.

feat: implement query_metrics #3074

feat: implement query_metrics #3074

Uh oh!

Conversation

cdoern commented Aug 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Test Plan

Uh oh!

franciscojavierarceo left a comment

Choose a reason for hiding this comment

Uh oh!

cdoern commented Aug 8, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cdoern commented Aug 11, 2025

Uh oh!

ashwinb commented Aug 11, 2025

Uh oh!

cdoern commented Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cdoern commented Aug 12, 2025

Uh oh!

ashwinb commented Aug 13, 2025

Uh oh!

cdoern commented Aug 13, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cdoern commented Aug 18, 2025

Uh oh!

ashwinb commented Aug 20, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

cdoern commented Aug 8, 2025 •

edited

Loading

cdoern commented Aug 11, 2025 •

edited

Loading