-
Notifications
You must be signed in to change notification settings - Fork 1.5k
docs: Document all OPA metrics definitions #7929
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
docs: Document all OPA metrics definitions #7929
Conversation
✅ Deploy Preview for openpolicyagent ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
82d1b81 to
d3352e7
Compare
a03a2b6 to
928e84b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, I have left a few more comments here for you to have a think about. I noticed some metrics like counter_rego_builtin_regex_interquery_value_cache_hits seem to be missing.
While I appreciate the effort, it might be best to work on a smaller change set here. Perhaps we could look to focus on regex and http.send metrics since they are likely some of the more common ones used? Wdyt?
- Fix misleading 'aggregated' terminology - use 'instance-level' instead - Remove per-query metrics section from monitoring.md, add cross-references - Focus metrics documentation on commonly used regex and http.send built-ins - Add missing counter_rego_builtin_regex_interquery_value_cache_hits metric - Move admonition to after example in REST API documentation - Simplify and reduce scope of metrics documentation per reviewer guidance
Per reviewer feedback, removing blank line changes that were unintentionally included from merging PR open-policy-agent#7929.
- Fix misleading 'aggregated' terminology - use 'instance-level' instead - Remove per-query metrics section from monitoring.md, add cross-references - Focus metrics documentation on commonly used regex and http.send built-ins - Add missing counter_rego_builtin_regex_interquery_value_cache_hits metric - Move admonition to after example in REST API documentation - Simplify and reduce scope of metrics documentation per reviewer guidance Signed-off-by: Anivar A Aravind <[email protected]>
Per reviewer feedback, removing blank line changes that were unintentionally included from merging PR open-policy-agent#7929. Signed-off-by: Anivar A Aravind <[email protected]>
7304b49 to
dc63b7f
Compare
- Move metrics overview into Prometheus section for better flow - Add explicit /metrics path mention in Prometheus intro - Add links to Status API and Decision Logs documentation - Fix CLI tools to include proper documentation links - Clarify which metrics are enabled with instrument=true parameter - Remove inaccurate 'subset' terminology for Status API Addresses review comments from charlieegan3 on September 25, 2025 Signed-off-by: Anivar A Aravind <[email protected]>
15b9ec8 to
1221915
Compare
- Fix misleading 'aggregated' terminology - use 'instance-level' instead - Remove per-query metrics section from monitoring.md, add cross-references - Focus metrics documentation on commonly used regex and http.send built-ins - Add missing counter_rego_builtin_regex_interquery_value_cache_hits metric - Move admonition to after example in REST API documentation - Simplify and reduce scope of metrics documentation per reviewer guidance Signed-off-by: Anivar A Aravind <[email protected]>
Per reviewer feedback, removing blank line changes that were unintentionally included from merging PR open-policy-agent#7929. Signed-off-by: Anivar A Aravind <[email protected]>
- Move metrics overview into Prometheus section for better flow - Add explicit /metrics path mention in Prometheus intro - Add links to Status API and Decision Logs documentation - Fix CLI tools to include proper documentation links - Clarify which metrics are enabled with instrument=true parameter - Remove inaccurate 'subset' terminology for Status API Addresses review comments from charlieegan3 on September 25, 2025 Signed-off-by: Anivar A Aravind <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @anivar, I've left two comments about the built-in specific metrics I think that would be good to document and where I think they are best documented. I am not sure we're really going in the right direction with the rest of the PR, so if it's ok with you, I think can keep these notes on the suggested pages and get this in. We don't need to update monitoring and policy-performance this time around, for now let's just focus on the metrics for specific built-ins you've documented in here.
Generate and document all OPA metrics in a central registry. Add operational metrics sections to monitoring docs. Fixes: open-policy-agent#6730 Signed-off-by: Anivar A Aravind <[email protected]>
Based on PR feedback, this commit: - Clearly distinguishes between /metrics endpoint (system-wide) and ?metrics=true (per-query) - Removes duplicate metrics listings to avoid maintenance burden - Adds cross-references between monitoring and REST API docs - Simplifies the documentation structure without automation Addresses review feedback from @charlieegan3 Signed-off-by: Anivar A Aravind <[email protected]>
As requested by @charlieegan3: - Remove auto-generation tooling (cmd/metrics-docs/main.go) - Remove auto-generated metrics registry file - Remove Makefile target for metrics generation The reviewer indicated metrics don't change frequently enough to warrant automation, and prefers avoiding duplicate lists. Signed-off-by: Anivar A Aravind <[email protected]>
- Fix misleading 'aggregated' terminology - use 'instance-level' instead - Remove per-query metrics section from monitoring.md, add cross-references - Focus metrics documentation on commonly used regex and http.send built-ins - Add missing counter_rego_builtin_regex_interquery_value_cache_hits metric - Move admonition to after example in REST API documentation - Simplify and reduce scope of metrics documentation per reviewer guidance Signed-off-by: Anivar A Aravind <[email protected]>
Per reviewer feedback, removing blank line changes that were unintentionally included from merging PR open-policy-agent#7929. Signed-off-by: Anivar A Aravind <[email protected]>
- Move metrics overview into Prometheus section for better flow - Add explicit /metrics path mention in Prometheus intro - Add links to Status API and Decision Logs documentation - Fix CLI tools to include proper documentation links - Clarify which metrics are enabled with instrument=true parameter - Remove inaccurate 'subset' terminology for Status API Addresses review comments from charlieegan3 on September 25, 2025 Signed-off-by: Anivar A Aravind <[email protected]>
1221915 to
75b7706
Compare
|
@charlieegan3 Updated per your feedback:
Ready for review. |
Following final review guidance to document only builtin-specific metrics in their respective reference pages. Changes: - http.mdx: Document http.send timer and cache metrics - regex.mdx: Document regex cache hit metric - glob.mdx: Document glob cache hit metric - rest-api.md: Clarify available instrumentation metrics Removed broader metrics documentation from monitoring.md and policy-performance.md per reviewer request to keep changes focused. Fixes open-policy-agent#6730 Signed-off-by: Anivar A Aravind <[email protected]>
75b7706 to
39184be
Compare
✅ Deploy Preview for openpolicyagent ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
✅ Deploy Preview for openpolicyagent ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
Fixes #6730
Users were struggling to understand OPA metrics, especially builtin function metrics like
http.send(). The existing docs didn't explain what these metrics capture or how they behave with caching.This PR creates a comprehensive metrics registry that documents all currently discovered OPA metrics. Each metric now has a clear description and units.
Building on PR #7851 (which added the http.send network request counter), this completes the documentation for all http.send metrics:
timer_rego_builtin_http_send_nscaptures total time across ALL http.send() callscounter_rego_builtin_http_send_interquery_cache_hitstracks cache hitscounter_rego_builtin_http_send_network_requestscounts actual network requests (added in feat: Add counter metric for http.send network requests #7851)Beyond fixing the immediate documentation gap, I've added a generator tool in
cmd/metrics-docs/to keep the registry maintainable. Runmake generate-metrics-docsto regenerate when new metrics are added to OPA. The generator works from a manually curated list to ensure accurate descriptions.Also enhanced the existing monitoring and policy-performance docs with operational metrics sections and fixed a broken link in the REST API docs.
Files changed:
cmd/metrics-docs/main.goandREADME.md- Generator tooldocs/docs/metrics-registry.md- The complete registry (generated)docs/docs/monitoring.md- Added operational metrics sectionsdocs/docs/policy-performance.md- Enhanced performance metricsdocs/docs/rest-api.md- Fixed broken referenceThis should help users interpret metrics without diving into source code.