Skip to content

New metrics automation #118

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 21 commits into
base: main
Choose a base branch
from
Open

New metrics automation #118

wants to merge 21 commits into from

Conversation

paulohtb6
Copy link
Contributor

@paulohtb6 paulohtb6 commented Jul 21, 2025

New metrics automation

Introducing a new metrics automation tool that traverses through the Repdanda source code to create reliable automation.

Run it:

npx generate metrics-docs --tag [tag-value. Can be dev or v25.2.4-rc4 for example]

Changes

Moved the old metrics command to metric-docs-legacy

Json output schema

{
  "redpanda_kafka_requests_total": {
    "unique_id": "a1b2c3d4e5f6...",
    "name": "redpanda_kafka_requests_total",
    "full_name": "redpanda_kafka_requests_total",
    "metric_type": "external",
    "type": "counter",
    "description": "Total number of Kafka requests.",
    "labels": ["request_type", "status"],
    "constructor": "make_counter",
    "files": [{"file": "src/v/kafka/server/handlers/handler.cc", "line": 45}],
    "group": "kafka"
  }
}

more info

More info at the provided readmes

@paulohtb6 paulohtb6 requested a review from JakeSCahill July 21, 2025 16:15
Copy link

netlify bot commented Jul 21, 2025

Deploy Preview for docs-extensions-and-macros ready!

Name Link
🔨 Latest commit 9c382b1
🔍 Latest deploy log https://app.netlify.com/projects/docs-extensions-and-macros/deploys/688295e0da3eb5000844ca2c
😎 Deploy Preview https://deploy-preview-118--docs-extensions-and-macros.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Copy link
Contributor

coderabbitai bot commented Jul 21, 2025

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

📝 Walkthrough

Walkthrough

This change introduces a new, source-based workflow for extracting and generating Redpanda metrics documentation. It adds a Python-based metrics extractor tool using Tree-sitter to parse C++ source files, a Makefile to automate environment setup and extraction, and CLI integration via new commands in bin/doc-tools.js. The workflow supports generating metrics docs from a specific Redpanda tag/branch or local source, outputs both AsciiDoc and JSON files, and includes improved logging, dependency checks, and readiness polling. Supporting scripts, documentation, and tests are also added or updated. Legacy Docker-based extraction is retained with enhanced logging. Additional improvements include refined Docker Compose service dependencies and healthchecks, updated CLI scripts for better logging and readiness checks, and a new metrics comparison tool for AsciiDoc files.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant DocToolsCLI
    participant Makefile
    participant MetricsExtractor (Python)
    participant TreeSitter
    participant RedpandaRepo

    User->>DocToolsCLI: Run "generate metrics-docs" or "source-metrics-docs"
    DocToolsCLI->>Makefile: Invoke with tag/branch or local path
    Makefile->>RedpandaRepo: Clone or use local source
    Makefile->>TreeSitter: Build C++ grammar
    Makefile->>MetricsExtractor: Run metrics_extractor.py
    MetricsExtractor->>TreeSitter: Parse C++ files for metrics
    MetricsExtractor->>RedpandaRepo: Read C++ source files
    MetricsExtractor-->>Makefile: Output JSON and AsciiDoc files
    Makefile-->>DocToolsCLI: Return generated documentation paths
    DocToolsCLI-->>User: Print completion and output file locations
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~150 minutes

Possibly related PRs

Suggested reviewers

  • kbatuigas
  • Feediver1
✨ Finishing Touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch metrics-improvements

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@paulohtb6 paulohtb6 marked this pull request as ready for review July 22, 2025 22:45
@paulohtb6 paulohtb6 changed the title WIP: New metrics automation New metrics automation Jul 22, 2025
@JakeSCahill
Copy link
Contributor

@coderabbitat review

Copy link
Contributor

coderabbitai bot commented Jul 23, 2025

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

@JakeSCahill I'll conduct a comprehensive review of this substantial PR that introduces a new source-based metrics extraction workflow. Let me analyze the key components systematically.

Based on my comprehensive analysis, here's my detailed review:

🔍 Overall Assessment

This is an ambitious and well-structured PR that introduces a modern, source-based approach to metrics documentation generation. The implementation demonstrates solid software engineering practices with proper separation of concerns, comprehensive tooling, and good documentation.

Strengths

1. Architecture & Design

  • Clean separation of concerns: Parser, bag container, extractor, and CLI layers are well-defined
  • Flexible output formats: Supports both JSON and AsciiDoc with internal/external classification
  • Comprehensive tooling: Makefile automation, CLI integration, and validation scripts
  • Tree-sitter integration: Modern AST-based parsing is more reliable than regex-based approaches

2. Code Quality

  • Robust error handling: Good use of try-catch blocks and logging
  • Input validation: Path validation, dependency checks, and sanitization functions
  • Documentation: Well-documented modules with clear docstrings
  • Testing: Includes test infrastructure and validation scripts

3. User Experience

  • Multiple workflows: Supports GitHub tags, local repos, and legacy Docker method
  • Progress feedback: Enhanced logging with timestamps and status updates
  • Dependency management: Automated dependency verification and setup

⚠️ Areas for Improvement

1. Security Considerations

# In metrics_parser.py - potential security risk
def get_file_contents(path):
    with open(path, 'r') as f:  # No encoding specified
        return f.read()

Recommendation: Specify encoding explicitly and add file size limits:

def get_file_contents(path, max_size=10*1024*1024):  # 10MB limit
    if os.path.getsize(path) > max_size:
        raise ValueError(f"File {path} too large")
    with open(path, 'r', encoding='utf-8') as f:
        return f.read()

2. Resource Management

The Makefile performs git clones and builds without cleanup on failure. Consider adding:

.ONESHELL:
clean-on-error:
	@if [ -d "$(REDPANDA_REPO_DIR)" ]; then rm -rf "$(REDPANDA_REPO_DIR)"; fi

3. Error Handling Enhancement

In metrics_extractor.py, some error paths don't provide enough context:

# Current
if not os.path.exists(options.redpanda_source_path):
    print(f"Error: Path {options.redpanda_source_path} does not exist")
    sys.exit(1)

# Suggested improvement
if not os.path.exists(options.redpanda_source_path):
    print(f"Error: Redpanda source path '{options.redpanda_source_path}' does not exist.")
    print("Please ensure the path is correct and accessible.")
    print("For remote repos, use: git clone <repo_url> <local_path>")
    sys.exit(1)

4. Performance Optimization

The metrics extraction processes all C++ files without parallelization. For large codebases, consider:

from concurrent.futures import ProcessPoolExecutor
import multiprocessing

def extract_metrics_parallel(cpp_files, parser, language, filter_namespace=None):
    num_processes = min(multiprocessing.cpu_count(), len(cpp_files))
    with ProcessPoolExecutor(max_workers=num_processes) as executor:
        # Implementation details...

5. Configuration Management

Hard-coded values should be configurable:

# In metrics_extractor.py
DEFAULT_TREESITTER_VERSION = "v0.20.5"  # Should be configurable
MAX_FILE_SIZE = 10 * 1024 * 1024  # Should be in config

📋 Specific Code Issues

1. Tree-sitter Version Pinning

# In Makefile - version should be configurable
TREESITTER_VERSION ?= v0.20.5

2. Memory Usage

The MetricsBag class stores all metrics in memory. For very large codebases, consider implementing a streaming approach or pagination.

3. Regex Complexity

In metrics_parser.py, some regex patterns are complex and could benefit from compilation:

# Compile regex patterns at module level
LABEL_PATTERN = re.compile(r'["\']([^"\']+)["\']')

🧪 Testing Recommendations

  1. Add integration tests for the complete workflow
  2. Test error conditions (missing dependencies, malformed C++ files)
  3. Performance benchmarks for large codebases
  4. Cross-platform testing (Windows/macOS compatibility)

📚 Documentation Enhancements

The documentation is comprehensive, but consider adding:

  • Troubleshooting section for common issues
  • Performance tuning guide for large repositories
  • Contributing guidelines for extending the extractor

🎯 Final Verdict

This is a high-quality PR that significantly improves the metrics documentation workflow. The code demonstrates strong engineering practices, and the architecture is well-designed for maintainability and extensibility.

Recommendation: Approve with minor revisions focusing on the security and error handling improvements mentioned above.

🚀 Next Steps

  1. Address the security considerations around file handling
  2. Add resource cleanup for failed operations
  3. Consider performance optimizations for large codebases
  4. Enhance error messages with actionable guidance

The new source-based approach is a significant improvement over the Docker-based legacy method, providing better reliability, performance, and maintainability. Excellent work! 🎉


🏁 Script executed:

#!/bin/bash
# First, let's get an overview of the changes and file structure
echo "=== PR Overview ==="
echo "Files changed: $(git diff --name-only HEAD~1 | wc -l)"
echo "Lines added/removed:"
git diff --stat HEAD~1

echo -e "\n=== New Python modules structure ==="
find tools/metrics-extractor -name "*.py" | head -10

echo -e "\n=== Key configuration files ==="
ls -la tools/metrics-extractor/Makefile 2>/dev/null || echo "Makefile not found"
ls -la tools/metrics-extractor/requirements.txt 2>/dev/null || echo "requirements.txt not found"

echo -e "\n=== CLI integration ==="
grep -n "metrics-docs" bin/doc-tools.js | head -5

Length of output: 1497


🏁 Script executed:

#!/bin/bash
# Analyze the main Python modules for code quality and structure
echo "=== Analyzing metrics_extractor.py ==="
python -c "
import ast
with open('tools/metrics-extractor/metrics_extractor.py', 'r') as f:
    tree = ast.parse(f.read())
    
functions = [node.name for node in ast.walk(tree) if isinstance(node, ast.FunctionDef)]
print(f'Functions defined: {len(functions)}')
print('Function names:', functions[:10])

classes = [node.name for node in ast.walk(tree) if isinstance(node, ast.ClassDef)]
print(f'Classes defined: {len(classes)}')
if classes:
    print('Class names:', classes)
"

echo -e "\n=== Analyzing metrics_parser.py ==="
python -c "
import ast
with open('tools/metrics-extractor/metrics_parser.py', 'r') as f:
    tree = ast.parse(f.read())
    
functions = [node.name for node in ast.walk(tree) if isinstance(node, ast.FunctionDef)]
print(f'Functions defined: {len(functions)}')
print('Function names:', functions[:10])
"

echo -e "\n=== Checking error handling patterns ==="
grep -n "except\|try\|raise\|logger\|logging" tools/metrics-extractor/metrics_extractor.py | head -10

Length of output: 2430


🏁 Script executed:

#!/bin/bash
# Check the CLI integration and dependency management
echo "=== CLI Commands Added ==="
grep -A 5 -B 2 "metrics-docs" bin/doc-tools.js

echo -e "\n=== Dependency verification function ==="
grep -A 10 "verifyMetricsExtractorDependencies" bin/doc-tools.js

echo -e "\n=== Package.json scripts ==="
grep -A 5 -B 2 "metrics" package.json

echo -e "\n=== Requirements analysis ==="
cat tools/metrics-extractor/requirements.txt

Length of output: 3437


🏁 Script executed:

#!/bin/bash
# Analyze the Makefile structure and targets
echo "=== Makefile targets ==="
grep "^[a-zA-Z].*:" tools/metrics-extractor/Makefile | head -15

echo -e "\n=== Environment variables and configuration ==="
grep "^[A-Z_]*=" tools/metrics-extractor/Makefile | head -10

echo -e "\n=== Testing infrastructure ==="
find tools/metrics-extractor -name "*test*" -type f
ls -la tools/metrics-extractor/tests/ 2>/dev/null || echo "No tests directory found"

Length of output: 1171


🏁 Script executed:

#!/bin/bash
# Check for security and best practices issues
echo "=== Security analysis ==="
echo "Checking for subprocess calls without shell=False:"
grep -rn "subprocess\|os\.system\|shell=True" tools/metrics-extractor/ | head -10

echo -e "\n=== File permissions and executable checks ==="
find tools/metrics-extractor -name "*.py" -executable | head -5

echo -e "\n=== Hardcoded paths or secrets ==="
grep -rn "localhost\|127\.0\.0\.1\|password\|secret\|token" tools/metrics-extractor/ | head -5

echo -e "\n=== Input validation patterns ==="
grep -rn "validate\|sanitize\|clean" tools/metrics-extractor/ | head -10

Length of output: 1830


🏁 Script executed:

#!/bin/bash
# Analyze documentation and usability
echo "=== Documentation files ==="
find . -name "README*" -newer tools/metrics-extractor/README.adoc 2>/dev/null || find . -name "README*"

echo -e "\n=== Help text and usage ==="
grep -rn "help\|usage\|--help" bin/doc-tools.js | grep -i metrics

echo -e "\n=== Error messages and user feedback ==="
grep -rn "console\.log\|console\.error\|print" tools/metrics-extractor/ | head -10

Length of output: 1515


Based on my comprehensive analysis of this substantial PR, here's my detailed review:

🔍 Overall Assessment

This is an excellent, well-architected PR that introduces a modern source-based metrics extraction workflow. The implementation demonstrates strong software engineering practices with proper separation of concerns, comprehensive tooling, and thoughtful user experience design.

Major Strengths

1. Architecture Excellence

  • Clean modular design: MetricsBag (container) → metrics_parser (AST analysis) → metrics_extractor (orchestration)
  • Tree-sitter integration: Modern AST-based parsing is far superior to regex approaches
  • Flexible output: JSON + AsciiDoc with internal/external classification
  • Multi-workflow support: GitHub tags, local repos, and legacy Docker fallback

2. Robust Implementation

  • Proper dependency management: verifyMetricsExtractorDependencies() checks for make, Python 3.10+, Git, and C++ compiler
  • Good error handling: Comprehensive try-catch blocks and informative logging
  • Safe subprocess usage: No shell=True usage found - security conscious
  • User-friendly feedback: Progress indicators, timestamps, and clear success/error messages

3. Comprehensive Tooling

  • Makefile automation: 15+ targets covering build, test, clean, and development workflows
  • CLI integration: 3 new commands seamlessly integrated into existing doc-tools
  • Validation scripts: End-to-end validation and dependency checking

⚠️ Areas for Improvement

1. Enhanced Error Handling & File Operations

Current issue in metrics_parser.py:

def get_file_contents(path):
    with open(path, 'r') as f:  # No encoding, no size limits
        return f.read()

Recommended improvement:

def get_file_contents(path, max_size=10*1024*1024, encoding='utf-8'):
    """Read file contents with safety checks."""
    if not os.path.exists(path):
        raise FileNotFoundError(f"File not found: {path}")
    
    file_size = os.path.getsize(path)
    if file_size > max_size:
        raise ValueError(f"File too large ({file_size} bytes): {path}")
    
    try:
        with open(path, 'r', encoding=encoding, errors='replace') as f:
            return f.read()
    except UnicodeDecodeError as e:
        raise ValueError(f"Cannot decode file {path}: {e}")

2. Performance Optimization for Large Codebases

The current implementation processes files sequentially. For large repositories like Redpanda, consider parallel processing:

from concurrent.futures import ThreadPoolExecutor
import multiprocessing

def extract_metrics_from_files_parallel(cpp_files, parser, language, filter_namespace=None):
    """Extract metrics using parallel processing."""
    max_workers = min(multiprocessing.cpu_count(), len(cpp_files), 8)  # Cap at 8
    
    with ThreadPoolExecutor(max_workers=max_workers) as executor:
        futures = [
            executor.submit(parse_cpp_file, file_path, parser, language, filter_namespace)
            for file_path in cpp_files
        ]
        
        results = MetricsBag()
        for future in concurrent.futures.as_completed(futures):
            try:
                file_results = future.result()
                results.merge(file_results)
            except Exception as e:
                logger.warning(f"Failed to process file: {e}")
        
        return results

3. Memory Management for Large Datasets

The MetricsBag class stores everything in memory. For very large codebases, consider streaming:

class MetricsBag:
    def __init__(self, max_memory_items=50000):
        self._metrics = {}
        self._max_memory_items = max_memory_items
        self._temp_file = None
    
    def add_metric(self, *args, **kwargs):
        if len(self._metrics) > self._max_memory_items:
            self._spill_to_disk()
        # ... existing logic
    
    def _spill_to_disk(self):
        """Spill excess metrics to temporary file."""
        # Implementation for memory management

4. Enhanced Input Validation

Current validation in metrics_extractor.py:

def validate_paths(options):
    path = options.redpanda_source_path
    if not os.path.exists(path):
        logger.error(f'Path does not exist: "{path}".')
        return False
    return True

Enhanced validation:

def validate_paths(options):
    """Comprehensive path validation with helpful error messages."""
    path = options.redpanda_source_path
    
    if not path:
        logger.error("Redpanda source path is required.")
        return False
    
    if not os.path.exists(path):
        logger.error(f'Path does not exist: "{path}"')
        logger.error("Suggestions:")
        logger.error("  • Check the path spelling")
        logger.error("  • For remote repos: git clone <repo_url> <local_path>")
        logger.error("  • Ensure you have read permissions")
        return False
    
    if not os.path.isdir(path):
        logger.error(f'Path is not a directory: "{path}"')
        return False
    
    # Check for key Redpanda files/directories
    expected_paths = ['src/', 'CMakeLists.txt', 'tools/']
    missing_paths = [p for p in expected_paths if not os.path.exists(os.path.join(path, p))]
    
    if missing_paths:
        logger.warning(f"This may not be a Redpanda repository. Missing: {missing_paths}")
    
    return True

5. Expanded Testing Coverage

Currently only one test file exists. Recommend adding:

# tests/test_metrics_bag.py
def test_metrics_bag_operations():
    """Test MetricsBag core functionality."""
    
# tests/test_parser_edge_cases.py  
def test_malformed_cpp_files():
    """Test parser robustness with malformed C++ files."""
    
# tests/test_cli_integration.py
def test_cli_commands():
    """Test CLI command integration."""
    
# tests/test_performance.py
def test_large_codebase_performance():
    """Performance benchmarks for large repositories."""

📊 Code Quality Metrics

Excellent:

  • No security anti-patterns (shell=True, hardcoded secrets)
  • Proper logging infrastructure
  • Good separation of concerns
  • Comprehensive CLI integration

Good:

  • Error handling coverage
  • User feedback and progress indicators
  • Documentation completeness

⚠️ Needs attention:

  • Test coverage (only 1 test file for 5 modules)
  • Memory usage optimization
  • File encoding handling

🧪 Testing Recommendations

  1. Integration tests for complete workflows
  2. Edge case testing for malformed C++ files
  3. Performance benchmarks with large codebases
  4. Cross-platform compatibility testing
  5. Error condition testing (missing dependencies, network failures)

📚 Documentation Enhancements

The documentation is comprehensive, but consider adding:

# Troubleshooting Guide
## Common Issues
- Tree-sitter compilation failures
- Memory issues with large repositories
- C++ parsing edge cases

# Performance Tuning
- Parallel processing configuration
- Memory management for large codebases
- Filtering strategies for faster processing

🎯 Final Recommendation

APPROVE - This is exemplary work that significantly improves the metrics documentation workflow. The architecture is sound, implementation is robust, and the user experience is well-designed.

🚀 Priority Action Items

  1. High Priority: Add file encoding safety to get_file_contents()
  2. Medium Priority: Expand test coverage with integration tests
  3. Medium Priority: Add parallel processing for large repositories
  4. Low Priority: Enhanced error messages with troubleshooting hints

🌟 Standout Features

  • Tree-sitter integration - Modern, reliable parsing approach
  • Dependency verification - Comprehensive pre-flight checks
  • Multi-output formats - JSON + categorized AsciiDoc
  • Makefile automation - Production-ready build system
  • Legacy compatibility - Maintains existing Docker workflow

This PR represents a significant architectural improvement that will make metrics documentation generation more reliable, maintainable, and performant. Excellent engineering work! 🎉

🐰✨
Impressed by the thoughtful architecture and comprehensive implementation!

@@ -429,7 +460,51 @@ function diffDirs(kind, oldTag, newTag) {

automation
.command('metrics-docs')
.description('Generate JSON and AsciiDoc documentation for Redpanda metrics')
.description('Generate JSON and AsciiDoc documentation for Redpanda metrics from source code')
.requiredOption('-r, --redpanda-repo <path>', 'Path to the Redpanda repository root directory')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this required? For the property-docs command, we require a tag and use that to clone and check out the repo.

Please give this command the same signature as property-docs. It should support the diff option.

automation
  .command('property-docs')
  .description('Generate JSON and AsciiDoc documentation for Redpanda configuration properties')
  .option('--tag <tag>', 'Git tag or branch to extract from', 'dev')
  .option('--diff <oldTag>', 'Also diff autogenerated properties from <oldTag> → <tag>')
  .action((options) => {
    verifyPropertyDependencies();

    const newTag = options.tag;
    const oldTag = options.diff;
    const cwd = path.resolve(__dirname, '../tools/property-extractor');
    const make = (tag) => {
      console.log(`⏳ Building property docs for ${tag}…`);
      const r = spawnSync('make', ['build', `TAG=${tag}`], { cwd, stdio: 'inherit' });
      if (r.error) {
        console.error(`❌ ${r.error.message}`);
        process.exit(1);
      }
      if (r.status !== 0) process.exit(r.status);
    };

    if (oldTag) {
      const oldDir = path.join('autogenerated', oldTag, 'properties');
      if (!fs.existsSync(oldDir)) make(oldTag);
    }

    make(newTag);

    if (oldTag) {
      diffDirs('properties', oldTag, newTag);
    }

    process.exit(0);
  });


const startTime = Date.now();
const result = spawnSync('python3', [
path.join(__dirname, '../tools/metrics-extractor/metrics_extractor.py'),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you have a Makefile, we should execute make build instead like we do for property-docs.

.option('--internal-asciidoc <path>', 'Custom path for internal metrics AsciiDoc file', 'autogenerated/internal_metrics_reference.adoc')
.option('--external-asciidoc <path>', 'Custom path for external/public metrics AsciiDoc file', 'autogenerated/public_metrics_reference.adoc')
.action((options) => {
console.log(`🎯 Starting enhanced metrics extraction from source code`);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
console.log(`🎯 Starting enhanced metrics extraction from source code`);

.description('Generate JSON and AsciiDoc documentation for Redpanda metrics from source code')
.requiredOption('-r, --redpanda-repo <path>', 'Path to the Redpanda repository root directory')
.option('--json-output <path>', 'Custom path for JSON output file', 'autogenerated/metrics.json')
.option('--internal-asciidoc <path>', 'Custom path for internal metrics AsciiDoc file', 'autogenerated/internal_metrics_reference.adoc')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like these new options for specify paths.

We don't typically use underscores in Asciidoc filenames. Please replace with hyphens.

@@ -713,6 +792,41 @@ automation
process.exit(0);
});

automation
.command('source-metrics-docs')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why this is a separate command. This should be called metrics-docs and should replace the other one in this file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also keep support for specifying paths for the output


All commands generate three files in the `autogenerated/` directory:

* `internal_metrics_reference.adoc` - Internal metrics for engineering documentation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update these docs when you fix the filenames from using underscores to hyphens.

filtered = {}
seen_names = set() # Track metric names to detect duplicates

for name, data in metrics.items():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!


# Create more meaningful category names
category_mapping = {
'cluster': 'Cluster metrics',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So nice

});

automation
.command('metrics-docs-legacy')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just remove the legacy implementation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're using semver versioning, so we can always install a previous version to try it again.

@@ -21,7 +21,9 @@
"get-console-version": "doc-tools get-console-version",
"build": "antora --to-dir docs --fetch local-antora-playbook.yml",
"serve": "wds --node-resolve --open preview/test/ --watch --root-dir docs",
"test": "jest"
"test": "jest",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please bump the version in this file. Since this is quite a large change, bump the minor version to 4.8.0 and then npm install to update the lock file.


SHELL := /bin/bash
TAG ?= dev
OUTPUT_DIR := autogenerated/$(TAG)/source-metrics
Copy link
Contributor

@JakeSCahill JakeSCahill Jul 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the CLI is going to be installed inside our content repos, we'll want the output saved to the root of the repo rather than relative to this package in node_modules so it's easy to find.

This is what we did for the property-docs automation:

REPO_ROOT     := $(shell git rev-parse --show-toplevel)
MODULE_ROOT   := $(shell cd "$(dir $(realpath $(lastword $(MAKEFILE_LIST))))"/../.. && pwd)
TOOL_ROOT     := $(MODULE_ROOT)/tools/property-extractor
TMP_ROOT      := $(TOOL_ROOT)/tmp
REDPANDA_SRC  := $(TMP_ROOT)/redpanda
TREESITTER_DIR:= $(TOOL_ROOT)/tree-sitter/tree-sitter-cpp
VENV          := $(TOOL_ROOT)/tmp/redpanda-property-extractor-venv
PYTHON        := $(VENV)/bin/python
OUTPUT_DIR    := $(REPO_ROOT)/autogenerated/$(TAG)/properties

@JakeSCahill
Copy link
Contributor

Looking really nice @paulohtb6 thanks for working on this!

When it's ready, can you update our wiki: https://redpandadata.atlassian.net/wiki/spaces/DOC/pages/1185054748/Doc+Tools+CLI

@JakeSCahill
Copy link
Contributor

Some nice suggestions and praise from 🐰 !

take a look at some of the suggestions: #118 (comment)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

♻️ Duplicate comments (5)
package.json (1)

24-27: Version bump still missing & lock-file not updated
Jake already requested raising the package version (→ 4.8.0) and regenerating the lock file. That hasn’t happened yet. Please bump the version and run npm install to commit the updated package-lock.json.

tools/metrics-extractor/README.adoc (1)

35-36: Address the filename convention feedback

The filenames still use underscores instead of hyphens as previously mentioned by JakeSCahill. Please update to use hyphens for consistency with project conventions.

bin/doc-tools.js (2)

466-467: Use hyphens instead of underscores in filenames

As previously mentioned by JakeSCahill, filenames should use hyphens instead of underscores for consistency.


483-488: Use make build instead of direct Python execution

As previously mentioned by JakeSCahill, this should execute make build instead of calling the Python script directly, similar to how property-docs works.

tools/metrics-extractor/Makefile (1)

6-6: Output should be saved to repo root, not relative to this package

As previously mentioned by JakeSCahill, the output should be saved to the repository root rather than relative to this package location. This makes it easier to find when installed in content repos.

Consider updating to use the same pattern as property-extractor:

REPO_ROOT     := $(shell git rev-parse --show-toplevel)
OUTPUT_DIR    := $(REPO_ROOT)/autogenerated/$(TAG)/source-metrics
🧹 Nitpick comments (13)
cli-utils/python-venv.sh (1)

10-10: Trim the trailing whitespace to avoid noisy diffs
The extra space after the closing quote ("..."␠) is harmless at runtime but unnecessarily clutters future git blame/diffs.

-echo "Recreating Python venv at $VENV_DIR..." 
+echo "Recreating Python venv at $VENV_DIR..."
tools/metrics-extractor/requirements.txt (1)

1-2: Pinning strategy & supply-chain hygiene

  1. tree_sitter==0.21.1 – hard-pinning implies you’ll miss upstream security/bug fixes. Prefer ~=0.21 unless reproducibility absolutely mandates a full pin.
  2. Consider adding --require-hashes support (hash-checked requirements) to mitigate dependency-confusion or typosquatting attacks.
cli-utils/install-test-dependencies.sh (1)

16-16: Minor whitespace nit
Same trailing-space issue as the venv script; clean-up keeps history tidy.

-    echo "Node.js version: $(node -v)" 
+    echo "Node.js version: $(node -v)"
package.json (1)

24-27: Script names OK but undocumented
The new metrics:extract and metrics:legacy targets look good; ensure README/CONTRIB docs mention them so users don’t rely on the raw npx incantation only.

tools/metrics-extractor/tests/test_extraction.py (1)

76-79: Consider making tree-sitter paths configurable.

The hardcoded paths for tree-sitter components ("tree-sitter", "tree-sitter-cpp.so") could be made configurable to improve test portability across different environments.

Consider using environment variables or configuration parameters:

-        parser, language = get_treesitter_cpp_parser_and_language("tree-sitter", "tree-sitter-cpp.so")
+        parser, language = get_treesitter_cpp_parser_and_language(
+            os.environ.get("TREESITTER_PATH", "tree-sitter"),
+            os.environ.get("TREESITTER_CPP_LIB", "tree-sitter-cpp.so")
+        )
tools/metrics-extractor/validate.py (1)

12-33: Use importlib.util.find_spec for cleaner dependency checking

Consider using importlib.util.find_spec instead of try/except imports to check module availability without actually importing them.

+import importlib.util
+
 def check_dependencies():
     """Check if all required dependencies are available"""
     print("🔧 Checking dependencies...")
     
-    try:
-        import tree_sitter
+    if importlib.util.find_spec('tree_sitter') is not None:
         print("  ✓ tree-sitter is available")
-    except ImportError:
+    else:
         print("  ❌ tree-sitter not found. Install with: pip install tree-sitter")
         return False
     
     # Check if we can import our modules
-    try:
-        from metrics_parser import build_treesitter_cpp_library, extract_metrics_from_files
-        from metrics_bag import MetricsBag
+    modules_to_check = ['metrics_parser', 'metrics_bag']
+    for module in modules_to_check:
+        if importlib.util.find_spec(module) is None:
+            print(f"  ❌ Module '{module}' not found")
+            return False
+    
+    try:
+        # Verify specific functions exist by importing them
+        from metrics_parser import build_treesitter_cpp_library, extract_metrics_from_files
+        from metrics_bag import MetricsBag
         print("  ✓ All custom modules are available")
     except ImportError as e:
         print(f"  ❌ Import error: {e}")
         return False
     
     return True
tools/metrics-extractor/metrics_extractor.py (3)

20-27: Enhance path validation for better user experience.

The current validation only checks if the path exists. Consider adding more comprehensive checks to ensure the path is a valid Redpanda repository.

 def validate_paths(options):
     path = options.redpanda_repo
 
     if not os.path.exists(path):
         logger.error(f'Path does not exist: "{path}".')
+        logger.error(f'Please ensure you have cloned the Redpanda repository or specify a valid path.')
         sys.exit(1)
+    
+    if not os.path.isdir(path):
+        logger.error(f'Path is not a directory: "{path}".')
+        sys.exit(1)
+    
+    # Check for expected Redpanda source structure
+    expected_markers = ['src', 'CMakeLists.txt']
+    if not any(os.path.exists(os.path.join(path, marker)) for marker in expected_markers):
+        logger.warning(f'Path may not be a valid Redpanda repository: "{path}".')
+        logger.warning('Expected to find "src" directory or "CMakeLists.txt" file.')

158-372: Consider refactoring this large function for better maintainability.

The function is over 200 lines and handles multiple responsibilities. The extensive hardcoded category mappings could be externalized.

Extract the category mapping logic into a separate configuration:

# Define at module level or in a separate config file
METRIC_CATEGORY_MAPPINGS = {
    'cluster': 'Cluster metrics',
    'kafka': 'Kafka metrics',
    'raft': 'Raft metrics',
    # ... rest of the mappings
}

# Keywords that map to application metrics
APPLICATION_METRIC_KEYWORDS = {
    'active', 'adjacent', 'anomalies', 'available', 'backlog',
    # ... rest of the keywords
}

def get_metric_category(metric_name):
    """Determine the category for a metric based on its name."""
    # Remove prefixes
    clean_name = metric_name
    for prefix in ['redpanda_', 'vectorized_']:
        if clean_name.startswith(prefix):
            clean_name = clean_name[len(prefix):]
            break
    
    # Get first part
    parts = clean_name.split('_')
    category = parts[0] if parts else 'other'
    
    # Map to category name
    if category in METRIC_CATEGORY_MAPPINGS:
        return METRIC_CATEGORY_MAPPINGS[category]
    elif category in APPLICATION_METRIC_KEYWORDS:
        return 'Application metrics'
    else:
        return 'Other metrics'

Also consider splitting the function into smaller functions:

  • separate_metrics_by_type()
  • group_metrics_by_category()
  • write_metrics_documentation()

455-455: Remove unnecessary f-string prefix.

The string doesn't contain any placeholders.

-    print(f"📊 Metrics by type:")
+    print("📊 Metrics by type:")
tools/metrics/compare_metrics.py (1)

308-317: Good error handling, but consider file size limits.

While the error handling is good, consider adding file size validation to prevent issues with very large files.

     try:
+        # Check file sizes first
+        for filepath in [args.file1, args.file2]:
+            file_size = os.path.getsize(filepath)
+            if file_size > 10 * 1024 * 1024:  # 10MB limit
+                print(f"Warning: {filepath} is large ({file_size / 1024 / 1024:.1f}MB). This may take a while.")
+        
         with open(args.file1, 'r', encoding='utf-8') as f:
             content1 = f.read()
tools/metrics-extractor/metrics_parser.py (3)

101-129: Consider compiling regex patterns for better performance.

The regex patterns are compiled on every function call. For better performance, especially when processing many files, compile them once at module level.

# At module level
LABEL_PATTERNS = [
    re.compile(r'\.aggregate\s*\(\s*([^)]+)\s*\)'),  # .aggregate(aggregate_labels)
    re.compile(r'auto\s+(\w*labels\w*)\s*='),        # auto aggregate_labels =
    re.compile(r'std::vector<[^>]*>\s*{([^}]+)}'),   # std::vector<sm::label>{sm::shard_label}
    re.compile(r'sm::([a-z_]*label[a-z_]*)'),        # sm::shard_label, sm::topic_label, etc.
    re.compile(r'"([^"]+)"\s*:\s*[^,}]+'),           # key-value pairs
]

def extract_labels_from_code(code_context):
    """Extract potential label names from code context around metrics"""
    labels = set()
    
    for pattern in LABEL_PATTERNS:
        matches = pattern.findall(code_context)
        # ... rest of the function

231-255: Simplify nested conditions where possible.

Some nested if statements can be combined for better readability.

-                        if inner_function and '::metrics_name' in inner_function.text.decode('utf-8'):
-                            # Found it. Extract the string literal from its arguments.
-                            if inner_args and inner_args.named_child_count > 0:
+                        if (inner_function and '::metrics_name' in inner_function.text.decode('utf-8') and
+                            inner_args and inner_args.named_child_count > 0):
                                 group_name_node = inner_args.named_children[0]
                                 if group_name_node.type == 'string_literal':

1115-1128: Consider parallel processing for better performance.

The current implementation processes files sequentially. For large codebases, parallel processing could significantly improve performance.

from concurrent.futures import ProcessPoolExecutor, as_completed
import multiprocessing

def extract_metrics_from_files(cpp_files, treesitter_parser, cpp_language, filter_namespace=None):
    """Extract metrics from multiple C++ files"""
    all_metrics = MetricsBag()
    
    # Use process pool for parallel parsing
    max_workers = min(multiprocessing.cpu_count(), len(cpp_files))
    
    with ProcessPoolExecutor(max_workers=max_workers) as executor:
        # Submit all files for processing
        future_to_file = {
            executor.submit(parse_cpp_file, file_path, treesitter_parser, cpp_language, filter_namespace): file_path
            for file_path in cpp_files
        }
        
        # Collect results as they complete
        for future in as_completed(future_to_file):
            file_path = future_to_file[future]
            try:
                file_metrics = future.result()
                all_metrics.merge(file_metrics)
            except Exception as e:
                logger.warning(f"Failed to process {file_path}: {e}")
    
    return all_metrics

Note: This would require making the parser objects pickleable or recreating them in each process.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 922f9fe and ab66342.

⛔ Files ignored due to path filters (1)
  • tools/metrics-extractor/tree-sitter-cpp.so is excluded by !**/*.so
📒 Files selected for processing (18)
  • README.adoc (1 hunks)
  • bin/doc-tools.js (6 hunks)
  • cli-utils/generate-cluster-docs.sh (3 hunks)
  • cli-utils/install-test-dependencies.sh (1 hunks)
  • cli-utils/python-venv.sh (1 hunks)
  • docker-compose/25.1/docker-compose.yml (4 hunks)
  • docker-compose/docker-compose.yml (1 hunks)
  • package.json (1 hunks)
  • tools/metrics-extractor/Makefile (1 hunks)
  • tools/metrics-extractor/README.adoc (1 hunks)
  • tools/metrics-extractor/metrics_bag.py (1 hunks)
  • tools/metrics-extractor/metrics_extractor.py (1 hunks)
  • tools/metrics-extractor/metrics_parser.py (1 hunks)
  • tools/metrics-extractor/requirements.txt (1 hunks)
  • tools/metrics-extractor/tests/test_extraction.py (1 hunks)
  • tools/metrics-extractor/validate.py (1 hunks)
  • tools/metrics/compare_metrics.py (1 hunks)
  • tools/metrics/metrics.py (4 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
tools/metrics-extractor/validate.py (4)
bin/doc-tools.js (4)
  • os (4-4)
  • result (305-305)
  • result (482-488)
  • result (619-627)
tools/metrics-extractor/metrics_parser.py (2)
  • build_treesitter_cpp_library (61-64)
  • extract_metrics_from_files (1115-1127)
tools/metrics-extractor/metrics_bag.py (1)
  • MetricsBag (9-211)
tools/metrics-extractor/metrics_extractor.py (1)
  • main (374-459)
🪛 checkmake (0.2.2)
tools/metrics-extractor/Makefile

[warning] 19-19: Target body for "help" exceeds allowed length of 5 (13).

(maxbodylength)


[warning] 69-69: Target body for "treesitter" exceeds allowed length of 5 (8).

(maxbodylength)


[warning] 97-97: Target body for "clean" exceeds allowed length of 5 (7).

(maxbodylength)


[warning] 15-15: Missing required phony target "test"

(minphony)

🪛 Ruff (0.12.2)
tools/metrics-extractor/validate.py

7-7: json imported but unused

Remove unused import: json

(F401)


17-17: tree_sitter imported but unused; consider using importlib.util.find_spec to test for availability

(F401)


25-25: metrics_parser.build_treesitter_cpp_library imported but unused; consider using importlib.util.find_spec to test for availability

(F401)


25-25: metrics_parser.extract_metrics_from_files imported but unused; consider using importlib.util.find_spec to test for availability

(F401)


26-26: metrics_bag.MetricsBag imported but unused; consider using importlib.util.find_spec to test for availability

(F401)


49-49: f-string without any placeholders

Remove extraneous f prefix

(F541)

tools/metrics-extractor/metrics_extractor.py

6-6: re imported but unused

Remove unused import: re

(F401)


12-12: metrics_bag.MetricsBag imported but unused

Remove unused import: metrics_bag.MetricsBag

(F401)


236-239: Combine if branches using logical or operator

Combine if branches

(SIM114)


238-241: Combine if branches using logical or operator

Combine if branches

(SIM114)


240-243: Combine if branches using logical or operator

Combine if branches

(SIM114)


455-455: f-string without any placeholders

Remove extraneous f prefix

(F541)

tools/metrics/compare_metrics.py

12-12: typing.Tuple imported but unused

Remove unused import

(F401)


12-12: typing.Optional imported but unused

Remove unused import

(F401)


231-231: f-string without any placeholders

Remove extraneous f prefix

(F541)


234-234: f-string without any placeholders

Remove extraneous f prefix

(F541)


243-243: f-string without any placeholders

Remove extraneous f prefix

(F541)


251-251: f-string without any placeholders

Remove extraneous f prefix

(F541)


272-272: f-string without any placeholders

Remove extraneous f prefix

(F541)


280-280: f-string without any placeholders

Remove extraneous f prefix

(F541)


287-287: f-string without any placeholders

Remove extraneous f prefix

(F541)

tools/metrics-extractor/metrics_parser.py

1-1: os imported but unused

Remove unused import: os

(F401)


3-3: subprocess imported but unused

Remove unused import: subprocess

(F401)


231-233: Use a single if statement instead of nested if statements

(SIM102)


309-309: f-string without any placeholders

Remove extraneous f prefix

(F541)


426-430: Use a single if statement instead of nested if statements

(SIM102)


428-430: Use a single if statement instead of nested if statements

(SIM102)


489-490: Use a single if statement instead of nested if statements

(SIM102)


553-554: Use a single if statement instead of nested if statements

(SIM102)


619-621: Use a single if statement instead of nested if statements

(SIM102)


637-638: Use a single if statement instead of nested if statements

(SIM102)


681-682: Use a single if statement instead of nested if statements

(SIM102)


698-699: Use a single if statement instead of nested if statements

(SIM102)

tools/metrics-extractor/metrics_bag.py

3-3: uuid imported but unused

Remove unused import: uuid

(F401)


125-125: Loop control variable name not used within loop body

Rename unused name to _name

(B007)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Redirect rules - docs-extensions-and-macros
  • GitHub Check: Header rules - docs-extensions-and-macros
  • GitHub Check: Pages changed - docs-extensions-and-macros
🔇 Additional comments (33)
tools/metrics/metrics.py (4)

123-142: Well-implemented filtering function for documentation cleanup.

The filter_metrics_for_docs function effectively addresses the need to exclude histogram suffixes and handle duplicates for documentation while preserving complete data for JSON output. The logic is sound and the logging provides good visibility into filtering operations.


183-184: Improved error handling for directory creation.

Good change from logging an error and exiting to actually creating the directory when it doesn't exist. This makes the script more robust and user-friendly.


207-209: Proper separation of filtered vs unfiltered metrics.

The approach of filtering metrics specifically for documentation (AsciiDoc) while keeping the original unfiltered data for JSON output is well-designed. This ensures documentation clarity without losing data completeness.


223-224: Consistent application of filtering to both metric types.

Both public and internal metrics are properly filtered using the new function before AsciiDoc generation, ensuring consistent documentation output across both metric types.

README.adoc (1)

969-1008: Excellent documentation for the new CLI tools.

The new CLI Tools section provides clear and comprehensive documentation for the metrics automation features introduced in this PR. The documentation:

  • Clearly explains the three different workflows (source-based, local repo, legacy Docker)
  • Documents the output files users can expect
  • Provides practical usage examples with appropriate command syntax
  • Includes helpful context about when to use each approach

This aligns perfectly with the PR objectives for introducing reliable automated metrics documentation generation.

tools/metrics-extractor/tests/test_extraction.py (2)

1-61: Well-structured test with comprehensive sample C++ code.

The test provides good coverage of the metrics extraction functionality with a realistic sample that includes various metric constructor types. The C++ code structure accurately represents what would be found in the Redpanda codebase.


88-104: Robust test validation with good error reporting.

The test validation logic effectively checks both the presence and type correctness of extracted metrics. The visual feedback with checkmarks and error messages makes test results easy to interpret.

docker-compose/25.1/docker-compose.yml (3)

54-54: Improved healthcheck simplification.

Changing the healthcheck from rpk cluster info to rpk version is a good improvement. The rpk version command is faster, more reliable, and doesn't require cluster connectivity, making the healthcheck more robust during startup.


59-62: Enhanced service dependency management.

The explicit depends_on conditions with service_started, service_healthy, and service_completed_successfully ensure proper startup sequencing. This is particularly important for the metrics extraction workflow where stable cluster initialization is critical.

Also applies to: 93-98, 129-134


402-407: Modernized MinIO client configuration.

The update from mc config host add to mc alias set aligns with current MinIO client best practices. The explicit exit with status 0 provides cleaner completion signaling compared to the previous approach of tailing /dev/null.

cli-utils/generate-cluster-docs.sh (5)

5-8: Excellent logging enhancement for better user experience.

The log_step function with timestamps significantly improves the user experience by providing clear progress indicators and timing information. This is particularly valuable for long-running operations like cluster setup.


25-28: Good pre-flight check addition.

Adding the curl dependency check is a smart improvement since the script uses curl for metrics endpoint polling. This provides early failure detection with clear error messaging.


78-92: Much improved readiness checking with polling.

Replacing the fixed 300-second sleep with an intelligent polling loop is a significant improvement. The script now:

  • Actively checks endpoint readiness instead of blindly waiting
  • Provides faster execution when the endpoint becomes ready early
  • Includes proper timeout handling with clear error messaging
  • Gives users better feedback about the waiting process

103-103: Correct path update for new metrics extractor.

The path change from metrics/requirements.txt to metrics-extractor/requirements.txt properly aligns with the new metrics extraction tooling structure introduced in this PR.


113-116: Updated command reflects new metrics extraction workflow.

The command update from the old metrics.py script to the new metrics_extractor.py with explicit JSON and separate internal/external AsciiDoc outputs correctly implements the new source-based metrics documentation generation approach described in the PR objectives.

tools/metrics-extractor/README.adoc (1)

1-73: Well-documented tool with clear usage instructions

The README provides comprehensive documentation for the metrics extractor tool, including quick start guides, technical details, prerequisites, and development commands. The structure and content are excellent.

tools/metrics-extractor/validate.py (1)

58-169: Well-structured validation script

The validation functions and main orchestration logic are well-implemented. The script provides clear feedback at each step and handles errors appropriately.

bin/doc-tools.js (3)

267-286: Dependencies check looks good

The new verifyMetricsExtractorDependencies function properly checks for all required dependencies including make, Python 3.10+, Git, and a C++ compiler.


418-431: Good improvements to cluster docs execution

The enhanced logging, timing, and error reporting provide much better user feedback during the cluster documentation generation process.


505-551: Legacy command properly renamed with enhanced logging

Good job renaming the legacy command and adding enhanced logging for better user feedback. This appropriately separates the legacy Docker-based approach from the new source-based extraction.

tools/metrics-extractor/metrics_bag.py (1)

9-211: Well-designed metrics container class

The MetricsBag class is well-implemented with comprehensive functionality for storing, merging, and managing metrics. The deterministic ID generation, proper merge logic, and support for multiple output formats make this a solid foundation for the metrics extraction tooling.

tools/metrics-extractor/Makefile (1)

1-155: Well-structured Makefile with comprehensive automation

The Makefile provides excellent automation for the metrics extraction workflow with proper dependency management, error handling, and development convenience targets. The pinned tree-sitter version (v0.20.5) ensures consistent parsing behavior.

tools/metrics-extractor/metrics_extractor.py (5)

28-47: Well-structured file discovery implementation.

The function properly handles both individual files and directories, with appropriate C++ file extensions and recursive search support.


49-59: Correct Tree-sitter initialization.

The function properly initializes the C++ parser and handles library building when needed.


61-107: Well-designed command-line interface.

The argument parser provides a comprehensive set of options with clear help messages and sensible defaults. Good backward compatibility handling for the deprecated --asciidoc option.


109-156: Excellent data normalization functions.

Both clean_description and clean_labels implement thoughtful normalization:

  • Ensuring descriptions end with proper punctuation
  • Smart handling of braced vs. unbraced labels
  • Proper deduplication and sorting

374-463: Well-orchestrated main function with good user feedback.

The main function properly handles:

  • Logging configuration based on verbosity
  • Path validation and parser initialization
  • Clear error messages for missing dependencies
  • Informative output with emoji indicators
  • Comprehensive metrics breakdown

The only minor issue is the unnecessary f-string on line 455.

tools/metrics/compare_metrics.py (1)

30-152: Well-implemented AsciiDoc parser with robust metric extraction.

The parser correctly handles:

  • Multiple heading levels (== and ===)
  • Various metric properties (type, labels, usage, etc.)
  • Error recovery with warning messages

One minor suggestion: Consider using instance variables instead of modifying locals() in _process_section for better code clarity.

tools/metrics-extractor/metrics_parser.py (5)

196-317: Robust variable resolution with appropriate fallbacks.

The function implements a comprehensive strategy for resolving group names:

  1. Local scope search
  2. Broader scope search
  3. Function scope search
  4. Forward declaration search
  5. File-wide search
  6. Emergency fallbacks

This multi-layered approach ensures metrics are rarely lost due to unresolved variables.


761-792: Well-implemented metric name construction.

The function properly:

  • Sanitizes group names by replacing special characters
  • Applies correct prefixes based on metric type (vectorized_ for internal, redpanda_ for external)
  • Handles edge cases with unknown group names

794-905: Good handling of Seastar-specific metric patterns.

Both functions properly extract Seastar metrics with appropriate:

  • Pattern matching for different Seastar APIs
  • Recursive AST traversal
  • Consistent metric metadata extraction

907-1016: Excellent orchestration with critical safeguards.

The function demonstrates:

  • Proper sequencing of different parsing strategies (Seastar then Prometheus)
  • Critical null-check safeguard for group names with multiple fallback strategies
  • Comprehensive error handling
  • Good logging for debugging

The emergency fallback logic (lines 981-998) is particularly well-designed to ensure metrics are never lost due to unresolved group names.


1018-1113: Sophisticated description extraction with C++ concatenation support.

The function excellently handles:

  • Multi-part descriptions from C++ automatic string concatenation
  • Proper filtering of comments and whitespace between strings
  • Position-based string literal tracking
  • Filtering of incomplete format strings

Comment on lines +795 to +828
automation
.command('source-metrics-docs')
.description('Generate metrics documentation from Redpanda source code using tree-sitter')
.option('--tag <tag>', 'Git tag or branch to extract from', 'dev')
.option('--diff <oldTag>', 'Also diff autogenerated metrics from <oldTag> → <tag>')
.action((options) => {
verifyMetricsExtractorDependencies();

const newTag = options.tag;
const oldTag = options.diff;
const cwd = path.resolve(__dirname, '../tools/metrics-extractor');
const make = (tag) => {
console.log(`⏳ Building source-based metrics docs for ${tag}…`);
const r = spawnSync('make', ['build', `TAG=${tag}`], { cwd, stdio: 'inherit' });
if (r.error) {
console.error(`❌ ${r.error.message}`);
process.exit(1);
}
if (r.status !== 0) process.exit(r.status);
};

if (oldTag) {
const oldDir = path.join('autogenerated', oldTag, 'source-metrics');
if (!fs.existsSync(oldDir)) make(oldTag);
}

make(newTag);

if (oldTag) {
diffDirs('source-metrics', oldTag, newTag);
}

process.exit(0);
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Well-implemented command following established patterns

This source-metrics-docs command properly addresses all previous review feedback by using make build and supporting the same options as property-docs.

However, having both metrics-docs and source-metrics-docs commands may confuse users. Consider removing the metrics-docs command (lines 461-503) and renaming this to just metrics-docs as suggested in the previous review.

🤖 Prompt for AI Agents
In bin/doc-tools.js between lines 461 and 503, remove the existing
'metrics-docs' command to avoid user confusion. Then rename the
'source-metrics-docs' command defined around lines 795 to 828 to 'metrics-docs'
to unify the command naming. Ensure all references and descriptions are updated
accordingly to reflect this change.

Comment on lines +54 to 57
test: ["CMD", "rpk", "version"]
interval: 10s
timeout: 15s
retries: 10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Health-check no longer validates broker health
rpk version succeeds even when Redpanda isn’t running; the command only prints the CLI binary’s version and never contacts the broker. The previous rpk cluster info … actually hit the admin API.

This makes depends_on: condition: service_healthy meaningless and may cause race conditions for services relying on a ready broker.

-      test: ["CMD", "rpk", "version"]
+      # Probe the broker over the Admin API; keeps zero external creds.
+      test: ["CMD", "rpk", "cluster", "info", "-X", "brokers=localhost:9644"]
🤖 Prompt for AI Agents
In docker-compose/docker-compose.yml around lines 54 to 57, the health-check
uses `rpk version`, which only checks the CLI version and does not verify broker
health. Replace the health-check command with one that queries the broker's
admin API, such as `rpk cluster info`, to ensure the health-check accurately
reflects the broker's readiness and prevents race conditions for dependent
services.

Comment on lines +147 to +154
$(PYTHON) metrics_extractor.py \
--recursive \
--output $(OUTPUT_DIR)/metrics.json \
--internal-asciidoc $(OUTPUT_DIR)/internal_metrics_reference.adoc \
--external-asciidoc $(OUTPUT_DIR)/public_metrics_reference.adoc \
--filter-namespace redpanda \
--verbose \
$(REDPANDA_PATH)/src
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix incorrect argument name in extract-local target

The extract-local target uses --output but should use --json-output to match the expected argument name in metrics_extractor.py.

 	$(PYTHON) metrics_extractor.py \
 		--redpanda-repo $(REDPANDA_PATH) \
-		--output $(OUTPUT_DIR)/metrics.json \
+		--json-output $(OUTPUT_DIR)/metrics.json \
 		--internal-asciidoc $(OUTPUT_DIR)/internal_metrics_reference.adoc \
 		--external-asciidoc $(OUTPUT_DIR)/public_metrics_reference.adoc \
-		--filter-namespace redpanda \
 		--verbose
-		$(REDPANDA_PATH)/src

Also note that the command seems to have extra arguments that may not be supported:

  • --filter-namespace redpanda
  • $(REDPANDA_PATH)/src as a positional argument

Please verify these arguments are supported by the script.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In tools/metrics-extractor/Makefile around lines 147 to 154, the extract-local
target uses the argument --output which should be changed to --json-output to
match the expected argument in metrics_extractor.py. Additionally, verify if the
arguments --filter-namespace redpanda and the positional argument
$(REDPANDA_PATH)/src are supported by the script; if not, remove or correct them
accordingly to ensure compatibility with metrics_extractor.py.

@@ -0,0 +1,211 @@
import logging
import hashlib
import uuid
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Remove unused import

The uuid module is imported but never used in this file.

 import logging
 import hashlib
-import uuid
 from collections import defaultdict
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
import uuid
import logging
import hashlib
from collections import defaultdict
🧰 Tools
🪛 Ruff (0.12.2)

3-3: uuid imported but unused

Remove unused import: uuid

(F401)

🤖 Prompt for AI Agents
In tools/metrics-extractor/metrics_bag.py at line 3, the uuid module is imported
but not used anywhere in the file. Remove the import statement for uuid to clean
up unused imports and improve code clarity.

@@ -0,0 +1,1127 @@
import os
import re
import subprocess
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Remove unused import.

subprocess is imported but not used in the code.

-import subprocess
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
import subprocess
🧰 Tools
🪛 Ruff (0.12.2)

3-3: subprocess imported but unused

Remove unused import: subprocess

(F401)

🤖 Prompt for AI Agents
In tools/metrics-extractor/metrics_parser.py at line 3, the subprocess module is
imported but never used. Remove the import statement for subprocess to clean up
the code and avoid unnecessary imports.

"""
import os
import sys
import json
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Remove unused import

The json module is imported but never used in this file.

-import json
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
import json
🧰 Tools
🪛 Ruff (0.12.2)

7-7: json imported but unused

Remove unused import: json

(F401)

🤖 Prompt for AI Agents
In tools/metrics-extractor/validate.py at line 7, the json module is imported
but not used anywhere in the file. Remove the import statement for json to clean
up the code and avoid unnecessary imports.

Comment on lines +49 to +51
print(f" ❌ Example script failed:")
print(f" stdout: {result.stdout}")
print(f" stderr: {result.stderr}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Remove unnecessary f-string prefix

The f-string prefix is not needed as there are no placeholders in the string.

-            print(f"  ❌ Example script failed:")
+            print("  ❌ Example script failed:")
             print(f"     stdout: {result.stdout}")
             print(f"     stderr: {result.stderr}")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
print(f" ❌ Example script failed:")
print(f" stdout: {result.stdout}")
print(f" stderr: {result.stderr}")
print(" ❌ Example script failed:")
print(f" stdout: {result.stdout}")
print(f" stderr: {result.stderr}")
🧰 Tools
🪛 Ruff (0.12.2)

49-49: f-string without any placeholders

Remove extraneous f prefix

(F541)

🤖 Prompt for AI Agents
In tools/metrics-extractor/validate.py around lines 49 to 51, the print
statements use f-string prefixes without any placeholders. Remove the
unnecessary f-string prefix from these print statements to simplify the code.


import re
import argparse
from typing import Dict, List, Tuple, Optional
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Remove unused imports.

Tuple and Optional are imported but not used in the code.

-from typing import Dict, List, Tuple, Optional
+from typing import Dict, List
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
from typing import Dict, List, Tuple, Optional
-from typing import Dict, List, Tuple, Optional
+from typing import Dict, List
🧰 Tools
🪛 Ruff (0.12.2)

12-12: typing.Tuple imported but unused

Remove unused import

(F401)


12-12: typing.Optional imported but unused

Remove unused import

(F401)

🤖 Prompt for AI Agents
In tools/metrics/compare_metrics.py at line 12, the imports Tuple and Optional
are not used anywhere in the code. Remove Tuple and Optional from the import
statement to clean up unused imports, leaving only Dict and List.

Comment on lines +231 to +234
differences.append(f"Labels differ")

if metric1.usage != metric2.usage:
differences.append(f"Usage differs")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Remove unnecessary f-string prefixes.

These strings don't contain any placeholders.

-            differences.append(f"Labels differ")
+            differences.append("Labels differ")
         
         if metric1.usage != metric2.usage:
-            differences.append(f"Usage differs")
+            differences.append("Usage differs")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
differences.append(f"Labels differ")
if metric1.usage != metric2.usage:
differences.append(f"Usage differs")
differences.append("Labels differ")
if metric1.usage != metric2.usage:
differences.append("Usage differs")
🧰 Tools
🪛 Ruff (0.12.2)

231-231: f-string without any placeholders

Remove extraneous f prefix

(F541)


234-234: f-string without any placeholders

Remove extraneous f prefix

(F541)

🤖 Prompt for AI Agents
In tools/metrics/compare_metrics.py around lines 231 to 234, remove the
unnecessary f-string prefixes from the strings "Labels differ" and "Usage
differs" since they do not contain any placeholders. Replace the f-strings with
regular string literals to simplify the code.

Comment on lines +243 to +287
print(f"METRICS COMPARISON REPORT")
print(f"{'='*60}")
print(f"File 1 ({file1_name}): {results['total_file1']} metrics")
print(f"File 2 ({file2_name}): {results['total_file2']} metrics")
print(f"Common metrics: {results['common_count']}")

# Heading level differences
if results['heading_level_differences']:
print(f"\n📏 HEADING LEVEL DIFFERENCES:")
print(f" Count: {len(results['heading_level_differences'])}")
for item in results['heading_level_differences']:
print(f" - {item['name']}: {item['file1_level']} vs {item['file2_level']}")

# Metrics only in file 1 (should be removed)
if results['file1_unique']:
print(f"\n🗑️ METRICS TO REMOVE (only in {file1_name}):")
print(f" Count: {len(results['file1_unique'])}")
for metric in results['file1_unique']:
print(f" - {metric}")

# Metrics only in file 2 (missing from file 1)
if results['file2_unique']:
print(f"\n📝 METRICS MISSING FROM {file1_name}:")
print(f" Count: {len(results['file2_unique'])}")
for metric in results['file2_unique']:
print(f" - {metric}")

# Description improvements
if results['improved_descriptions']:
print(f"\n✨ POTENTIAL DESCRIPTION IMPROVEMENTS:")
print(f" Count: {len(results['improved_descriptions'])}")

for item in results['improved_descriptions']:
print(f"\n 📊 {item['name']}:")
print(f" Similarity: {item['similarity']:.2f}")

if item['likely_improvement']:
print(f" 🔍 LIKELY IMPROVEMENT (File 1 has longer description)")

print(f" File 1: {item['file1_desc'][:100]}{'...' if len(item['file1_desc']) > 100 else ''}")
print(f" File 2: {item['file2_desc'][:100]}{'...' if len(item['file2_desc']) > 100 else ''}")

# Other property differences
if results['different_properties']:
print(f"\n🔧 OTHER PROPERTY DIFFERENCES:")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Remove unnecessary f-string prefixes throughout the function.

Multiple strings don't contain placeholders and don't need the f-prefix.

-    print(f"METRICS COMPARISON REPORT")
+    print("METRICS COMPARISON REPORT")
-    print(f"\n📏 HEADING LEVEL DIFFERENCES:")
+    print("\n📏 HEADING LEVEL DIFFERENCES:")
-    print(f"\n✨ POTENTIAL DESCRIPTION IMPROVEMENTS:")
+    print("\n✨ POTENTIAL DESCRIPTION IMPROVEMENTS:")
-    print(f"      🔍 LIKELY IMPROVEMENT (File 1 has longer description)")
+    print("      🔍 LIKELY IMPROVEMENT (File 1 has longer description)")
-    print(f"\n🔧 OTHER PROPERTY DIFFERENCES:")
+    print("\n🔧 OTHER PROPERTY DIFFERENCES:")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
print(f"METRICS COMPARISON REPORT")
print(f"{'='*60}")
print(f"File 1 ({file1_name}): {results['total_file1']} metrics")
print(f"File 2 ({file2_name}): {results['total_file2']} metrics")
print(f"Common metrics: {results['common_count']}")
# Heading level differences
if results['heading_level_differences']:
print(f"\n📏 HEADING LEVEL DIFFERENCES:")
print(f" Count: {len(results['heading_level_differences'])}")
for item in results['heading_level_differences']:
print(f" - {item['name']}: {item['file1_level']} vs {item['file2_level']}")
# Metrics only in file 1 (should be removed)
if results['file1_unique']:
print(f"\n🗑️ METRICS TO REMOVE (only in {file1_name}):")
print(f" Count: {len(results['file1_unique'])}")
for metric in results['file1_unique']:
print(f" - {metric}")
# Metrics only in file 2 (missing from file 1)
if results['file2_unique']:
print(f"\n📝 METRICS MISSING FROM {file1_name}:")
print(f" Count: {len(results['file2_unique'])}")
for metric in results['file2_unique']:
print(f" - {metric}")
# Description improvements
if results['improved_descriptions']:
print(f"\n✨ POTENTIAL DESCRIPTION IMPROVEMENTS:")
print(f" Count: {len(results['improved_descriptions'])}")
for item in results['improved_descriptions']:
print(f"\n 📊 {item['name']}:")
print(f" Similarity: {item['similarity']:.2f}")
if item['likely_improvement']:
print(f" 🔍 LIKELY IMPROVEMENT (File 1 has longer description)")
print(f" File 1: {item['file1_desc'][:100]}{'...' if len(item['file1_desc']) > 100 else ''}")
print(f" File 2: {item['file2_desc'][:100]}{'...' if len(item['file2_desc']) > 100 else ''}")
# Other property differences
if results['different_properties']:
print(f"\n🔧 OTHER PROPERTY DIFFERENCES:")
print("METRICS COMPARISON REPORT")
print(f"{'='*60}")
print(f"File 1 ({file1_name}): {results['total_file1']} metrics")
print(f"File 2 ({file2_name}): {results['total_file2']} metrics")
print(f"Common metrics: {results['common_count']}")
# Heading level differences
if results['heading_level_differences']:
print("\n📏 HEADING LEVEL DIFFERENCES:")
print(f" Count: {len(results['heading_level_differences'])}")
for item in results['heading_level_differences']:
print(f" - {item['name']}: {item['file1_level']} vs {item['file2_level']}")
# Metrics only in file 1 (should be removed)
if results['file1_unique']:
print(f"\n🗑️ METRICS TO REMOVE (only in {file1_name}):")
print(f" Count: {len(results['file1_unique'])}")
for metric in results['file1_unique']:
print(f" - {metric}")
# Metrics only in file 2 (missing from file 1)
if results['file2_unique']:
print(f"\n📝 METRICS MISSING FROM {file1_name}:")
print(f" Count: {len(results['file2_unique'])}")
for metric in results['file2_unique']:
print(f" - {metric}")
# Description improvements
if results['improved_descriptions']:
print("\n✨ POTENTIAL DESCRIPTION IMPROVEMENTS:")
print(f" Count: {len(results['improved_descriptions'])}")
for item in results['improved_descriptions']:
print(f"\n 📊 {item['name']}:")
print(f" Similarity: {item['similarity']:.2f}")
if item['likely_improvement']:
print(" 🔍 LIKELY IMPROVEMENT (File 1 has longer description)")
print(f" File 1: {item['file1_desc'][:100]}{'...' if len(item['file1_desc']) > 100 else ''}")
print(f" File 2: {item['file2_desc'][:100]}{'...' if len(item['file2_desc']) > 100 else ''}")
# Other property differences
if results['different_properties']:
print("\n🔧 OTHER PROPERTY DIFFERENCES:")
🧰 Tools
🪛 Ruff (0.12.2)

243-243: f-string without any placeholders

Remove extraneous f prefix

(F541)


251-251: f-string without any placeholders

Remove extraneous f prefix

(F541)


272-272: f-string without any placeholders

Remove extraneous f prefix

(F541)


280-280: f-string without any placeholders

Remove extraneous f prefix

(F541)


287-287: f-string without any placeholders

Remove extraneous f prefix

(F541)

🤖 Prompt for AI Agents
In tools/metrics/compare_metrics.py from lines 243 to 287, many print statements
use f-string prefixes unnecessarily where no variable interpolation occurs.
Remove the f-prefix from all string literals that do not contain placeholders to
simplify the code and improve readability.

@paulohtb6
Copy link
Contributor Author

paulohtb6 commented Jul 23, 2025

During testing I identified several false-positives from the generation. Most of the problems were because the parsing was failing to get group_names. Some are EXTREMELY hard to identify via this method, even using the AST.

One in particular I'm considering impossible, because it spawns multiple files.

I'm trying to reduce the number of false-positives, and the latest commit introduces an aggressive check to resolve the group_names.

There's still several print debugs in the code because of this. So the automation must not be merged yet.

@paulohtb6
Copy link
Contributor Author

Just noticed that some labels are also missclassified.

In these two metrics, the autogenerated labels look like this.

=== redpanda_trust_file_crc32c

CRC32C checksum calculated from the contents of the trust file. This value is calculated when a valid certificate is loaded and a trust store is present. Otherwise, the value is zero.

*Type*: gauge

*Labels*:

- `aggregates`
- `label`
- `shard`

---

=== redpanda_truststore_expires_at_timestamp_seconds

Expiry time of the shortest-lived CA in the truststore(seconds since epoch).

*Type*: gauge

*Labels*:

- `aggregates`
- `label`
- `label_instance`

---

On the source code, available here https://github.com/redpanda-data/redpanda/blob/dev/src/v/net/probes.cc we can tell that it's actually area, detail, shard

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants