Skip to content

[SIT] EISW-204381: Add per output threshold#34352

Open
svigh wants to merge 13 commits intoopenvinotoolkit:masterfrom
svigh:EISW-204381_add_per_output_threshold
Open

[SIT] EISW-204381: Add per output threshold#34352
svigh wants to merge 13 commits intoopenvinotoolkit:masterfrom
svigh:EISW-204381_add_per_output_threshold

Conversation

@svigh
Copy link
Contributor

@svigh svigh commented Feb 26, 2026

Details:

  • Changed argument parsing to be able to pass different threshold value per layer
  • Moved mAP metric helpers to separate files
  • Change default value for NRMSE metric to 0.02 instead of

Tickets:

AI Assistance:

  • AI assistance used: yes
  • Generated the per layer value parsing of argument strings, tested manually on PTL RVP

@svigh svigh requested review from a team as code owners February 26, 2026 10:11
@github-actions github-actions bot added the category: NPU OpenVINO NPU plugin label Feb 26, 2026
@svigh svigh requested a review from Copilot February 26, 2026 13:52
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for per-output (per-layer) threshold specification across all metric comparison modes in the single-image-test tool. Previously, all metrics used global threshold values; now users can specify different thresholds for different output layers using the syntax "layer1:value1;layer2:value2" or continue using single global values.

Changes:

  • Extracted mAP metric helper functions to separate files (map_metric_helpers.hpp/cpp) for better code organization
  • Implemented per-layer threshold parsing infrastructure in new argument_parse_helpers.hpp/cpp files
  • Changed NRMSE default threshold from 1.0 to 0.02
  • Updated all metric test functions to support per-layer thresholds while maintaining backward compatibility with single global values

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
map_metric_helpers.hpp Declares structures (Detection, MatchResult) and functions for mAP calculation, extracted from main.cpp
map_metric_helpers.cpp Implements IoU calculation, detection parsing, matching, and average precision computation for mAP metric
argument_parse_helpers.hpp Declares all command-line flags (previously in main.cpp), provides utility classes (CaselessEq), and declares per-layer value parsing functions
argument_parse_helpers.cpp Defines all DEFINE_string/DEFINE_* flags with updated help text mentioning per-layer support, implements parsePerLayerValues and getValueForLayer functions, and moves parseCommandLine function from main.cpp
main.cpp Integrates per-layer threshold parsing into all metric test functions (RAW, CoSim, RRMSE, NRMSE, L2Norm, PSNR, Classification, SSD, YOLO, MeanIoU), removes flag definitions and helper code now in argument_parse_helpers, and updates PSNR to handle multi-output models differently

Comment on lines 1852 to 1896
// If there's only one output, use the global PSNR metric
if (outputs.size() == 1) {
auto refOutput = npu::utils::parseTensorsAsFP32(references);
auto actOutput = npu::utils::parseTensorsAsFP32(outputs);

auto result = utils::runPSNRMetric(actOutput, refOutput, dstHeight, dstWidth, scaleBorder, normalizedImage);
auto result = utils::runPSNRMetric(actOutput, refOutput, dstHeight, dstWidth, scaleBorder, normalizedImage);

if (FLAGS_psnr_reference - result > FLAGS_psnr_tolerance) {
std::cout << "Absolute difference between actual value " << result << " and reference value "
<< FLAGS_psnr_reference << " larger then tolerance " << FLAGS_psnr_tolerance << std::endl;
return false;
// Get per-layer or global values
std::string layerName = outputs.begin()->first;
double psnrReference = utils::getValueForLayer(referenceMap, layerName);
double psnrTolerance = utils::getValueForLayer(toleranceMap, layerName);

if (psnrReference - result > psnrTolerance) {
std::cout << "Absolute difference between actual value " << result << " and reference value "
<< psnrReference << " larger then tolerance " << psnrTolerance << std::endl;
return false;
}
} else {
// For multiple outputs, compute PSNR per layer
for (const auto& [layerName, outputTensor] : outputs) {
auto refIterator = references.find(layerName);
OPENVINO_ASSERT(refIterator != references.end());

TensorMap singleOutput = {{layerName, outputTensor}};
TensorMap singleReference = {{layerName, refIterator->second}};

auto refOutput = npu::utils::parseTensorsAsFP32(singleReference);
auto actOutput = npu::utils::parseTensorsAsFP32(singleOutput);

auto result = utils::runPSNRMetric(actOutput, refOutput, dstHeight, dstWidth, scaleBorder, normalizedImage);

// Get per-layer values
double psnrReference = utils::getValueForLayer(referenceMap, layerName);
double psnrTolerance = utils::getValueForLayer(toleranceMap, layerName);

std::cout << "Layer: " << layerName << " - ";
if (psnrReference - result > psnrTolerance) {
std::cout << "Absolute difference between actual value " << result << " and reference value "
<< psnrReference << " larger then tolerance " << psnrTolerance << std::endl;
allPassed = false;
} else {
std::cout << "PASSED (PSNR: " << result << ", Reference: " << psnrReference << ")" << std::endl;
}
}
}
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PSNR test behavior has changed in a potentially breaking way. Previously, for models with multiple outputs, all outputs were combined into a single PSNR calculation. Now, the code branches at line 1853: if there's exactly one output, it combines all tensors (which is just one), but if there are multiple outputs, it evaluates each layer separately at lines 1870-1895.

This is a functional change, not just adding per-layer threshold support. If this is intentional, it should be documented in the PR description. If not, the multi-output case should still combine all outputs like the original implementation did, while supporting per-layer thresholds for the combined result.

Copilot generated this review using guidance from repository custom instructions.
Comment on lines +220 to +221
} catch (...) {
// Not a single number, parse as key:value pairs
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The catch-all exception handler at line 220 silently treats any invalid single-value input (e.g., "abc") as a key:value format. If the user mistakenly provides an invalid single value, it will be silently ignored and the default will be used without any warning. Consider adding a warning message inside the catch block to inform users that the single-value parse failed and it will be treated as key:value format, or validate that at least one colon exists before proceeding to key:value parsing.

Suggested change
} catch (...) {
// Not a single number, parse as key:value pairs
} catch (const std::exception& e) {
// Not a single number. If there is no colon, we cannot interpret this as "layer:value".
if (str.find(':') == std::string::npos) {
std::cerr << "Warning: Failed to parse per-layer value '" << str
<< "' as a single number and no 'layer:value' pairs were found. "
<< "Using default value " << defaultValue << "." << std::endl;
return result;
}
// Otherwise, fall through and try to parse as key:value pairs.

Copilot uses AI. Check for mistakes.
@svigh svigh added this pull request to the merge queue Mar 16, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Mar 16, 2026
@github-actions github-actions bot added the category: build OpenVINO cmake script / infra label Mar 16, 2026
@svigh svigh requested a review from a team as a code owner March 16, 2026 14:00
@svigh svigh enabled auto-merge March 16, 2026 14:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: build OpenVINO cmake script / infra category: NPU OpenVINO NPU plugin

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants