Skip to content

Info -> warning: Override performance mode to THROUGHPUT for compilation#34666

Open
DariaMityagina wants to merge 2 commits intoopenvinotoolkit:masterfrom
DariaMityagina:icv/dm/batch-improvements-v3
Open

Info -> warning: Override performance mode to THROUGHPUT for compilation#34666
DariaMityagina wants to merge 2 commits intoopenvinotoolkit:masterfrom
DariaMityagina:icv/dm/batch-improvements-v3

Conversation

@DariaMityagina
Copy link
Contributor

Details:

  • Switching performance mode to THROUGHPUT during compilation when PLUGIN batching is applicable is a non-obvious behavior change, we should warn about it.

Tickets:

  • ticket-id

AI Assistance:

  • AI assistance used: no / yes
  • If yes, summarize how AI was used and what human validation was performed (build/tests/manual checks).

@DariaMityagina DariaMityagina self-assigned this Mar 12, 2026
@DariaMityagina DariaMityagina requested review from a team as code owners March 12, 2026 14:38
@github-actions github-actions bot added the category: NPU OpenVINO NPU plugin label Mar 12, 2026
@@ -689,7 +689,7 @@ std::shared_ptr<ov::ICompiledModel> Plugin::compile_model(const std::shared_ptr<
auto modelToCompile = successfullyDebatched ? batchedModel : model->clone();

if (successfullyDebatched && localConfig.get<PERFORMANCE_HINT>() == ov::hint::PerformanceMode::LATENCY) {
Copy link
Contributor

@Maxim-Doronin Maxim-Doronin Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about:

const bool performanceHintSetByUser = properties.find(ov::hint::performance_mode.name()) != properties.end();

auto modifiedConfig = localConfig;

if (!performanceHintSetByUser) {
    _logger.info("Setting performance mode to THROUGHPUT for batched model compilation");
    std::stringstream strStream;
    strStream << ov::hint::PerformanceMode::THROUGHPUT;
    localConfig.update({{ov::hint::performance_mode.name(), strStream.str()}});
    graph = compileWithConfig(std::move(modelToCompile), modifiedConfig);
} else if (localConfig.get<PERFORMANCE_HINT>() == ov::hint::PerformanceMode::LATENCY) {
    _logger.warning("PERFORMANCE_HINT is explicitly set to LATENCY mode, but batch dimension (N) is detected in the model. The NPU Plugin will reshape the model to batch size 1 and process each batch slice separately.");
    _logger.warning("For optimal performance with batched models, THROUGHPUT mode is highly recommended, as LATENCY mode prevents parallel batch processing.");
    _logger.warning("If batch detection appears incorrect, verify that the input and output layouts are configured properly.");
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks! Good idea!
b1b9bcb

I will run validation with this change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: NPU OpenVINO NPU plugin

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants