[TRT-EP] Add loadModelProto APIs #25409

kevinch-nv · 2025-07-15T20:21:52Z

Description

This PR adds three new options for the TRT execution provider:

trt_load_user_initializer
trt_external_data_bytestream
trt_external_data_bytestream_size

The idea is to use these options to leverage new TRT 10.13 APIs to give the user more control on how the weights are loaded in the ONNX parser.

When trt_load_user_initializer is set to true, the EP will own the weights instead of serializing the weights to ModelProto. This reduces overhead in having to serialize large weights.

When trt_external_data_bytestream / trt_external_data_bytestream_size is provided, the refitEngine() function will be able to read from this bytestream directly to extract weights for the refitter.

Also fixes graph_proto_serializer to keep information about external weights.

chilo-ms · 2025-07-15T21:10:25Z

/azp run Linux QNN CI Pipeline, Win_TRT_Minimal_CUDA_Test_CI, Windows ARM64 QNN CI Pipeline, Windows GPU Doc Gen CI Pipeline, Windows x64 QNN CI Pipeline

azure-pipelines · 2025-07-15T21:10:47Z

Azure Pipelines successfully started running 5 pipeline(s).

Copilot

Pull Request Overview

This PR adds support for three new TensorRT execution provider options to leverage TRT 10.13 APIs for improved weight handling and loading control. The changes enable users to manage initializer weights more efficiently by avoiding serialization overhead and providing direct bytestream access for weight refitting.

Key changes include:

Addition of trt_load_user_initializer option to keep weights in memory instead of serializing to ModelProto
Addition of trt_external_data_bytestream and trt_external_data_bytestream_size options for direct weight bytestream access during refitting
Enhanced graph proto serializer to preserve external weight information

Reviewed Changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
tensorrt_provider_factory.cc	Maps new provider options to internal info structure
tensorrt_execution_provider_info.h	Adds new fields for external data bytestream and user initializer loading
tensorrt_execution_provider_info.cc	Implements parsing and serialization for the new provider options
tensorrt_execution_provider.h	Adds TensorrtUserWeights struct and related member variables
tensorrt_execution_provider.cc	Implements core logic for new weight loading and refitting features
onnx_ctx_model_helper.h	Updates constructor signature to accept external data bytestream parameters
onnx_ctx_model_helper.cc	Passes external data bytestream to RefitEngine calls
graph_proto_serializer.cc	Preserves external data location information when excluding initializer data
tensorrt_provider_options.h	Adds new provider option fields to public interface

Comments suppressed due to low confidence (3)

onnxruntime/core/providers/tensorrt/tensorrt_execution_provider_info.cc:64

The constant name 'kGraphIncludeInitializer' is misleading as it actually controls whether to load user initializers, not whether to include them in the graph. Consider renaming to 'kLoadUserInitializer' to match the actual option name.

constexpr const char* kGraphIncludeInitializer = "trt_load_user_initializer";

onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc:2919

[nitpick] The variable 'sizes' should be more descriptive, such as 'weight_sizes' or 'data_sizes', to clarify it represents the sizes of weight data.

    std::vector<int64_t> sizes;

onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc:2917

[nitpick] The variable 'bytes' should be more descriptive, such as 'weight_data' or 'data_pointers', to clarify it represents pointers to weight data.

    std::vector<const char*> bytes;

onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc

jywu-msft · 2025-07-16T06:30:26Z

some build warnings

C4100: 'onnx_external_data_bytestream_size': unreferenced formal parameter

Signed-off-by: Kevin Chen <[email protected]>

kevinch-nv · 2025-07-16T16:18:20Z

@chilo-ms addressed build issues and comments, can you help run CI again

chilo-ms · 2025-07-16T16:45:01Z

/azp run Linux QNN CI Pipeline, Win_TRT_Minimal_CUDA_Test_CI, Windows ARM64 QNN CI Pipeline, Windows GPU Doc Gen CI Pipeline, Windows x64 QNN CI Pipeline

azure-pipelines · 2025-07-16T16:45:22Z

Azure Pipelines successfully started running 5 pipeline(s).

snnn · 2025-07-25T16:34:24Z

Hi there! We haven't cut the release branch for this version yet, so I'm removing the release:1.23.0 label for now to keep things tidy. Thanks so much for your contribution! We'll make sure this gets included when the release is prepared. 🤖

Add loadModelProto APIs

c2d4493

jywu-msft requested review from Copilot and chilo-ms and removed request for Copilot July 15, 2025 21:30

Copilot AI reviewed Jul 15, 2025

View reviewed changes

onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc Outdated Show resolved Hide resolved

onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc Outdated Show resolved Hide resolved

jywu-msft added the release:1.23.0 label Jul 16, 2025

Address review comments, fix build

c7360f1

Signed-off-by: Kevin Chen <[email protected]>

jywu-msft approved these changes Jul 17, 2025

View reviewed changes

jywu-msft merged commit 2536acf into microsoft:main Jul 17, 2025
87 checks passed

gedoensmax mentioned this pull request Jul 23, 2025

[TRT EP] Fix trt_load_user_initializer for large models where weight are not correctly excluded #25502

Merged

snnn removed the release:1.23.0 label Jul 25, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[TRT-EP] Add loadModelProto APIs #25409

[TRT-EP] Add loadModelProto APIs #25409

Uh oh!

kevinch-nv commented Jul 15, 2025 •

edited

Loading

Uh oh!

chilo-ms commented Jul 15, 2025

Uh oh!

azure-pipelines bot commented Jul 15, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

jywu-msft commented Jul 16, 2025

Uh oh!

kevinch-nv commented Jul 16, 2025

Uh oh!

chilo-ms commented Jul 16, 2025

Uh oh!

azure-pipelines bot commented Jul 16, 2025

Uh oh!

Uh oh!

snnn commented Jul 25, 2025

Uh oh!

Uh oh!

[TRT-EP] Add loadModelProto APIs #25409

[TRT-EP] Add loadModelProto APIs #25409

Uh oh!

Conversation

kevinch-nv commented Jul 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Uh oh!

chilo-ms commented Jul 15, 2025

Uh oh!

azure-pipelines bot commented Jul 15, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

jywu-msft commented Jul 16, 2025

Uh oh!

kevinch-nv commented Jul 16, 2025

Uh oh!

chilo-ms commented Jul 16, 2025

Uh oh!

azure-pipelines bot commented Jul 16, 2025

Uh oh!

Uh oh!

snnn commented Jul 25, 2025

Uh oh!

Uh oh!

kevinch-nv commented Jul 15, 2025 •

edited

Loading