Skip to content

PyTorch partial quantization doesn’t apply in TensorRT #1083

@niubiplus2

Description

@niubiplus2

Before Asking

  • I have read the README carefully. 我已经仔细阅读了README上的操作指引。

  • I want to train my custom dataset, and I have read the tutorials for training your custom data carefully and organize my dataset correctly; (FYI: We recommand you to apply the config files of xx_finetune.py.) 我想训练自定义数据集,我已经仔细阅读了训练自定义数据的教程,以及按照正确的目录结构存放数据集。(FYI: 我们推荐使用xx_finetune.py等配置文件训练自定义数据集。)

  • I have pulled the latest code of main branch to run again and the problem still existed. 我已经拉取了主分支上最新的代码,重新运行之后,问题仍不能解决。

Search before asking

  • I have searched the YOLOv6 issues and found no similar questions.

Question

问题描述:我在 PyTorch 中使用 partial quantization 成功量化了模型,但导出 ONNX 后与FP32 一模一样。再用 trtexec --int8 转 TensorRT, 发现是TensorRT 全局自动量化。

疑问:partial quantization 是否只能在 PyTorch 内生效?还是有方法让它在 ONNX/TensorRT 中生效?

复现步骤:

PyTorch partial quantization

python partial_quant.py --weights model.pt --calib-weights model_calib.pt --sensitivity-file sens.txt --quant-boundary 55

ONNX export

python export_onnx.py --weights model_partial.pt

TensorRT

trtexec --onnx=model_partial.onnx --fp16 --int8 --saveEngine=model.trt

Additional

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions