This repository contains the sample solution for LPCVC 2026 Track 3: AI Generated Images Detection. It also contains instructions for preparing the necessary files for submission.
Our approach is based on Qwen2-VL-2B-Instruct, following the model preparation tutorial from QPM.
Download the complete sample solution zip here. Several files aren't included in the GitHub repository due to file size limits. The zip file has size 1.8 GB, fully unzipped it has size 2.4 GB
Download "Tutorial for Qwen2_VL_2b (IoT)" from Qualcomm Package Manager (QPM).
Starting with the README.md in example1, work sequentially through the example1 and example2 directories to generate the files.
- example1
- PyTorch Model optimization and export using AIMET
- example2
- Preparation and conversion of ONNX model to Qualcomm NN
Get more guidance from our tutorial guidance section
Once all files have been generated, we will prepare the necessary files needed for submission.
submission_files
├── ar*-ar*-cl*
│ └── weight_sharing_model_1_of_1.serialized.bin
├── embedding_weights*.raw
├── inputs.json
├── mask.raw
├── position_ids_cos.raw
├── position_ids_sin.raw
├── serialized_binaries
│ └── veg.serialized.bin
└── tokenizer.json
Note: Our evaluation also supports Qwen2.5-vl based models. Instead of submitting mask.raw, include full_attention_mask.raw and window_attention_mask.raw in you submission files.
All files, excluding inputs.json, should have been generated by running example1 and example2 from the tutorial. We will provide additional instructions for generating inputs.json below.
The ar*-ar*-cl* folder can be named anything within the pattern. This gives contestants more flexibility. In this sample solution, the complete folder name is ar128-ar1-cl2048 and only contains 1 file.
The embedding_weights*.raw file can be named anything within the pattern. This gives contestants more flexibility. In this sample solution, the complete file name is embedding_weights_151936x1536.raw
Tutorial_for_Qwen2_VL_2b_IoT
├── example1
│ ├── Example1A
│ │ └── output_dir
│ │ └── veg_exports
│ │ ├── mask.raw
│ │ ├── position_ids_cos.raw
│ │ └── position_ids_sin.raw
│ └── Example1B
│ └── output_dir
│ ├── embedding_weights_151936x1536.raw
│ └── tokenizer
│ └── tokenizer.json
├── example2
│ ├── Example2A
│ │ └── host_linux
│ │ └── exports
│ │ └── serialized_binaries
│ │ └── veg.serialized.bin
│ └── Example2B
│ └── host_linux
│ └── assets
│ └── artifacts
│ └── ar128-ar1-cl2048
│ └── weight_sharing_model_1_of_1.serialized.bin
└── example3
└── qnn_model_execution.ipynb
└─> Extract several parameters into `inputs.json`
This file contains the parameters needed to run model inference. In the tutorial, it was hard-coded within the notebooks for Qwen2-vl-2B. To support multiple models beyond Qwen2-vl-2B in our inference script, the contestant must provide additional parameters.
inputs.json should be a JSON file containing the following parameters extracted from qnn_model_execution.ipynb in example3:
{
"qwen_vl_processor": "Qwen/Qwen2-VL-2B-Instruct",
"llm_config": "Qwen/Qwen2-VL-2B-Instruct",
"data_preprocess_inp_h": 342,
"data_preprocess_inp_w": 512,
"run_veg_n_tokens": 216,
"run_veg_embedding_dim": 1536,
"genie_config": { "...": "omitted for length" }
}| Parameter | Description | Location within qnn_model_execution.ipynb |
|---|---|---|
| qwen_vl_processor | Name of the qwen vl processor | qwen2_vl_processor = AutoProcessor.from_pretrained("Qwen/Qwen2-VL-2B-Instruct") |
| llm_config | Name of the llm | llm_config = AutoConfig.from_pretrained("Qwen/Qwen2-VL-2B-Instruct", trust_remote_code=True) |
| data_preprocess_inp_h | Image height taken by the data_process function |
inputs = data_preprocess(qwen2_vl_processor, image_file, 342, 512, prompt) |
| data_preprocess_inp_w | Image width taken by the data_process function |
inputs = data_preprocess(qwen2_vl_processor, image_file, 342, 512, prompt) |
| run_veg_n_tokens | Second shape of the output in the run_veg function |
output_data = output_data.reshape((1, 216, 1536)) |
| run_veg_embedding_dim | Third shape of the output in the run_veg function |
output_data = output_data.reshape((1, 216, 1536)) |
| genie_config | The entirety of the json found in the code cell under "Creating Genie Config JSON". Reminder: boolean values true and false must be in lowercase. Do not worry about the paths, we will set them for you. |
genie_config = { ...... } |
Place all necessary submission files mentioned above into a folder named with your team name. The folder should contain:
- 2 subdirectories
- 2 .json files
- 4 .raw files
Zip the folder and also name the zip file with your team name. Your final submission should look like:
team_name.zip/
└── team_name/
├── ar*-ar*-cl*/
├── serialized_binaries/
├── embedding_weights*.raw
├── inputs.json
├── mask.raw
├── position_ids_cos.raw
├── position_ids_sin.raw
└── tokenizer.json
If you don't have a Snapdragon 8 Gen 5 Android device, we provide 3 scripts that can help you evaluate the quantization error of your model. Below are the steps.
Copy llm_inout.py to /Tutorial_for_Qwen2_VL_2b_IoT/example1/Example1B
python llm_inout.py --save_path="./path_to_inout"Need to set submission_num accordingly
python inference_multi.py \
--device_model="Snapdragon 8 Elite Gen 5 QRD" \
--model_id="mm00000xx" \
--load_path="./path_to_input/inputs*.pt" \
--out_path="./output/submission_num"Need to set NUM_TOKEN and BATCH accordingly
python compute_score_multi_aihub.pyIf you have a Snapdragon 8 Gen 5 Android device, here are the steps to run the complete model inference on it. Place all your files directly within the contestant_uploads folder, or set --uploads_dir="./path_to_files".
Run the following command within the conda environment:
python inference_script.pyFiles will be pulled back into the Host_Outputs folder.
In example1/README.md of the official tutorial, there is a typo in the aimetpro-release version tag.
Incorrect:
docker pull artifacts.codelinaro.org/codelinaro-aimet/aimet-dev:1.34.torch-gpu-pt113
cd aimetpro-release-1.34_build-*.torch-gpu-pt113-releaseCorrect:
docker pull artifacts.codelinaro.org/codelinaro-aimet/aimet-dev:1.34.0.torch-gpu-pt113
cd aimetpro-release-1.34.0_build-*.torch-gpu-pt113-releasePlease use the corrected tags above when following example1.
After installing the AIMET environment, you must add the QNN SDK path to your environment variables.
Run the following commands, replace YOUR_QNN_LIB_PATH with the actual path:
QNN_SDK_ROOT="YOUR_QNN_LIB_PATH"
export PYTHONPATH=$QNN_SDK_ROOT/lib/python:\
$QNN_SDK_ROOT/lib/python/qti/aisw/converters/common/linux-x86_64:$PYTHONPATH
export LD_LIBRARY_PATH=$QNN_SDK_ROOT/lib:\
$QNN_SDK_ROOT/lib/x86_64-linux-clang:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$QNN_SDK_ROOT/lib/python/qti/aisw/converters/common/linux-x86_64:\
$LD_LIBRARY_PATHQuantization calibration
- Text:
llava_v1_5_mix665k_300.jsonfrom Hugging Face - Images: COCO 2017 dataset
PPL metric evaluation
- Text:
wiki103_test_long.json
Please refer to this
Qualcomm support link
for details on the wiki103_test_long.json format and conversion instructions.
In Step 1 of Example 1A, when checking and installing dependencies:
- The dependencies under “other lib” can be safely commented out.
If you use this sample solution or refer to the IEEE Low-Power Computer Vision Challenge, please cite the challenge as follows:
@misc{lpcvc,
author = {{IEEE Low Power Computer Vision Challenge Organizing Committee}},
title = {{IEEE Low Power Computer Vision Challenge}},
howpublished = {\url{https://lpcv.ai/}},
note = {Annual competition series on low power computer vision}
}