Skip to content

lpcvai/26LPCVC_Track3_Sample_Solution

Repository files navigation

LPCVC-Track-3

This repository contains the sample solution for LPCVC 2026 Track 3: AI Generated Images Detection. It also contains instructions for preparing the necessary files for submission.

Our approach is based on Qwen2-VL-2B-Instruct, following the model preparation tutorial from QPM.

Sample Solution

Download the complete sample solution zip here. Several files aren't included in the GitHub repository due to file size limits. The zip file has size 1.8 GB, fully unzipped it has size 2.4 GB


0. Download the Tutorial

Download "Tutorial for Qwen2_VL_2b (IoT)" from Qualcomm Package Manager (QPM).

1. Getting started

Starting with the README.md in example1, work sequentially through the example1 and example2 directories to generate the files.

  • example1
    • PyTorch Model optimization and export using AIMET
  • example2
    • Preparation and conversion of ONNX model to Qualcomm NN

Get more guidance from our tutorial guidance section

2. Preparing files for submission

Once all files have been generated, we will prepare the necessary files needed for submission.

Files needed for submission:

submission_files
├── ar*-ar*-cl*
│   └── weight_sharing_model_1_of_1.serialized.bin
├── embedding_weights*.raw
├── inputs.json
├── mask.raw
├── position_ids_cos.raw
├── position_ids_sin.raw
├── serialized_binaries
│   └── veg.serialized.bin
└── tokenizer.json

Note: Our evaluation also supports Qwen2.5-vl based models. Instead of submitting mask.raw, include full_attention_mask.raw and window_attention_mask.raw in you submission files.

All files, excluding inputs.json, should have been generated by running example1 and example2 from the tutorial. We will provide additional instructions for generating inputs.json below.

The ar*-ar*-cl* folder can be named anything within the pattern. This gives contestants more flexibility. In this sample solution, the complete folder name is ar128-ar1-cl2048 and only contains 1 file.

The embedding_weights*.raw file can be named anything within the pattern. This gives contestants more flexibility. In this sample solution, the complete file name is embedding_weights_151936x1536.raw

The needed files/folders can be found in the following locations:

Tutorial_for_Qwen2_VL_2b_IoT
├── example1
│   ├── Example1A
│   │   └── output_dir
│   │       └── veg_exports
│   │           ├── mask.raw
│   │           ├── position_ids_cos.raw
│   │           └── position_ids_sin.raw
│   └── Example1B
│       └── output_dir
│           ├── embedding_weights_151936x1536.raw
│           └── tokenizer
│               └── tokenizer.json
├── example2
│   ├── Example2A
│   │   └── host_linux
│   │       └── exports
│   │           └── serialized_binaries
│   │               └── veg.serialized.bin
│   └── Example2B
│       └── host_linux
│           └── assets
│               └── artifacts
│                   └── ar128-ar1-cl2048
│                       └── weight_sharing_model_1_of_1.serialized.bin
└── example3
    └── qnn_model_execution.ipynb
        └─> Extract several parameters into `inputs.json`

Creating inputs.json

This file contains the parameters needed to run model inference. In the tutorial, it was hard-coded within the notebooks for Qwen2-vl-2B. To support multiple models beyond Qwen2-vl-2B in our inference script, the contestant must provide additional parameters.

inputs.json should be a JSON file containing the following parameters extracted from qnn_model_execution.ipynb in example3:

inputs.json

{
  "qwen_vl_processor": "Qwen/Qwen2-VL-2B-Instruct",
  "llm_config": "Qwen/Qwen2-VL-2B-Instruct",
  "data_preprocess_inp_h": 342,
  "data_preprocess_inp_w": 512,
  "run_veg_n_tokens": 216,
  "run_veg_embedding_dim": 1536,
  "genie_config": { "...": "omitted for length" }
}

Table of parameters

Parameter Description Location within qnn_model_execution.ipynb
qwen_vl_processor Name of the qwen vl processor qwen2_vl_processor = AutoProcessor.from_pretrained("Qwen/Qwen2-VL-2B-Instruct")
llm_config Name of the llm llm_config = AutoConfig.from_pretrained("Qwen/Qwen2-VL-2B-Instruct", trust_remote_code=True)
data_preprocess_inp_h Image height taken by the data_process function inputs = data_preprocess(qwen2_vl_processor, image_file, 342, 512, prompt)
data_preprocess_inp_w Image width taken by the data_process function inputs = data_preprocess(qwen2_vl_processor, image_file, 342, 512, prompt)
run_veg_n_tokens Second shape of the output in the run_veg function output_data = output_data.reshape((1, 216, 1536))
run_veg_embedding_dim Third shape of the output in the run_veg function output_data = output_data.reshape((1, 216, 1536))
genie_config The entirety of the json found in the code cell under "Creating Genie Config JSON". Reminder: boolean values true and false must be in lowercase. Do not worry about the paths, we will set them for you. genie_config = { ...... }

3. Uploading your submission

Place all necessary submission files mentioned above into a folder named with your team name. The folder should contain:

  • 2 subdirectories
  • 2 .json files
  • 4 .raw files

Zip the folder and also name the zip file with your team name. Your final submission should look like:

team_name.zip/
└── team_name/
    ├── ar*-ar*-cl*/
    ├── serialized_binaries/
    ├── embedding_weights*.raw
    ├── inputs.json
    ├── mask.raw
    ├── position_ids_cos.raw
    ├── position_ids_sin.raw
    └── tokenizer.json

4. Evaluating the quantization error (stepwize) for aihub inference

If you don't have a Snapdragon 8 Gen 5 Android device, we provide 3 scripts that can help you evaluate the quantization error of your model. Below are the steps.

1. Generate input and output on the server

Copy llm_inout.py to /Tutorial_for_Qwen2_VL_2b_IoT/example1/Example1B

python llm_inout.py --save_path="./path_to_inout"

2. Submit to aihub

Need to set submission_num accordingly

python inference_multi.py \
  --device_model="Snapdragon 8 Elite Gen 5 QRD" \
  --model_id="mm00000xx" \
  --load_path="./path_to_input/inputs*.pt" \
  --out_path="./output/submission_num"

3. Compute quantization error

Need to set NUM_TOKEN and BATCH accordingly

python compute_score_multi_aihub.py

5. Running Inference

If you have a Snapdragon 8 Gen 5 Android device, here are the steps to run the complete model inference on it. Place all your files directly within the contestant_uploads folder, or set --uploads_dir="./path_to_files".

Run the following command within the conda environment:

python inference_script.py

Files will be pulled back into the Host_Outputs folder.


Tutorial Guidance

1. Tutorial Typo Fix (example1/README.md)

In example1/README.md of the official tutorial, there is a typo in the aimetpro-release version tag.

Incorrect:

docker pull artifacts.codelinaro.org/codelinaro-aimet/aimet-dev:1.34.torch-gpu-pt113
cd aimetpro-release-1.34_build-*.torch-gpu-pt113-release

Correct:

docker pull artifacts.codelinaro.org/codelinaro-aimet/aimet-dev:1.34.0.torch-gpu-pt113
cd aimetpro-release-1.34.0_build-*.torch-gpu-pt113-release

Please use the corrected tags above when following example1.


2. QNN Environment Setup

After installing the AIMET environment, you must add the QNN SDK path to your environment variables.

Run the following commands, replace YOUR_QNN_LIB_PATH with the actual path:

QNN_SDK_ROOT="YOUR_QNN_LIB_PATH"

export PYTHONPATH=$QNN_SDK_ROOT/lib/python:\
$QNN_SDK_ROOT/lib/python/qti/aisw/converters/common/linux-x86_64:$PYTHONPATH

export LD_LIBRARY_PATH=$QNN_SDK_ROOT/lib:\
$QNN_SDK_ROOT/lib/x86_64-linux-clang:$LD_LIBRARY_PATH

export LD_LIBRARY_PATH=$QNN_SDK_ROOT/lib/python/qti/aisw/converters/common/linux-x86_64:\
$LD_LIBRARY_PATH

3. Dataset Sources (Example1A/config/veg_config.json)

Quantization calibration

  • Text: llava_v1_5_mix665k_300.json from Hugging Face
  • Images: COCO 2017 dataset

PPL metric evaluation

  • Text: wiki103_test_long.json

Please refer to this Qualcomm support link for details on the wiki103_test_long.json format and conversion instructions.


4. Dependency Installation Note (Example1A/veg.ipynb)

In Step 1 of Example 1A, when checking and installing dependencies:

  • The dependencies under “other lib” can be safely commented out.
image

Citation

If you use this sample solution or refer to the IEEE Low-Power Computer Vision Challenge, please cite the challenge as follows:

@misc{lpcvc,
  author       = {{IEEE Low Power Computer Vision Challenge Organizing Committee}},
  title        = {{IEEE Low Power Computer Vision Challenge}},
  howpublished = {\url{https://lpcv.ai/}},
  note         = {Annual competition series on low power computer vision}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages