HDPNet: Hourglass Vision Transformer with Dual-Path Feature Pyramid for Camouflaged Object Detection
This repository contains the implementation of HDPNet, a state-of-the-art model for camouflaged object detection. The model uses a combination of Hourglass Vision Transformer and Dual-Path Feature Pyramid architecture to achieve high accuracy in detecting camouflaged objects.
- Hourglass Vision Transformer architecture
- Dual-Path Feature Pyramid for multi-scale feature extraction
- High accuracy on various camouflaged object detection datasets
- Easy-to-use inference script with visualization capabilities
- Create a conda environment:
conda create -n HDPNet python=3.8
conda activate HDPNet- Install required packages:
pip install -r requirements.txtDownload the pretrained model weights from the following link:
After downloading:
- Create a
weightsdirectory in the project root:
mkdir weights- Place the downloaded weights file in the
weightsdirectory:
mv /path/to/downloaded/weights.pth weights/To run inference on test images:
- Place your test images in the
testdatadirectory:
mkdir -p testdata
# Copy your images to testdata/- Run the inference script:
python inference.pyThis will:
- Process all images in the
testdatadirectory - Save predictions in
results/predictions - Create side-by-side visualizations in
results/visualizations - Generate a video visualization in
results/visualization.mp4
Here are some example results showing the original images and their predictions:
Left: Original Image, Right: Prediction
Left: Original Image, Right: Prediction
- Input: RGB images (any size, will be resized to 384x384)
- Output:
- Binary mask predictions (0-255 grayscale)
- Side-by-side visualizations
- Video compilation of results
HDPNet/
├── weights/ # Model weights
├── testdata/ # Input test images
├── results/
│ ├── predictions/ # Binary mask predictions
│ ├── visualizations/ # Side-by-side visualizations
│ └── visualization.mp4 # Video compilation
├── inference.py # Inference script
└── requirements.txt # Dependencies
The model achieves state-of-the-art performance on various camouflaged object detection datasets. For detailed quantitative results, please refer to the paper.
If you use this code in your research, please cite:
@inproceedings{he2025hdpnet,
title={HDPNet: Hourglass Vision Transformer with Dual-Path Feature Pyramid for Camouflaged Object Detection},
author={He, Jinpeng and Liu, Biyuan and Chen, Huaixin},
booktitle={2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
pages={8638--8647},
year={2025},
organization={IEEE}
}This project is licensed under the MIT License - see the LICENSE file for details.