⭐ Key components of SynergyAmodal:
- A full completion diffusion model achieving zero-shot generalization and textual controllability.
🔥 News:
-
🥰 Check out our new gradio demo by simply running
python app.py. -
😊 Our paper is accepted by ACMMM 2025.
- create a new conda enviroment
conda create -n synergyamodal python=3.10
conda activate synergyamodal
- install pytorch (or use your own if it is compatible with
xformers)
conda install pytorch==2.2.2 torchvision==0.17.2 torchaudio==2.2.2 pytorch-cuda=11.8 -c pytorch -c nvidia
- install
xformersfor momory-efficient attention
conda install xformers -c xformers
- install
pippackages
pip install -r requirements.txt
- clone this repo:
git clone https://github.com/imlixinyang/SynergyAmodal.git
cd SynergyAmodal
Download the pre-trained model by:
wget https://huggingface.co/cloudyfall/DeoccAnything/resolve/main/vae_ckpt_dir/epoch%3D5-step%3D100000.ckpt?download=true -O vae.ckpt
wget https://huggingface.co/cloudyfall/DeoccAnything/resolve/main/ldm_ckpt_dir/epoch%3D8-step%3D58000.ckpt?download=true -O ldm.ckpt
Download the pre-trained weights of ZIM from the official ZIM repository. We thank the ZIM team for this excellent work.
Download and extract our dataset with:
wget https://huggingface.co/datasets/cloudyfall/SynergyAmodal16K/resolve/main/dataset.tar.gz
tar -xzvf dataset.tar.gz
The original images are from the official EntitySeg repository. Please follow its licences, which only allows for non-commercial research use.
@article{li2025synergyamodal,
title={SynergyAmodal: Deocclude Anything with Text Control},
author={Li, Xinyang and Yi, Chengjie and Lai, Jiawei and Lin, Mingbao and Qu, Yansong and Zhang, Shengchuan and Cao, Liujuan},
journal={arXiv preprint arXiv:2504.19506},
year={2025}
}
Our code and annotations are released under the CC BY-NC-SA 4.0 (Attribution-NonCommercial-ShareAlike 4.0 International) Licensed, for academic research use only.
If you have any questions, please contact me via imlixinyang@gmail.com.
