Skip to content

imlixinyang/SynergyAmodal

Repository files navigation

SynergyAmodal 😷⇒☺️ : Deocclude Anything with Text Control

teaser_

⭐ Key components of SynergyAmodal:

  • A full completion diffusion model achieving zero-shot generalization and textual controllability.

🔥 News:

  • 🥰 Check out our new gradio demo by simply running python app.py.

  • 😊 Our paper is accepted by ACMMM 2025.

🔧 Installation

  • create a new conda enviroment
conda create -n synergyamodal python=3.10
conda activate synergyamodal
  • install pytorch (or use your own if it is compatible with xformers)
conda install pytorch==2.2.2 torchvision==0.17.2 torchaudio==2.2.2 pytorch-cuda=11.8 -c pytorch -c nvidia
  • install xformers for momory-efficient attention
conda install xformers -c xformers
  • install pip packages
pip install -r requirements.txt
  • clone this repo:
git clone https://github.com/imlixinyang/SynergyAmodal.git
cd SynergyAmodal

Model

Download the pre-trained model by:

wget https://huggingface.co/cloudyfall/DeoccAnything/resolve/main/vae_ckpt_dir/epoch%3D5-step%3D100000.ckpt?download=true -O vae.ckpt
wget https://huggingface.co/cloudyfall/DeoccAnything/resolve/main/ldm_ckpt_dir/epoch%3D8-step%3D58000.ckpt?download=true -O ldm.ckpt

Download the pre-trained weights of ZIM from the official ZIM repository. We thank the ZIM team for this excellent work.

Dataset

Download and extract our dataset with:

wget https://huggingface.co/datasets/cloudyfall/SynergyAmodal16K/resolve/main/dataset.tar.gz
tar -xzvf dataset.tar.gz

The original images are from the official EntitySeg repository. Please follow its licences, which only allows for non-commercial research use.

Citation

@article{li2025synergyamodal,
  title={SynergyAmodal: Deocclude Anything with Text Control},
  author={Li, Xinyang and Yi, Chengjie and Lai, Jiawei and Lin, Mingbao and Qu, Yansong and Zhang, Shengchuan and Cao, Liujuan},
  journal={arXiv preprint arXiv:2504.19506},
  year={2025}
}

License

Our code and annotations are released under the CC BY-NC-SA 4.0 (Attribution-NonCommercial-ShareAlike 4.0 International) Licensed, for academic research use only.

If you have any questions, please contact me via imlixinyang@gmail.com.

About

Code, Dataset, and Models for "SynergyAmodal: Deocclude Anything with Text Control" (ACMMM 2025)

Resources

Stars

Watchers

Forks

Contributors

Languages