[CVPR 2026] VDE: Training-Free Accelerating Rectified Flow Model via Velocity Decomposition and Estimation

Junwen Tan, Jinglin Liang, Hongyuan Chen, Shuangping Huang

South China University of Technology

Figure 1. Comparison between VDE and standard 50-step sampling across Flux, Qwen-Image, and Wan2.1. VDE achieves comparable visual quality with dramatically reduced runtime (up to 3.01× speedup).

💡 Introduction

Though Rectified Flow (RF) models have achieved remarkable performance in visual generation, their practical deployments are challenged by slow inference speeds. Previous training-free acceleration methods typically follow a caching-and-reusing paradigm, neglecting the growing mismatch between static cached values and evolving inputs.

We propose Velocity Decomposition and Estimation (VDE), a novel method that shifts the paradigm from caching-and-reusing to decomposing-and-estimating.

VDE decomposes the model's velocity output into components parallel and orthogonal to the input.
It exploits the temporal predictability of the components' coefficients and the consistency of the orthogonal direction.
VDE periodically anchors the model's state and precisely estimates subsequent outputs analytically in an inherently input-adaptive manner.

VDE achieves up to 2.04× - 3.22× acceleration with minimal loss in visual quality, outperforming the best cache-based baseline by 19.5% in SSIM, 30.3% in PSNR, and reducing LPIPS by 55.4% in image generation.

🔥 Latest News

[2026/06/07] 🗓️ VDE will be presented at CVPR 2026: Sun, Jun 7, 2026, 3:30 PM – 5:30 PM MDT, ExHall A 162.
[2026/05/31] 📄 VDE is available on CVF Open Access.
[2026/05/30] 🚀 The code for VDE is officially released! Supports image and video generation/editing.
[2026/05/22] 📄 VDE is available on arXiv.
[2026/02/21] 🎉 VDE is accepted by CVPR 2026!

🛠️ Supported Models

VDE is highly versatile and supports a wide range of state-of-the-art Rectified Flow models across modalities:

🎨 Image Generation

🎥 Video Generation

🧊 3D Generation

Trellis2

⚡ Performance & Demos

1. FLUX-dev (Text-to-Image)

Baseline Latency (T=50): 8.20s

Method	Speedup ↑	Latency ↓	Steps ↓	SSIM ↑	PSNR ↑	LPIPS ↓	CLIP ↑	ImageReward ↑
VDE-fast	3.01×	2.72 s	16	0.8267	23.19	0.1997	0.3109	0.969
VDE- medium	2.70×	3.04 s	18	0.8499	24.02	0.1679	0.3102	0.973
VDE-slow	2.21×	3.70 s	22	0.8877	25.81	0.1243	0.3095	0.978

2. Qwen-Image (Text-to-Image)

Baseline Latency (T=50): 12.53s

Method	Speedup ↑	Latency ↓	Steps ↓	SSIM ↑	PSNR ↑	LPIPS ↓	CLIP ↑	ImageReward ↑
VDE-fast	2.70×	4.64 s	18	0.8967	25.46	0.1096	0.3163	1.287
VDE-slow	2.04×	6.14 s	24	0.9362	28.58	0.0691	0.3159	1.295

3. Wan2.1 (Text-to-Video)

Baseline Latency (T=50, 81 frames, 832×480): 175.35s

Method	Speedup ↑	Latency ↓	Steps ↓	SSIM ↑	PSNR ↑	LPIPS ↓	VBench (%) ↑
VDE-fast	2.50×	70.11 s	20	0.8658	24.69	0.0754	80.43
VDE-slow	2.08×	84.18 s	24	0.8902	25.92	0.0554	80.32

📋 To-Do List

Release core VDE algorithm and Paper.
Support Text-to-Image (FLUX, Qwen, Z-Image, HiDream).
Support Text-to-Video (Wan2.1, HunyuanVideo, Open-Sora).
Release ComfyUI Custom Nodes.
Upstream PR to Hugging Face diffusers.

💐 Acknowledgement

This project builds upon several excellent open-source projects, including Diffusers, FLUX, Qwen-Image, Z-Image, Wan2.1, and HunyuanVideo. We sincerely thank the authors for their contributions to the community.

🔒 License

This project is licensed under the Apache License 2.0.

📖 Citation

If you find VDE useful for your research or applications, please consider giving us a star ⭐ and citing our paper:

@inproceedings{tan2026vde,
  title={VDE: Training-Free Accelerating Rectified Flow Model via Velocity Decomposition and Estimation},
  author={Tan, Junwen and Liang, Jinglin and Chen, Hongyuan and Huang, Shuangping},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={37918--37928},
  year={2026}
}

Name		Name	Last commit message	Last commit date
Latest commit History 75 Commits
ComfyUI-VDE		ComfyUI-VDE
VDE4CogVideoX1.5		VDE4CogVideoX1.5
VDE4FLUX		VDE4FLUX
VDE4HunyuanVideo1.5		VDE4HunyuanVideo1.5
VDE4Qwen-Image		VDE4Qwen-Image
VDE4Trellis2		VDE4Trellis2
VDE4Wan2.1		VDE4Wan2.1
VDE4Z-image		VDE4Z-image
assets		assets
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

[CVPR 2026] VDE: Training-Free Accelerating Rectified Flow Model via Velocity Decomposition and Estimation

💡 Introduction

🔥 Latest News

🛠️ Supported Models

⚡ Performance & Demos

1. FLUX-dev (Text-to-Image)

2. Qwen-Image (Text-to-Image)

3. Wan2.1 (Text-to-Video)

📋 To-Do List

💐 Acknowledgement

🔒 License

📖 Citation

Star History

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

[CVPR 2026] VDE: Training-Free Accelerating Rectified Flow Model via Velocity Decomposition and Estimation

💡 Introduction

🔥 Latest News

🛠️ Supported Models

⚡ Performance & Demos

1. FLUX-dev (Text-to-Image)

2. Qwen-Image (Text-to-Image)

3. Wan2.1 (Text-to-Video)

📋 To-Do List

💐 Acknowledgement

🔒 License

📖 Citation

Star History

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages