📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉
-
Updated
Feb 27, 2026 - Python
📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉
Pre-built wheels that erase Flash Attention 3 installation headaches.
Demonstration for the Qwen-Image-Edit-2511 model with lazy-loaded LoRA adapters for advanced single- and multi-image editing. Supports 7+ specialized LoRAs including photo-to-anime, multi-angle camera control, pose transfer (Any-Pose), upscaling, style transfer, light migration, and manga tone. Features fast inference (4 steps default).
Qwen-Image-Edit-2509-LoRAs-Fast is a high-performance, user-friendly web application built with Gradio that leverages the advanced Qwen/Qwen-Image-Edit-2509 model from Hugging Face for seamless image editing tasks.
Demonstration for the Qwen/Qwen-Image-Edit-2509 model with 3D lighting control using the Multi-Angle-Lighting LoRA adapter. Allows precise control of light source direction via interactive 3D viewport or sliders for azimuth and elevation..
Toy Flash Attention implementation in torch
Demonstration for the Lightricks LTX-2 Distilled model, enhanced with specialized LoRA adapters for cinematic camera movements (dolly left/right/in/out, jib up/down, static). Generates animated videos from text prompts or input images, with optional prompt enhancement using Gemma-3-12b.
Demonstration for the Qwen/Qwen-Image-Edit-2511 model, specialized in object manipulation via lazy-loaded LoRA adapters. Supports adding or removing specific elements (e.g., logos, accessories, clothing) in single- or multi-image inputs while preserving lighting, realism, and background details. Features precise prompt control and fast inference.
Demonstration for the Qwen/Qwen-Image-Edit-2509 model, enhanced with lazy-loaded LoRA adapters for specialized image editing tasks like texture application, object fusion, material transfer, and light migration. Uses a fused Lightning LoRA for rapid inference (4 steps default)
Add a description, image, and links to the flash-attention-3 topic page so that developers can more easily learn about it.
To associate your repository with the flash-attention-3 topic, visit your repo's landing page and select "manage topics."