You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* LTX2 condition pipeline initial commit
* Fix pipeline import error
* Implement LTX-2-style general image conditioning
* Blend denoising output and clean latents in sample space instead of velocity space
* make style and make quality
* make fix-copies
* Rename LTX2VideoCondition image to frames
* Update LTX2ConditionPipeline example
* Remove support for image and video in __call__
* Put latent_idx_from_index logic inline
* Improve comment on using the conditioning mask in denoising loop
* Apply suggestions from code review
Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>
* make fix-copies
* Migrate to Python 3.9+ style type annotations without explicit typing imports
* Forward kwargs from preprocess/postprocess_video to preprocess/postprocess resp.
* Center crop LTX-2 conditions following original code
* Duplicate video and audio position ids if using CFG
* make style and make quality
* Remove unused index_type arg to preprocess_conditions
* Add # Copied from for _normalize_latents
* Fix _normalize_latents # Copied from statement
* Add LTX-2 condition pipeline docs
* Remove TODOs
* Support only unpacked latents (5D for video, 4D for audio)
* Remove # Copied from for prepare_audio_latents
---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>
Copy file name to clipboardExpand all lines: docs/source/en/api/pipelines/ltx2.md
+179Lines changed: 179 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -193,6 +193,179 @@ encode_video(
193
193
)
194
194
```
195
195
196
+
## Condition Pipeline Generation
197
+
198
+
You can use `LTX2ConditionPipeline` to specify image and/or video conditions at arbitrary latent indices. For example, we can specify both a first-frame and last-frame condition to perform first-last-frame-to-video (FLF2V) generation:
199
+
200
+
```py
201
+
import torch
202
+
from diffusers import LTX2ConditionPipeline, LTX2LatentUpsamplePipeline
203
+
from diffusers.pipelines.ltx2.latent_upsampler import LTX2LatentUpsamplerModel
204
+
from diffusers.pipelines.ltx2.pipeline_ltx2_condition import LTX2VideoCondition
205
+
from diffusers.pipelines.ltx2.utils importDISTILLED_SIGMA_VALUES, STAGE_2_DISTILLED_SIGMA_VALUES
206
+
from diffusers.pipelines.ltx2.export_utils import encode_video
Because the conditioning is done via latent frames, the 8 data space frames corresponding to the specified latent frame for an image condition will tend to be static.
0 commit comments