Skip to content

Commit f8ba5cd

Browse files
authored
[docs] Cache link (#12105)
cache
1 parent c9c8217 commit f8ba5cd

File tree

5 files changed

+6
-4
lines changed

5 files changed

+6
-4
lines changed

docs/source/en/api/pipelines/flux.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,8 @@ Original model checkpoints for Flux can be found [here](https://huggingface.co/b
2525

2626
Flux can be quite expensive to run on consumer hardware devices. However, you can perform a suite of optimizations to run it faster and in a more memory-friendly manner. Check out [this section](https://huggingface.co/blog/sd3#memory-optimizations-for-sd3) for more details. Additionally, Flux can benefit from quantization for memory efficiency with a trade-off in inference latency. Refer to [this blog post](https://huggingface.co/blog/quanto-diffusers) to learn more. For an exhaustive list of resources, check out [this gist](https://gist.github.com/sayakpaul/b664605caf0aa3bf8585ab109dd5ac9c).
2727

28+
[Caching](../../optimization/cache) may also speed up inference by storing and reusing intermediate outputs.
29+
2830
</Tip>
2931

3032
Flux comes in the following variants:

docs/source/en/api/pipelines/hidream.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@
1818

1919
<Tip>
2020

21-
Make sure to check out the Schedulers [guide](../../using-diffusers/schedulers) to learn how to explore the tradeoff between scheduler speed and quality, and see the [reuse components across pipelines](../../using-diffusers/loading#reuse-a-pipeline) section to learn how to efficiently load the same components into multiple pipelines.
21+
[Caching](../../optimization/cache) may also speed up inference by storing and reusing intermediate outputs.
2222

2323
</Tip>
2424

docs/source/en/api/pipelines/ltx_video.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -88,7 +88,7 @@ export_to_video(video, "output.mp4", fps=24)
8888
</hfoption>
8989
<hfoption id="inference speed">
9090

91-
[Compilation](../../optimization/fp16#torchcompile) is slow the first time but subsequent calls to the pipeline are faster.
91+
[Compilation](../../optimization/fp16#torchcompile) is slow the first time but subsequent calls to the pipeline are faster. [Caching](../../optimization/cache) may also speed up inference by storing and reusing intermediate outputs.
9292

9393
```py
9494
import torch

docs/source/en/api/pipelines/qwenimage.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ Check out the model card [here](https://huggingface.co/Qwen/Qwen-Image) to learn
2020

2121
<Tip>
2222

23-
Make sure to check out the Schedulers [guide](../../using-diffusers/schedulers) to learn how to explore the tradeoff between scheduler speed and quality, and see the [reuse components across pipelines](../../using-diffusers/loading#reuse-a-pipeline) section to learn how to efficiently load the same components into multiple pipelines.
23+
[Caching](../../optimization/cache) may also speed up inference by storing and reusing intermediate outputs.
2424

2525
</Tip>
2626

docs/source/en/api/pipelines/wan.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -119,7 +119,7 @@ export_to_video(output, "output.mp4", fps=16)
119119
</hfoption>
120120
<hfoption id="T2V inference speed">
121121

122-
[Compilation](../../optimization/fp16#torchcompile) is slow the first time but subsequent calls to the pipeline are faster.
122+
[Compilation](../../optimization/fp16#torchcompile) is slow the first time but subsequent calls to the pipeline are faster. [Caching](../../optimization/cache) may also speed up inference by storing and reusing intermediate outputs.
123123

124124
```py
125125
# pip install ftfy

0 commit comments

Comments
 (0)