-
Notifications
You must be signed in to change notification settings - Fork 60
Diffusers support #604
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Diffusers support #604
Changes from all commits
9d41cd6
949d86b
4aae90f
87e4e6e
ba05487
210423f
a20f9b8
f63d608
1219674
2163dc8
c4c64af
418a394
43b0f8c
411147f
9ca88b6
3a1fd5c
2c844c9
8f3f43e
bb41e2c
5d68aee
41f7a7b
e3cb255
a9c54b1
9775dab
fbc9ca3
02d5b9b
0a159f2
f84f3ac
9323495
e7d1f6c
eed9bdd
dd04723
31a0077
1b08fa9
02c1c05
c10ee5a
424fc83
31fd747
b2f8668
30a2c46
cfe61b9
901d293
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,95 @@ | ||
|
|
||
| <div align="center"> | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. move it to docs folder and create a new page for diffusers |
||
|
|
||
|
|
||
| # **Diffusion Models on Qualcomm Cloud AI 100** | ||
|
|
||
|
|
||
| <div align="center"> | ||
|
|
||
| ### 🎨 **Experience the Future of AI Image Generation** | ||
|
|
||
| * Optimized for Qualcomm Cloud AI 100* | ||
|
|
||
| <img src="../../docs/image/girl_laughing.png" alt="Sample Output" width="400"> | ||
|
|
||
| **Generated with**: `black-forest-labs/FLUX.1-schnell` • `"A girl laughing"` • 4 steps • 0.0 guidance scale • ⚡ | ||
|
|
||
|
|
||
|
|
||
| </div> | ||
|
|
||
|
|
||
|
|
||
| [](https://github.com/huggingface/diffusers) | ||
| </div> | ||
|
|
||
| --- | ||
|
|
||
| ## ✨ Overview | ||
|
|
||
| QEfficient Diffusers brings the power of state-of-the-art diffusion models to Qualcomm Cloud AI 100 hardware for text-to-image generation. Built on top of the popular HuggingFace Diffusers library, our optimized pipeline provides seamless inference on Qualcomm Cloud AI 100 hardware. | ||
|
|
||
| ## 🛠️ Installation | ||
|
|
||
| ### Prerequisites | ||
|
|
||
| Ensure you have Python 3.8+ and the required dependencies: | ||
|
|
||
| ```bash | ||
| # Create Python virtual environment (Recommended Python 3.10) | ||
| sudo apt install python3.10-venv | ||
| python3.10 -m venv qeff_env | ||
| source qeff_env/bin/activate | ||
| pip install -U pip | ||
| ``` | ||
|
|
||
| ### Install QEfficient | ||
|
|
||
| ```bash | ||
| # Install from GitHub (includes diffusers support) | ||
| pip install git+https://github.com/quic/efficient-transformers | ||
|
|
||
| # Or build from source | ||
| git clone https://github.com/quic/efficient-transformers.git | ||
| cd efficient-transformers | ||
| pip install build wheel | ||
| python -m build --wheel --outdir dist | ||
| pip install dist/qefficient-0.0.1.dev0-py3-none-any.whl | ||
| ``` | ||
|
|
||
| --- | ||
|
|
||
| ## 🎯 Supported Models | ||
| - ✅ [`black-forest-labs/FLUX.1-schnell`](https://huggingface.co/black-forest-labs/FLUX.1-schnell) | ||
|
|
||
| --- | ||
|
|
||
|
|
||
| ## 📚 Examples | ||
|
|
||
| Check out our comprehensive examples in the [`examples/diffusers/`](../../examples/diffusers/) directory: | ||
|
|
||
| --- | ||
|
|
||
| ## 🤝 Contributing | ||
|
|
||
| We welcome contributions! Please see our [Contributing Guide](../../CONTRIBUTING.md) for details. | ||
|
|
||
|
|
||
|
|
||
| --- | ||
|
|
||
| ## 🙏 Acknowledgments | ||
|
|
||
| - **HuggingFace Diffusers**: For the excellent foundation library | ||
| - **Stability AI**: For the amazing Stable Diffusion models | ||
| --- | ||
|
|
||
| ## 📞 Support | ||
|
|
||
| - 📖 **Documentation**: [https://quic.github.io/efficient-transformers/](https://quic.github.io/efficient-transformers/) | ||
| - 🐛 **Issues**: [GitHub Issues](https://github.com/quic/efficient-transformers/issues) | ||
|
|
||
| --- | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,6 @@ | ||
| # ----------------------------------------------------------------------------- | ||
| # | ||
| # Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. | ||
| # SPDX-License-Identifier: BSD-3-Clause | ||
| # | ||
| # ---------------------------------------------------------------------------- |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,6 @@ | ||
| # ----------------------------------------------------------------------------- | ||
| # | ||
| # Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. | ||
| # SPDX-License-Identifier: BSD-3-Clause | ||
| # | ||
| # ---------------------------------------------------------------------------- |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,73 @@ | ||
| # ----------------------------------------------------------------------------- | ||
| # | ||
| # Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. | ||
| # SPDX-License-Identifier: BSD-3-Clause | ||
| # | ||
| # ---------------------------------------------------------------------------- | ||
|
|
||
| import torch | ||
| from diffusers.models.attention import JointTransformerBlock, _chunked_feed_forward | ||
|
|
||
|
|
||
| class QEffJointTransformerBlock(JointTransformerBlock): | ||
| def forward( | ||
| self, hidden_states: torch.FloatTensor, encoder_hidden_states: torch.FloatTensor, temb: torch.FloatTensor | ||
quic-amitraj marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| ): | ||
| if self.use_dual_attention: | ||
| norm_hidden_states, gate_msa, shift_mlp, scale_mlp, gate_mlp, norm_hidden_states2, gate_msa2 = self.norm1( | ||
| hidden_states, emb=temb | ||
| ) | ||
| else: | ||
| norm_hidden_states, gate_msa, shift_mlp, scale_mlp, gate_mlp = self.norm1(hidden_states, emb=temb) | ||
|
|
||
| if self.context_pre_only: | ||
| norm_encoder_hidden_states = self.norm1_context(encoder_hidden_states, temb) | ||
| else: | ||
| norm_encoder_hidden_states, c_gate_msa, c_shift_mlp, c_scale_mlp, c_gate_mlp = self.norm1_context( | ||
| encoder_hidden_states, emb=temb | ||
| ) | ||
|
|
||
| # Attention. | ||
| attn_output, context_attn_output = self.attn( | ||
| hidden_states=norm_hidden_states, encoder_hidden_states=norm_encoder_hidden_states | ||
| ) | ||
|
|
||
| # Process attention outputs for the `hidden_states`. | ||
| attn_output = gate_msa.unsqueeze(1) * attn_output | ||
| hidden_states = hidden_states + attn_output | ||
|
|
||
| if self.use_dual_attention: | ||
| attn_output2 = self.attn2(hidden_states=norm_hidden_states2) | ||
| attn_output2 = gate_msa2.unsqueeze(1) * attn_output2 | ||
| hidden_states = hidden_states + attn_output2 | ||
|
|
||
| norm_hidden_states = self.norm2(hidden_states) | ||
| norm_hidden_states = norm_hidden_states * (1 + scale_mlp[:, None]) + shift_mlp[:, None] | ||
| if self._chunk_size is not None: | ||
| # "feed_forward_chunk_size" can be used to save memory | ||
| ff_output = _chunked_feed_forward(self.ff, norm_hidden_states, self._chunk_dim, self._chunk_size) | ||
| else: | ||
| ff_output = self.ff(norm_hidden_states, block_size=4096) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. FeedForward doesnt accept the block_size parameter, why are we passing the block size here? |
||
| ff_output = gate_mlp.unsqueeze(1) * ff_output | ||
|
|
||
| hidden_states = hidden_states + ff_output | ||
|
|
||
| # Process attention outputs for the `encoder_hidden_states`. | ||
| if self.context_pre_only: | ||
| encoder_hidden_states = None | ||
| else: | ||
| context_attn_output = c_gate_msa.unsqueeze(1) * context_attn_output | ||
| encoder_hidden_states = encoder_hidden_states + context_attn_output | ||
|
|
||
| norm_encoder_hidden_states = self.norm2_context(encoder_hidden_states) | ||
| norm_encoder_hidden_states = norm_encoder_hidden_states * (1 + c_scale_mlp[:, None]) + c_shift_mlp[:, None] | ||
| if self._chunk_size is not None: | ||
| # "feed_forward_chunk_size" can be used to save memory | ||
| context_ff_output = _chunked_feed_forward( | ||
| self.ff_context, norm_encoder_hidden_states, self._chunk_dim, self._chunk_size | ||
| ) | ||
| else: | ||
| context_ff_output = self.ff_context(norm_encoder_hidden_states, block_size=333) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same as above FeedForward doesnt accept the block_size parameter, why are we passing the block size here? |
||
| encoder_hidden_states = encoder_hidden_states + c_gate_mlp.unsqueeze(1) * context_ff_output | ||
|
|
||
| return encoder_hidden_states, hidden_states | ||
Uh oh!
There was an error while loading. Please reload this page.