feat(diffusers/pipelines): add hidream #1360

Fzilan · 2025-10-11T08:21:00Z

add hidream pipelines and required modules, comparable with diffusers v0.34.
but currently have oom issues.

…ffusers0.34-dev

gemini-code-assist · 2025-10-11T08:21:26Z

Summary of Changes

Hello @Fzilan, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly expands the mindone.diffusers library by integrating the innovative HiDream-I1 image generation model and its corresponding pipeline. The core of this addition is the HiDreamImageTransformer2DModel, which features a sophisticated Mixture-of-Experts architecture for enhanced performance and leverages multiple text encoders to process diverse textual inputs. The accompanying HiDreamImagePipeline provides a streamlined interface for users to generate high-quality images from text prompts, complete with robust input handling, detailed documentation, and thorough testing to ensure reliability.

Highlights

New HiDream-I1 Model Integration: Introduces the HiDreamImageTransformer2DModel, a novel 2D image transformer model, into the library, designed for advanced image generation tasks.
Mixture-of-Experts (MoE) Architecture: The HiDreamImageTransformer2DModel incorporates a Mixture-of-Experts (MoE) gating mechanism within its feed-forward networks, which can lead to more efficient and potentially higher-quality processing by selectively activating specialized sub-networks.
Multi-modal Pipeline with Multiple Text Encoders: A new HiDreamImagePipeline is added, capable of leveraging multiple text encoders (specifically CLIP, T5, and LlamaForCausalLM) to process rich and diverse textual inputs for enhanced text-to-image generation.
Comprehensive Documentation and Testing: Dedicated documentation files have been created for both the HiDreamImageTransformer2DModel and HiDreamImagePipeline, alongside new unit tests to ensure the correctness, numerical consistency, and reliability of these new components across various data types.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request adds support for the HiDream-I1 model by introducing the HiDreamImageTransformer2DModel and HiDreamImagePipeline. The changes include the core model and pipeline implementations, along with corresponding documentation and tests. Overall, the implementation looks solid, but I've identified a critical bug in the model's initialization that would prevent it from running. Additionally, there are a few areas with leftover TODO/FIXME comments, and some minor typos in documentation and test files that should be addressed to improve code quality and maintainability.

gemini-code-assist · 2025-10-11T08:23:12Z

mindone/diffusers/models/transformers/transformer_hidream_image.py

+
+        self.gradient_checkpointing = False
+
+        self.patch_size = self.patch_size


This line self.patch_size = self.patch_size is a self-assignment that will raise an AttributeError at runtime because self.patch_size has not been initialized on the instance yet. This is a critical bug that will prevent the model from being instantiated.

Based on the surrounding code, it seems the intention was to assign the value from the configuration. It should be self.patch_size = self.config.patch_size.

Suggested change

self.patch_size = self.patch_size

self.patch_size = self.config.patch_size

gemini-code-assist · 2025-10-11T08:23:12Z

mindone/diffusers/models/transformers/transformer_hidream_image.py

+            # FIXME: mindspore lacks tensor.scatter_reduce_
+            # expert_cache.scatter_reduce_(0, exp_token_idx.view(-1, 1).repeat(1, x.shape[-1]), expert_out, reduce="sum")
+            expert_cache.scatter_add_(0, exp_token_idx.view(-1, 1).repeat(1, x.shape[-1]), expert_out)


A FIXME comment indicates that mindspore lacks tensor.scatter_reduce_. The code uses scatter_add_ as a workaround. This could lead to incorrect behavior if the indices in exp_token_idx are not unique for each operation, potentially affecting model correctness. Please verify if scatter_add_ is a safe replacement here. If it is, the comment should be updated to reflect that. If not, a proper implementation of scatter_reduce with sum reduction should be considered.

gemini-code-assist · 2025-10-11T08:23:12Z

docs/diffusers/api/pipelines/hidream.md

+<!-- Copyright 2024 The HuggingFace Team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License. -->


The license header in this new documentation file uses a mix of HTML comments () and hash comments (#). This is inconsistent with other documentation files and looks like a copy-paste error. For better maintainability and consistency, please use only HTML comments for the entire license block.

Suggested change

gemini-code-assist · 2025-10-11T08:23:12Z

mindone/diffusers/models/transformers/transformer_hidream_image.py

+    # TODO check here, notes: hf use ms.float32 if npu else
+    # dtype = ms.float32 if (is_mps or is_npu) else ms.float64
+    dtype = ms.float32


There's a leftover TODO comment and a hardcoded dtype. The comment suggests that the dtype should be conditionally set based on the hardware (e.g., NPU), but it's currently hardcoded to ms.float32. This should be resolved to ensure correct behavior on different hardware platforms.

gemini-code-assist · 2025-10-11T08:23:13Z

mindone/diffusers/pipelines/hidream_image/pipeline_hidream_image.py

+        >>> import mindspore
+        >>> import numpy as np
+        >>> from transformers import AutoTokenizer
+        >>> form mindone.transformers import LlamaForCausalLM


There is a typo in the example code within the docstring. form should be from. This will cause an error for users who copy and paste the example.

Suggested change

>>> form mindone.transformers import LlamaForCausalLM

>>> from mindone.transformers import LlamaForCausalLM

gemini-code-assist · 2025-10-11T08:23:13Z

mindone/diffusers/pipelines/hidream_image/pipeline_hidream_image.py

+                # expand the latents if we are doing classifier free guidance
+                latent_model_input = mint.cat([latents] * 2) if self.do_classifier_free_guidance else latents
+                # broadcast to batch dimension in a way that's compatible with ONNX/Core ML
+                timestep = t.broadcast_to((latent_model_input.shape[0],))  # .to(latents.dtype) ?


The comment # .to(latents.dtype) ? suggests uncertainty about whether a dtype conversion is needed for the timestep tensor. This should be clarified. If the conversion is not needed, the comment should be removed to improve code clarity. If it is needed, the code should be added.

Suggested change

timestep = t.broadcast_to((latent_model_input.shape[0],)) # .to(latents.dtype) ?

timestep = t.broadcast_to((latent_model_input.shape[0],))

gemini-code-assist · 2025-10-11T08:23:13Z

tests/diffusers_tests/modules/modules_test_cases.py

 ]


+HIDREAM_IMAGE_TRANSFORER2D_CASES = [


There is a typo in the variable name HIDREAM_IMAGE_TRANSFORER2D_CASES. It should be HIDREAM_IMAGE_TRANSFORMER2D_CASES to match the model name and for consistency. Please also update its usage on line 1569.

Suggested change

HIDREAM_IMAGE_TRANSFORER2D_CASES = [

HIDREAM_IMAGE_TRANSFORMER2D_CASES = [

Fzilan added 5 commits June 18, 2025 09:53

add hidream

f3cb26f

Merge remote-tracking branch 'upstream/master' into diffusers0.34-dev

8255c09

fix hidream

2321e8e

add docs, fix fast test

5f9b0f2

Merge branch 'diffusers0.34-dev' of github.com:Fzilan/mindone into di…

0d14e0a

…ffusers0.34-dev

Fzilan requested a review from vigo999 as a code owner October 11, 2025 08:21

Fzilan changed the title ~~Diffusers0.34 dev~~ feat(diffusers/pipelines): add hidream Oct 11, 2025

gemini-code-assist bot reviewed Oct 11, 2025

View reviewed changes

fix precommit and typo

8f2ec50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(diffusers/pipelines): add hidream #1360

feat(diffusers/pipelines): add hidream #1360

Uh oh!

Fzilan commented Oct 11, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Oct 11, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Oct 11, 2025

Uh oh!

gemini-code-assist bot Oct 11, 2025

Uh oh!

gemini-code-assist bot Oct 11, 2025

Uh oh!

gemini-code-assist bot Oct 11, 2025

Uh oh!

gemini-code-assist bot Oct 11, 2025

Uh oh!

gemini-code-assist bot Oct 11, 2025

Uh oh!

gemini-code-assist bot Oct 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant


		self.gradient_checkpointing = False

		self.patch_size = self.patch_size

	self.patch_size = self.patch_size
	self.patch_size = self.config.patch_size

	>>> form mindone.transformers import LlamaForCausalLM
	>>> from mindone.transformers import LlamaForCausalLM

	timestep = t.broadcast_to((latent_model_input.shape[0],)) # .to(latents.dtype) ?
	timestep = t.broadcast_to((latent_model_input.shape[0],))

	HIDREAM_IMAGE_TRANSFORER2D_CASES = [
	HIDREAM_IMAGE_TRANSFORMER2D_CASES = [

feat(diffusers/pipelines): add hidream #1360

Are you sure you want to change the base?

feat(diffusers/pipelines): add hidream #1360

Uh oh!

Conversation

Fzilan commented Oct 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot commented Oct 11, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Oct 11, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 11, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 11, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 11, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 11, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 11, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 11, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Fzilan commented Oct 11, 2025 •

edited

Loading