-
Notifications
You must be signed in to change notification settings - Fork 88
feat(diffusers/pipelines): add hidream #1360
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Summary of ChangesHello @Fzilan, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly expands the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request adds support for the HiDream-I1 model by introducing the HiDreamImageTransformer2DModel
and HiDreamImagePipeline
. The changes include the core model and pipeline implementations, along with corresponding documentation and tests. Overall, the implementation looks solid, but I've identified a critical bug in the model's initialization that would prevent it from running. Additionally, there are a few areas with leftover TODO
/FIXME
comments, and some minor typos in documentation and test files that should be addressed to improve code quality and maintainability.
|
||
self.gradient_checkpointing = False | ||
|
||
self.patch_size = self.patch_size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line self.patch_size = self.patch_size
is a self-assignment that will raise an AttributeError
at runtime because self.patch_size
has not been initialized on the instance yet. This is a critical bug that will prevent the model from being instantiated.
Based on the surrounding code, it seems the intention was to assign the value from the configuration. It should be self.patch_size = self.config.patch_size
.
self.patch_size = self.patch_size | |
self.patch_size = self.config.patch_size |
# FIXME: mindspore lacks tensor.scatter_reduce_ | ||
# expert_cache.scatter_reduce_(0, exp_token_idx.view(-1, 1).repeat(1, x.shape[-1]), expert_out, reduce="sum") | ||
expert_cache.scatter_add_(0, exp_token_idx.view(-1, 1).repeat(1, x.shape[-1]), expert_out) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A FIXME
comment indicates that mindspore
lacks tensor.scatter_reduce_
. The code uses scatter_add_
as a workaround. This could lead to incorrect behavior if the indices in exp_token_idx
are not unique for each operation, potentially affecting model correctness. Please verify if scatter_add_
is a safe replacement here. If it is, the comment should be updated to reflect that. If not, a proper implementation of scatter_reduce
with sum reduction should be considered.
<!-- Copyright 2024 The HuggingFace Team. All rights reserved. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. --> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The license header in this new documentation file uses a mix of HTML comments (<!-- ... -->
) and hash comments (#
). This is inconsistent with other documentation files and looks like a copy-paste error. For better maintainability and consistency, please use only HTML comments for the entire license block.
<!-- Copyright 2024 The HuggingFace Team. All rights reserved. | |
# | |
# Licensed under the Apache License, Version 2.0 (the "License"); | |
# you may not use this file except in compliance with the License. | |
# You may obtain a copy of the License at | |
# | |
# http://www.apache.org/licenses/LICENSE-2.0 | |
# | |
# Unless required by applicable law or agreed to in writing, software | |
# distributed under the License is distributed on an "AS IS" BASIS, | |
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | |
# See the License for the specific language governing permissions and | |
# limitations under the License. --> | |
<!-- Copyright 2024 The HuggingFace Team. All rights reserved. | |
Licensed under the Apache License, Version 2.0 (the "License"); | |
you may not use this file except in compliance with the License. | |
You may obtain a copy of the License at | |
http://www.apache.org/licenses/LICENSE-2.0 | |
Unless required by applicable law or agreed to in writing, software | |
distributed under the License is distributed on an "AS IS" BASIS, | |
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | |
See the License for the specific language governing permissions and | |
limitations under the License. --> |
# TODO check here, notes: hf use ms.float32 if npu else | ||
# dtype = ms.float32 if (is_mps or is_npu) else ms.float64 | ||
dtype = ms.float32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
>>> import mindspore | ||
>>> import numpy as np | ||
>>> from transformers import AutoTokenizer | ||
>>> form mindone.transformers import LlamaForCausalLM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# expand the latents if we are doing classifier free guidance | ||
latent_model_input = mint.cat([latents] * 2) if self.do_classifier_free_guidance else latents | ||
# broadcast to batch dimension in a way that's compatible with ONNX/Core ML | ||
timestep = t.broadcast_to((latent_model_input.shape[0],)) # .to(latents.dtype) ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment # .to(latents.dtype) ?
suggests uncertainty about whether a dtype conversion is needed for the timestep
tensor. This should be clarified. If the conversion is not needed, the comment should be removed to improve code clarity. If it is needed, the code should be added.
timestep = t.broadcast_to((latent_model_input.shape[0],)) # .to(latents.dtype) ? | |
timestep = t.broadcast_to((latent_model_input.shape[0],)) |
] | ||
|
||
|
||
HIDREAM_IMAGE_TRANSFORER2D_CASES = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add hidream pipelines and required modules, comparable with diffusers v0.34.
but currently have oom issues.