Skip to content

[Issue]: Kandinsky 5: mat1 and mat2 must have the same dtype, but got Float and BFloat16 #4459

@liutyi

Description

@liutyi

Issue Description

Is there something should be configured before using Kandinsky 5?

Steps:

clean reinstall of latest dev in docker
download Kandinsky-5.0-T2I
Generate image -> Error

Kandinsky 5: mat1 and mat2 must have the same dtype, but got Float and BFloat16

using default settings.

Version Platform Description

Version: app=sd.next latest=2025-12-10T11:14:13Z hash=886f5770 branch=dev url=https://github.com/vladmandic/sdnext/tree/dev kanvas=main ui=main

Relevant log output

On branch dev
Your branch is up to date with 'origin/dev'.

Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
        modified:   extensions-builtin/sdnext-modernui

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
        modified:   extensions-builtin/sdnext-modernui (new commits)

Activate python venv: /mnt/python/venv
Launch: /mnt/python/venv/bin/python3
11:48:19-288093 INFO     Starting SD.Next
11:48:19-290575 INFO     Logger: file="/app/sdnext.log" level=DEBUG
                         host="0b8869c383cf" size=262103 mode=append
11:48:19-291188 INFO     Python: version=3.12.3 platform=Linux
                         bin="/mnt/python/venv/bin/python3"
                         venv="/mnt/python/venv"
11:48:19-307142 INFO     Version: app=sd.next updated=2025-12-10
                         commit=886f57708 branch=dev
                         url=https://github.com/vladmandic/sdnext/tree/dev
                         kanvas=main ui=dev
11:48:19-677201 TRACE    Repository branches: active=dev available=['dev',
                         'master', 'upstream']
11:48:20-021630 INFO     Version: app=sd.next latest=2025-12-10T11:14:13Z
                         hash=886f5770 branch=dev
11:48:20-023960 INFO     Platform: arch=x86_64 cpu=x86_64 system=Linux
                         release=6.17.0-7-generic python=3.12.3 locale=('C',
                         'UTF-8') docker=True
11:48:20-024670 DEBUG    Packages: prefix=../mnt/python/venv
                         site=['../mnt/python/venv/lib/python3.12/site-packages'
                         ]
11:48:20-025363 INFO     Args: ['-f', '--use-ipex', '--uv', '--listen',
                         '--insecure', '--api-log', '--log', 'sdnext.log']
11:48:20-025830 DEBUG    Setting environment tuning
11:48:20-026224 DEBUG    Torch allocator:
                         "garbage_collection_threshold:0.80,max_split_size_mb:51
                         2"
11:48:20-027148 INFO     Verifying torch installation
11:48:20-027510 DEBUG    Torch overrides: cuda=False rocm=False ipex=True
                         directml=False openvino=False zluda=False
11:48:20-027978 INFO     IPEX: Intel OneAPI toolkit detected
11:48:21-321322 INFO     Torch detected: gpu="Intel(R) Graphics [0x7d51]"
                         vram=126026 units=128
11:48:21-353191 INFO     Install: verifying requirements
11:48:21-359297 DEBUG    Timestamp repository update time: Wed Dec 10 11:14:13
                         2025
11:48:21-360609 DEBUG    Timestamp previous setup time: Wed Dec 10 11:37:08 2025
11:48:21-361526 INFO     Extensions: disabled=[]
11:48:21-362411 INFO     Extensions: path="extensions-builtin"
                         enabled=['stable-diffusion-webui-rembg',
                         'sd-extension-chainner', 'sdnext-kanvas',
                         'sd-extension-system-info', 'sdnext-modernui']
11:48:21-364136 INFO     Extensions: path="/mnt/data/extensions" enabled=[]
11:48:21-365085 DEBUG    Timestamp latest extensions time: Wed Dec 10 11:37:08
                         2025
11:48:21-366012 DEBUG    Timestamp: version:1765365253 setup:1765366628
                         extension:1765366628
11:48:21-366961 INFO     Startup: quick launch
11:48:21-367331 INFO     Extensions: disabled=[]
11:48:21-367628 INFO     Extensions: path="extensions-builtin"
                         enabled=['stable-diffusion-webui-rembg',
                         'sd-extension-chainner', 'sdnext-kanvas',
                         'sd-extension-system-info', 'sdnext-modernui']
11:48:21-368112 INFO     Extensions: path="/mnt/data/extensions" enabled=[]
11:48:21-369560 DEBUG    Extension preload: {'extensions-builtin': 0.0,
                         '/mnt/data/extensions': 0.0}
11:48:21-371021 INFO     Installer time: total=2.27 torch=1.30 latest=0.73
                         base=0.15
11:48:21-372543 INFO     Command line args: ['-f', '--use-ipex', '--uv',
                         '--listen', '--insecure', '--api-log', '--log',
                         'sdnext.log'] f=True uv=True use_ipex=True
                         insecure=True listen=True log=sdnext.log args=[]
11:48:21-374065 DEBUG    Env flags: ['SD_VAE_DEBUG=true', 'SD_DOCKER=true',
                         'SD_DATADIR=/mnt/data', 'SD_MODELSDIR=/mnt/models']
11:48:21-375121 DEBUG    Linker flags: preload="libjemalloc.so.2"
                         path=":/mnt/python/venv/lib/"
11:48:21-376138 DEBUG    Starting module: <module 'webui' from '/app/webui.py'>
11:48:26-586371 DEBUG    System: cores=16 affinity=16 threads=16
11:48:26-587954 INFO     Torch: torch==2.9.1+xpu torchvision==0.24.1+xpu
11:48:26-588343 INFO     Packages: diffusers==0.36.0.dev0 transformers==4.57.3
                         accelerate==1.12.0 gradio==3.43.2 pydantic==1.10.21
                         numpy==2.1.2 cv2==4.12.0
11:48:26-898238 DEBUG    ONNX: version=1.23.2,
                         available=['AzureExecutionProvider',
                         'CPUExecutionProvider']
11:48:26-929977 DEBUG    State initialized: id=127234988517584
11:48:26-942010 INFO     Device detect: memory=123.0 default=balanced
                         optimization=highvram
11:48:26-942541 DEBUG    Triton: pass=False fn=<module>:has_triton time=0.00
11:48:26-944007 DEBUG    Read: file="/mnt/data/config.json" json=14 bytes=749
                         time=0.000 fn=<module>:load
11:48:26-944819 INFO     Engine: backend=Backend.DIFFUSERS compute=ipex
                         device=xpu attention="Scaled-Dot-Product" mode=no_grad
11:48:26-945656 DEBUG    Read: file="html/reference.json" json=145 bytes=68779
                         time=0.000 fn=_call_with_frames_removed:<module>
11:48:26-946200 DEBUG    Torch attention: type="sdpa" kernels=['Flash',
                         'Memory', 'Math'] overrides=[]
11:48:26-946707 DEBUG    Torch attention installed: flashattn=False
                         sageattention=False
11:48:26-947067 DEBUG    Torch attention status: flash=False flash3=False
                         aiter=False sage=False flex=True npu=False xla=False
                         xformers=False
11:48:27-570095 DEBUG    Triton: pass=False fn=<module>:set_cuda_params
                         time=0.00
11:48:27-570936 INFO     Torch parameters: backend=ipex device=xpu config=Auto
                         dtype=torch.bfloat16 context=no_grad nohalf=False
                         nohalfvae=False upcast=False deterministic=False
                         tunable=[False, False] fp16=pass bf16=pass triton=fail
                         optimization="Scaled-Dot-Product"
11:48:27-582484 DEBUG    Quantization: registered=SDNQ
11:48:27-582922 INFO     Device: device=Intel(R) Graphics [0x7d51] n=1 ipex=
                         driver=1.6.31294+20
11:48:27-605714 TRACE    Trace: VAE
11:48:27-652052 DEBUG    Entering start sequence
11:48:27-652486 INFO     Base path: data="/mnt/data"
11:48:27-652811 INFO     Base path: models="/mnt/models"
11:48:27-653416 DEBUG    Initializing
11:48:27-672773 INFO     Available VAEs: path="/mnt/models/VAE" items=0
11:48:27-673328 INFO     Available UNets: path="/mnt/models/UNET" items=0
11:48:27-673764 INFO     Available TEs: path="/mnt/models/Text-encoder" items=0
11:48:27-674819 INFO     Available Models:
                         safetensors="/mnt/models/Stable-diffusion":0
                         diffusers="/mnt/models/Diffusers":1 reference=145
                         items=1 time=0.00
11:48:27-677658 INFO     Available LoRAs: path="/mnt/models/Lora" items=0
                         folders=2 time=0.00
11:48:27-682070 INFO     Available Styles: path="/mnt/models/styles" items=288
                         time=0.00
11:48:27-706793 INFO     Available Detailer: path="/mnt/models/yolo" items=11
                         downloaded=0
11:48:27-707407 DEBUG    Extensions: disabled=[]
11:48:27-707735 INFO     Load extensions
11:48:27-853310 DEBUG    Extensions init time: total=0.15
11:48:28-204052 DEBUG    Read: file="html/upscalers.json" json=4 bytes=2640
                         time=0.000 fn=__init__:__init__
11:48:28-204964 DEBUG    Read:
                         file="extensions-builtin/sd-extension-chainner/models.j
                         son" json=25 bytes=2803 time=0.000
                         fn=__init__:find_scalers
11:48:28-205902 DEBUG    Available chaiNNer: path="/mnt/models/chaiNNer"
                         defined=25 discovered=0 downloaded=0
11:48:28-207685 INFO     Available Upscalers: items=76 downloaded=0 user=0
                         time=0.35 types=['None', 'Resize', 'Latent',
                         'AsymmetricVAE', 'WanUpscale', 'DCC', 'VIPS',
                         'ChaiNNer', 'SeedVR', 'AuraSR', 'Diffusion', 'SwinIR',
                         'RealESRGAN', 'ESRGAN', 'SCUNet']
11:48:28-237869 INFO     Networks: type="video" engines=13 models=67 errors=0
                         time=0.03
11:48:28-239815 INFO     Huggingface: transfer=rust parallel=True direct=False
                         token="None" cache="/mnt/models/huggingface" init
11:48:28-243446 DEBUG    Huggingface: cache="/mnt/models/huggingface" size=27302
                         MB
11:48:28-243948 DEBUG    UI start sequence
11:48:28-244293 DEBUG    UI image support: kanvas=main
11:48:28-245896 INFO     UI locale: name="Auto"
11:48:28-246356 INFO     UI theme: type=Modern name="Default" available=35
11:48:28-246906 DEBUG    UI theme:
                         css="extensions-builtin/sdnext-modernui/themes/Default.
                         css" base="['base.css', 'timesheet.css']" user="None"
11:48:28-248633 DEBUG    UI initialize: tab=txt2img
11:48:28-266965 DEBUG    Read: file="html/reference.json" json=145 bytes=68779
                         time=0.000 fn=list_items:list_reference
11:48:28-270154 DEBUG    Networks: type="reference" items={'total': 145,
                         'ready': 1, 'hidden': 0, 'experimental': 0, 'base': 91,
                         'distilled': 18, 'quantized': 17, 'community': 15,
                         'cloud': 2}
11:48:28-273095 DEBUG    Networks: type="model" items=144 subfolders=8
                         tab=txt2img folders=['/mnt/models/Stable-diffusion',
                         'models/Reference', '/mnt/models/Stable-diffusion']
                         list=0.01 thumb=0.00 desc=0.00 info=0.00 workers=12
11:48:28-274299 DEBUG    Networks: type="lora" items=0 subfolders=1 tab=txt2img
                         folders=['/mnt/models/Lora'] list=0.00 thumb=0.00
                         desc=0.00 info=0.00 workers=12
11:48:28-279364 DEBUG    Networks: type="style" items=288 subfolders=3
                         tab=txt2img folders=['/mnt/models/styles', 'html']
                         list=0.00 thumb=0.00 desc=0.00 info=0.00 workers=12
11:48:28-281195 DEBUG    Networks: type="wildcards" items=0 subfolders=1
                         tab=txt2img folders=['/mnt/models/wildcards'] list=0.00
                         thumb=0.00 desc=0.00 info=0.00 workers=12
11:48:28-282066 DEBUG    Networks: type="embedding" items=0 subfolders=1
                         tab=txt2img folders=['/mnt/models/embeddings']
                         list=0.00 thumb=0.00 desc=0.00 info=0.00 workers=12
11:48:28-282910 DEBUG    Networks: type="vae" items=0 subfolders=1 tab=txt2img
                         folders=['/mnt/models/VAE'] list=0.00 thumb=0.00
                         desc=0.00 info=0.00 workers=12
11:48:28-283753 DEBUG    Networks: type="history" items=0 subfolders=1
                         tab=txt2img folders=[] list=0.00 thumb=0.00 desc=0.00
                         info=0.00 workers=12
11:48:28-398098 DEBUG    UI initialize: tab=img2img
11:48:28-506635 DEBUG    UI initialize: tab=control models="/mnt/models/control"
11:48:28-901720 DEBUG    UI initialize: tab=video
11:48:29-019155 DEBUG    UI initialize: tab=process
11:48:29-068342 DEBUG    UI initialize: tab=caption
11:48:29-151657 DEBUG    UI initialize: tab=models
11:48:29-224390 DEBUG    UI initialize: tab=gallery
11:48:29-267627 DEBUG    Read: file="/mnt/data/ui-config.json" json=0 bytes=2
                         time=0.000 fn=__init__:read_from_file
11:48:29-269381 DEBUG    UI initialize: tab=settings
11:48:29-726919 DEBUG    Settings: sections=23 settings=383/608 quicksettings=1
11:48:29-784302 DEBUG    UI initialize: tab=info
11:48:29-807155 DEBUG    UI initialize: tab=extensions
11:48:29-811677 INFO     Extension list is empty: refresh required
11:48:29-846537 DEBUG    Extension list: processed=3 installed=3 enabled=3
                         disabled=0 visible=3 hidden=0
11:48:30-357351 DEBUG    Root paths: ['/app', '/mnt/data', '/mnt/models']
11:48:30-442399 INFO     Local URL: http://localhost:7860/
11:48:30-443409 WARNING  Public URL: enabled without authentication
11:48:30-444075 WARNING  Public URL: enabled with insecure flag
11:48:30-444754 INFO     External URL: http://172.18.0.2:7860
11:48:30-605208 INFO     Public URL: http://xx.xx.xx.xx:7860
11:48:30-605965 INFO     API docs: http://localhost:7860/docs
11:48:30-606329 INFO     API redocs: http://localhost:7860/redocs
11:48:30-607421 DEBUG    API middleware: [<class
                         'starlette.middleware.base.BaseHTTPMiddleware'>, <class
                         'starlette.middleware.gzip.GZipMiddleware'>]
11:48:30-607963 DEBUG    API initialize
11:48:30-744379 DEBUG    Scripts setup: time=0.272 ['XYZ Grid:0.033', 'IP
                         Adapters:0.032']
11:48:30-745048 DEBUG    Model metadata: file="/mnt/data/metadata.json" no
                         changes
11:48:30-745421 INFO     Model: autoload=True
                         selected="Diffusers/kandinskylab/Kandinsky-5.0-T2I-Lite
                         -sft-Diffusers [25da1e82e6]"
11:48:30-746392 DEBUG    Model requested: fn=threading.py:run:<lambda>
11:48:30-747140 DEBUG    Search model:
                         name="Diffusers/kandinskylab/Kandinsky-5.0-T2I-Lite-sft
                         -Diffusers [25da1e82e6]"
                         matched="/mnt/models/Diffusers/models--kandinskylab--Ka
                         ndinsky-5.0-T2I-Lite-sft-Diffusers/snapshots/25da1e82e6
                         adefdbc83c02a71f762287c5e471a4" type=alias
11:48:30-747719 INFO     Load model:
                         select="Diffusers/kandinskylab/Kandinsky-5.0-T2I-Lite-s
                         ft-Diffusers [25da1e82e6]"
11:48:30-750510 INFO     Autodetect model: detect="Kandinsky 5.0" class=None
                         file="/mnt/models/Diffusers/models--kandinskylab--Kandi
                         nsky-5.0-T2I-Lite-sft-Diffusers/snapshots/25da1e82e6ade
                         fdbc83c02a71f762287c5e471a4"
11:48:30-751194 DEBUG    Cache clear
11:48:30-752471 DEBUG    Load model: type=Kandinsky50
                         repo="kandinskylab/Kandinsky-5.0-T2I-Lite-sft-Diffusers
                         " config={'low_cpu_mem_usage': True, 'torch_dtype':
                         torch.bfloat16, 'load_connected_pipeline': True,
                         'safety_checker': None, 'requires_safety_checker':
                         False} offload=balanced dtype=torch.bfloat16
                         args={'torch_dtype': torch.bfloat16}
11:48:30-753448 DEBUG    Load model:
                         transformer="kandinskylab/Kandinsky-5.0-T2I-Lite-sft-Di
                         ffusers" cls=Kandinsky5Transformer3DModel
                         subfolder=transformer quant="None" loader=default
                         args={'torch_dtype': torch.bfloat16}
Send: curl -X HEAD -H 'Accept: */*' -H 'Accept-Encoding: identity' -H 'Connection: keep-alive' -H 'user-agent: unknown/None; hf_hub/0.36.0; python/3.12.3; diffusers/0.36.0.dev0; session_id/94ced552e9644c9788717e9c44118d5e; telemetry/off' https://huggingface.co/kandinskylab/Kandinsky-5.0-T2I-Lite-sft-Diffusers/resolve/main/transformer/config.json
Request 99c7dbd3-216c-4331-99aa-7399abc5b323: HEAD https://huggingface.co/kandinskylab/Kandinsky-5.0-T2I-Lite-sft-Diffusers/resolve/main/transformer/config.json (authenticated: False)
Send: curl -X HEAD -H 'Accept: */*' -H 'Accept-Encoding: identity' -H 'Connection: keep-alive' -H 'user-agent: unknown/None; hf_hub/0.36.0; python/3.12.3; diffusers/0.36.0.dev0; session_id/94ced552e9644c9788717e9c44118d5e; telemetry/off' https://huggingface.co/api/resolve-cache/models/kandinskylab/Kandinsky-5.0-T2I-Lite-sft-Diffusers/25da1e82e6adefdbc83c02a71f762287c5e471a4/transformer%2Fconfig.json
Request cb801469-c25e-40ec-adf9-f89f5b7f12c5: HEAD https://huggingface.co/api/resolve-cache/models/kandinskylab/Kandinsky-5.0-T2I-Lite-sft-Diffusers/25da1e82e6adefdbc83c02a71f762287c5e471a4/transformer%2Fconfig.json (authenticated: False)
Send: curl -X HEAD -H 'Accept: */*' -H 'Accept-Encoding: identity' -H 'Connection: keep-alive' -H 'user-agent: unknown/None; hf_hub/0.36.0; python/3.12.3; diffusers/0.36.0.dev0; file_type/model; framework/pytorch' https://huggingface.co/kandinskylab/Kandinsky-5.0-T2I-Lite-sft-Diffusers/resolve/25da1e82e6adefdbc83c02a71f762287c5e471a4/transformer/diffusion_pytorch_model.safetensors.index.json
Request 7f8a55be-b7a1-4c62-8687-ac4e073867a2: HEAD https://huggingface.co/kandinskylab/Kandinsky-5.0-T2I-Lite-sft-Diffusers/resolve/25da1e82e6adefdbc83c02a71f762287c5e471a4/transformer/diffusion_pytorch_model.safetensors.index.json (authenticated: False)
11:48:31-547668 DEBUG    Load model:
                         text_encoder="hunyuanvideo-community/HunyuanImage-2.1-D
                         iffusers" cls=Qwen2_5_VLForConditionalGeneration
                         quant="None" loader=default shared=True
Send: curl -X HEAD -H 'Accept: */*' -H 'Accept-Encoding: identity' -H 'Connection: keep-alive' -H 'user-agent: unknown/None; hf_hub/0.36.0; python/3.12.3; transformers/4.57.3; session_id/9a638f87398540629504663bba1e8e3d; torch/2.9.1+xpu; telemetry/off' https://huggingface.co/hunyuanvideo-community/HunyuanImage-2.1-Diffusers/resolve/main/text_encoder/config.json
Request 85649972-1ba0-4b8b-842b-3d141d965511: HEAD https://huggingface.co/hunyuanvideo-community/HunyuanImage-2.1-Diffusers/resolve/main/text_encoder/config.json (authenticated: False)
Send: curl -X HEAD -H 'Accept: */*' -H 'Accept-Encoding: identity' -H 'Connection: keep-alive' -H 'user-agent: unknown/None; hf_hub/0.36.0; python/3.12.3; transformers/4.57.3; session_id/9a638f87398540629504663bba1e8e3d; torch/2.9.1+xpu; telemetry/off' https://huggingface.co/api/resolve-cache/models/hunyuanvideo-community/HunyuanImage-2.1-Diffusers/7e7b7a177de58591aeaffca0929f4765003d7ced/text_encoder%2Fconfig.json
Request eba7b2be-837d-4527-97dd-aa0091d30bef: HEAD https://huggingface.co/api/resolve-cache/models/hunyuanvideo-community/HunyuanImage-2.1-Diffusers/7e7b7a177de58591aeaffca0929f4765003d7ced/text_encoder%2Fconfig.json (authenticated: False)
Send: curl -X HEAD -H 'Accept: */*' -H 'Accept-Encoding: identity' -H 'Connection: keep-alive' -H 'user-agent: unknown/None; hf_hub/0.36.0; python/3.12.3; transformers/4.57.3; session_id/9a638f87398540629504663bba1e8e3d; torch/2.9.1+xpu; telemetry/off' https://huggingface.co/hunyuanvideo-community/HunyuanImage-2.1-Diffusers/resolve/main/text_encoder/config.json
Request ee47dc2b-9055-43bc-bcc8-f44724da9902: HEAD https://huggingface.co/hunyuanvideo-community/HunyuanImage-2.1-Diffusers/resolve/main/text_encoder/config.json (authenticated: False)
Send: curl -X HEAD -H 'Accept: */*' -H 'Accept-Encoding: identity' -H 'Connection: keep-alive' -H 'user-agent: unknown/None; hf_hub/0.36.0; python/3.12.3; transformers/4.57.3; session_id/9a638f87398540629504663bba1e8e3d; torch/2.9.1+xpu; telemetry/off' https://huggingface.co/api/resolve-cache/models/hunyuanvideo-community/HunyuanImage-2.1-Diffusers/7e7b7a177de58591aeaffca0929f4765003d7ced/text_encoder%2Fconfig.json
Request de325a42-4bd2-416d-8b10-479f104d5bd2: HEAD https://huggingface.co/api/resolve-cache/models/hunyuanvideo-community/HunyuanImage-2.1-Diffusers/7e7b7a177de58591aeaffca0929f4765003d7ced/text_encoder%2Fconfig.json (authenticated: False)
Progress 13.60it/s ██████████████ 100% 4/4 00:00 00:00 Loading checkpoint shards
Send: curl -X HEAD -H 'Accept: */*' -H 'Accept-Encoding: identity' -H 'Connection: keep-alive' -H 'user-agent: unknown/None; hf_hub/0.36.0; python/3.12.3; transformers/4.57.3; session_id/9a638f87398540629504663bba1e8e3d; torch/2.9.1+xpu; telemetry/off' https://huggingface.co/hunyuanvideo-community/HunyuanImage-2.1-Diffusers/resolve/main/text_encoder/generation_config.json
Request c14548cf-66f7-47fd-b325-19d12d0b7c0b: HEAD https://huggingface.co/hunyuanvideo-community/HunyuanImage-2.1-Diffusers/resolve/main/text_encoder/generation_config.json (authenticated: False)
Send: curl -X HEAD -H 'Accept: */*' -H 'Accept-Encoding: identity' -H 'Connection: keep-alive' -H 'user-agent: unknown/None; hf_hub/0.36.0; python/3.12.3; transformers/4.57.3; session_id/9a638f87398540629504663bba1e8e3d; torch/2.9.1+xpu; telemetry/off' https://huggingface.co/api/resolve-cache/models/hunyuanvideo-community/HunyuanImage-2.1-Diffusers/7e7b7a177de58591aeaffca0929f4765003d7ced/text_encoder%2Fgeneration_config.json
Request c4f2800e-378b-48d2-a743-636ef2d9b051: HEAD https://huggingface.co/api/resolve-cache/models/hunyuanvideo-community/HunyuanImage-2.1-Diffusers/7e7b7a177de58591aeaffca0929f4765003d7ced/text_encoder%2Fgeneration_config.json (authenticated: False)
Send: curl -X HEAD -H 'Accept: */*' -H 'Accept-Encoding: identity' -H 'Connection: keep-alive' -H 'user-agent: unknown/None; hf_hub/0.36.0; python/3.12.3; transformers/4.57.3; session_id/9a638f87398540629504663bba1e8e3d; torch/2.9.1+xpu; telemetry/off' https://huggingface.co/hunyuanvideo-community/HunyuanImage-2.1-Diffusers/resolve/main/custom_generate/generate.py
Request 74fb33f0-eec4-4f03-8eba-1d9b3dde10c2: HEAD https://huggingface.co/hunyuanvideo-community/HunyuanImage-2.1-Diffusers/resolve/main/custom_generate/generate.py (authenticated: False)
Send: curl -X GET -H 'Accept: */*' -H 'Accept-Encoding: gzip, deflate' -H 'Connection: keep-alive' -H 'user-agent: unknown/None; hf_hub/0.36.0; python/3.12.3' https://huggingface.co/api/models/kandinskylab/Kandinsky-5.0-T2I-Lite-sft-Diffusers
Request 5ee30564-54f6-46be-83b2-b8bf73a03aef: GET https://huggingface.co/api/models/kandinskylab/Kandinsky-5.0-T2I-Lite-sft-Diffusers (authenticated: False)
Send: curl -X HEAD -H 'Accept: */*' -H 'Accept-Encoding: identity' -H 'Connection: keep-alive' -H 'user-agent: unknown/None; hf_hub/0.36.0; python/3.12.3' https://huggingface.co/kandinskylab/Kandinsky-5.0-T2I-Lite-sft-Diffusers/resolve/main/model_index.json
Request 9b9b30c6-88b9-445f-9a1e-80c6679f90d7: HEAD https://huggingface.co/kandinskylab/Kandinsky-5.0-T2I-Lite-sft-Diffusers/resolve/main/model_index.json (authenticated: False)
Send: curl -X HEAD -H 'Accept: */*' -H 'Accept-Encoding: identity' -H 'Connection: keep-alive' -H 'user-agent: unknown/None; hf_hub/0.36.0; python/3.12.3' https://huggingface.co/api/resolve-cache/models/kandinskylab/Kandinsky-5.0-T2I-Lite-sft-Diffusers/25da1e82e6adefdbc83c02a71f762287c5e471a4/model_index.json
Request 8af52673-80a4-4c03-b277-a57fcce95b6b: HEAD https://huggingface.co/api/resolve-cache/models/kandinskylab/Kandinsky-5.0-T2I-Lite-sft-Diffusers/25da1e82e6adefdbc83c02a71f762287c5e471a4/model_index.json (authenticated: False)
Progress  6.44it/s ██████████████ 100% 1/1 00:00 00:00 Loading checkpoint shards
Progress  6.26it/s █████████ 100% 7/7 00:01 00:00 Loading pipeline components...
11:48:34-312733 DEBUG    GC: current={'gpu': 0.0, 'ram': 3.34, 'oom': 0}
                         prev={'gpu': 0.0, 'ram': 3.35} load={'gpu': 0, 'ram':
                         3} gc={'gpu': 0.0, 'py': 282}
                         fn=load_diffuser_force:load_kandinsky5 why=load
                         time=0.36
11:48:34-317306 DEBUG    Setting model: component=vae {'slicing': True,
                         'tiling': False}
11:48:34-319528 DEBUG    Setting model: attention="Scaled-Dot-Product"
11:48:34-320944 INFO     Offload: type=balanced op=init watermark=0.2-0.8
                         gpu=24.60-98.40:123.00 cpu=123.000 limit=0.00
                         always=['T5EncoderModel', 'UMT5EncoderModel']
                         never=['CLIPTextModel', 'CLIPTextModelWithProjection',
                         'AutoencoderKL'] pre=True streams=False
11:48:34-336683 DEBUG    Module: name=text_encoder
                         cls=Qwen2_5_VLForConditionalGeneration size=15.445
                         params=8292166656 quant=None
11:48:34-343565 DEBUG    Module: name=transformer
                         cls=Kandinsky5Transformer3DModel size=11.217
                         params=6022080576 quant=None
11:48:34-345834 DEBUG    Module: name=text_encoder_2 cls=CLIPTextModel
                         size=0.229 params=123060480 quant=None
11:48:34-348512 DEBUG    Module: name=vae cls=AutoencoderKL size=0.156
                         params=83819683 quant=None
11:48:34-349483 INFO     Model class=Kandinsky5T2IPipeline modules=4 size=27.048
11:48:34-353804 INFO     Load model: family=kandinsky5 time={'total': 3.61,
                         'load': 3.57} native=1024 memory={'ram': {'total':
                         123.07, 'rss': 2.14, 'used': 3.35, 'free': 116.59,
                         'avail': 119.73, 'buffers': 0.29, 'cached': 3.99},
                         'gpu': {'used': 0.0, 'total': 123.07, 'active': 0.0,
                         'peak': 0.0, 'retries': 0, 'oom': 0, 'swap': 0}, 'job':
                         ''}
11:48:34-357326 INFO     Startup time: total=19.84 checkpoint=3.61 launch=2.60
                         gradio=2.53 loader=2.29 installer=2.29 torch=1.85
                         libraries=1.28 ui-extensions=0.56 ui-defaults=0.44
                         upscalers=0.36 bnb=0.28 ui-control=0.28 diffusers=0.25
                         ui-networks=0.21 extensions=0.15 ui-models=0.12
                         ui-txt2img=0.10
11:48:34-362483 DEBUG    Save: file="/mnt/data/config.json" json=14 bytes=749
                         time=0.006
11:48:35-219706 DEBUG    UI: connected
11:48:35-221362 INFO     API user=None code=200 http/1.1 GET /sdapi/v1/version
                         10.9.8.241 0.0005
Traceback (most recent call last):
  File "/mnt/python/venv/lib/python3.12/site-packages/transformers/image_utils.py", line 480, in load_image
    b64 = base64.decodebytes(image.encode())
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/base64.py", line 554, in decodebytes
    return binascii.a2b_base64(s)
           ^^^^^^^^^^^^^^^^^^^^^^
binascii.Error: Invalid base64-encoded string: number of data characters (25) cannot be 1 more than a multiple of 4

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/mnt/python/venv/lib/python3.12/site-packages/gradio/queueing.py", line 388, in call_prediction
    output = await route_utils.call_process_api(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/python/venv/lib/python3.12/site-packages/gradio/route_utils.py", line 219, in call_process_api
    output = await app.get_blocks().process_api(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/python/venv/lib/python3.12/site-packages/gradio/blocks.py", line 1437, in process_api
    result = await self.call_function(
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/python/venv/lib/python3.12/site-packages/gradio/blocks.py", line 1109, in call_function
    prediction = await anyio.to_thread.run_sync(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/python/venv/lib/python3.12/site-packages/anyio/to_thread.py", line 61, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/python/venv/lib/python3.12/site-packages/anyio/_backends/_asyncio.py", line 2525, in run_sync_in_worker_thread
    return await future
           ^^^^^^^^^^^^
  File "/mnt/python/venv/lib/python3.12/site-packages/anyio/_backends/_asyncio.py", line 986, in run
    result = context.run(func, *args)
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/python/venv/lib/python3.12/site-packages/gradio/utils.py", line 641, in wrapper
    response = f(*args, **kwargs)
               ^^^^^^^^^^^^^^^^^^
  File "/app/modules/call_queue.py", line 13, in f
    res = func(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^
  File "/app/modules/ui_common.py", line 424, in update_token_counter
    ids = shared.sd_model.tokenizer(prompt)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/python/venv/lib/python3.12/site-packages/transformers/models/qwen2_vl/processing_qwen2_vl.py", line 144, in __call__
    image_inputs = self.image_processor(images=images, **output_kwargs["images_kwargs"])
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/python/venv/lib/python3.12/site-packages/transformers/image_processing_utils_fast.py", line 732, in __call__
    return self.preprocess(images, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/python/venv/lib/python3.12/site-packages/transformers/models/qwen2_vl/image_processing_qwen2_vl_fast.py", line 141, in preprocess
    return super().preprocess(images, videos, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/python/venv/lib/python3.12/site-packages/transformers/image_processing_utils_fast.py", line 757, in preprocess
    return self._preprocess_image_like_inputs(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/python/venv/lib/python3.12/site-packages/transformers/models/qwen2_vl/image_processing_qwen2_vl_fast.py", line 160, in _preprocess_image_like_inputs
    images = self._prepare_image_like_inputs(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/python/venv/lib/python3.12/site-packages/transformers/image_processing_utils_fast.py", line 633, in _prepare_image_like_inputs
    images = self._prepare_images_structure(images, expected_ndims=expected_ndims)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/python/venv/lib/python3.12/site-packages/transformers/image_processing_utils_fast.py", line 564, in _prepare_images_structure
    images = self.fetch_images(images)
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/python/venv/lib/python3.12/site-packages/transformers/image_processing_base.py", line 532, in fetch_images
    return load_image(image_url_or_urls)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/python/venv/lib/python3.12/site-packages/transformers/image_utils.py", line 483, in load_image
    raise ValueError(
ValueError: Incorrect image source. Must be a valid URL starting with `http://` or `https://`, a valid path to an image file, or a base64 encoded string. Got car drifting on the Red square. Failed with Invalid base64-encoded string: number of data characters (25) cannot be 1 more than a multiple of 4
11:49:31-904197 DEBUG    Select input: type=<class 'str'> source=None init=None
                         mask=None mode=Image  time=0.00
11:49:32-961175 DEBUG    Sampler: "Default" cls=FlowMatchEulerDiscreteScheduler
                         config={'num_train_timesteps': 1000, 'shift': 5.0,
                         'use_dynamic_shifting': False, 'base_shift': 0.5,
                         'max_shift': 1.15, 'base_image_seq_len': 256,
                         'max_image_seq_len': 4096, 'invert_sigmas': False,
                         'shift_terminal': None, 'use_karras_sigmas': False,
                         'use_exponential_sigmas': False, 'use_beta_sigmas':
                         False, 'time_shift_type': 'exponential',
                         'stochastic_sampling': False}
11:49:32-988940 INFO     Processing modifiers: apply
11:49:33-013696 INFO     Base: pipeline=Kandinsky5T2IPipeline task=TEXT_2_IMAGE
                         batch=1/1x1 set={'prompt': 1, 'negative_prompt': 1,
                         'guidance_scale': 6, 'generator': 'xpu:[1849669577]',
                         'num_inference_steps': 20, 'output_type': 'np',
                         'width': 1024, 'height': 1024, 'parser': 'fixed'}
11:49:33-041537 DEBUG    Encode: prompt="['car drifting on the Red square']"
                         hijack=True
11:49:46-001339 DEBUG    Encode: prompt="['']" hijack=True
Progress ?it/s                                              0% 0/20 00:08 ? Base
11:49:54-773356 ERROR    Processing: step=base args={'prompt': ['car drifting on
                         the Red square'], 'negative_prompt': [''],
                         'guidance_scale': 6, 'generator':
                         [<modules.intel.ipex.hijacks.torch_Generator object at
                         0x73b7fab70a10>], 'callback_on_step_end': <function
                         diffusers_callback at 0x73b7fd0fd300>,
                         'callback_on_step_end_tensor_inputs': ['latents'],
                         'num_inference_steps': 20, 'output_type': 'np',
                         'width': 1024, 'height': 1024} mat1 and mat2 must have
                         the same dtype, but got Float and BFloat16
11:49:54-775736 ERROR    Processing: RuntimeError
╭───────────────────── Traceback (most recent call last) ──────────────────────╮
│/app/modules/processing_diffusers.py:180 in process_base                      │
│                                                                              │
│  179 │   │   │   taskid = shared.state.begin('Inference')                    │
│❱ 180 │   │   │   output = shared.sd_model(**base_args)                       │
│  181 │   │   │   shared.state.end(taskid)                                    │
│                                                                              │
│/mnt/python/venv/lib/python3.12/site-packages/torch/utils/_contextlib.py:120  │
│                                                                              │
│  119 │   │   with ctx_factory():                                             │
│❱ 120 │   │   │   return func(*args, **kwargs)                                │
│  121                                                                         │
│                                                                              │
│/mnt/python/venv/lib/python3.12/site-packages/diffusers/pipelines/kandinsky5/ │
│                                                                              │
│  730 │   │   │   │   # Predict noise residual                                │
│❱ 731 │   │   │   │   pred_velocity = self.transformer(                       │
│  732 │   │   │   │   │   hidden_states=latents.to(dtype),                    │
│                                                                              │
│/mnt/python/venv/lib/python3.12/site-packages/torch/nn/modules/module.py:1775 │
│                                                                              │
│  1774 │   │   else:                                                          │
│❱ 1775 │   │   │   return self._call_impl(*args, **kwargs)                    │
│  1776                                                                        │
│                                                                              │
│/mnt/python/venv/lib/python3.12/site-packages/torch/nn/modules/module.py:1786 │
│                                                                              │
│  1785 │   │   │   │   or _global_forward_hooks or _global_forward_pre_hooks) │
│❱ 1786 │   │   │   return forward_call(*args, **kwargs)                       │
│  1787                                                                        │
│                                                                              │
│/mnt/python/venv/lib/python3.12/site-packages/accelerate/hooks.py:175 in new_ │
│                                                                              │
│  174 │   │   else:                                                           │
│❱ 175 │   │   │   output = module._old_forward(*args, **kwargs)               │
│  176 │   │   return module._hf_hook.post_forward(module, output)             │
│                                                                              │
│/mnt/python/venv/lib/python3.12/site-packages/diffusers/models/transformers/t │
│                                                                              │
│  628 │   │   text_embed = self.text_embeddings(text_embed)                   │
│❱ 629 │   │   time_embed = self.time_embeddings(time)                         │
│  630 │   │   time_embed = time_embed + self.pooled_text_embeddings(pooled_te │
│                                                                              │
│/mnt/python/venv/lib/python3.12/site-packages/torch/nn/modules/module.py:1775 │
│                                                                              │
│  1774 │   │   else:                                                          │
│❱ 1775 │   │   │   return self._call_impl(*args, **kwargs)                    │
│  1776                                                                        │
│                                                                              │
│/mnt/python/venv/lib/python3.12/site-packages/torch/nn/modules/module.py:1786 │
│                                                                              │
│  1785 │   │   │   │   or _global_forward_hooks or _global_forward_pre_hooks) │
│❱ 1786 │   │   │   return forward_call(*args, **kwargs)                       │
│  1787                                                                        │
│                                                                              │
│/mnt/python/venv/lib/python3.12/site-packages/torch/amp/autocast_mode.py:44 i │
│                                                                              │
│   43 │   │   with autocast_instance:                                         │
│❱  44 │   │   │   return func(*args, **kwargs)                                │
│   45                                                                         │
│                                                                              │
│/mnt/python/venv/lib/python3.12/site-packages/diffusers/models/transformers/t │
│                                                                              │
│  171 │   │   time_embed = torch.cat([torch.cos(args), torch.sin(args)], dim= │
│❱ 172 │   │   time_embed = self.out_layer(self.activation(self.in_layer(time_ │
│  173 │   │   return time_embed                                               │
│                                                                              │
│/mnt/python/venv/lib/python3.12/site-packages/torch/nn/modules/module.py:1775 │
│                                                                              │
│  1774 │   │   else:                                                          │
│❱ 1775 │   │   │   return self._call_impl(*args, **kwargs)                    │
│  1776                                                                        │
│                                                                              │
│/mnt/python/venv/lib/python3.12/site-packages/torch/nn/modules/module.py:1786 │
│                                                                              │
│  1785 │   │   │   │   or _global_forward_hooks or _global_forward_pre_hooks) │
│❱ 1786 │   │   │   return forward_call(*args, **kwargs)                       │
│  1787                                                                        │
│                                                                              │
│/mnt/python/venv/lib/python3.12/site-packages/torch/nn/modules/linear.py:134  │
│                                                                              │
│  133 │   │   """
│❱ 134 │   │   return F.linear(input, self.weight, self.bias)                  │
│  135                                                                         │
╰──────────────────────────────────────────────────────────────────────────────╯
RuntimeError: mat1 and mat2 must have the same dtype, but got Float and BFloat16
11:49:55-186148 DEBUG    Search model:
                         name="Diffusers/kandinskylab/Kandinsky-5.0-T2I-Lite-sft
                         -Diffusers [25da1e82e6]"
                         matched="/mnt/models/Diffusers/models--kandinskylab--Ka
                         ndinsky-5.0-T2I-Lite-sft-Diffusers/snapshots/25da1e82e6
                         adefdbc83c02a71f762287c5e471a4" type=alias
11:49:55-200050 DEBUG    Analyzed:
                         model="Diffusers/kandinskylab/Kandinsky-5.0-T2I-Lite-sf
                         t-Diffusers" type=kandinsky5
                         class=Kandinsky5T2IPipeline size=0 mtime="2025-12-10
                         11:00:51" modules=[name="transformer"
                         cls=Kandinsky5Transformer3DModel config=True
                         device=xpu:0 dtype=torch.bfloat16 params=6022080576
                         modules=1307, name="vae" cls=AutoencoderKL config=True
                         device=cpu dtype=torch.bfloat16 params=83819683
                         modules=241, name="text_encoder"
                         cls=Qwen2_5_VLForConditionalGeneration config=True
                         device=xpu:0 dtype=torch.bfloat16 params=8292166656
                         modules=763, name="tokenizer" cls=Qwen2VLProcessor
                         config=False, name="text_encoder_2" cls=CLIPTextModel
                         config=True device=xpu:0 dtype=torch.bfloat16
                         params=123060480 modules=152, name="tokenizer_2"
                         cls=CLIPTokenizer config=False, name="scheduler"
                         cls=FlowMatchEulerDiscreteScheduler config=True]
11:49:55-202693 INFO     Processing modifiers: unapply
11:49:55-204009 DEBUG    Process: batch=1/1 interrupted
11:49:55-205473 INFO     Processed: images=0 its=0.00 ops=['txt2img']
11:49:55-206321 DEBUG    Processed: timers={'total': 53.79, 'post': 22.2,
                         'onload': 18.28, 'te': 13.24}
11:49:55-207397 DEBUG    Processed: memory={'ram': {'total': 123.07, 'rss':
                         3.14, 'used': 33.51, 'free': 59.89, 'avail': 89.56,
                         'buffers': 0.29, 'cached': 58.3}, 'gpu': {'used':
                         27.76, 'total': 123.07, 'active': 26.96, 'peak': 26.96,
                         'retries': 0, 'oom': 0, 'swap': 0}, 'job': ''}

Backend

Diffusers

Compute

Intel IPEX

Interface

ModernUI

Branch

Dev

Model

Other

Acknowledgements

  • I have read the above and searched for existing issues
  • I confirm that this is classified correctly and its not an extension issue

Metadata

Metadata

Assignees

No one assigned

    Labels

    upstreamFix is required in upstream library

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions