(#2583) remove low_cpu_mem_usage from qwen init, it does not do anything anymore; use dtype instead of torch_dtype by bghira · Pull Request #2586 · bghira/SimpleTuner

bghira · 2026-02-08T23:23:57Z

This pull request improves the loading process for text encoder models in simpletuner/helpers/models/flux2/model.py. The changes optimize device placement during model initialization and add informative logging to help debug device allocation in distributed setups.

Device placement improvements:

Changed model loading for both Qwen3 and Mistral-3 text encoders to load models onto CPU first, then move them to the per-rank accelerator device, preventing GPU contention during initialization. (_load_text_encoder_qwen3, _load_text_encoder_mistral) [1] [2]
Updated model loading parameters by replacing torch_dtype with dtype and removing low_cpu_mem_usage for consistency and clarity. (_load_text_encoder_qwen3, _load_text_encoder_mistral) [1] [2]

Logging enhancements:

Added detailed logging statements when moving Qwen3 and Mistral-3 text encoders to their target devices, including device information and process rank, to aid in debugging distributed training setups. (_load_text_encoder_qwen3, _load_text_encoder_mistral) [1] [2]

…ing anymore; use dtype instead of torch_dtype

(#2583) remove low_cpu_mem_usage from qwen init, it does not do anyth…

8a688be

…ing anymore; use dtype instead of torch_dtype

bghira linked an issue Feb 8, 2026 that may be closed by this pull request

[bug] multi GPU training is loading TEs on the same GPU causing OOM #2583

Closed

bghira merged commit 46b77f9 into main Feb 9, 2026
2 checks passed

bghira deleted the bugfix/2583 branch February 9, 2026 02:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(#2583) remove low_cpu_mem_usage from qwen init, it does not do anything anymore; use dtype instead of torch_dtype#2586

(#2583) remove low_cpu_mem_usage from qwen init, it does not do anything anymore; use dtype instead of torch_dtype#2586
bghira merged 1 commit intomainfrom
bugfix/2583

bghira commented Feb 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bghira commented Feb 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant