model: add hunyuan dense #14878

stevenkuang-tencent · 2025-07-25T13:23:05Z

Update:

Support hunyuan_dense
fix hunyuan_moe chat template

Signed-off-by: stevenkuang <[email protected]>

convert_hf_to_gguf.py

convert_hf_to_gguf_update.py

gguf-py/gguf/constants.py

This reverts commit aa973ca.

Signed-off-by: stevenkuang <[email protected]>

convert_hf_to_gguf.py

src/llama-chat.cpp

src/llama-model.cpp

convert_hf_to_gguf.py

Signed-off-by: stevenkuang <[email protected]>

convert_hf_to_gguf.py

CISC · 2025-07-29T09:24:31Z

@stevenkuang-tencent gentle ping

Signed-off-by: stevenkuang <[email protected]>

src/llama-chat.cpp

stevenkuang-tencent · 2025-08-01T13:12:30Z

Politely asking, can this pull request be merged now? @CISC

CISC · 2025-08-01T13:26:39Z

@stevenkuang-tencent Yes, but the chat template gives me pause, please follow up once model is released if there are any problems.

jacekpoplawski · 2025-08-01T15:37:45Z

Is this for upcoming models or old ones?
Because https://huggingface.co/tencent/Hunyuan-4B-Instruct is not accessible and in the vllm I see https://huggingface.co/tencent/Hunyuan-7B-Instruct-0124 is mentioned

stevenkuang-tencent · 2025-08-01T15:52:06Z

Is this for upcoming models or old ones? Because https://huggingface.co/tencent/Hunyuan-4B-Instruct is not accessible and in the vllm I see https://huggingface.co/tencent/Hunyuan-7B-Instruct-0124 is mentioned

It is for upcoming models. Those models will come soon.

jacekpoplawski · 2025-08-01T16:02:55Z

It is for upcoming models. Those models will come soon.

that's fantastic news, thanks!

* support hunyuan_v1_dense Signed-off-by: stevenkuang <[email protected]> * update hunyuan_moe to hunyuan_v1_moe Signed-off-by: stevenkuang <[email protected]> * fix rope alpha assert and bos token Signed-off-by: stevenkuang <[email protected]> * add blank line Signed-off-by: stevenkuang <[email protected]> * Revert "update hunyuan_moe to hunyuan_v1_moe" This reverts commit aa973ca. * use hunyuan_dense instead of hunyuan_v1_dense Signed-off-by: stevenkuang <[email protected]> * fix hunyuan_moe chat template Signed-off-by: stevenkuang <[email protected]> * remove leftover code Signed-off-by: stevenkuang <[email protected]> * update hunyuan dense chat template Signed-off-by: stevenkuang <[email protected]> * fix hunyuan dense vocab and chat template Signed-off-by: stevenkuang <[email protected]> --------- Signed-off-by: stevenkuang <[email protected]>

pwilkin · 2025-08-05T10:50:27Z

Just wanted to chime in, tested IQ4NL quants and the output is completely incoherent.

arch-btw · 2025-08-05T13:19:16Z

Same issue here, tried it with the different flags but it still doesn't work:

-cnv --jinja

-cnv --chat-template hunyuan-dense

-cnv --chat-template hunyuan-moe

Example output:

hello

，不对，这样处理的话，比如对于输入序列“1,

@stevenkuang-tencent

pwilkin · 2025-08-05T13:23:17Z

My 3 attempts were:

Lots of Chinese text after "Hello"
Started "<think" then completely froze (but generation wasn't finished)
Kept repeating "answer:" in new session
Something is completely broken.

stevenkuang-tencent · 2025-08-05T14:57:07Z

The chat-template has been updated before the model is open sourced, and we are updating it synchronously

arch-btw · 2025-08-05T19:06:22Z

@stevenkuang-tencent thank you

@pwilkin I put this together and this seems to work for now, although it's not an official solution:

Save as hunyuan4b.jinja
Then run with --jinja --chat-template-file hunyuan4b.jinja
The model defaults to /no_think but putting /think before the prompt works.

hunyuan4b.jinja:

{%- if 'add_generation_prompt' not in context %}
    {%- set add_generation_prompt = false %}
{%- endif %}

{%- set ns = namespace(is_first=false) %}

{%- for message in messages if message['role'] == 'system' %}
    {%- if ns.is_first %}
        {%- set ns.is_first = false %}
        {{- bos_token -}}
        {{- message['content'] }}
    {%- else %}
        {{- '\n\n' + message['content'] }}
    {%- endif %}
{%- endfor %}

{%- for message in messages %}
    {% if message['role'] == 'user' %}
        <｜hy_User｜>{{ message['content'] }}<｜hy_Assistant｜>
    {% endif %}
    
    {% if message['role'] == 'assistant' %}
        {{ message['content'] }}{{ eos_token }}
    {% endif %}
{%- endfor %}

{%- if add_generation_prompt and not ns.is_last_user %}
    <｜hy_Assistant｜>
{%- endif %}

{%- if enable_thinking is defined and not enable_thinking %}
    ...
{%- endif %}

pwilkin · 2025-08-05T19:08:58Z

@stevenkuang-tencent thank you

@pwilkin I put this together and this seems to work for now, although it's not an official solution:

Save as hunyuan4b.jinja Then run with --jinja --chat-template-file hunyuan4b.jinja The model defaults to /no_think but putting /think before the prompt works.

hunyuan4b.jinja:

{%- if 'add_generation_prompt' not in context %}
    {%- set add_generation_prompt = false %}
{%- endif %}

{%- set ns = namespace(is_first=false) %}

{%- for message in messages if message['role'] == 'system' %}
    {%- if ns.is_first %}
        {%- set ns.is_first = false %}
        {{- bos_token -}}
        {{- message['content'] }}
    {%- else %}
        {{- '\n\n' + message['content'] }}
    {%- endif %}
{%- endfor %}

{%- for message in messages %}
    {% if message['role'] == 'user' %}
        <｜hy_User｜>{{ message['content'] }}<｜hy_Assistant｜>
    {% endif %}
    
    {% if message['role'] == 'assistant' %}
        {{ message['content'] }}{{ eos_token }}
    {% endif %}
{%- endfor %}

{%- if add_generation_prompt and not ns.is_last_user %}
    <｜hy_Assistant｜>
{%- endif %}

{%- if enable_thinking is defined and not enable_thinking %}
    ...
{%- endif %}

What's in the "..." part? The current contents?

arch-btw · 2025-08-05T19:13:07Z

I think so, when I remove it (with thinking enabled) it starts talking in Chinese again.

pwilkin · 2025-08-05T19:19:56Z

Nope, on Hunyuan 7B still garbage. Tried the fixed prompt from their tokenizer config, but still doesn't work.

pwilkin · 2025-08-05T19:20:47Z

I guess it might have something to do with this:
load: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect

arch-btw · 2025-08-05T19:21:11Z

I think 7b uses a different tokenizer.

pwilkin · 2025-08-05T19:21:43Z

Yes, but it's been incorrectly uploaded from what I've seen.

stevenkuang-tencent added 3 commits July 25, 2025 19:55

support hunyuan_v1_dense

5d2c042

Signed-off-by: stevenkuang <[email protected]>

update hunyuan_moe to hunyuan_v1_moe

aa973ca

Signed-off-by: stevenkuang <[email protected]>

fix rope alpha assert and bos token

5645497

Signed-off-by: stevenkuang <[email protected]>

github-actions bot added the python python script changes label Jul 25, 2025

add blank line

63f32c3

Signed-off-by: stevenkuang <[email protected]>

CISC reviewed Jul 25, 2025

View reviewed changes

convert_hf_to_gguf.py Outdated Show resolved Hide resolved

convert_hf_to_gguf_update.py Outdated Show resolved Hide resolved

gguf-py/gguf/constants.py Outdated Show resolved Hide resolved

stevenkuang-tencent added 3 commits July 26, 2025 03:26

Revert "update hunyuan_moe to hunyuan_v1_moe"

78de8db

This reverts commit aa973ca.

use hunyuan_dense instead of hunyuan_v1_dense

c7329b4

Signed-off-by: stevenkuang <[email protected]>

fix hunyuan_moe chat template

0192c12

Signed-off-by: stevenkuang <[email protected]>

stevenkuang-tencent changed the title ~~model: add hunyuan v1 dense~~ model: add hunyuan dense Jul 25, 2025

CISC requested changes Jul 25, 2025

View reviewed changes

xunjieliu mentioned this pull request Jul 26, 2025

Reddit News Daily 2025-07-26 xunjieliu/reddit-daily-news#132

Open

stevenkuang-tencent added 2 commits July 27, 2025 01:08

remove leftover code

3ecc5d3

Signed-off-by: stevenkuang <[email protected]>

update hunyuan dense chat template

6c17323

Signed-off-by: stevenkuang <[email protected]>

stevenkuang-tencent requested a review from CISC July 26, 2025 17:19

CISC approved these changes Jul 26, 2025

View reviewed changes

convert_hf_to_gguf.py Outdated Show resolved Hide resolved

fix hunyuan dense vocab and chat template

675f35d

Signed-off-by: stevenkuang <[email protected]>

CISC reviewed Jul 31, 2025

View reviewed changes

src/llama-chat.cpp Show resolved Hide resolved

CISC merged commit 0f5ccd6 into ggml-org:master Aug 1, 2025
50 checks passed

stevenkuang-tencent mentioned this pull request Aug 6, 2025

model : fix hunyuan chat template #15114

Merged

weedge mentioned this pull request Sep 2, 2025

feat: add Hunyuan-MT and Seed-X transformers generator translation test; run websocket/webrtc asr+translate+tts bot serve for Hunyuan-MT ai-bot-pro/achatbot#188

Merged

model: add hunyuan dense #14878

model: add hunyuan dense #14878

Uh oh!

Conversation

stevenkuang-tencent commented Jul 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

CISC commented Jul 29, 2025

Uh oh!

Uh oh!

stevenkuang-tencent commented Aug 1, 2025

Uh oh!

CISC commented Aug 1, 2025

Uh oh!

Uh oh!

jacekpoplawski commented Aug 1, 2025

Uh oh!

stevenkuang-tencent commented Aug 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jacekpoplawski commented Aug 1, 2025

Uh oh!

pwilkin commented Aug 5, 2025

Uh oh!

arch-btw commented Aug 5, 2025

Uh oh!

pwilkin commented Aug 5, 2025

Uh oh!

stevenkuang-tencent commented Aug 5, 2025

Uh oh!

arch-btw commented Aug 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pwilkin commented Aug 5, 2025

Uh oh!

arch-btw commented Aug 5, 2025

Uh oh!

pwilkin commented Aug 5, 2025

Uh oh!

pwilkin commented Aug 5, 2025

Uh oh!

arch-btw commented Aug 5, 2025

Uh oh!

pwilkin commented Aug 5, 2025

Uh oh!

Uh oh!

stevenkuang-tencent commented Jul 25, 2025 •

edited

Loading

stevenkuang-tencent commented Aug 1, 2025 •

edited

Loading

arch-btw commented Aug 5, 2025 •

edited

Loading