Skip to content

model: add hunyuan dense #14878

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Aug 1, 2025

Conversation

stevenkuang-tencent
Copy link
Contributor

@stevenkuang-tencent stevenkuang-tencent commented Jul 25, 2025

Update:

  • Support hunyuan_dense
  • fix hunyuan_moe chat template

@github-actions github-actions bot added the python python script changes label Jul 25, 2025
Signed-off-by: stevenkuang <[email protected]>
@stevenkuang-tencent stevenkuang-tencent changed the title model: add hunyuan v1 dense model: add hunyuan dense Jul 25, 2025
@stevenkuang-tencent stevenkuang-tencent requested a review from CISC July 26, 2025 17:19
@CISC
Copy link
Collaborator

CISC commented Jul 29, 2025

@stevenkuang-tencent gentle ping

@stevenkuang-tencent
Copy link
Contributor Author

Politely asking, can this pull request be merged now? @CISC

@CISC
Copy link
Collaborator

CISC commented Aug 1, 2025

@stevenkuang-tencent Yes, but the chat template gives me pause, please follow up once model is released if there are any problems.

@CISC CISC merged commit 0f5ccd6 into ggml-org:master Aug 1, 2025
50 checks passed
@jacekpoplawski
Copy link
Contributor

Is this for upcoming models or old ones?
Because https://huggingface.co/tencent/Hunyuan-4B-Instruct is not accessible and in the vllm I see https://huggingface.co/tencent/Hunyuan-7B-Instruct-0124 is mentioned

@stevenkuang-tencent
Copy link
Contributor Author

stevenkuang-tencent commented Aug 1, 2025

Is this for upcoming models or old ones? Because https://huggingface.co/tencent/Hunyuan-4B-Instruct is not accessible and in the vllm I see https://huggingface.co/tencent/Hunyuan-7B-Instruct-0124 is mentioned

It is for upcoming models. Those models will come soon.

@jacekpoplawski
Copy link
Contributor

It is for upcoming models. Those models will come soon.

that's fantastic news, thanks!

Nexesenex pushed a commit to Nexesenex/croco.cpp that referenced this pull request Aug 2, 2025
* support hunyuan_v1_dense

Signed-off-by: stevenkuang <[email protected]>

* update hunyuan_moe to hunyuan_v1_moe

Signed-off-by: stevenkuang <[email protected]>

* fix rope alpha assert and bos token

Signed-off-by: stevenkuang <[email protected]>

* add blank line

Signed-off-by: stevenkuang <[email protected]>

* Revert "update hunyuan_moe to hunyuan_v1_moe"

This reverts commit aa973ca.

* use hunyuan_dense instead of hunyuan_v1_dense

Signed-off-by: stevenkuang <[email protected]>

* fix hunyuan_moe chat template

Signed-off-by: stevenkuang <[email protected]>

* remove leftover code

Signed-off-by: stevenkuang <[email protected]>

* update hunyuan dense chat template

Signed-off-by: stevenkuang <[email protected]>

* fix hunyuan dense vocab and chat template

Signed-off-by: stevenkuang <[email protected]>

---------

Signed-off-by: stevenkuang <[email protected]>
@pwilkin
Copy link
Contributor

pwilkin commented Aug 5, 2025

Just wanted to chime in, tested IQ4NL quants and the output is completely incoherent.

@arch-btw
Copy link
Contributor

arch-btw commented Aug 5, 2025

Same issue here, tried it with the different flags but it still doesn't work:

-cnv --jinja

-cnv --chat-template hunyuan-dense

-cnv --chat-template hunyuan-moe

Example output:

hello

,不对,这样处理的话,比如对于输入序列“1,

@stevenkuang-tencent

@pwilkin
Copy link
Contributor

pwilkin commented Aug 5, 2025

My 3 attempts were:

  • Lots of Chinese text after "Hello"
  • Started "<think" then completely froze (but generation wasn't finished)
  • Kept repeating "answer:" in new session
    Something is completely broken.

@stevenkuang-tencent
Copy link
Contributor Author

The chat-template has been updated before the model is open sourced, and we are updating it synchronously

@arch-btw
Copy link
Contributor

arch-btw commented Aug 5, 2025

@stevenkuang-tencent thank you

@pwilkin I put this together and this seems to work for now, although it's not an official solution:

Save as hunyuan4b.jinja
Then run with --jinja --chat-template-file hunyuan4b.jinja
The model defaults to /no_think but putting /think before the prompt works.

hunyuan4b.jinja:

{%- if 'add_generation_prompt' not in context %}
    {%- set add_generation_prompt = false %}
{%- endif %}

{%- set ns = namespace(is_first=false) %}

{%- for message in messages if message['role'] == 'system' %}
    {%- if ns.is_first %}
        {%- set ns.is_first = false %}
        {{- bos_token -}}
        {{- message['content'] }}
    {%- else %}
        {{- '\n\n' + message['content'] }}
    {%- endif %}
{%- endfor %}

{%- for message in messages %}
    {% if message['role'] == 'user' %}
        <|hy_User|>{{ message['content'] }}<|hy_Assistant|>
    {% endif %}
    
    {% if message['role'] == 'assistant' %}
        {{ message['content'] }}{{ eos_token }}
    {% endif %}
{%- endfor %}

{%- if add_generation_prompt and not ns.is_last_user %}
    <|hy_Assistant|>
{%- endif %}

{%- if enable_thinking is defined and not enable_thinking %}
    ...
{%- endif %}

@pwilkin
Copy link
Contributor

pwilkin commented Aug 5, 2025

@stevenkuang-tencent thank you

@pwilkin I put this together and this seems to work for now, although it's not an official solution:

Save as hunyuan4b.jinja Then run with --jinja --chat-template-file hunyuan4b.jinja The model defaults to /no_think but putting /think before the prompt works.

hunyuan4b.jinja:

{%- if 'add_generation_prompt' not in context %}
    {%- set add_generation_prompt = false %}
{%- endif %}

{%- set ns = namespace(is_first=false) %}

{%- for message in messages if message['role'] == 'system' %}
    {%- if ns.is_first %}
        {%- set ns.is_first = false %}
        {{- bos_token -}}
        {{- message['content'] }}
    {%- else %}
        {{- '\n\n' + message['content'] }}
    {%- endif %}
{%- endfor %}

{%- for message in messages %}
    {% if message['role'] == 'user' %}
        <|hy_User|>{{ message['content'] }}<|hy_Assistant|>
    {% endif %}
    
    {% if message['role'] == 'assistant' %}
        {{ message['content'] }}{{ eos_token }}
    {% endif %}
{%- endfor %}

{%- if add_generation_prompt and not ns.is_last_user %}
    <|hy_Assistant|>
{%- endif %}

{%- if enable_thinking is defined and not enable_thinking %}
    ...
{%- endif %}

What's in the "..." part? The current contents?

@arch-btw
Copy link
Contributor

arch-btw commented Aug 5, 2025

I think so, when I remove it (with thinking enabled) it starts talking in Chinese again.

@pwilkin
Copy link
Contributor

pwilkin commented Aug 5, 2025

Nope, on Hunyuan 7B still garbage. Tried the fixed prompt from their tokenizer config, but still doesn't work.

@pwilkin
Copy link
Contributor

pwilkin commented Aug 5, 2025

I guess it might have something to do with this:
load: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect

@arch-btw
Copy link
Contributor

arch-btw commented Aug 5, 2025

I think 7b uses a different tokenizer.

@pwilkin
Copy link
Contributor

pwilkin commented Aug 5, 2025

Yes, but it's been incorrectly uploaded from what I've seen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
python python script changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants