fix: restore MiniCPM inference after Granite Four changes #14850

jk3456a · 2025-07-24T07:09:28Z

This commit fixes MiniCPM model inference that was broken by the Granite Four PR (#13550). The issue had two parts:

Missing LLM_KV_ATTENTION_LAYER_INDICES enum value that was removed, causing enum ordering to shift and breaking model metadata parsing
MiniCPM architecture uses llm_build_granite which was changed to use hparams.rope_finetuned instead of use_rope parameter, but MiniCPM models were not setting this flag correctly

Changes:

Restore LLM_KV_ATTENTION_LAYER_INDICES enum and string mapping
Set hparams.rope_finetuned = true for MiniCPM architecture

Fixes inference output from gibberish to correct model responses.

Tested with MiniCPM 0.5B model showing proper inference: Input: "你好"
Output: "你好，我是MiniCPM系列模型，由面壁智能和OpenBMB开源社区开发。详细信息请访问 https://github.com/OpenBMB/ [end of text]"

Make sure to read the contributing guidelines before submitting a PR

ggerganov · 2025-07-24T07:16:33Z

Missing LLM_KV_ATTENTION_LAYER_INDICES enum value that was removed, causing enum ordering to shift and breaking model metadata parsing

Hm, I don't think the actual enum values matter. Which parsing is broken?

MiniCPM models use the llm_build_granite constructor which was changed in the Granite Four PR to use hparams.rope_finetuned instead of a use_rope parameter. MiniCPM models need rope enabled by default. Fixes inference from gibberish to correct responses.

jk3456a · 2025-07-24T08:37:21Z

sorry you are right, the enum value doesn't matter.
The actual issue was that MiniCPM models use the llm_build_granite constructor, which was changed in the Granite Four PR to use hparams.rope_finetuned instead of the use_rope parameter. MiniCPM models need rope enabled by default, but weren't setting this flag.

) MiniCPM models use the llm_build_granite constructor which was changed in the Granite Four PR to use hparams.rope_finetuned instead of a use_rope parameter. MiniCPM models need rope enabled by default. Fixes inference from gibberish to correct responses.

* origin/master: docs : update HOWTO‑add‑model.md for ModelBase and new model classes (ggml-org#14874) ggml : remove invalid portPos specifiers from dot files (ggml-org#14838) context : restore preemptive sched reset when LLAMA_SET_ROWS=0 (ggml-org#14870) mtmd : fix 32-bit narrowing issue in export-lora and mtmd clip (ggml-org#14503) rpc : check for null buffers in get/set/copy tensor endpoints (ggml-org#14868) sched : fix multiple evaluations of the same graph with pipeline parallelism (ggml-org#14855) musa: upgrade musa sdk to rc4.2.0 (ggml-org#14498) sync : ggml cmake : fix usage issues (ggml/1257) ggml-cpu : remove stdlib include from repack.cpp (ggml/1276) context : perform output reorder lazily upon access after sync (ggml-org#14853) chat : fix kimi-k2 chat template (ggml-org#14852) sycl: fixed semantics of block offset calculation (ggml-org#14814) llama : fix MiniCPM inference after Granite Four changes (ggml-org#14850) docs: add libcurl-dev install hint for Linux distros (ggml-org#14801) metal : fix fusion across different encoders (ggml-org#14849) sycl: fix undefined variable in work group size check (ggml-org#14843) convert : text-only support for GLM-4.1V-9B-Thinking (ggml-org#14823) CUDA: fix overflow in FA, tune performance (ggml-org#14840) CUDA: fix compilation with GGML_CUDA_F16 (ggml-org#14837)

jk3456a force-pushed the fix/minicpm-rope-inference branch from 19594e5 to 2cdb760 Compare July 24, 2025 08:33

CISC approved these changes Jul 24, 2025

View reviewed changes

CISC merged commit 86f5623 into ggml-org:master Jul 24, 2025
47 checks passed

jk3456a deleted the fix/minicpm-rope-inference branch July 24, 2025 14:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: restore MiniCPM inference after Granite Four changes #14850

fix: restore MiniCPM inference after Granite Four changes #14850

Uh oh!

jk3456a commented Jul 24, 2025

Uh oh!

ggerganov commented Jul 24, 2025

Uh oh!

jk3456a commented Jul 24, 2025

Uh oh!

Uh oh!

Uh oh!

fix: restore MiniCPM inference after Granite Four changes #14850

fix: restore MiniCPM inference after Granite Four changes #14850

Uh oh!

Conversation

jk3456a commented Jul 24, 2025

Uh oh!

ggerganov commented Jul 24, 2025

Uh oh!

jk3456a commented Jul 24, 2025

Uh oh!

Uh oh!

Uh oh!