Skip to content

Commit 86f5623

Browse files
authored
llama : fix MiniCPM inference after Granite Four changes (#14850)
MiniCPM models use the llm_build_granite constructor which was changed in the Granite Four PR to use hparams.rope_finetuned instead of a use_rope parameter. MiniCPM models need rope enabled by default. Fixes inference from gibberish to correct responses.
1 parent 39cffdf commit 86f5623

File tree

1 file changed

+3
-0
lines changed

1 file changed

+3
-0
lines changed

src/llama-model.cpp

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -646,6 +646,9 @@ void llama_model::load_hparams(llama_model_loader & ml) {
646646
ml.get_key(LLM_KV_RESIDUAL_SCALE, hparams.f_residual_scale);
647647
ml.get_key(LLM_KV_LOGIT_SCALE, hparams.f_logit_scale);
648648

649+
// MiniCPM uses rope by default, unlike Granite which uses it as a switch
650+
hparams.rope_finetuned = true;
651+
649652
switch (hparams.n_layer) {
650653
case 52: type = LLM_TYPE_1B; break;
651654
case 40: type = LLM_TYPE_2B; break;

0 commit comments

Comments
 (0)