Commit 939fa72
Ralf Waldukat
fix: prevent KV cache corruption on SWA/ISWA models (e.g. Gemma-4)
SWA/ISWA KV caches maintain global position maps (g_iswa_pos_max/min) that
are only cleared by llama_memory_clear(), not by kv_cache_seq_rm(). When
generate() finds a prefix match (e.g. shared BOS token), it calls
kv_cache_seq_rm which returns True for ISWA, skipping the full reset. But
the stale position maps cause batch allocator inconsistency and
llama_decode returned -1 on subsequent prompts.
Changes:
- Add _has_swa property via llama_model_n_swa() > 0
- reset() now calls llama_memory_clear() unconditionally
- generate() bypasses prefix-match optimization for SWA models,
forcing full state reset (same path as recurrent models)1 parent 1cb8b9f commit 939fa72
1 file changed
Lines changed: 39 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
553 | 553 | | |
554 | 554 | | |
555 | 555 | | |
| 556 | + | |
| 557 | + | |
| 558 | + | |
| 559 | + | |
| 560 | + | |
| 561 | + | |
| 562 | + | |
| 563 | + | |
556 | 564 | | |
557 | 565 | | |
558 | 566 | | |
| |||
580 | 588 | | |
581 | 589 | | |
582 | 590 | | |
| 591 | + | |
| 592 | + | |
| 593 | + | |
| 594 | + | |
| 595 | + | |
| 596 | + | |
| 597 | + | |
| 598 | + | |
583 | 599 | | |
584 | 600 | | |
585 | 601 | | |
| |||
638 | 654 | | |
639 | 655 | | |
640 | 656 | | |
| 657 | + | |
| 658 | + | |
| 659 | + | |
| 660 | + | |
641 | 661 | | |
642 | 662 | | |
643 | 663 | | |
| |||
889 | 909 | | |
890 | 910 | | |
891 | 911 | | |
892 | | - | |
| 912 | + | |
893 | 913 | | |
894 | 914 | | |
895 | 915 | | |
896 | 916 | | |
| 917 | + | |
| 918 | + | |
| 919 | + | |
| 920 | + | |
| 921 | + | |
| 922 | + | |
| 923 | + | |
| 924 | + | |
| 925 | + | |
| 926 | + | |
| 927 | + | |
| 928 | + | |
| 929 | + | |
| 930 | + | |
| 931 | + | |
| 932 | + | |
| 933 | + | |
| 934 | + | |
897 | 935 | | |
898 | 936 | | |
899 | 937 | | |
| |||
0 commit comments