Gemma4 tool calling and RoPE fixes by ncylich · Pull Request #594 · cactus-compute/cactus

ncylich · 2026-04-17T05:30:01Z

Summary

Fix Gemma4 partial RoPE to rotate interleaved halves (pair index i with i + head_dim/2) instead of a contiguous prefix, matching the reference implementation.
Fix tool-calling constraints in Gemma4: make set_tool_constraints / clear_tool_constraints / update_tool_constraints virtual on the base model and forward them through Gemma4MmModel so multimodal instances actually apply constraints.
Fix tool JSON parsing: advance the "function" search cursor past the parsed parameters block so multiple tools in a single payload are no longer dropped or re-parsed.
Fix special-tokens map loading to honor escape sequences via a shared extract_json_string helper (previously rfind("\"") could pull in trailing quoted content).

Test plan

Build and run existing Gemma4 unit tests
Verify tool-calling end-to-end on a multimodal Gemma4 prompt with multiple tools
Sanity-check RoPE output against reference for a partial-rotation config

Copilot

Pull request overview

This PR fixes several Gemma4 correctness issues affecting rotary embeddings and tool calling, ensuring behavior matches reference implementations and that multimodal Gemma4 instances correctly apply tool constraints.

Changes:

Correct Gemma4 partial RoPE to rotate interleaved halves rather than a contiguous prefix.
Make tool-constraint APIs virtual on Model and forward them through Gemma4MmModel.
Fix tool JSON parsing cursor advancement so multiple tools in one payload are handled.
Improve special-tokens map parsing by honoring JSON escape sequences via extract_json_string.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
cactus/models/gemma4/model_gemma4_mm.cpp	Forwards tool-constraint methods to the underlying language model so multimodal decoding is constrained.
cactus/models/gemma4/model_gemma4.h	Declares `Gemma4MmModel` overrides for tool-constraint APIs.
cactus/models/gemma4/model_gemma4.cpp	Updates partial RoPE slicing/rotation to use interleaved-halves semantics.
cactus/ffi/cactus_utils.h	Fixes tool JSON parsing to continue searching after the parsed `parameters` block.
cactus/engine/engine_tokenizer.cpp	Uses `extract_json_string` to correctly parse escaped special-token strings.
cactus/engine/engine.h	Adds `extract_json_string`, makes tool-constraint APIs virtual, and extends `ToolCallConstrainer` API surface.

Comments suppressed due to low confidence (1)

cactus/ffi/cactus_utils.h:910

parse_tools_json still has unchecked find(...) + 1 patterns (e.g., size_t name_start = json.find('"', name_pos + 6) + 1;). If find returns npos, the + 1 wraps to 0 and the subsequent substr calls can parse unrelated parts of the payload. Since tools_json is passed in from the FFI, please add npos checks for each delimiter search (and ideally bound searches to the current tool object) so malformed inputs fail gracefully instead of producing corrupted tool specs.

        size_t next_search = pos + 1;
        
        size_t name_pos = json.find("\"name\"", pos);
        if (name_pos != std::string::npos) {
            size_t name_start = json.find('"', name_pos + 6) + 1;
            size_t name_end = json.find('"', name_start);
            tool.name = json.substr(name_start, name_end - name_start);
        }

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-17T05:34:51Z

    if (rot_dim < head_dim) {
+        size_t half_dim = head_dim / 2;
+        size_t half_rot = rot_dim / 2;
+        size_t pass_len = half_dim - half_rot;
        float adjusted_theta = std::pow(rope_freq, static_cast<float>(rot_dim) / static_cast<float>(head_dim));


apply_partial_rope can pass an odd or zero rot_dim into gb->rope(...) (since rot_dim is computed via float->size_t truncation and isn't clamped/aligned). The underlying RoPE kernel assumes an even head_dim and will leave the last element(s) uninitialized when head_dim is odd; for rot_dim==0 it will also build zero-length slices/concats unnecessarily. Consider clamping rot_dim to [0, head_dim], forcing it to be even (e.g., rot_dim &= ~1u), and early-returning when rot_dim==0 or rot_dim>=head_dim (similar to the partial-rope handling in model_qwen3p5.cpp).

Copilot · 2026-04-17T05:34:52Z

    void compute_bias();
    void tokenize_grammar_elements();
    void add_tokens_for_string(const std::string& str, std::unordered_set<uint32_t>& token_set);
+    void add_tokens_containing(char needle, std::unordered_set<uint32_t>& token_set);


ToolCallConstrainer declares add_tokens_containing(...) here, but there is no corresponding definition in the codebase (searching the repository only finds this declaration). If this is intended to be used, it needs an implementation (or should be removed until it’s needed) to avoid leaving dead/incomplete API surface in engine.h.

Suggested change

void add_tokens_containing(char needle, std::unordered_set<uint32_t>& token_set);

Signed-off-by: Noah Cylich <noahcylich@gmail.com>

This reverts commit 4ab77bb. Signed-off-by: Noah Cylich <noahcylich@gmail.com>

Signed-off-by: Noah Cylich <noahcylich@gmail.com>

Copilot AI review requested due to automatic review settings April 17, 2026 05:30

Copilot started reviewing on behalf of ncylich April 17, 2026 05:30 View session

Copilot AI reviewed Apr 17, 2026

View reviewed changes

ncylich added 4 commits April 16, 2026 22:34

fixing tool calling parsing, tokenization, and tool forcing

aaca62f

Signed-off-by: Noah Cylich <noahcylich@gmail.com>

fix tool formatting

5adb546

Signed-off-by: Noah Cylich <noahcylich@gmail.com>

Revert "fix tool formatting"

22c88cd

This reverts commit 4ab77bb. Signed-off-by: Noah Cylich <noahcylich@gmail.com>

fixed RoPE implementation

565d40e

Signed-off-by: Noah Cylich <noahcylich@gmail.com>

ncylich force-pushed the gemma-fc-fix branch from dd18b09 to 565d40e Compare April 17, 2026 05:35

HenryNdubuaku merged commit 34df072 into main Apr 17, 2026
3 of 6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gemma4 tool calling and RoPE fixes#594

Gemma4 tool calling and RoPE fixes#594
HenryNdubuaku merged 4 commits intomainfrom
gemma-fc-fix

ncylich commented Apr 17, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 17, 2026

Uh oh!

Copilot AI Apr 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ncylich commented Apr 17, 2026

Summary

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants