Skip to content

Gemma4 tool calling and RoPE fixes#594

Merged
HenryNdubuaku merged 4 commits intomainfrom
gemma-fc-fix
Apr 17, 2026
Merged

Gemma4 tool calling and RoPE fixes#594
HenryNdubuaku merged 4 commits intomainfrom
gemma-fc-fix

Conversation

@ncylich
Copy link
Copy Markdown
Collaborator

@ncylich ncylich commented Apr 17, 2026

Summary

  • Fix Gemma4 partial RoPE to rotate interleaved halves (pair index i with i + head_dim/2) instead of a contiguous prefix, matching the reference implementation.
  • Fix tool-calling constraints in Gemma4: make set_tool_constraints / clear_tool_constraints / update_tool_constraints virtual on the base model and forward them through Gemma4MmModel so multimodal instances actually apply constraints.
  • Fix tool JSON parsing: advance the "function" search cursor past the parsed parameters block so multiple tools in a single payload are no longer dropped or re-parsed.
  • Fix special-tokens map loading to honor escape sequences via a shared extract_json_string helper (previously rfind("\"") could pull in trailing quoted content).

Test plan

  • Build and run existing Gemma4 unit tests
  • Verify tool-calling end-to-end on a multimodal Gemma4 prompt with multiple tools
  • Sanity-check RoPE output against reference for a partial-rotation config

Copilot AI review requested due to automatic review settings April 17, 2026 05:30
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes several Gemma4 correctness issues affecting rotary embeddings and tool calling, ensuring behavior matches reference implementations and that multimodal Gemma4 instances correctly apply tool constraints.

Changes:

  • Correct Gemma4 partial RoPE to rotate interleaved halves rather than a contiguous prefix.
  • Make tool-constraint APIs virtual on Model and forward them through Gemma4MmModel.
  • Fix tool JSON parsing cursor advancement so multiple tools in one payload are handled.
  • Improve special-tokens map parsing by honoring JSON escape sequences via extract_json_string.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
cactus/models/gemma4/model_gemma4_mm.cpp Forwards tool-constraint methods to the underlying language model so multimodal decoding is constrained.
cactus/models/gemma4/model_gemma4.h Declares Gemma4MmModel overrides for tool-constraint APIs.
cactus/models/gemma4/model_gemma4.cpp Updates partial RoPE slicing/rotation to use interleaved-halves semantics.
cactus/ffi/cactus_utils.h Fixes tool JSON parsing to continue searching after the parsed parameters block.
cactus/engine/engine_tokenizer.cpp Uses extract_json_string to correctly parse escaped special-token strings.
cactus/engine/engine.h Adds extract_json_string, makes tool-constraint APIs virtual, and extends ToolCallConstrainer API surface.
Comments suppressed due to low confidence (1)

cactus/ffi/cactus_utils.h:910

  • parse_tools_json still has unchecked find(...) + 1 patterns (e.g., size_t name_start = json.find('"', name_pos + 6) + 1;). If find returns npos, the + 1 wraps to 0 and the subsequent substr calls can parse unrelated parts of the payload. Since tools_json is passed in from the FFI, please add npos checks for each delimiter search (and ideally bound searches to the current tool object) so malformed inputs fail gracefully instead of producing corrupted tool specs.
        size_t next_search = pos + 1;
        
        size_t name_pos = json.find("\"name\"", pos);
        if (name_pos != std::string::npos) {
            size_t name_start = json.find('"', name_pos + 6) + 1;
            size_t name_end = json.find('"', name_start);
            tool.name = json.substr(name_start, name_end - name_start);
        }

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 229 to 233
if (rot_dim < head_dim) {
size_t half_dim = head_dim / 2;
size_t half_rot = rot_dim / 2;
size_t pass_len = half_dim - half_rot;
float adjusted_theta = std::pow(rope_freq, static_cast<float>(rot_dim) / static_cast<float>(head_dim));
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apply_partial_rope can pass an odd or zero rot_dim into gb->rope(...) (since rot_dim is computed via float->size_t truncation and isn't clamped/aligned). The underlying RoPE kernel assumes an even head_dim and will leave the last element(s) uninitialized when head_dim is odd; for rot_dim==0 it will also build zero-length slices/concats unnecessarily. Consider clamping rot_dim to [0, head_dim], forcing it to be even (e.g., rot_dim &= ~1u), and early-returning when rot_dim==0 or rot_dim>=head_dim (similar to the partial-rope handling in model_qwen3p5.cpp).

Copilot uses AI. Check for mistakes.
Comment thread cactus/engine/engine.h
void compute_bias();
void tokenize_grammar_elements();
void add_tokens_for_string(const std::string& str, std::unordered_set<uint32_t>& token_set);
void add_tokens_containing(char needle, std::unordered_set<uint32_t>& token_set);
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ToolCallConstrainer declares add_tokens_containing(...) here, but there is no corresponding definition in the codebase (searching the repository only finds this declaration). If this is intended to be used, it needs an implementation (or should be removed until it’s needed) to avoid leaving dead/incomplete API surface in engine.h.

Suggested change
void add_tokens_containing(char needle, std::unordered_set<uint32_t>& token_set);

Copilot uses AI. Check for mistakes.
ncylich added 4 commits April 16, 2026 22:34
Signed-off-by: Noah Cylich <noahcylich@gmail.com>
Signed-off-by: Noah Cylich <noahcylich@gmail.com>
This reverts commit 4ab77bb.

Signed-off-by: Noah Cylich <noahcylich@gmail.com>
Signed-off-by: Noah Cylich <noahcylich@gmail.com>
@HenryNdubuaku HenryNdubuaku merged commit 34df072 into main Apr 17, 2026
3 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants