Conversation
There was a problem hiding this comment.
Pull request overview
Updates LFM2 tool-call parsing and token-constraint logic to better handle tool argument formatting and enforce required parameters during generation.
Changes:
- Skip adding tool specs with empty names when parsing tools JSON.
- Improve LFM2 tool-call argument parsing to support nested values (arrays/objects) and emit non-string JSON values (booleans/null).
- Extend LFM2 token constraints to track the current function + seen argument keys and block
)until required parameters are present.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
cactus/ffi/cactus_utils.h |
Tightens tool JSON parsing and refines LFM2 tool-call argument extraction/JSON construction. |
cactus/engine/engine_constraints.cpp |
Adds LFM2 constraint state for required args and introduces prefix-token collection for (. |
cactus/engine/engine.h |
Adds new LFM2 tracking members and declares the new prefix-token helper. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| while (!args_str.empty() && (args_str.back() == ')' || args_str.back() == ']')) { | ||
| args_str.pop_back(); | ||
| args_str = trim_lfm2_slice(args_str, 0, args_str.size()); | ||
| } | ||
| args_str = trim_lfm2_slice(args_str, 0, args_str.size()); |
There was a problem hiding this comment.
The trailing-character stripping in append_lfm2_call will incorrectly remove a closing ] that belongs to the last argument value (e.g., foo(items=[1,2]) becomes items=[1,2 after popping ) then ]), producing invalid JSON arguments. Instead of popping ]/) blindly, slice arguments using the position of the last ) in trimmed_entry (or only strip an extra ] if it appears after the closing )), so bracketed values remain intact.
| void ToolCallConstrainer::add_tokens_for_prefix_string(const std::string& prefix, std::unordered_set<uint32_t>& token_set) { | ||
| if (!tokenizer_) return; | ||
| const uint32_t vocab = tokenizer_->get_vocab_size(); | ||
| for (uint32_t t = 0; t < vocab; t++) { | ||
| if (tokenizer_->decode({t}).rfind(prefix, 0) == 0) { | ||
| token_set.insert(t); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
add_tokens_for_prefix_string decodes every token in the vocabulary on each call (O(vocab) decode operations). This can be a noticeable latency hit when initializing/updating tool constraints. Consider caching decoded token strings once (similar to needle_token_strings_), or building this set once per tokenizer instance and reusing it across initializations.
| std::string lfm_args_buffer_; | ||
| std::unordered_set<std::string> lfm_seen_arg_keys_; | ||
| std::unordered_map<std::string, std::vector<std::string>> lfm_required_params_; | ||
| std::unordered_map<std::string, std::vector<std::string>> lfm_all_params_; |
There was a problem hiding this comment.
lfm_all_params_ is populated in ToolCallConstrainer::init() but is never read anywhere in the codebase. If it’s not needed for upcoming constraints logic, it should be removed to avoid carrying unused state; otherwise, consider using it (e.g., to bias/validate allowed parameter keys) so it has a clear purpose.
| std::unordered_map<std::string, std::vector<std::string>> lfm_all_params_; | |
| [[maybe_unused]] std::unordered_map<std::string, std::vector<std::string>> lfm_all_params_; |
No description provided.