-
Notifications
You must be signed in to change notification settings - Fork 1.2k
[Bug] xgrammar GrammarMatcher crashes with invalid token id when using JSON Schema mode (Qwen3-0.6B) #807
Description
Thanks for the great work on web-llm!
Description
When using response_format with type: "json_object" (both with and without a schema constraint), the xgrammar GrammarMatcher crashes with an invalid token id. The issue is reproducible with the Qwen3-0.6B-q4f16_1-MLC model.
The crash occurs in two scenarios:
JSON Schema mode (type: "json_object" + schema): crashes frequently with complex input.
JSON Mode (type: "json_object" without schema): also reproducible, especially with longer or more descriptive prompts.
Chinese input triggers the crash most frequently — prompts like "发光塑料青蛙玩具,适合6岁以下小朋友,安全无毒" have a high crash rate. However, English input can also trigger the same crash occasionally at a lower rate. This suggests the issue is not strictly language-specific, but rather related to how xgrammar handles certain token sequences under grammar-guided decoding.
Error Message
[FATAL] xgrammar/cpp/matcher.cc:273: Check failed: (token_id >= 0 && token_id < tokenizer_info_.GetVocabSize()) is false: Invalid token id -1447643207 for GrammarMatcherAborted()
Error getting AI response: RuntimeError: Aborted(). Build with -sASSERTIONS for more info.
Environment
- @mlc-ai/web-llm version: 0.2.80
- Model: Qwen3-0.6B-q4f16_1-MLC
- Browser: Chrome (latest)
- OS: macOS
Steps to Reproduce
- Load the Qwen3-0.6B-q4f16_1-MLC model using web-llm.
- Call chat.completions.create with the following configuration:
const schema = JSON.stringify({
type: "object",
properties: {
title: { type: "string" },
selling_points: { type: "array", items: { type: "string" } },
price_range: { type: "string" }
},
required: ["title", "selling_points", "price_range"]
});
const response = await engine.chat.completions.create({
messages: [
{ role: "system", content: "You are a helpful assistant. Output valid JSON only." },
{ role: "user", content: '为名为"发光塑料青蛙玩具,适合6岁以下小朋友,安全无毒"的产品生成营销策略,输出 JSON 包含 title, selling_points, price_range 字段。' }
],
stream: false,
response_format: {
type: "json_object",
schema: schema,
},
extra_body: {
enable_thinking: false,
}
});
- The xgrammar matcher crashes with the error above. JSON Schema mode crashes more frequently, but JSON Mode is also affected.
- English input such as "Glow-in-the-dark plastic frog toy, suitable for children under 6, safe and non-toxic" can also trigger the same crash, though less frequently.
Expected Behavior
The model should return a valid JSON response, or gracefully return an error instead of crashing the WASM runtime.
Actual Behavior
The xgrammar GrammarMatcher encounters an invalid token id (negative integer overflow), enters an aborted state, and the entire WASM runtime crashes with RuntimeError: Aborted().
Additional Context
- Both JSON Schema mode and JSON Mode are affected.
- Chinese input triggers the crash more frequently, but English input is also affected at a lower rate.
- It appears to be related to how xgrammar's grammar-guided decoding handles certain token sequences that conflict with the JSON grammar constraints.