Skip to content

[Bug] xgrammar GrammarMatcher crashes with invalid token id when using JSON Schema mode (Qwen3-0.6B) #807

@wopelo

Description

@wopelo

Thanks for the great work on web-llm!

Description

When using response_format with type: "json_object" (both with and without a schema constraint), the xgrammar GrammarMatcher crashes with an invalid token id. The issue is reproducible with the Qwen3-0.6B-q4f16_1-MLC model.

The crash occurs in two scenarios:

JSON Schema mode (type: "json_object" + schema): crashes frequently with complex input.
JSON Mode (type: "json_object" without schema): also reproducible, especially with longer or more descriptive prompts.
Chinese input triggers the crash most frequently — prompts like "发光塑料青蛙玩具,适合6岁以下小朋友,安全无毒" have a high crash rate. However, English input can also trigger the same crash occasionally at a lower rate. This suggests the issue is not strictly language-specific, but rather related to how xgrammar handles certain token sequences under grammar-guided decoding.

Error Message

[FATAL] xgrammar/cpp/matcher.cc:273: Check failed: (token_id >= 0 && token_id < tokenizer_info_.GetVocabSize()) is false: Invalid token id -1447643207 for GrammarMatcherAborted()

Error getting AI response: RuntimeError: Aborted(). Build with -sASSERTIONS for more info.

Environment

  • @mlc-ai/web-llm version: 0.2.80
  • Model: Qwen3-0.6B-q4f16_1-MLC
  • Browser: Chrome (latest)
  • OS: macOS

Steps to Reproduce

  1. Load the Qwen3-0.6B-q4f16_1-MLC model using web-llm.
  2. Call chat.completions.create with the following configuration:
const schema = JSON.stringify({
  type: "object",
  properties: {
    title: { type: "string" },
    selling_points: { type: "array", items: { type: "string" } },
    price_range: { type: "string" }
  },
  required: ["title", "selling_points", "price_range"]
});

const response = await engine.chat.completions.create({
  messages: [
    { role: "system", content: "You are a helpful assistant. Output valid JSON only." },
    { role: "user", content: '为名为"发光塑料青蛙玩具,适合6岁以下小朋友,安全无毒"的产品生成营销策略,输出 JSON 包含 title, selling_points, price_range 字段。' }
  ],
  stream: false,
  response_format: {
    type: "json_object",
    schema: schema,
  },
  extra_body: {
    enable_thinking: false,
  }
});
  1. The xgrammar matcher crashes with the error above. JSON Schema mode crashes more frequently, but JSON Mode is also affected.
  2. English input such as "Glow-in-the-dark plastic frog toy, suitable for children under 6, safe and non-toxic" can also trigger the same crash, though less frequently.

Expected Behavior

The model should return a valid JSON response, or gracefully return an error instead of crashing the WASM runtime.

Actual Behavior

The xgrammar GrammarMatcher encounters an invalid token id (negative integer overflow), enters an aborted state, and the entire WASM runtime crashes with RuntimeError: Aborted().

Additional Context

  • Both JSON Schema mode and JSON Mode are affected.
  • Chinese input triggers the crash more frequently, but English input is also affected at a lower rate.
  • It appears to be related to how xgrammar's grammar-guided decoding handles certain token sequences that conflict with the JSON grammar constraints.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions