Skip to content

Conversation

@cdoern
Copy link
Collaborator

@cdoern cdoern commented Nov 26, 2025

What does this PR do?

Add support for reasoning fields in OpenAI-compatible chat completion
messages to enable compatibility with vLLM reasoning parsers.

Changes:

  • Add reasoning_content and reasoning fields to OpenAIAssistantMessageParam
  • Add reasoning field to OpenAIChoiceDelta (reasoning_content already existed)

Both field names are supported for maximum compatibility:

  • reasoning_content: Used by vLLM ≤ v0.8.4
  • reasoning: New field name in vLLM ≥ v0.9.x
    (based on release notes)

vLLM documentation recommends migrating to the shorter reasoning field
name, but maintains backward compatibility with reasoning_content.

These fields allow reasoning models to return their chain-of-thought
process alongside the final answer, which is crucial for transparency
and debugging with reasoning models.

References:

Test Plan

vllm serve deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B \
    --reasoning-parser deepseek_r1
  
llama stack run starter

curl http://localhost:8321/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{
      "model": "vllm/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B",
      "messages": [
        {"role": "user", "content": "What is 25 * 4?"}
      ]
    }'

{"id":"chatcmpl-9df9d2a5f849bbe0","choices":[{"finish_reason":"stop","index":0,"logprobs":null,"message":{"content":"\n\nTo calculate \\(25 \\times 4\\), follow these easy steps:\n\n1. **Understand the Multiplication:**\n   \n   \\(25 \\times 4\\) means you are adding the number 25 four times.\n   \n   \\[\n   25 + 25 + 25 + 25 = 100\n   \\]\n\n2. **Break Down the Multiplication:**\n   \n   - Multiply 25 by 2:\n     \\[\n     25 \\times 2 = 50\n     \\]\n   - Then multiply the result by 2:\n     \\[\n     50 \\times 2 = 100\n     \\]\n\n3. **Final Answer:**\n   \n   \\[\n   \\boxed{100}\n   \\]","refusal":null,"role":"assistant","annotations":null,"audio":null,"function_call":null,"tool_calls":null,"reasoning":"To solve 25 multiplied by 4, I start by recognizing that 25 is a quarter of 100. Multiplying 25 by 4 is the same as finding a quarter of 100 multiplied by 4, which equals 100.\n\nNext, I can consider that 25 multiplied by 4 is also equal to 25 multiplied by 2, which is 50, and then multiplied by 2 again, resulting in 100.\n\nAlternatively, I can use the distributive property by breaking down 4 into 3 and 1, so 25 multiplied by 3 is 75, and 25 multiplied by 1 is 25. Adding these together gives 100.\n\nBoth methods lead to the same result, confirming that 25 multiplied by 4 equals 100.\n","reasoning_content":"To solve 25 multiplied by 4, I start by recognizing that 25 is a quarter of 100. Multiplying 25 by 4 is the same as finding a quarter of 100 multiplied by 4, which equals 100.\n\nNext, I can consider that 25 multiplied by 4 is also equal to 25 multiplied by 2, which is 50, and then multiplied by 2 again, resulting in 100.\n\nAlternatively, I can use the distributive property by breaking down 4 into 3 and 1, so 25 multiplied by 3 is 75, and 25 multiplied by 1 is 25. Adding these together gives 100.\n\nBoth methods lead to the same result, confirming that 25 multiplied by 4 equals 100.\n"},"stop_reason":null,"token_ids":null}],"created":1764187386,"model":"vllm/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B","object":"chat.completion","service_tier":null,"system_fingerprint":null,"usage":{"completion_tokens":356,"prompt_tokens":14,"total_tokens":370,"completion_tokens_details":null,"prompt_tokens_details":null},"prompt_logprobs":null,"prompt_token_ids":null,"kv_transfer_params":null,"metrics":[{"trace_id":"9ed1630440cb1e923916455d98663df3","span_id":"a27b4cb4208ed39f","timestamp":"2025-11-26T20:03:19.089063Z","attributes":{"model_id":"vllm/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B","provider_id":"vllm"},"type":"metric","metric":"prompt_tokens","value":14,"unit":"tokens"},{"trace_id":"9ed1630440cb1e923916455d98663df3","span_id":"a27b4cb4208ed39f","timestamp":"2025-11-26T20:03:19.089072Z","attributes":{"model_id":"vllm/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B","provider_id":"vllm"},"type":"metric","metric":"completion_tokens","value":356,"unit":"tokens"},{"trace_id":"9ed1630440cb1e923916455d98663df3","span_id":"a27b4cb4208ed39f","timestamp":"2025-11-26T20:03:19.089075Z","attributes":{"model_id":"vllm/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B","provider_id":"vllm"},"type":"metric","metric":"total_tokens","value":370,"unit":"tokens"}]}%

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Nov 26, 2025
Add support for reasoning fields in OpenAI-compatible chat completion
messages to enable compatibility with vLLM reasoning parsers.

Changes:
- Add `reasoning_content` and `reasoning` fields to OpenAIAssistantMessageParam
- Add `reasoning` field to OpenAIChoiceDelta (reasoning_content already existed)

Both field names are supported for maximum compatibility:
- `reasoning_content`: Used by vLLM ≤ v0.8.4
- `reasoning`: New field name in vLLM ≥ v0.9.x

vLLM documentation recommends migrating to the shorter `reasoning` field
name, but maintains backward compatibility with `reasoning_content`.

These fields allow reasoning models to return their chain-of-thought
process alongside the final answer, which is crucial for transparency
and debugging with reasoning models.

References:
- vLLM Reasoning Outputs: https://docs.vllm.ai/en/stable/features/reasoning_outputs/
- vLLM Issue #12468: vllm-project/vllm#12468

Signed-off-by: Charlie Doern <[email protected]>
@cdoern
Copy link
Collaborator Author

cdoern commented Nov 26, 2025

this one can wait until CI is back, want to make sure this doesnt break engines which don't support the field.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant