[Bugfix] Validate custom logits processor xargs for online serving #27560

Isotr0py · 2025-10-27T08:19:27Z

Purpose

The default model_type for deepseek-ocr is deepseek_vl_v2, so the default deepseek-ocr chat template won't be used by default:

vllm/vllm/transformers_utils/chat_templates/registry.py

Line 36 in a806c14

"deepseek_ocr": CHAT_TEMPLATES_DIR / "template_deepseek_ocr.jinja",
Also allow list of int as vllm_xargs for ChatCompletionRequest, otherwise "whitelist_token_ids": [128821, 128822], parameters for deepseek-ocr's custom logits processor is not allowed.

openai.BadRequestError: Error code: 400 - {'error': {'message': "[{'type': 'string_type', 'loc': ('body', 'vllm_xargs', 'whitelist_token_ids', 'str'), 'msg': 'Input should be a valid string', 'input': [128821, 128822]}, {'type': 'int_type', 'loc': ('body', 'vllm_xargs', 'whitelist_token_ids', 'int'), 'msg': 'Input should be a valid integer', 'input': [128821, 128822]}, {'type': 'float_type', 'loc': ('body', 'vllm_xargs', 'whitelist_token_ids', 'float'), 'msg': 'Input should be a valid number', 'input': [128821, 128822]}]", 'type': 'Bad Request', 'param': None, 'code': 400}}

Also validate request's vllm_xargs to avoid unexpected behavior from custom logits processor.

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Isotr0py <[email protected]>

gemini-code-assist

Code Review

This pull request addresses a bug in the DeepSeek-OCR chat template fallback and allows a list of integers as vllm_xargs for ChatCompletionRequest to accommodate custom logits processors. The code changes modify vllm/entrypoints/openai/protocol.py to allow lists in vllm_xargs and vllm/transformers_utils/configs/deepseek_vl2.py to correctly set the model_type for DeepSeek OCR models. No style guide was provided, and the changes appear consistent with common Python practices.

gemini-code-assist · 2025-10-27T08:21:39Z

vllm/transformers_utils/configs/deepseek_vl2.py

+        # update model_type for OCR model
+        if "DeepseekOCRForCausalLM" in (
+            self.architectures or kwargs.get("architectures", [])
+        ):
+            self.model_type = "deepseek_ocr"


The condition if "DeepseekOCRForCausalLM" in (self.architectures or kwargs.get("architectures", [])) could potentially be simplified by directly checking if "DeepseekOCRForCausalLM" in self.architectures + kwargs.get("architectures", []). This avoids the need for the or operator and might improve readability.

However, it's crucial to ensure that this change doesn't alter the behavior of the code, especially if self.architectures or kwargs.get("architectures", []) could be None or not a list. Adding a check to ensure that these are lists before concatenation could mitigate this risk.

Suggested change

# update model_type for OCR model

if "DeepseekOCRForCausalLM" in (

self.architectures or kwargs.get("architectures", [])

):

self.model_type = "deepseek_ocr"

# update model_type for OCR model

architectures = self.architectures if self.architectures else []

architectures += kwargs.get("architectures", []) if kwargs.get("architectures", []) else []

if "DeepseekOCRForCausalLM" in architectures:

self.model_type = "deepseek_ocr"

Signed-off-by: Isotr0py <[email protected]>

DarkLight1337

We are going to patch a security vulnerability before merging this

Signed-off-by: Isotr0py <[email protected]>

mergify · 2025-10-27T12:17:04Z

Documentation preview: https://vllm--27560.org.readthedocs.build/en/27560/

njhill · 2025-10-27T17:50:48Z

cc @afeldman-nm I think we discussed this at one point

njhill · 2025-10-27T17:52:50Z

Thanks @Isotr0py! I would also like to review this

mergify · 2025-10-28T15:19:58Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @Isotr0py.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

afeldman-nm

Thanks for this @Isotr0py ! It will be good to have argument validation

afeldman-nm · 2025-10-28T18:21:04Z

docs/features/custom_logitsprocs.md

 While request-level logits processors are explicitly *not* supported in the vLLM engine, vLLM *does* provide a convenient process to wrap an existing `Callable` request-level logits processor and create a batch-level logits processor that is compatible with vLLM. The `Callable` must conform to the type annotation above; if your request-level logits processor has a different interface, then in order to wrap it, you may need to modify it or implement an additional wrapper layer to comply with the interface specification above.

-You can wrap the request-level logits processor by subclassing `AdapterLogitsProcessor` as shown in the example below (in this example, `DummyPerReqLogitsProcessor` is a stand-in for your request-level logits processor which needs to be wrapped.) Override `AdapterLogitsProcessor.is_argmax_invariant(self)` to accurately reflect whether your request-level logits processor may impact which token has the highest-value logit. Override `AdapterLogitsProcessor.new_req_logits_processor(self,params)` to create a new request-level logits processor instance from a `SamplingParams` instance:
+You can wrap the request-level logits processor by subclassing `AdapterLogitsProcessor` as shown in the example below (in this example, `DummyPerReqLogitsProcessor` is a stand-in for your request-level logits processor which needs to be wrapped.): 


It is great that you updated the documentation to reflect these changes.

Correct me if wrong, but it appears like this PR right now is only updating the documentation for the special case of adapting a request-level logits processor.

I think we will probably also want to update the documentation for

The "Programming Model" section of the logits processors design docs https://docs.vllm.ai/en/latest/design/logits_processors.html#logits-processor-programming-model

Other sections of the custom logits processor design docs, specifically "Creating a custom logits processor" (https://docs.vllm.ai/en/latest/features/custom_logitsprocs.html#creating-a-custom-logits-processor), "Passing Custom Argument to a Custom Logits Procesor" (https://docs.vllm.ai/en/latest/features/custom_logitsprocs.html#passing-custom-argument-to-a-custom-logits-processor), "Example custom logits processor implementation" (https://docs.vllm.ai/en/latest/features/custom_logitsprocs.html#example-custom-logits-processor-implementation)

vllm/v1/sample/logits_processor/interface.py

vllm/v1/sample/logits_processor/builtin.py

Signed-off-by: Isotr0py <[email protected]>

Isotr0py · 2025-11-03T09:58:35Z

Sorry for the delayed update! I was a bit occupied last week. 😅 This PR should be ready for a further review now!

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

vllm/utils/torch_utils.py

docs/features/custom_logitsprocs.md

Signed-off-by: Isotr0py <[email protected]>

vllm/model_executor/models/deepseek_ocr.py

Signed-off-by: Isotr0py <[email protected]>

DarkLight1337

Thanks, LGTM

DarkLight1337 · 2025-11-03T13:39:26Z

/gemini review

Just in case

gemini-code-assist

Code Review

This pull request introduces a robust validation mechanism for custom logits processor arguments, especially for online serving, by adding a validate_params method to the LogitsProcessor interface. This is a great improvement to prevent invalid parameters from causing issues downstream. The changes to allow list values in vllm_xargs and the fix for the deepseek-ocr model type are also valuable. The documentation and tests have been updated accordingly. I have one suggestion to refactor some duplicated code for better maintainability.

Signed-off-by: Isotr0py <[email protected]>

Isotr0py added 3 commits October 27, 2025 15:44

fix deepseek ocr chat template

b0ed8d4

Signed-off-by: Isotr0py <[email protected]>

fix

8f538eb

Signed-off-by: Isotr0py <[email protected]>

revert

9809125

Signed-off-by: Isotr0py <[email protected]>

Isotr0py requested review from aarnphm and chaunceyjiang as code owners October 27, 2025 08:19

mergify bot added deepseek Related to DeepSeek models frontend labels Oct 27, 2025

gemini-code-assist bot reviewed Oct 27, 2025

View reviewed changes

Isotr0py mentioned this pull request Oct 27, 2025

Add online serving usage with custom logits processor for DeepSeek-OCR vllm-project/recipes#101

Merged

noooop approved these changes Oct 27, 2025

View reviewed changes

Isotr0py enabled auto-merge (squash) October 27, 2025 08:36

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 27, 2025

update description

3dce5a1

Signed-off-by: Isotr0py <[email protected]>

Isotr0py disabled auto-merge October 27, 2025 08:54

refactor

2618941

Signed-off-by: Isotr0py <[email protected]>

DarkLight1337 requested changes Oct 27, 2025

View reviewed changes

Isotr0py and others added 4 commits October 27, 2025 20:09

update doc

14cd026

Signed-off-by: Isotr0py <[email protected]>

update doc

865d10b

Signed-off-by: Isotr0py <[email protected]>

Merge branch 'vllm-project:main' into fix-ds-ocr-template

69e4f8a

Merge branch 'valid-logit-proc' into fix-ds-ocr-template

a66d593

Signed-off-by: Isotr0py <[email protected]>

Isotr0py requested review from 22quinn, houseroad and njhill as code owners October 27, 2025 12:16

mergify bot added documentation Improvements or additions to documentation v1 labels Oct 27, 2025

Isotr0py changed the title ~~[Bugfix] Fix DeepSeek-OCR chat template fallback and custom logits processor xargs for online serving~~ [Bugfix] Validate custom logits processor xargs for online serving Oct 27, 2025

Isotr0py marked this pull request as draft October 27, 2025 12:20

mergify bot added the needs-rebase label Oct 28, 2025

afeldman-nm suggested changes Oct 28, 2025

View reviewed changes

njhill reviewed Oct 28, 2025

View reviewed changes

vllm/v1/sample/logits_processor/interface.py Show resolved Hide resolved

vllm/v1/sample/logits_processor/builtin.py Outdated Show resolved Hide resolved

vllm/v1/sample/logits_processor/builtin.py Outdated Show resolved Hide resolved

Isotr0py added 2 commits October 30, 2025 01:39

guard CUDA initialization

b70bdaf

Signed-off-by: Isotr0py <[email protected]>

Merge remote-tracking branch 'upstream/main' into fix-ds-ocr-template

706a902

Signed-off-by: Isotr0py <[email protected]>

mergify bot removed the needs-rebase label Oct 29, 2025

Isotr0py added 5 commits November 3, 2025 15:26

Merge branch 'main' into fix-ds-ocr-template

ff7be0b

better error messages and logging

7b7473a

Signed-off-by: Isotr0py <[email protected]>

oops

b1b365f

Signed-off-by: Isotr0py <[email protected]>

update doc and example

843866f

Signed-off-by: Isotr0py <[email protected]>

add test

8c085cf

Signed-off-by: Isotr0py <[email protected]>

Isotr0py marked this pull request as ready for review November 3, 2025 09:57

chatgpt-codex-connector bot reviewed Nov 3, 2025

View reviewed changes

vllm/utils/torch_utils.py Outdated Show resolved Hide resolved

DarkLight1337 reviewed Nov 3, 2025

View reviewed changes

docs/features/custom_logitsprocs.md Outdated Show resolved Hide resolved

DarkLight1337 reviewed Nov 3, 2025

View reviewed changes

docs/features/custom_logitsprocs.md Outdated Show resolved Hide resolved

Isotr0py added 2 commits November 3, 2025 20:39

move classmethod to top of class

2f1a66b

Signed-off-by: Isotr0py <[email protected]>

use finally for cuda initialization guard

b583b01

Signed-off-by: Isotr0py <[email protected]>

DarkLight1337 reviewed Nov 3, 2025

View reviewed changes

vllm/model_executor/models/deepseek_ocr.py Outdated Show resolved Hide resolved

clean ngram logitsproc __init__

4da4f6a

Signed-off-by: Isotr0py <[email protected]>

DarkLight1337 approved these changes Nov 3, 2025

View reviewed changes

gemini-code-assist bot reviewed Nov 3, 2025

View reviewed changes

fix failed test

c4c3167

Signed-off-by: Isotr0py <[email protected]>

Isotr0py requested review from NickLucche, robertgshaw2-redhat and simon-mo as code owners November 4, 2025 04:04

Merge branch 'main' into fix-ds-ocr-template

f4f0a14

Isotr0py enabled auto-merge (squash) November 4, 2025 04:06

Uh oh!

[Bugfix] Validate custom logits processor xargs for online serving #27560

Are you sure you want to change the base?

[Bugfix] Validate custom logits processor xargs for online serving #27560

Conversation

Isotr0py commented Oct 27, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Oct 27, 2025

Choose a reason for hiding this comment

Uh oh!

DarkLight1337 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mergify bot commented Oct 27, 2025

Uh oh!

njhill commented Oct 27, 2025

Uh oh!

njhill commented Oct 27, 2025

Uh oh!

mergify bot commented Oct 28, 2025

Uh oh!

afeldman-nm left a comment

Choose a reason for hiding this comment

Uh oh!

afeldman-nm Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Isotr0py commented Nov 3, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

DarkLight1337 left a comment

Choose a reason for hiding this comment

Uh oh!

DarkLight1337 commented Nov 3, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Isotr0py commented Oct 27, 2025 •

edited by github-actions bot

Loading

DarkLight1337 left a comment •

edited

Loading