Skip to content

Conversation

@Isotr0py
Copy link
Member

@Isotr0py Isotr0py commented Oct 27, 2025

Purpose

  • The default model_type for deepseek-ocr is deepseek_vl_v2, so the default deepseek-ocr chat template won't be used by default:
    "deepseek_ocr": CHAT_TEMPLATES_DIR / "template_deepseek_ocr.jinja",
  • Also allow list of int as vllm_xargs for ChatCompletionRequest, otherwise "whitelist_token_ids": [128821, 128822], parameters for deepseek-ocr's custom logits processor is not allowed.
openai.BadRequestError: Error code: 400 - {'error': {'message': "[{'type': 'string_type', 'loc': ('body', 'vllm_xargs', 'whitelist_token_ids', 'str'), 'msg': 'Input should be a valid string', 'input': [128821, 128822]}, {'type': 'int_type', 'loc': ('body', 'vllm_xargs', 'whitelist_token_ids', 'int'), 'msg': 'Input should be a valid integer', 'input': [128821, 128822]}, {'type': 'float_type', 'loc': ('body', 'vllm_xargs', 'whitelist_token_ids', 'float'), 'msg': 'Input should be a valid number', 'input': [128821, 128822]}]", 'type': 'Bad Request', 'param': None, 'code': 400}}
  • Also validate request's vllm_xargs to avoid unexpected behavior from custom logits processor.

Test Plan

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Isotr0py <[email protected]>
Signed-off-by: Isotr0py <[email protected]>
@mergify mergify bot added deepseek Related to DeepSeek models frontend labels Oct 27, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a bug in the DeepSeek-OCR chat template fallback and allows a list of integers as vllm_xargs for ChatCompletionRequest to accommodate custom logits processors. The code changes modify vllm/entrypoints/openai/protocol.py to allow lists in vllm_xargs and vllm/transformers_utils/configs/deepseek_vl2.py to correctly set the model_type for DeepSeek OCR models. No style guide was provided, and the changes appear consistent with common Python practices.

Comment on lines +222 to +226
# update model_type for OCR model
if "DeepseekOCRForCausalLM" in (
self.architectures or kwargs.get("architectures", [])
):
self.model_type = "deepseek_ocr"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The condition if "DeepseekOCRForCausalLM" in (self.architectures or kwargs.get("architectures", [])) could potentially be simplified by directly checking if "DeepseekOCRForCausalLM" in self.architectures + kwargs.get("architectures", []). This avoids the need for the or operator and might improve readability.

However, it's crucial to ensure that this change doesn't alter the behavior of the code, especially if self.architectures or kwargs.get("architectures", []) could be None or not a list. Adding a check to ensure that these are lists before concatenation could mitigate this risk.

Suggested change
# update model_type for OCR model
if "DeepseekOCRForCausalLM" in (
self.architectures or kwargs.get("architectures", [])
):
self.model_type = "deepseek_ocr"
# update model_type for OCR model
architectures = self.architectures if self.architectures else []
architectures += kwargs.get("architectures", []) if kwargs.get("architectures", []) else []
if "DeepseekOCRForCausalLM" in architectures:
self.model_type = "deepseek_ocr"

@Isotr0py Isotr0py enabled auto-merge (squash) October 27, 2025 08:36
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 27, 2025
Signed-off-by: Isotr0py <[email protected]>
@Isotr0py Isotr0py disabled auto-merge October 27, 2025 08:54
Signed-off-by: Isotr0py <[email protected]>
Copy link
Member

@DarkLight1337 DarkLight1337 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are going to patch a security vulnerability before merging this

@mergify
Copy link

mergify bot commented Oct 27, 2025

Documentation preview: https://vllm--27560.org.readthedocs.build/en/27560/

@mergify mergify bot added documentation Improvements or additions to documentation v1 labels Oct 27, 2025
@Isotr0py Isotr0py changed the title [Bugfix] Fix DeepSeek-OCR chat template fallback and custom logits processor xargs for online serving [Bugfix] Validate custom logits processor xargs for online serving Oct 27, 2025
@Isotr0py Isotr0py marked this pull request as draft October 27, 2025 12:20
@njhill
Copy link
Member

njhill commented Oct 27, 2025

cc @afeldman-nm I think we discussed this at one point

@njhill
Copy link
Member

njhill commented Oct 27, 2025

Thanks @Isotr0py! I would also like to review this

@mergify
Copy link

mergify bot commented Oct 28, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @Isotr0py.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Oct 28, 2025
Copy link
Contributor

@afeldman-nm afeldman-nm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this @Isotr0py ! It will be good to have argument validation

While request-level logits processors are explicitly *not* supported in the vLLM engine, vLLM *does* provide a convenient process to wrap an existing `Callable` request-level logits processor and create a batch-level logits processor that is compatible with vLLM. The `Callable` must conform to the type annotation above; if your request-level logits processor has a different interface, then in order to wrap it, you may need to modify it or implement an additional wrapper layer to comply with the interface specification above.

You can wrap the request-level logits processor by subclassing `AdapterLogitsProcessor` as shown in the example below (in this example, `DummyPerReqLogitsProcessor` is a stand-in for your request-level logits processor which needs to be wrapped.) Override `AdapterLogitsProcessor.is_argmax_invariant(self)` to accurately reflect whether your request-level logits processor may impact which token has the highest-value logit. Override `AdapterLogitsProcessor.new_req_logits_processor(self,params)` to create a new request-level logits processor instance from a `SamplingParams` instance:
You can wrap the request-level logits processor by subclassing `AdapterLogitsProcessor` as shown in the example below (in this example, `DummyPerReqLogitsProcessor` is a stand-in for your request-level logits processor which needs to be wrapped.):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is great that you updated the documentation to reflect these changes.

Correct me if wrong, but it appears like this PR right now is only updating the documentation for the special case of adapting a request-level logits processor.

I think we will probably also want to update the documentation for

  1. The "Programming Model" section of the logits processors design docs https://docs.vllm.ai/en/latest/design/logits_processors.html#logits-processor-programming-model

  2. Other sections of the custom logits processor design docs, specifically "Creating a custom logits processor" (https://docs.vllm.ai/en/latest/features/custom_logitsprocs.html#creating-a-custom-logits-processor), "Passing Custom Argument to a Custom Logits Procesor" (https://docs.vllm.ai/en/latest/features/custom_logitsprocs.html#passing-custom-argument-to-a-custom-logits-processor), "Example custom logits processor implementation" (https://docs.vllm.ai/en/latest/features/custom_logitsprocs.html#example-custom-logits-processor-implementation)

@mergify mergify bot removed the needs-rebase label Oct 29, 2025
@Isotr0py Isotr0py marked this pull request as ready for review November 3, 2025 09:57
@Isotr0py
Copy link
Member Author

Isotr0py commented Nov 3, 2025

Sorry for the delayed update! I was a bit occupied last week. 😅 This PR should be ready for a further review now!

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Copy link
Member

@DarkLight1337 DarkLight1337 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM

@DarkLight1337
Copy link
Member

/gemini review

Just in case

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a robust validation mechanism for custom logits processor arguments, especially for online serving, by adding a validate_params method to the LogitsProcessor interface. This is a great improvement to prevent invalid parameters from causing issues downstream. The changes to allow list values in vllm_xargs and the fix for the deepseek-ocr model type are also valuable. The documentation and tests have been updated accordingly. I have one suggestion to refactor some duplicated code for better maintainability.

Signed-off-by: Isotr0py <[email protected]>
@Isotr0py Isotr0py enabled auto-merge (squash) November 4, 2025 04:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

deepseek Related to DeepSeek models documentation Improvements or additions to documentation frontend ready ONLY add when PR is ready to merge/full CI is needed v1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants