Skip to content

[Misc] Remove deprecated args in v0.10 #21349

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion examples/offline_inference/neuron_speculation.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,6 @@ def initialize_llm():
max_num_seqs=4,
max_model_len=2048,
block_size=2048,
use_v2_block_manager=True,
device="neuron",
tensor_parallel_size=32,
)
Expand Down
1 change: 0 additions & 1 deletion tests/neuron/2_core/test_mistral.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,6 @@ def test_mistral():
tensor_parallel_size=2,
max_num_seqs=4,
max_model_len=128,
use_v2_block_manager=True,
override_neuron_config={
"sequence_parallel_enabled": False,
"skip_warmup": True
Expand Down
2 changes: 0 additions & 2 deletions tests/neuron/2_core/test_multi_lora.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,6 @@ def test_llama_single_lora():
tensor_parallel_size=2,
max_num_seqs=4,
max_model_len=512,
use_v2_block_manager=True,
override_neuron_config={
"sequence_parallel_enabled": False,
"skip_warmup": True,
Expand Down Expand Up @@ -57,7 +56,6 @@ def test_llama_multiple_lora():
tensor_parallel_size=2,
max_num_seqs=4,
max_model_len=512,
use_v2_block_manager=True,
override_neuron_config={
"sequence_parallel_enabled":
False,
Expand Down
21 changes: 0 additions & 21 deletions vllm/engine/arg_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -313,7 +313,6 @@ class EngineArgs:
CacheConfig.prefix_caching_hash_algo
disable_sliding_window: bool = ModelConfig.disable_sliding_window
disable_cascade_attn: bool = ModelConfig.disable_cascade_attn
use_v2_block_manager: bool = True
swap_space: float = CacheConfig.swap_space
cpu_offload_gb: float = CacheConfig.cpu_offload_gb
gpu_memory_utilization: float = CacheConfig.gpu_memory_utilization
Expand Down Expand Up @@ -364,7 +363,6 @@ class EngineArgs:
max_prompt_adapter_token: int = \
PromptAdapterConfig.max_prompt_adapter_token

device: Device = DeviceConfig.device
Copy link
Collaborator

@yewentao256 yewentao256 Jul 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure this is not longer supported?
Now the upstream caller still pass in the param, and will cause an error

lm_eval   --model vllm   --model_args "pretrained=Qwen/Qwen3-30B-A3B-FP8,max_model_len=32768,enforce_eager=True"   --trust_remote_code   --tasks gsm8k   --num_fewshot 5   --batch_size auto
  File "/home/wentao/.wentao_env/lib/python3.12/site-packages/lm_eval/models/vllm_causallms.py", line 177, in __init__
    self.model = LLM(**self.model_args)
                 ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wentao/vllm-source/vllm/entrypoints/llm.py", line 244, in __init__
    engine_args = EngineArgs(
                  ^^^^^^^^^^^
TypeError: EngineArgs.__init__() got an unexpected keyword argument 'device'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, if we are removing this, should we also have removed the Device config from config.py?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yewentao256 See #18301 (comment), the device parameter will not take effect even if it is set. It is a useless parameter. If you want to specify the device, you can specify it through the CUDA_VISIABLE_DEVICES environment variable. If it is cpu mode, you need to recompile it to cpu mode.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hmellor Thanks for your reminder, I will take a closer look at how to safely remove DeviceConfig.device, which requires another PR.

Copy link
Collaborator

@yewentao256 yewentao256 Jul 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yewentao256 See #18301 (comment), the device parameter will not take effect even if it is set. It is a useless parameter. If you want to specify the device, you can specify it through the CUDA_VISIABLE_DEVICES environment variable. If it is cpu mode, you need to recompile it to cpu mode.

Make sense, thanks!

num_scheduler_steps: int = SchedulerConfig.num_scheduler_steps
multi_step_stream_outputs: bool = SchedulerConfig.multi_step_stream_outputs
ray_workers_use_nsight: bool = ParallelConfig.ray_workers_use_nsight
Expand Down Expand Up @@ -745,16 +743,6 @@ def add_cli_args(parser: FlexibleArgumentParser) -> FlexibleArgumentParser:
"--max-prompt-adapter-token",
**prompt_adapter_kwargs["max_prompt_adapter_token"])

# Device arguments
device_kwargs = get_kwargs(DeviceConfig)
device_group = parser.add_argument_group(
title="DeviceConfig",
description=DeviceConfig.__doc__,
)
device_group.add_argument("--device",
**device_kwargs["device"],
deprecated=True)

# Speculative arguments
speculative_group = parser.add_argument_group(
title="SpeculativeConfig",
Expand Down Expand Up @@ -856,15 +844,6 @@ def add_cli_args(parser: FlexibleArgumentParser) -> FlexibleArgumentParser:
**vllm_kwargs["additional_config"])

# Other arguments
parser.add_argument('--use-v2-block-manager',
action='store_true',
default=True,
deprecated=True,
help='[DEPRECATED] block manager v1 has been '
'removed and SelfAttnBlockSpaceManager (i.e. '
'block manager v2) is now the default. '
'Setting this flag to True or False'
' has no effect on vLLM behavior.')
parser.add_argument('--disable-log-stats',
action='store_true',
help='Disable logging statistics.')
Expand Down