Skip to content

[Bug]: Failed to load model from local s3 instanceΒ #23236

@wizche

Description

@wizche

Your current environment

docker image vllm-openai version v1.10.1

πŸ› Describe the bug

Loading a model from local s3 causes the following error on version v1.10.1:

INFO 08-19 23:09:31 [__init__.py:241] Automatically detected platform cuda.
(APIServer pid=1) INFO 08-19 23:09:34 [api_server.py:1805] vLLM API server version 0.10.1
(APIServer pid=1) INFO 08-19 23:09:34 [utils.py:326] non-default args: {'model_tag': 's3://ai-prod.models/Qwen2.5-14B-Instruct', 'host': '0.0.0.0', 'enable_auto_tool_choice': True, 'tool_call_parser': 'hermes', 'model': 's3://ai-prod.models/Qwen2.5-14B-Instruct', 'dtype': 'bfloat16', 'served_model_name': ['Qwen/Qwen2.5-14B-Instruct'], 'load_format': 'runai_streamer', 'enable_prefix_caching': False}
(APIServer pid=1) Traceback (most recent call last):
(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/transformers/utils/hub.py", line 479, in cached_files
(APIServer pid=1) hf_hub_download(
(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/huggingface_hub/utils/_validators.py", line 106, in _inner_fn
(APIServer pid=1) validate_repo_id(arg_value)
(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/huggingface_hub/utils/_validators.py", line 154, in validate_repo_id
(APIServer pid=1) raise HFValidationError(
(APIServer pid=1) huggingface_hub.errors.HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': 's3://ai-prod.models/Qwen2.5-14B-Instruct'. Use `repo_type` argument if needed.
(APIServer pid=1)
(APIServer pid=1) During handling of the above exception, another exception occurred:
(APIServer pid=1)
(APIServer pid=1) Traceback (most recent call last):
(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/transformers/configuration_utils.py", line 708, in _get_config_dict
(APIServer pid=1) resolved_config_file = cached_file(
(APIServer pid=1) ^^^^^^^^^^^^
(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/transformers/utils/hub.py", line 321, in cached_file
(APIServer pid=1) file = cached_files(path_or_repo_id=path_or_repo_id, filenames=[filename], **kwargs)
(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/transformers/utils/hub.py", line 532, in cached_files
(APIServer pid=1) _get_cache_file_to_return(path_or_repo_id, filename, cache_dir, revision, repo_type)
(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/transformers/utils/hub.py", line 144, in _get_cache_file_to_return
(APIServer pid=1) resolved_file = try_to_load_from_cache(
(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/huggingface_hub/utils/_validators.py", line 106, in _inner_fn
(APIServer pid=1) validate_repo_id(arg_value)
(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/huggingface_hub/utils/_validators.py", line 154, in validate_repo_id
(APIServer pid=1) raise HFValidationError(
(APIServer pid=1) huggingface_hub.errors.HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': 's3://ai-prod.models/Qwen2.5-14B-Instruct'. Use `repo_type` argument if needed.
(APIServer pid=1)
(APIServer pid=1) During handling of the above exception, another exception occurred:
(APIServer pid=1)
(APIServer pid=1) Traceback (most recent call last):
(APIServer pid=1) File "/usr/local/bin/vllm", line 10, in <module>
(APIServer pid=1) sys.exit(main())
(APIServer pid=1) ^^^^^^
(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/cli/main.py", line 54, in main
(APIServer pid=1) args.dispatch_function(args)
(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/cli/serve.py", line 50, in cmd
(APIServer pid=1) uvloop.run(run_server(args))
(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/uvloop/__init__.py", line 109, in run
(APIServer pid=1) return __asyncio.run(
(APIServer pid=1) ^^^^^^^^^^^^^^
(APIServer pid=1) File "/usr/lib/python3.12/asyncio/runners.py", line 195, in run
(APIServer pid=1) return runner.run(main)
(APIServer pid=1) ^^^^^^^^^^^^^^^^
(APIServer pid=1) File "/usr/lib/python3.12/asyncio/runners.py", line 118, in run
(APIServer pid=1) return self._loop.run_until_complete(task)
(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1) File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete
(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/uvloop/__init__.py", line 61, in wrapper
(APIServer pid=1) return await main
(APIServer pid=1) ^^^^^^^^^^
(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 1850, in run_server
(APIServer pid=1) await run_server_worker(listen_address, sock, args, **uvicorn_kwargs)
(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 1870, in run_server_worker
(APIServer pid=1) async with build_async_engine_client(
(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1) File "/usr/lib/python3.12/contextlib.py", line 210, in __aenter__
(APIServer pid=1) return await anext(self.gen)
(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 178, in build_async_engine_client
(APIServer pid=1) async with build_async_engine_client_from_engine_args(
(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1) File "/usr/lib/python3.12/contextlib.py", line 210, in __aenter__
(APIServer pid=1) return await anext(self.gen)
(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 204, in build_async_engine_client_from_engine_args
(APIServer pid=1) vllm_config = engine_args.create_engine_config(usage_context=usage_context)
(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/engine/arg_utils.py", line 1057, in create_engine_config
(APIServer pid=1) model_config = self.create_model_config()
(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/engine/arg_utils.py", line 904, in create_model_config
(APIServer pid=1) return ModelConfig(
(APIServer pid=1) ^^^^^^^^^^^^
(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/pydantic/_internal/_dataclasses.py", line 123, in __init__
(APIServer pid=1) s.__pydantic_validator__.validate_python(ArgsKwargs(args, kwargs), self_instance=s)
(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/config/__init__.py", line 529, in __post_init__
(APIServer pid=1) self.model, self.tokenizer = maybe_override_with_speculators_target_model( # noqa: E501
(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/transformers_utils/config.py", line 338, in maybe_override_with_speculators_target_model
(APIServer pid=1) config_dict, _ = PretrainedConfig.get_config_dict(
(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/transformers/configuration_utils.py", line 649, in get_config_dict
(APIServer pid=1) config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/transformers/configuration_utils.py", line 731, in _get_config_dict
(APIServer pid=1) raise OSError(
(APIServer pid=1) OSError: Can't load the configuration of 's3://ai-prod.models/Qwen2.5-14B-Instruct'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 's3://ai-prod.models/Qwen2.5-14B-Instruct' is the correct path to a directory containing a config.json file

This works flawlessly with version v0.9.1 of the docker image (without any change to the startup parameters).

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions