Expose PyTorch profiler configuration to environment variables #21803

Csrayz · 2025-07-29T05:17:49Z

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.

Purpose

Expose inference-related profiler configuration in the worker to environment variables.

[FEAT] Expose PyTorch profiler config via environment variables
[DOC] update profiling.md
NOT change profiler in v0 worker. related to [RFC]: Deprecating vLLM V0 #18571

Test Plan

Test Result

(Optional) Documentation Update

Update docs/contributing/profiling.md for new profiler-related environment variables

gemini-code-assist

Code Review

This pull request exposes several PyTorch profiler configurations through environment variables, which is a great enhancement for debugging and performance analysis. My review focuses on an inconsistency in the naming of one of the new environment variables. The variable for memory profiling is named VLLM_TORCH_PROFILER_WITH_PROFILE_MEMORY in the code, but the documentation and consistency with other flags suggest it should be VLLM_TORCH_PROFILER_PROFILE_MEMORY. I've provided suggestions to correct this across the codebase to ensure consistency and prevent user confusion.

vllm/envs.py

vllm/v1/worker/gpu_worker.py

vllm/v1/worker/xpu_worker.py

github-actions · 2025-07-29T05:20:18Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

* [FEAT] Expose PyTorch profiler config via environment variables * [DOC] update profiling.md * NOT change profiler in v0 worker. related to vllm-project#18571 Signed-off-by: Csrayz <[email protected]>

variables `VLLM_TORCH_PROFILER_WITH_PROFILE_MEMORY` Signed-off-by: Csrayz <[email protected]>

Signed-off-by: Csrayz <[email protected]>

jikunshang

LGTM! thanks for contributing!

docs/contributing/profiling.md

Csrayz · 2025-07-29T14:57:25Z

Why does accepting a commit suggestion cause the DCO workflow to fail?

DarkLight1337 · 2025-07-29T15:06:12Z

It's because GitHub doesn't sign off the commits

paragraphs to an unordered list Signed-off-by: Csrayz <[email protected]>

Csrayz · 2025-07-30T01:07:06Z

Any way to run those workflow again? It seems like the failure was caused by network fluctuations.

DarkLight1337 · 2025-07-30T02:46:39Z

Force merging as the CI failures aren't related

noooop · 2025-07-30T07:41:43Z

@DarkLight1337

Perhaps this PR is the first buildkite/ci/entrypoints-test-api-server - Failed

DarkLight1337 · 2025-07-30T07:43:35Z

I have seen these failures yesterday as well

noooop · 2025-07-30T07:46:59Z

Hmm

There were indeed earlier failures

https://buildkite.com/vllm/ci/builds/25184/steps/canvas?jid=019851ec-29c3-4f63-a655-d664b88e80a0

Csrayz requested review from hmellor, WoosukKwon, robertgshaw2-redhat, njhill, ywang96, comaniac and alexm-redhat as code owners July 29, 2025 05:17

mergify bot added documentation Improvements or additions to documentation v1 labels Jul 29, 2025

gemini-code-assist bot reviewed Jul 29, 2025

View reviewed changes

vllm/envs.py Show resolved Hide resolved

vllm/envs.py Outdated Show resolved Hide resolved

vllm/v1/worker/gpu_worker.py Show resolved Hide resolved

vllm/v1/worker/xpu_worker.py Show resolved Hide resolved

Expose PyTorch profiler configuration to environment variables

ac65e79

* [FEAT] Expose PyTorch profiler config via environment variables * [DOC] update profiling.md * NOT change profiler in v0 worker. related to vllm-project#18571 Signed-off-by: Csrayz <[email protected]>

Csrayz force-pushed the feat_profiler branch from ac4cff3 to ac65e79 Compare July 29, 2025 05:21

[FIX] inconsistency in the naming of one of the new environment

78d3a44

variables `VLLM_TORCH_PROFILER_WITH_PROFILE_MEMORY` Signed-off-by: Csrayz <[email protected]>

Csrayz force-pushed the feat_profiler branch from ebb8325 to 78d3a44 Compare July 29, 2025 05:26

[FIX] fix lint

16a9d8e

Signed-off-by: Csrayz <[email protected]>

jikunshang approved these changes Jul 29, 2025

View reviewed changes

jikunshang added the ready ONLY add when PR is ready to merge/full CI is needed label Jul 29, 2025

hmellor reviewed Jul 29, 2025

View reviewed changes

docs/contributing/profiling.md Outdated Show resolved Hide resolved

Csrayz force-pushed the feat_profiler branch from 143678e to 16a9d8e Compare July 29, 2025 15:02

[DOC] Change the Profiler environment variable descriptions from

ba69230

paragraphs to an unordered list Signed-off-by: Csrayz <[email protected]>

vllm-bot merged commit b917da4 into vllm-project:main Jul 30, 2025
62 of 65 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Expose PyTorch profiler configuration to environment variables #21803

Expose PyTorch profiler configuration to environment variables #21803

Csrayz commented Jul 29, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Jul 29, 2025

Uh oh!

jikunshang left a comment

Uh oh!

Uh oh!

Csrayz commented Jul 29, 2025

Uh oh!

DarkLight1337 commented Jul 29, 2025

Uh oh!

Csrayz commented Jul 30, 2025

Uh oh!

Uh oh!

DarkLight1337 commented Jul 30, 2025

Uh oh!

noooop commented Jul 30, 2025

Uh oh!

DarkLight1337 commented Jul 30, 2025

Uh oh!

noooop commented Jul 30, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Expose PyTorch profiler configuration to environment variables #21803

Expose PyTorch profiler configuration to environment variables #21803

Conversation

Csrayz commented Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Essential Elements of an Effective PR Description Checklist

Purpose

Test Plan

Test Result

(Optional) Documentation Update

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Jul 29, 2025

Uh oh!

jikunshang left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Csrayz commented Jul 29, 2025

Uh oh!

DarkLight1337 commented Jul 29, 2025

Uh oh!

Csrayz commented Jul 30, 2025

Uh oh!

Uh oh!

DarkLight1337 commented Jul 30, 2025

Uh oh!

noooop commented Jul 30, 2025

Uh oh!

DarkLight1337 commented Jul 30, 2025

Uh oh!

noooop commented Jul 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Csrayz commented Jul 29, 2025 •

edited

Loading

noooop commented Jul 30, 2025 •

edited

Loading