Skip to content

Conversation

Mcirino1
Copy link

@Mcirino1 Mcirino1 commented Aug 13, 2025

Waiting on latency results

Please direct your PRs to the upstream vllm (https://github.com/vllm-project/vllm.git)

Accepting PRs into the ROCm fork (https://github.com/ROCm/vllm) will require a clear previously communicated exception

Isotr0py and others added 30 commits February 16, 2025 14:28
Signed-off-by: isotr0py <[email protected]>
* Enabling ROCm CI on MI250 machines:
- correct build target
- correct queue

Signed-off-by: Alexei V. Ivanov <[email protected]>

---------

Signed-off-by: Alexei V. Ivanov <[email protected]>
* Optimization for quantized gemm skinny sizes

* lint fix

* Add support for bf16/fp16

* code cleanup

* code cleanup

* lint fix2

* cleanup

* Moved the logic into tuned gemm to preserve API compatibility

---------

Co-authored-by: Gregory Shtrasberg <[email protected]>
Co-authored-by: Gregory Shtrasberg <[email protected]>
* Removing gfx940 and gfx941 targets. These have been deprecated in favor of gfx942 for MI300X

Signed-off-by: Gregory Shtrasberg <[email protected]>

* Remove from custom kernels as well

---------

Signed-off-by: Gregory Shtrasberg <[email protected]>
* Advance torch commit to be past pytorch/pytorch#144942 to fix tunable ops

* Make sure to use the submodule commit compatible with the main aiter commit
Signed-off-by: Sage Moore <[email protected]>
Signed-off-by: Sage Moore <[email protected]>
Signed-off-by: Sage Moore <[email protected]>
* Using aiter branch that can be built into a whl with PREBUILD_KERNELS=1

* Using fail fast on aiter build to see compilation errors in the log since it fails silently

* Check for build success without installing whl
* Using proposed fix from ROCm/aiter#115

* Build fix
* tuning adjustment for quantized skinny gemm.

* lint fix
gshtras and others added 16 commits June 23, 2025 12:40
* Updated README.md for June 24 Docker release

* Added additional throughput results

* Fixed some throughput results
* Minor changes to command line examples

* README changes and added throughput results

Still waiting on latency

* Added latency results

* Update README.md

* Update README.md
* Update test-pipeline.yaml

Disabling the "Tensorizer Test".

The test is seen to generate exceptions while still reporting as successful. That needs to be verified before re-enabling the test in the production environment.

Signed-off-by: Alexei V. Ivanov <[email protected]>

* Fixing pre-commit complaints.

Signed-off-by: Alexei V. Ivanov <[email protected]>

* .

Signed-off-by: Alexei V. Ivanov <[email protected]>

---------

Signed-off-by: Alexei V. Ivanov <[email protected]>
…symbol exposure (vllm-project#21647)"

This reverts commit 9ba1c88.

Signed-off-by: Gregory Shtrasberg <[email protected]>
cd vllm
git checkout b432b7a285aa0dcb9677380936ffa74931bb6d6f
git checkout 340ea86dfe5955d6f9a9e767d6abab5aacf2c978
docker build -f docker/Dockerfile.rocm -t <your_tag> --build-arg USE_CYTHON=1 .
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to remove this --build-arg USE_CYTHON=1 part since a few releases back

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed it for both build commands, but let me know if we still need it for the other one. Thanks!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way it is now is correct, thank you

@Mcirino1 Mcirino1 marked this pull request as ready for review August 14, 2025 15:17
shajrawi
shajrawi previously approved these changes Aug 14, 2025
@gshtras gshtras dismissed shajrawi’s stale review September 9, 2025 16:42

The merge-base changed after approval.

@gshtras gshtras force-pushed the main branch 2 times, most recently from 1d2c43d to eb9d4de Compare September 9, 2025 16:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.