-
Notifications
You must be signed in to change notification settings - Fork 48
Updated README.md for August 12 RC2 throughput results only #631
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: isotr0py <[email protected]>
Signed-off-by: isotr0py <[email protected]>
Upstream merge 25 02 17
* Enabling ROCm CI on MI250 machines: - correct build target - correct queue Signed-off-by: Alexei V. Ivanov <[email protected]> --------- Signed-off-by: Alexei V. Ivanov <[email protected]>
* Optimization for quantized gemm skinny sizes * lint fix * Add support for bf16/fp16 * code cleanup * code cleanup * lint fix2 * cleanup * Moved the logic into tuned gemm to preserve API compatibility --------- Co-authored-by: Gregory Shtrasberg <[email protected]> Co-authored-by: Gregory Shtrasberg <[email protected]>
* Removing gfx940 and gfx941 targets. These have been deprecated in favor of gfx942 for MI300X Signed-off-by: Gregory Shtrasberg <[email protected]> * Remove from custom kernels as well --------- Signed-off-by: Gregory Shtrasberg <[email protected]>
Signed-off-by: Divakar Verma <[email protected]>
* Advance torch commit to be past pytorch/pytorch#144942 to fix tunable ops * Make sure to use the submodule commit compatible with the main aiter commit
Signed-off-by: Sage Moore <[email protected]>
Signed-off-by: Sage Moore <[email protected]>
Signed-off-by: Sage Moore <[email protected]>
Upstream merge 25 02 24
* Using aiter branch that can be built into a whl with PREBUILD_KERNELS=1 * Using fail fast on aiter build to see compilation errors in the log since it fails silently * Check for build success without installing whl
* Using proposed fix from ROCm/aiter#115 * Build fix
* tuning adjustment for quantized skinny gemm. * lint fix
)" This reverts commit 8294773.
Upstream merge 2025 06 23
Upstream merge 2025 06 25
Upstream merge 2025 06 30
* Updated README.md for June 24 Docker release * Added additional throughput results * Fixed some throughput results
* Minor changes to command line examples * README changes and added throughput results Still waiting on latency * Added latency results * Update README.md * Update README.md
* Update test-pipeline.yaml Disabling the "Tensorizer Test". The test is seen to generate exceptions while still reporting as successful. That needs to be verified before re-enabling the test in the production environment. Signed-off-by: Alexei V. Ivanov <[email protected]> * Fixing pre-commit complaints. Signed-off-by: Alexei V. Ivanov <[email protected]> * . Signed-off-by: Alexei V. Ivanov <[email protected]> --------- Signed-off-by: Alexei V. Ivanov <[email protected]>
…symbol exposure (vllm-project#21647)" This reverts commit 9ba1c88. Signed-off-by: Gregory Shtrasberg <[email protected]>
Upstream merge 2025 07 29
Waiting on latency results
docs/dev-docker/README.md
Outdated
cd vllm | ||
git checkout b432b7a285aa0dcb9677380936ffa74931bb6d6f | ||
git checkout 340ea86dfe5955d6f9a9e767d6abab5aacf2c978 | ||
docker build -f docker/Dockerfile.rocm -t <your_tag> --build-arg USE_CYTHON=1 . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to remove this --build-arg USE_CYTHON=1
part since a few releases back
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed it for both build commands, but let me know if we still need it for the other one. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way it is now is correct, thank you
The merge-base changed after approval.
1d2c43d
to
eb9d4de
Compare
Waiting on latency results
Please direct your PRs to the upstream vllm (https://github.com/vllm-project/vllm.git)
Accepting PRs into the ROCm fork (https://github.com/ROCm/vllm) will require a clear previously communicated exception