Feature/moe bf16 pr #1859

aleozlx · 2025-10-04T03:07:28Z

📌 Description

trtllm-gen bf16 moe

🔍 Related Issues

🚀 Pull Request Checklist

Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete.

✅ Pre-commit Checks

I have installed pre-commit by running pip install pre-commit (or used your preferred method).
I have installed the hooks with pre-commit install.
I have run the hooks manually with pre-commit run --all-files and fixed any reported issues.

If you are unsure about how to set up pre-commit, see the pre-commit documentation.

🧪 Tests

Tests have been added or updated as needed.
All tests are passing (unittest, etc.).

pytest -x -v tests/moe/test_trtllm_gen_fused_moe.py -k All_BF16

9 passed, 999 skipped

====

pytest  tests/moe/test_trtllm_gen_fused_moe.py

PENDING.. some IMA in existing tests is detected

Reviewer Notes

In the new trtllm_bf16_moe interface, i used * in the argument list to mark which ones should be passed by keyword only. It is a practice to make function calls less error prone when the function has a very long list of arguments. Before * are the ones commonly used, whereas the ones after are optional ones / perf tuning.

aleozlx · 2025-10-04T03:08:02Z

IMA

pytest tests/moe/test_trtllm_gen_fused_moe.py::test_moe_quantization_classes[SwiGlu-Shuffled_MajorK-DSLite-NvFP4xNvFP4-1024-1024-1]

yzh119 · 2025-10-04T04:21:54Z

Hi @aleozlx can we confirm whether this IMA is a kernel issue or an integration issue?

@aleozlx

## 📌 Description - Refactor `trtllm_fused_moe_kernel_launcher.cu` to use class structure for code cleanliness and readability - Add BF16 MOE, initial PR (#1859) from @aleozlx and @nekorobov - Add BF16 MOE autotune  ## 🔍 Related Issues  ## 🚀 Pull Request Checklist Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete. ### ✅ Pre-commit Checks - [x] I have installed `pre-commit` by running `pip install pre-commit` (or used your preferred method). - [x] I have installed the hooks with `pre-commit install`. - [x] I have run the hooks manually with `pre-commit run --all-files` and fixed any reported issues. > If you are unsure about how to set up `pre-commit`, see [the pre-commit documentation](https://pre-commit.com/). ## 🧪 Tests - [x] Tests have been added or updated as needed. - [x] All tests are passing (`unittest`, etc.). ## Reviewer Notes   ## Summary by CodeRabbit * **New Features** * BF16 Mixture-of-Experts (MoE) pathway added with autotuning and public API access. * **Improvements** * Unified BF16/FP8/FP4/FP16 pathways with clearer dtype compatibility checks and corrected operator return semantics. * Routing selection now respects token-size and input packing, and diagnostics produce more descriptive error messages. * **Tests** * Expanded BF16 test coverage across routing modes, weight layouts, and token sizes. * **Chores** * Updated artifact metadata and checksums.  --------- Signed-off-by: jiahanc <[email protected]>

aleozlx and others added 13 commits September 26, 2025 00:08

bf16

71052df

fix compilable kernel but test fails

9d35e9e

some other fixes

dd1d53d

fix: test is passing

2b570cf

patch hash for public cubins (tested)

50bfd47

precommit + clean

b1cf91f

Merge branch 'main' into feature/moe_bf16_pr

62d33f1

upgrade to tvm ffi

baab3ac

..

69e64b0

..

49a1a73

..

0da1566

..

a60c01e

..

ef08cdd

jiahanc mentioned this pull request Nov 4, 2025

[feat] Refactor trtllmgen MOE and add Bf16 trtllmgen moe #2014

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature/moe bf16 pr #1859

Feature/moe bf16 pr #1859

Uh oh!

aleozlx commented Oct 4, 2025

Uh oh!

aleozlx commented Oct 4, 2025

Uh oh!

yzh119 commented Oct 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Feature/moe bf16 pr #1859

Are you sure you want to change the base?

Feature/moe bf16 pr #1859

Uh oh!

Conversation

aleozlx commented Oct 4, 2025

📌 Description

🔍 Related Issues

🚀 Pull Request Checklist

✅ Pre-commit Checks

🧪 Tests

Reviewer Notes

Uh oh!

aleozlx commented Oct 4, 2025

Uh oh!

yzh119 commented Oct 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants