Commit e61e47a
[GPU]qwen3 moe fused compressed (#32536)
### Details:
- Qwen3 moe model support for weight fusion compression
- moe transformation:
FuseVectorizedMOE3GEMM->ConvertMOEToMOECompressed->FuseMOECompressed
- ov::intel_gpu::op::MOEFusedCompressed fuses softmax_topk/onehot into
moe computation for performance optimization
- prefill stage leverages gemm kernel to compute each experts output one
by one
- decode stage leverages ocl kernels to compute experts output in
parallel.
- moe exec graph:
<img width="194" height="432" alt="image"
src="https://github.com/user-attachments/assets/fb5fd9b9-3b56-43cc-a71c-27cd4b9cd0d2"
/>
### Tickets:
- *CVS-169299*
---------
Co-authored-by: chenhu-wang <[email protected]>
Co-authored-by: Xuejun,Zhai <[email protected]>
Co-authored-by: Zaixing,Wang <[email protected]>
Co-authored-by: Chen Peter <[email protected]>1 parent cad9e8e commit e61e47a
File tree
27 files changed
+5572
-106
lines changed- src
- common/transformations/src/transformations/common_optimizations
- plugins/intel_gpu
- include/intel_gpu
- op
- plugin
- primitives
- src
- graph
- impls/ocl_v2
- moe
- include
- registry
- plugin
- ops
- transformations
- op
- tests/unit
- test_cases
- transformations
27 files changed
+5572
-106
lines changedLines changed: 2 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
291 | 291 | | |
292 | 292 | | |
293 | 293 | | |
294 | | - | |
295 | | - | |
| 294 | + | |
296 | 295 | | |
297 | | - | |
| 296 | + | |
298 | 297 | | |
299 | 298 | | |
300 | 299 | | |
| |||
Lines changed: 46 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
Lines changed: 1 addition & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
| 18 | + | |
18 | 19 | | |
19 | 20 | | |
20 | 21 | | |
| |||
Lines changed: 1 addition & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
312 | 312 | | |
313 | 313 | | |
314 | 314 | | |
| 315 | + | |
315 | 316 | | |
Lines changed: 72 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
0 commit comments