Commit 52027ff
committed
Voxtral Realtime: enable bf16 for Metal backend with quantization
The Metal AOTI backend already handles bf16 correctly (fp32 attention
masks, fp32 RoPE upcast, dtype-agnostic KV caches and SDPA). Enable
--dtype bf16 as the default recipe for Metal CI and update all
documentation to recommend bf16 with fpa4w quantization.
Fix a Metal shader compilation bug in the streaming encoder where
bool.to(bf16) generates `bfloat tmp = 0.0;` — Metal Shading Language
doesn't support implicit float-to-bfloat literal conversion. Use
.float() instead and let mul_ handle type promotion.1 parent 6db7f4c commit 52027ff
File tree
5 files changed
+21
-13
lines changed- .ci/scripts
- examples/models/voxtral_realtime
5 files changed
+21
-13
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
262 | 262 | | |
263 | 263 | | |
264 | 264 | | |
| 265 | + | |
265 | 266 | | |
266 | 267 | | |
267 | 268 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
87 | 87 | | |
88 | 88 | | |
89 | 89 | | |
90 | | - | |
| 90 | + | |
91 | 91 | | |
92 | 92 | | |
93 | 93 | | |
| |||
128 | 128 | | |
129 | 129 | | |
130 | 130 | | |
131 | | - | |
| 131 | + | |
132 | 132 | | |
133 | 133 | | |
134 | 134 | | |
135 | 135 | | |
136 | 136 | | |
| 137 | + | |
137 | 138 | | |
138 | 139 | | |
139 | 140 | | |
140 | 141 | | |
141 | 142 | | |
142 | | - | |
| 143 | + | |
143 | 144 | | |
144 | 145 | | |
145 | 146 | | |
146 | 147 | | |
147 | 148 | | |
| 149 | + | |
148 | 150 | | |
149 | 151 | | |
150 | 152 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
30 | 30 | | |
31 | 31 | | |
32 | 32 | | |
33 | | - | |
| 33 | + | |
34 | 34 | | |
35 | 35 | | |
36 | 36 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
74 | 74 | | |
75 | 75 | | |
76 | 76 | | |
77 | | - | |
78 | | - | |
79 | | - | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
80 | 81 | | |
81 | 82 | | |
82 | | - | |
83 | | - | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
84 | 86 | | |
85 | 87 | | |
86 | 88 | | |
| |||
274 | 276 | | |
275 | 277 | | |
276 | 278 | | |
277 | | - | |
| 279 | + | |
278 | 280 | | |
279 | 281 | | |
280 | 282 | | |
| |||
364 | 366 | | |
365 | 367 | | |
366 | 368 | | |
367 | | - | |
| 369 | + | |
368 | 370 | | |
369 | 371 | | |
370 | 372 | | |
| |||
422 | 424 | | |
423 | 425 | | |
424 | 426 | | |
425 | | - | |
| 427 | + | |
| 428 | + | |
426 | 429 | | |
427 | 430 | | |
428 | 431 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1049 | 1049 | | |
1050 | 1050 | | |
1051 | 1051 | | |
1052 | | - | |
| 1052 | + | |
| 1053 | + | |
| 1054 | + | |
1053 | 1055 | | |
1054 | 1056 | | |
1055 | 1057 | | |
| |||
0 commit comments