You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
docs(voxtral_realtime): document CUDA Windows workflow
Add CUDA-Windows instructions to the Voxtral Realtime README, including export prerequisites and an example command.
Document Windows build steps via CMake workflow presets and add PowerShell run examples with and without the .ptd data file.
Note recommended CUDA architectures for int4 kernels, and reformat voxtral_realtime CMake presets without changing behavior.
This builds ExecuTorch with Metal backend support. The runner binary is at
229
262
the same path as above. Metal exports can only run on macOS with Apple Silicon.
230
263
264
+
### CUDA-Windows
265
+
266
+
On Windows (PowerShell), use CMake workflow presets directly from the executorch root directory. Note that if you exported the model with 4-bit quantization, you may need to specify your GPU's compute capability (e.g., `80;86;89;90;120` for Ampere, Lovelace, Hopper, and Blackwell) to avoid "invalid device function" errors at runtime, as the `int4mm` kernels require SM 80 or newer.
267
+
268
+
```powershell
269
+
$env:CMAKE_CUDA_ARCHITECTURES="80;86;89;90;120"
270
+
cmake --workflow --preset llm-release-cuda
271
+
Push-Location examples/models/voxtral_realtime
272
+
cmake --workflow --preset voxtral-realtime-cuda
273
+
Pop-Location
274
+
```
275
+
231
276
## Run
232
277
233
278
The runner requires:
@@ -237,35 +282,37 @@ The runner requires:
237
282
- A 16kHz mono WAV audio file (or live audio via `--mic`)
238
283
- For CUDA: `aoti_cuda_blob.ptd` — delegate data file (pass via `--data_path`)
0 commit comments