Skip to content

Export voxtral to ExecuTorchΒ #39511

@mergennachin

Description

@mergennachin

Feature request

  • Add a executorch exportability test in transformers so that voxtral. See if we need to extend integration/executorch.py capability.
  • Add a test in optimum-executorch for performant lowering
  • Have a demo cpp app

Looking at the voxtral, it looks like fairly straightforward. In particular, for

model = VoxtralForConditionalGeneration.from_pretrained("mistralai/Voxtral-Mini-3B-2507")

it has three parts:

  1. It has Whisper-based audio encoder (which should be exportable already)
  2. Language model which is based on LlamaCausalLM, which is already exportable
  3. VoxtralMultiModalProjector, also a fairly simple projection.

Reference:
https://huggingface.co/docs/transformers/en/executorch
https://huggingface.co/mistralai/Voxtral-Mini-3B-2507
https://github.com/huggingface/transformers/blob/main/src/transformers/models/voxtral/modular_voxtral.py
https://github.com/huggingface/transformers/blob/main/src/transformers/models/voxtral/modeling_voxtral.py

Motivation

So that we can easily run voxtral in C++ directly. Voxtral was added recently in #39429

Your contribution

N/A

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions