HeartMuLa reimplementation #2442
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request adds support for the HeartCodec audio codec model, including its configuration, flow-matching inference, and integration into the codebase. It also introduces improvements for handling audio token fields and autoregressive models throughout the metadata and model helper modules. The most important changes are grouped below:
HeartCodec Model Integration
HeartCodecmodel, including its configuration (HeartCodecConfig), flow-matching logic (FlowMatching), and scalar codec (ScalarModel), along with all necessary methods for detokenizing audio from codebook tokens. [1] [2] [3] [4]Support for Audio Token Fields
huggingface.py,parquet.py) to extract and properly handleaudio_tokensandaudio_tokens_pathfields, converting them to lists if needed. [1] [2]Autoregressive Model and Token Handling
AUTOREGRESSIVE_NEXT_TOKEN, to thePredictionTypesenum and updated the string conversion logic to recognize it. [1] [2]uses_audio_tokens,collate_audio_tokens, and updateduses_noise_schedulelogic to support models that do not use diffusion noise schedules. [1] [2] [3]Dependency Updates
vector-quantize-pytorchas a required dependency insetup.pyfor the new codec implementation.