Skip to content

Support for adding new languages #11

@assefvisic

Description

@assefvisic

Hi Liquid team

Amazing work on Liquid Audio — I’ve been exploring it and really like the interleaved text/audio design.

What would be the recommended approach to add support for a new language, for example Croatian (hr)?

I’d like to understand:

  • Which model components are language-dependent (text tokenizer, audio encoder, etc.)?
  • Is fine-tuning LFM2-Audio on new audio↔text pairs sufficient, or would this require changes to the tokenizer or encoder?
  • Any rough guidance on data requirements (hours or token count) for a usable new language model?

If you have any internal recipes, tips, or references for multilingual extensions, I’d really appreciate it.

Thanks a lot for your time and for releasing such an interesting system! 🙏

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions