Skip to content

Maya Voice AI is an open-source project that demonstrates the Maya1 model, capable of generating realistic voice audio from text input with rich emotional and descriptive control. This repository provides a demo for text-to-speech synthesis using advanced language models and the SNAC codec, focusing on high-quality audio at 24kHz.

Notifications You must be signed in to change notification settings

Dineshkumar-Ponnusamy/maya-voice-ai

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 

Repository files navigation

Maya Voice AI

A demonstration of the Maya1 voice AI model, which generates realistic voice audio from text input with emotional and descriptive control.

Description

This project showcases the Maya1 model, an open-source voice AI that can synthesize speech with specified voice characteristics and emotions. The demo script generates audio from a text prompt using a voice description, producing high-quality voice output.

Features

  • Text-to-speech synthesis with voice descriptions
  • Emotional and stylistic voice control
  • High-quality audio output at 24kHz
  • Uses advanced language models and audio codecs

Installation

  1. Ensure you have Python 3.8 or higher installed.

  2. Install the required dependencies:

    pip install torch transformers snac soundfile
  3. The models will be automatically downloaded when you run the script:

    • Maya1 model: maya-research/maya1
    • SNAC codec: hubertsiuzdak/snac_24khz

Usage

Run the demo script to generate voice audio:

python maya1_demo.py

The script will:

  • Load the Maya1 model and SNAC codec
  • Generate voice based on the predefined description and text
  • Save the output as output.wav

You can modify the description and text variables in maya1_demo.py to customize the voice generation.

Requirements

  • Python 3.8+
  • PyTorch
  • Transformers library
  • SNAC audio codec
  • Soundfile for audio I/O

Output

The generated audio is saved as output.wav in the project directory. The audio is encoded at 24kHz sample rate.

License

This project uses open-source models and libraries. Please refer to the individual model licenses for usage terms.

About

Maya Voice AI is an open-source project that demonstrates the Maya1 model, capable of generating realistic voice audio from text input with rich emotional and descriptive control. This repository provides a demo for text-to-speech synthesis using advanced language models and the SNAC codec, focusing on high-quality audio at 24kHz.

Topics

Resources

Stars

Watchers

Forks

Languages