Skip to content

extrawest/livekit_voice_assistant

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 

Repository files navigation

🎙️ Voice-to-Voice Communication Assistant

Maintenance Maintainer Ask Me Anything ! License Version

A production-ready voice assistant built with LiveKit for real-time communication and Flutter integration. Features custom Speech-to-Text (STT) and Text-to-Speech (TTS) implementations using local APIs for enhanced privacy and performance.

Simulator.-.iPhone.15.Pro.-.5.June.2025.3.mp4

✨ Features

Uploading Overview of ExtraWest Company (1).mp4…

  • 🗣️ Real-time Voice Communication: Seamless voice interaction with AI assistant
  • 🏠 Local API Architecture: Privacy-focused local processing
    • Custom STT via Speeches AI API
    • Custom TTS via Kokoro AI API
    • Groq LLM integration for intelligent responses
  • 🛠️ Extended Capabilities:
    • 🔍 Web search using Tavily API
    • 🌤️ Weather information retrieval
  • ⚡ Performance Optimized:
    • LiveKit real-time communication framework
    • Silero VAD for precise voice activity detection
    • Integrated noise cancellation

🔧 Prerequisites

  • Python 3.10 or higher
  • LiveKit server (local or cloud deployment)
  • Local STT API (Speeches AI)
  • Local TTS API (Kokoro AI)
  • Ollama running locally
  • Flutter SDK (for mobile app)

🚀 Quick Start

1. Backend Setup

Clone and configure the voice assistant backend:

git clone https://github.com/extrawest/livekit_voice_assistant.git
cd livekit_voice_assistant

# Create virtual environment
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Configure environment
cp .env.example .env

2. Environment Configuration

Edit .env with your settings:

# LiveKit Configuration
LIVEKIT_URL=ws://localhost:7880
LIVEKIT_API_KEY=devkey
LIVEKIT_API_SECRET=secret

# STT Configuration
STT_API_URL=your_stt_url

# LLM Configuration
GROQ_API_KEY=your-groq-api-key

# TTS Configuration
TTS_API_URL=your_tts_url
TTS_API_KEY=your-tts-api-key

# Optional Features
WEATHER_API_KEY=your-weather-api-key
TAVILY_API_KEY=your-tavily-api-key

3. Start the Backend

python main.py dev

4. Flutter Frontend Setup

Set up the mobile application:

  1. Navigate to flutter/application folder
  2. Create .env file, add your LIVEKIT_SANDBOX_ID=

or

# Clone Flutter frontend
git clone https://github.com/livekit-examples/voice-assistant-flutter
cd voice-assistant-flutter

# Install Flutter dependencies
flutter pub get

# Run the application
flutter run

5. LiveKit Cloud Setup

  1. Create account at LiveKit Cloud
  2. Register your server
  3. Update environment variables with your LiveKit credentials

🏗️ System Architecture

The application follows a modular architecture:

┌─────────────────┐    ┌──────────────────┐    ┌─────────────────┐
│  Flutter App    │◄──►│   LiveKit        │◄──►│  Voice Assistant│
│  (Frontend)     │    │   (Real-time     │    │  (Backend)      │
└─────────────────┘    │   Communication) │    └─────────────────┘
                       └──────────────────┘
                                │
                    ┌───────────▼──────────┐
                    │   Local APIs         │
                    │ ┌─────────────────┐  │
                    │ │ STT (Speeches)  │  │
                    │ │ TTS (Kokoro)    │  │
                    │ │ LLM (Groq)      │  │
                    │ └─────────────────┘  │
                    └──────────────────────┘

Core Components

  • LiveKit Integration: Manages real-time audio streaming and room connections
  • Custom STT Module: Converts speech to text using local Speeches AI API
  • Custom TTS Module: Generates natural speech from text via Kokoro AI API
  • Groq LLM Processing: Handles conversation logic and response generation
  • Agent Controller: Orchestrates conversation flow and tool integrations

🔍 Advanced Features

Voice Activity Detection

Utilizes Silero VAD for accurate speech detection, reducing false triggers and improving conversation flow.

Noise Cancellation

Built-in audio processing for cleaner voice interactions in various environments.

Tool Integration

Extensible architecture supporting additional tools like Tavily web search and OpenWeather APIs.

License

This project is licensed under the MIT License - see the LICENSE.txt file for details.


Created by Oleksandr Samoilenko
Extrawest.com, 2025

Releases

No releases published

Packages

No packages published