π An open-source, real-time AI Voice Assistant for intelligent, human-like conversations β built to serve institutions, automate tasks, and adapt to your context.
Zentry AI Assistant is a modular and real-time voice AI system designed for telephony and institutional automation. It combines high-accuracy speech-to-text (STT), reasoning with lightweight LLMs, and natural speech synthesis (TTS) to create seamless human-like conversations in local languages.
Built around FreeSWITCH for call control, CTranslate2-optimized Whisper for efficient transcription, and Phi-3 Mini with RAG for factual reasoning, Zentry emphasizes speed, accuracy, and locality. The assistant is extendable with Meta MMS multilingual models for broader language coverage, enabling use in education, healthcare, and enterprise environments.
- ποΈ STT (Malayalam + English) using Whisper-medium CTranslate2
- π§© Reasoning & RAG with Phi-3 Mini (low-latency + factual)
- π Multilingual extension via Meta MMS speech models
- βοΈ FreeSWITCH SIP integration for call routing + telephony
- π¬ Dynamic response generation
- π¦ Lightweight + fully local (edge-device deployable)
- π API-first for easy integration with external systems
Component | Tech / Tool |
---|---|
Speech-to-Text | Whisper-medium (CTranslate2 runtime) |
NLP / Reasoning | Phi-3 Mini + RAG pipeline |
Multilingual STT | Meta MMS (Vineelβs fine-tuned speech models) |
Voice I/O | FreeSWITCH + Linphone SIP |
Backend | Python (FastAPI / Flask optional) |
Orchestration | Docker, Supervisor |
Real-Time Engine | Asyncio + WebSockets |
- Python 3.9+
- FFmpeg
- CTranslate2 + Whisper model
- FreeSWITCH running with SIP endpoints
linphonec
or any SIP client
git clone https://github.com/Zentry-org/.github.git
cd zentry-ai-assistant
# Create a virtual environment
python3 -m venv venv && source venv/bin/activate
# Install dependencies
pip install -r requirements.txt
# Download and optimize Whisper model
python setup_model.py
# Start the assistant
python run_assistant.py
[Caller] β [FreeSWITCH] β [Linphone] β [VoiceBot.py]
β
[STT: Whisper (CTranslate2)]
β
[LLM: Phi-3 Mini + Retrieval (RAG)]
β
[TTS / Playback with future module]
- π College reception desk voice assistant
- π Automated helpline support
- π₯ Healthcare triage voicebot
- π Local-language assistants (Malayalam, Tamil, Hindi, etc.)
- STT with Whisper (CTranslate2 optimized)
- FreeSWITCH SIP integration
- MMS Meta integration for multilingual STT
- TTS Module (Indic-TTS / Coqui extension)
- Phi-3 Mini RAG optimization for factual answers
- Conversation monitoring dashboard
- Deployable builds for Raspberry Pi / edge
We welcome your contributions!
# Fork the repo, make changes, and submit a PR π
Check CONTRIBUTING.md before submitting.
MIT License β see LICENSE
- Organization: Zentry
- Developers: Habel Shaji | Lino Tom
- Still Doubt?: Some DevNotes Notion
"The future of voice is local, inclusive, and intelligent. Let's build it together." β Team Zentry
Do you want me to also add a Demo / Benchmark section (with response time goals like <2s, WER numbers for Malayalam STT) so people see your projectβs performance metrics?