AI Voice Interview Agent (Thrisha Karkera)
An interactive voice-based AI interview assistant that listens to interviewer questions and responds with intelligent answers in both text and speech.
This project combines speech recognition, predefined logic, and LLM fallback to simulate a structured technical interview.
👉 Deployed on Hugging Face: https://huggingface.co/spaces/Thrisha2005/voicebot
- 🎤 Voice Input — Ask questions using your microphone
- 🧠 Smart Question Matching — Detects intent using keyword-based matching
- 📚 Predefined Answers — High-quality, structured responses for common interview questions
- 🤖 LLM Fallback (Groq + LLaMA) — Handles unexpected questions dynamically
- 🔊 Voice Output (TTS) — Converts answers into speech using gTTS
- 💬 Text Output — Displays conversation clearly
- 🧾 Conversation Memory — Maintains chat history for context
-
Python
-
Gradio — UI interface
-
Groq API
- Whisper (
whisper-large-v3) for speech-to-text - LLaMA (
llama-3.3-70b-versatile) for fallback responses
- Whisper (
-
gTTS — Text-to-speech
-
pydub — Audio processing
- 🎤 User records a question
- 🔄 Audio is converted to
.wavformat - 🧠 Whisper (via Groq) transcribes speech → text
- 🔍 System checks for a predefined answer using smart keyword matching
- 🤖 If no match → fallback to LLaMA model
- 🔊 Response is converted to speech (gTTS)
- 💬 Output is shown as both text + audio
The assistant prioritizes scripted, high-quality answers for common interview questions like:
- "Tell me about yourself"
- "What are your strengths?"
- "Why should we hire you?"
- "What are your weaknesses?"
- "Where do you see yourself in 5 years?"
If a question doesn't match → it uses LLM fallback to generate a response.
├── app.py # Main application (Gradio UI + logic)
├── requirements.txt # Dependencies
└── README.md # Documentation
git clone https://github.com/Thrisha200578/voicebot.git cd voicebot
pip install -r requirements.txtexport GROQ_API_KEY=your_api_key_herepython app.py| Variable | Description |
|---|---|
| GROQ_API_KEY | API key for Groq |
- Works best with clear voice input
- Scripted answers rely on keyword matching (not full NLP intent detection)
- Internet required for Groq API calls
- gTTS may introduce slight latency
- ✅ Replace keyword matching with semantic search / embeddings
- ✅ Add multilingual support
- ✅ Improve voice quality with advanced TTS models
- ✅ Add real-time streaming responses
- ✅ Expand interview question coverage
(Add screenshots from your Hugging Face app here)
Contributions are welcome! Feel free to fork the repo and improve the system.
MIT License
- Groq for ultra-fast inference
- Open-source speech & audio libraries
- Gradio for rapid UI development
Thrisha Karkera