A secure, production-ready Voice-Based Authentication System with a FastAPI backend.
2026 Edition - Uses SpeechBrain ECAPA-TDNN for state-of-the-art speaker verification.
Frontend Repository: This is the Backend API only. The React frontend is maintained in a separate repository.
- π€ Voice Enrollment: Register your unique voice signature
- π Voice Login: Authenticate using natural speech
- π‘οΈ Anti-Replay Protection: Challenge-response system prevents recording attacks
- π Basic Anti-Spoofing: Detects synthetic/replayed audio
- π Audit Logging: Track all authentication attempts
- π Encrypted Storage: Voice embeddings encrypted at rest (AES-256)
- π Multilingual: Supports English, Hindi, Marathi, Spanish phrases
- π REST API: OpenAPI/Swagger documented endpoints
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β FRONTEND (React SPA) β
β (Separate Repository - See link above) β
β ββββββββββββββ ββββββββββββββββ ββββββββββββββββββββ β
β β Pages ββββ Components ββββ UI Components β β
β β (Routes) β β (Business) β β (shadcn/ui) β β
β ββββββββββββββ ββββββββββββββββ ββββββββββββββββββββ β
β β β β β
β ββββββββββββββββββ΄ββββββββββββββββββββββ β
β β β
β ββββββββββΌβββββββββββ β
β β Custom Hooks β β
β β (State Logic) β β
β ββββββββββ¬βββββββββββ β
β β β
β ββββββββββΌβββββββββββ β
β β API Service β β
β β (api.ts) β β
β ββββββββββ¬βββββββββββ β
ββββββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββ
β HTTP/REST
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β BACKEND API (This Repository) β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β FastAPI Server (main.py) ββ
β β β’ REST API endpoints ββ
β β β’ CORS configuration for frontend ββ
β β β’ Request validation (Pydantic) ββ
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β API Routes (api/routes.py) ββ
β β β’ POST /api/enroll - Voice enrollment ββ
β β β’ POST /api/challenge - Get login challenge ββ
β β β’ POST /api/verify - Verify voice response ββ
β β β’ GET /api/users/{id}/status - User status ββ
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β Processing Layer (utils.py) ββ
β β β’ Audio preprocessing (resample, normalize, trim) ββ
β β β’ ECAPA-TDNN embedding extraction (192-dim vectors) ββ
β β β’ Cosine similarity matching ββ
β β β’ Challenge-response generation ββ
β β β’ Basic anti-spoofing checks ββ
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β Storage Layer (storage.py) ββ
β β β’ SQLite database (users, enrollments, challenges) ββ
β β β’ Fernet encryption for embeddings at rest ββ
β β β’ Account lockout protection ββ
β β β’ Audit logging ββ
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
- Python 3.10+
- ~500MB disk space (for ML model)
# 1. Clone the repository
git clone https://github.com/anujpundir999/voice-biometric-authentication-backend.git
cd voice-biometric-authentication-backend
# 2. Create virtual environment
python -m venv venv
source venv/bin/activate # Linux/macOS
# OR: venv\Scripts\activate # Windows
# 3. Install dependencies
pip install -r requirements.txt
# 4. Download ML model (first run)
python utils.py # Downloads ECAPA-TDNN model (~90MB)
# 5. Initialize database
python storage.py # Creates SQLite database
# 6. Start the API server
uvicorn main:app --reload --host 0.0.0.0 --port 8000# API should be running at:
# http://localhost:8000
# Swagger documentation:
# http://localhost:8000/docs
# Health check:
curl http://localhost:8000/healthvoice-biometric-authentication-backend/
βββ main.py # FastAPI application entry point
βββ app.py # Application configuration
βββ utils.py # Core ML & audio utilities
βββ storage.py # Database & encryption layer
βββ requirements.txt # Python dependencies
βββ pyproject.toml # Project metadata
β
βββ api/ # API layer
β βββ __init__.py
β βββ routes.py # API endpoint definitions
β βββ schemas.py # Pydantic request/response models
β βββ openapi.yaml # OpenAPI specification
β
βββ models/ # ML models (auto-downloaded)
β βββ ecapa_tdnn/ # SpeechBrain ECAPA-TDNN
β
βββ data/ # Runtime data (gitignored)
β βββ voiceauth.db # SQLite database (auto-created)
β βββ temp/ # Temporary audio files
β
βββ tests/ # Test suite
β βββ test_embedding.py
β βββ test_preprocessing.py
β βββ test_storage.py
β βββ test_threshold.py
β
βββ docs/ # Additional documentation
βββ AGENT_DOCUMENTATION.md
βββ FRONTEND_INTEGRATION_GUIDE.md
βββ QUICK_REFERENCE.md
| Method | Endpoint | Description |
|---|---|---|
GET |
/health |
Health check |
POST |
/api/enroll |
Enroll voice for new user |
POST |
/api/challenge |
Get challenge phrase for login |
POST |
/api/verify |
Verify voice against challenge |
GET |
/api/users/{user_id}/status |
Get user enrollment status |
DELETE |
/api/users/{user_id} |
Delete user and enrollments |
# 1. Enroll a new user
curl -X POST http://localhost:8000/api/enroll \
-F "user_id=john_doe" \
-F "audio=@voice_sample.wav"
# 2. Get challenge for login
curl -X POST http://localhost:8000/api/challenge \
-H "Content-Type: application/json" \
-d '{"user_id": "john_doe"}'
# 3. Verify with challenge response
curl -X POST http://localhost:8000/api/verify \
-F "user_id=john_doe" \
-F "challenge_id=<challenge_id>" \
-F "audio=@challenge_response.wav"Create a .env file (optional):
# Server
HOST=0.0.0.0
PORT=8000
DEBUG=false
# Security
CORS_ORIGINS=http://localhost:3000,http://localhost:5173
# Audio processing
SIMILARITY_THRESHOLD=0.75
CHALLENGE_EXPIRY_SECONDS=60| Setting | Default | Description |
|---|---|---|
SIMILARITY_THRESHOLD |
0.75 | Cosine similarity threshold for match |
SAMPLE_RATE |
16000 | Audio sample rate (Hz) |
MIN_AUDIO_DURATION |
1.5s | Minimum recording length |
MAX_AUDIO_DURATION |
10s | Maximum recording length |
CHALLENGE_EXPIRY_SECONDS |
60 | Challenge validity window |
# Run all tests
pytest tests/ -v
# Run specific test
pytest tests/test_embedding.py -v
# Test with coverage
pytest tests/ --cov=. --cov-report=html| Metric | Value |
|---|---|
| Embedding Dimension | 192 |
| Embedding Time (CPU) | ~200ms |
| Matching Time | <1ms |
| EER (VoxCeleb) | ~1-2% |
| Model Size | ~90MB |
- Challenge-Response: Random phrase each login, expires in 60s
- One-Time Challenges: Each challenge can only be used once
- Encrypted Storage: Embeddings encrypted with Fernet (AES-128-CBC)
- Account Lockout: 15-minute lockout after 5 failed attempts
- Audit Logging: All attempts logged with timestamps
- CORS Protection: Configurable allowed origins
- Basic anti-spoofing (not production-grade)
- No voice activity detection (VAD)
- Single-device enrollment
- No continuous authentication
- No liveness detection
# Manual download from HuggingFace
# https://huggingface.co/speechbrain/spkrec-ecapa-voxceleb
# Place files in models/ecapa_tdnn/# Convert to WAV (16kHz, mono)
ffmpeg -i input.mp3 -ar 16000 -ac 1 output.wav# Update CORS origins in main.py
origins = [
"http://localhost:3000", # React dev server
"http://localhost:5173", # Vite dev server
"https://your-frontend.com", # Production
]- Ensure quiet recording environment
- Speak clearly with consistent volume
- Use 3-5 second recordings
- Enroll multiple samples (up to 5)
- SpeechBrain - Speech processing toolkit
- ECAPA-TDNN Paper - Model architecture
- FastAPI - API framework
- ASVspoof Challenge - Anti-spoofing research
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit changes (
git commit -m 'Add amazing feature') - Push to branch (
git push origin feature/amazing-feature) - Open a Pull Request
MIT License - See LICENSE file
Built with β€οΈ for secure voice authentication
Quick Commands:
# Start development server
uvicorn main:app --reload --port 8000
# Run tests
pytest tests/ -v
# Reset database (development only)
python -c "from storage import reset_database; reset_database()"
# Check API docs
open http://localhost:8000/docs