VoiceAuth MVP 🎤🔐

A secure, production-ready Voice-Based Authentication System with a FastAPI backend.

2026 Edition - Uses SpeechBrain ECAPA-TDNN for state-of-the-art speaker verification.

🔗 Related Repository

Frontend Repository: This is the Backend API only. The React frontend is maintained in a separate repository.

👉 VoiceAuth Frontend (React SPA)

🌟 Features

🎤 Voice Enrollment: Register your unique voice signature
🔐 Voice Login: Authenticate using natural speech
🛡️ Anti-Replay Protection: Challenge-response system prevents recording attacks
🔍 Basic Anti-Spoofing: Detects synthetic/replayed audio
📊 Audit Logging: Track all authentication attempts
🔒 Encrypted Storage: Voice embeddings encrypted at rest (AES-256)
🌍 Multilingual: Supports English, Hindi, Marathi, Spanish phrases
🌐 REST API: OpenAPI/Swagger documented endpoints

🏗️ System Architecture

┌─────────────────────────────────────────────────────────────┐
│                   FRONTEND (React SPA)                       │
│           (Separate Repository - See link above)             │
│  ┌────────────┐  ┌──────────────┐  ┌──────────────────┐    │
│  │   Pages    │──│  Components  │──│  UI Components   │    │
│  │ (Routes)   │  │  (Business)  │  │  (shadcn/ui)     │    │
│  └────────────┘  └──────────────┘  └──────────────────┘    │
│         │                │                     │             │
│         └────────────────┴─────────────────────┘             │
│                          │                                   │
│                 ┌────────▼──────────┐                       │
│                 │   Custom Hooks    │                       │
│                 │  (State Logic)    │                       │
│                 └────────┬──────────┘                       │
│                          │                                   │
│                 ┌────────▼──────────┐                       │
│                 │  API Service      │                       │
│                 │  (api.ts)         │                       │
│                 └────────┬──────────┘                       │
└──────────────────────────┼──────────────────────────────────┘
                           │ HTTP/REST
                           ▼
┌─────────────────────────────────────────────────────────────┐
│                  BACKEND API (This Repository)               │
│  ┌─────────────────────────────────────────────────────────┐│
│  │  FastAPI Server (main.py)                                ││
│  │  • REST API endpoints                                    ││
│  │  • CORS configuration for frontend                       ││
│  │  • Request validation (Pydantic)                         ││
│  └─────────────────────────────────────────────────────────┘│
│                          │                                   │
│  ┌─────────────────────────────────────────────────────────┐│
│  │  API Routes (api/routes.py)                              ││
│  │  • POST /api/enroll - Voice enrollment                   ││
│  │  • POST /api/challenge - Get login challenge             ││
│  │  • POST /api/verify - Verify voice response              ││
│  │  • GET  /api/users/{id}/status - User status             ││
│  └─────────────────────────────────────────────────────────┘│
│                          │                                   │
│  ┌─────────────────────────────────────────────────────────┐│
│  │  Processing Layer (utils.py)                             ││
│  │  • Audio preprocessing (resample, normalize, trim)       ││
│  │  • ECAPA-TDNN embedding extraction (192-dim vectors)     ││
│  │  • Cosine similarity matching                            ││
│  │  • Challenge-response generation                         ││
│  │  • Basic anti-spoofing checks                            ││
│  └─────────────────────────────────────────────────────────┘│
│                          │                                   │
│  ┌─────────────────────────────────────────────────────────┐│
│  │  Storage Layer (storage.py)                              ││
│  │  • SQLite database (users, enrollments, challenges)      ││
│  │  • Fernet encryption for embeddings at rest              ││
│  │  • Account lockout protection                            ││
│  │  • Audit logging                                         ││
│  └─────────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────────┘

🚀 Quick Start

Prerequisites

Python 3.10+
~500MB disk space (for ML model)

Installation

# 1. Clone the repository
git clone https://github.com/anujpundir999/voice-biometric-authentication-backend.git
cd voice-biometric-authentication-backend

# 2. Create virtual environment
python -m venv venv
source venv/bin/activate  # Linux/macOS
# OR: venv\Scripts\activate  # Windows

# 3. Install dependencies
pip install -r requirements.txt

# 4. Download ML model (first run)
python utils.py  # Downloads ECAPA-TDNN model (~90MB)

# 5. Initialize database
python storage.py  # Creates SQLite database

# 6. Start the API server
uvicorn main:app --reload --host 0.0.0.0 --port 8000

Verify Installation

# API should be running at:
# http://localhost:8000

# Swagger documentation:
# http://localhost:8000/docs

# Health check:
curl http://localhost:8000/health

📁 Project Structure

voice-biometric-authentication-backend/ 
├── main.py              # FastAPI application entry point
├── app.py               # Application configuration
├── utils.py             # Core ML & audio utilities
├── storage.py           # Database & encryption layer
├── requirements.txt     # Python dependencies
├── pyproject.toml       # Project metadata
│
├── api/                 # API layer
│   ├── __init__.py
│   ├── routes.py        # API endpoint definitions
│   ├── schemas.py       # Pydantic request/response models
│   └── openapi.yaml     # OpenAPI specification
│
├── models/              # ML models (auto-downloaded)
│   └── ecapa_tdnn/      # SpeechBrain ECAPA-TDNN
│
├── data/                # Runtime data (gitignored)
│   ├── voiceauth.db     # SQLite database (auto-created)
│   └── temp/            # Temporary audio files
│
├── tests/               # Test suite
│   ├── test_embedding.py
│   ├── test_preprocessing.py
│   ├── test_storage.py
│   └── test_threshold.py
│
└── docs/                # Additional documentation
    ├── AGENT_DOCUMENTATION.md
    ├── FRONTEND_INTEGRATION_GUIDE.md
    └── QUICK_REFERENCE.md

🔌 API Endpoints

Method	Endpoint	Description
`GET`	`/health`	Health check
`POST`	`/api/enroll`	Enroll voice for new user
`POST`	`/api/challenge`	Get challenge phrase for login
`POST`	`/api/verify`	Verify voice against challenge
`GET`	`/api/users/{user_id}/status`	Get user enrollment status
`DELETE`	`/api/users/{user_id}`	Delete user and enrollments

Example: Enrollment Flow

# 1. Enroll a new user
curl -X POST http://localhost:8000/api/enroll \
  -F "user_id=john_doe" \
  -F "audio=@voice_sample.wav"

# 2. Get challenge for login
curl -X POST http://localhost:8000/api/challenge \
  -H "Content-Type: application/json" \
  -d '{"user_id": "john_doe"}'

# 3. Verify with challenge response
curl -X POST http://localhost:8000/api/verify \
  -F "user_id=john_doe" \
  -F "challenge_id=<challenge_id>" \
  -F "audio=@challenge_response.wav"

🔧 Configuration

Environment Variables

Create a .env file (optional):

# Server
HOST=0.0.0.0
PORT=8000
DEBUG=false

# Security
CORS_ORIGINS=http://localhost:3000,http://localhost:5173

# Audio processing
SIMILARITY_THRESHOLD=0.75
CHALLENGE_EXPIRY_SECONDS=60

Key Settings in `utils.py`

Setting	Default	Description
`SIMILARITY_THRESHOLD`	0.75	Cosine similarity threshold for match
`SAMPLE_RATE`	16000	Audio sample rate (Hz)
`MIN_AUDIO_DURATION`	1.5s	Minimum recording length
`MAX_AUDIO_DURATION`	10s	Maximum recording length
`CHALLENGE_EXPIRY_SECONDS`	60	Challenge validity window

🧪 Testing

# Run all tests
pytest tests/ -v

# Run specific test
pytest tests/test_embedding.py -v

# Test with coverage
pytest tests/ --cov=. --cov-report=html

📊 Performance

Metric	Value
Embedding Dimension	192
Embedding Time (CPU)	~200ms
Matching Time	<1ms
EER (VoxCeleb)	~1-2%
Model Size	~90MB

🔒 Security Features

Challenge-Response: Random phrase each login, expires in 60s
One-Time Challenges: Each challenge can only be used once
Encrypted Storage: Embeddings encrypted with Fernet (AES-128-CBC)
Account Lockout: 15-minute lockout after 5 failed attempts
Audit Logging: All attempts logged with timestamps
CORS Protection: Configurable allowed origins

🚧 Known Limitations

Basic anti-spoofing (not production-grade)
No voice activity detection (VAD)
Single-device enrollment
No continuous authentication
No liveness detection

🛠️ Troubleshooting

Model download fails

# Manual download from HuggingFace
# https://huggingface.co/speechbrain/spkrec-ecapa-voxceleb
# Place files in models/ecapa_tdnn/

Audio format not supported

# Convert to WAV (16kHz, mono)
ffmpeg -i input.mp3 -ar 16000 -ac 1 output.wav

CORS errors from frontend

# Update CORS origins in main.py
origins = [
    "http://localhost:3000",      # React dev server
    "http://localhost:5173",      # Vite dev server
    "https://your-frontend.com",  # Production
]

Low similarity scores

Ensure quiet recording environment
Speak clearly with consistent volume
Use 3-5 second recordings
Enroll multiple samples (up to 5)

📚 References

SpeechBrain - Speech processing toolkit
ECAPA-TDNN Paper - Model architecture
FastAPI - API framework
ASVspoof Challenge - Anti-spoofing research

🤝 Contributing

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit changes (git commit -m 'Add amazing feature')
Push to branch (git push origin feature/amazing-feature)
Open a Pull Request

📄 License

MIT License - See LICENSE file

👥 Authors

Built with ❤️ for secure voice authentication

Quick Commands:

# Start development server
uvicorn main:app --reload --port 8000

# Run tests
pytest tests/ -v

# Reset database (development only)
python -c "from storage import reset_database; reset_database()"

# Check API docs
open http://localhost:8000/docs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VoiceAuth MVP 🎤🔐

🔗 Related Repository

🌟 Features

🏗️ System Architecture

🚀 Quick Start

Prerequisites

Installation

Verify Installation

📁 Project Structure

🔌 API Endpoints

Example: Enrollment Flow

🔧 Configuration

Environment Variables

Key Settings in `utils.py`

🧪 Testing

📊 Performance

🔒 Security Features

🚧 Known Limitations

🛠️ Troubleshooting

Model download fails

Audio format not supported

CORS errors from frontend

Low similarity scores

📚 References

🤝 Contributing

📄 License

👥 Authors

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
api		api
data		data
models		models
tests		tests
.gitignore		.gitignore
.python-version		.python-version
QUICK_REFERENCE.md		QUICK_REFERENCE.md
README.md		README.md
app.py		app.py
main.py		main.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
storage.py		storage.py
utils.py		utils.py

Folders and files

Latest commit

History

Repository files navigation

VoiceAuth MVP 🎤🔐

🔗 Related Repository

🌟 Features

🏗️ System Architecture

🚀 Quick Start

Prerequisites

Installation

Verify Installation

📁 Project Structure

🔌 API Endpoints

Example: Enrollment Flow

🔧 Configuration

Environment Variables

Key Settings in utils.py

🧪 Testing

📊 Performance

🔒 Security Features

🚧 Known Limitations

🛠️ Troubleshooting

Model download fails

Audio format not supported

CORS errors from frontend

Low similarity scores

📚 References

🤝 Contributing

📄 License

👥 Authors

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Key Settings in `utils.py`

Packages