A comprehensive real-time face recognition system with RTSP stream support, WebSocket communication, and secure API endpoints. Built with FastAPI, PyTorch, and FaceNet for high-accuracy facial recognition in live video streams.
- Real-time Face Recognition: Process live video streams from RTSP cameras
- Multi-face Detection: Detect and identify multiple faces simultaneously
- Face Registration: Register new faces with multiple reference images
- Live Streaming: View processed video streams with bounding boxes
- Callback Integration: Send detection events to external systems
- WebSocket Support: Real-time detection via WebSocket connections
- JWT Authentication: Secure API access with token-based authentication
- User Management: Admin-controlled user creation and management
- Session Management: Cookie-based authentication for web interface
- CORS Support: Configurable cross-origin resource sharing
- GPU Acceleration: CUDA support for faster processing
- Database Storage: SQLite with SQLAlchemy ORM
- Configuration Management: Dynamic parameter adjustment
- Thread Pool Processing: Concurrent RTSP stream handling
- Memory Optimization: Efficient frame processing and cleanup
┌─────────────────────────────────────────────────────────────┐
│ RTSP Face Recognition System │
├─────────────────────────────────────────────────────────────┤
│ Frontend Layer │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Login │ │ Config │ │ RTSP Mgmt │ │
│ │ Page │ │ Panel │ │ Panel │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
├─────────────────────────────────────────────────────────────┤
│ API Layer (FastAPI) │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Auth │ │ Face Ops │ │ RTSP │ │
│ │ Endpoints │ │ Endpoints │ │ Endpoints │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
├─────────────────────────────────────────────────────────────┤
│ Processing Layer │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Face │ │ RTSP │ │ WebSocket │ │
│ │ Detection │ │ Processing │ │ Handler │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
├─────────────────────────────────────────────────────────────┤
│ AI/ML Layer │
│ ┌─────────────┐ ┌─────────────┐ │
│ │ MTCNN │ │ Inception │ │
│ │ Detection │ │ ResNet V1 │ │
│ └─────────────┘ └─────────────┘ │
├─────────────────────────────────────────────────────────────┤
│ Data Layer │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ SQLite DB │ │ File System │ │ Memory │ │
│ │ (Users, │ │ (Templates, │ │ Cache │ │
│ │ Faces, etc) │ │ Static) │ │ │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
└─────────────────────────────────────────────────────────────┘
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ RTSP Camera │───▶│ Frame │───▶│ Face │
│ Stream │ │ Capture │ │ Detection │
└─────────────┘ └─────────────┘ └─────────────┘
│
▼
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ External │◀───│ Callback │◀───│ Face │
│ System │ │ Handler │ │ Recognition │
└─────────────┘ └─────────────┘ └─────────────┘
│
▼
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Web Client │◀───│ WebSocket/ │◀───│ Bounding │
│ Display │ │ HTTP Stream │ │ Box Draw │
└─────────────┘ └─────────────┘ └─────────────┘
-- Users table for authentication
CREATE TABLE users (
id INTEGER PRIMARY KEY,
username STRING UNIQUE,
hashed_password STRING
);
-- Configuration parameters
CREATE TABLE config (
key STRING PRIMARY KEY,
value STRING
);
-- Registered face embeddings
CREATE TABLE registered_faces (
id INTEGER PRIMARY KEY,
person_id STRING UNIQUE,
embedding BLOB, -- Pickled numpy array
updated_at DATETIME
);
-- RTSP stream configurations
CREATE TABLE rtsp_streams (
id INTEGER PRIMARY KEY,
name STRING,
rtsp_url STRING,
is_active BOOLEAN,
created_at DATETIME,
last_detection DATETIME
);- Python 3.8+
- CUDA-compatible GPU (optional, for acceleration)
- FFmpeg (for RTSP stream processing)
The project includes a comprehensive requirements.txt file with all necessary dependencies:
# Install all dependencies
pip install -r requirements.txt
# Key libraries included:
# - torch, torchvision (PyTorch ML framework)
# - facenet-pytorch (Face recognition models)
# - opencv-python (Computer vision)
# - fastapi, uvicorn (Web framework)
# - sqlalchemy (Database ORM)
# - pyjwt, passlib[bcrypt] (Authentication)
# - jinja2 (Template engine)
# - httpx (HTTP client for callbacks)Note: The Docker setup automatically handles all dependency installation and model file requirements.
- Clone the repository
git clone https://github.com/tovfikur/rtsp_face_recognition.git
cd rtsp_face_recognition- Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install dependencies
pip install -r requirements.txt- Environment configuration
# Create .env file (optional)
SECRET_KEY=your-super-secure-secret-key-here- Initialize database
# Database tables are created automatically on first run
python main.py# Development mode
python main.py
# Production mode with Uvicorn
uvicorn main:app --host 0.0.0.0 --port 8000 --workers 4- Username:
admin - Password:
admin
- Main Dashboard: http://localhost:8000/
- API Documentation: http://localhost:8000/docs
- RTSP Management: http://localhost:8000/rtsp/manage
- WebSocket Test: http://localhost:8000/ws/test
Obtain JWT access token
curl -X POST "http://localhost:8000/token" \
-H "Content-Type: application/x-www-form-urlencoded" \
-d "username=admin&password=admin"Register a new person with face images
curl -X POST "http://localhost:8000/register" \
-H "Authorization: Bearer YOUR_TOKEN" \
-F "person_id=john_doe" \
-F "images=@photo1.jpg" \
-F "images=@photo2.jpg"Authenticate person from uploaded image
curl -X POST "http://localhost:8000/authenticate" \
-H "Authorization: Bearer YOUR_TOKEN" \
-F "file=@test_image.jpg"Compare two face images
curl -X POST "http://localhost:8000/compare" \
-H "Authorization: Bearer YOUR_TOKEN" \
-F "reference=@person1.jpg" \
-F "check=@person2.jpg"Add new RTSP stream
curl -X POST "http://localhost:8000/rtsp/streams" \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "Main Entrance",
"rtsp_url": "rtsp://admin:password@192.168.1.100:554/stream1"
}'Start monitoring RTSP stream
curl -X POST "http://localhost:8000/rtsp/streams/1/start" \
-H "Authorization: Bearer YOUR_TOKEN"View live processed video stream
http://localhost:8000/rtsp/streams/1/video
Get current system configuration
curl -X GET "http://localhost:8000/config" \
-H "Authorization: Bearer YOUR_TOKEN"Update system configuration
curl -X PUT "http://localhost:8000/config" \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"tolerance": 0.8,
"detection_threshold": 0.95
}'Connect to WebSocket endpoint for live face detection:
const ws = new WebSocket('ws://localhost:8000/ws/detection?token=YOUR_JWT_TOKEN');
// Send base64-encoded image frame
ws.send(base64ImageData);
// Receive detection results
ws.onmessage = function(event) {
const results = JSON.parse(event.data);
console.log('Detected persons:', results);
};- Client sends base64-encoded image frame
- Server processes frame through MTCNN → InceptionResnetV1
- Face matching against registered embeddings
- Results returned as JSON array of person IDs
| Parameter | Default | Description |
|---|---|---|
TOLERANCE |
0.9 | Face matching threshold (lower = stricter) |
DETECTION_THRESHOLD |
0.95 | Face detection confidence threshold |
CALLBACK_URL |
"" | External system notification endpoint |
CALLBACK_TOKEN |
"" | Authentication token for callbacks |
Configure external system notifications:
curl -X PUT "http://localhost:8000/callback-config" \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"callback_url": "https://your-system.com/api/face-detected",
"callback_token": "your-api-key"
}'The system sends POST requests with this payload:
{
"employee_id": 123,
"check_in": "2025-01-20 14:30:00",
"check_out": "2025-01-20 22:30:00"
}The project includes a sophisticated multi-stage Docker build optimized for both development and production environments.
The Dockerfile uses a two-stage build process:
Stage 1 - Builder: Compiles dependencies and builds wheels Stage 2 - Runtime: Creates slim production image
Key features:
- Multi-stage build for optimized image size
- Non-root user security implementation
- Volume mounting for development
- Required model files (shape_predictor_68_face_landmarks.dat, nn4.small2.v1.t7)
- FFmpeg integration for RTSP stream processing
- Flexible entrypoint supporting dev/prod modes
Before building, ensure these OpenFace model files are in your project root:
# Download required model files
wget http://dlib.net/files/shape_predictor_68_face_landmarks.dat.bz2
bunzip2 shape_predictor_68_face_landmarks.dat.bz2
wget https://storage.cmusatyalab.org/openface-models/nn4.small2.v1.t7# Clone and build
git clone https://github.com/tovfikur/rtsp_face_recognition.git
cd rtsp_face_recognition
# Start development environment
docker-compose up --buildThe compose file includes:
- Development mode:
DEV=1environment variable - Live reload: Volume mounting for code changes
- Log persistence: Dedicated volume for application logs
- Auto-restart: Container restart policy
version: "3.8"
services:
app:
build: .
ports:
- "8000:8000"
environment:
- DEV=1 # Enables development mode with hot reload
volumes:
- .:/app # Live code mounting for development
- logs:/tmp/logs # Persistent log storage
restart: unless-stopped
volumes:
logs: # Named volume for log persistenceFor production, modify the environment:
version: "3.8"
services:
app:
build: .
ports:
- "8000:8000"
environment:
- DEV=0 # Production mode
- SECRET_KEY=your-super-secure-production-key
volumes:
- ./db:/app/db # Persist database
- logs:/tmp/logs
restart: unless-stopped
deploy:
resources:
limits:
memory: 2G
reservations:
memory: 1G
volumes:
logs:- Non-root execution: Application runs as
appuser - Sudo access: Available for system-level operations
- Secure defaults: Proper file permissions and user isolation
- Optimized libraries: Pre-compiled OpenCV and ML dependencies
- Efficient caching: Multi-stage build reduces final image size
- Resource management: Configurable memory limits
- Live reload: Code changes reflect immediately
- Volume mounting: Full project directory accessible
- Log access: Persistent logging across container restarts
# Development mode (with live reload)
docker-compose up
# Production mode
DEV=0 docker-compose up -d
# View logs
docker-compose logs -f app
# Access container shell
docker-compose exec app bash
# Stop services
docker-compose down| Variable | Default | Description |
|---|---|---|
DEV |
1 | Development mode (0 for production) |
SECRET_KEY |
auto | JWT signing key |
PYTHONDONTWRITEBYTECODE |
1 | Prevent .pyc files |
PYTHONUNBUFFERED |
1 | Real-time logging |
The entrypoint.sh script automatically chooses the appropriate server mode:
- Development: Uvicorn with auto-reload
- Production: Gunicorn with multiple workers
This provides optimal performance for each environment while maintaining simplicity in deployment.
The system automatically detects and uses CUDA-compatible GPUs:
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')For production deployments:
# Adjust frame processing settings
cap.set(cv2.CAP_PROP_BUFFERSIZE, 1) # Minimize buffering
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640) # Optimize resolution
cap.set(cv2.CAP_PROP_FPS, 10) # Control frame rate- CPU: Use ThreadPoolExecutor with max_workers=5
- Memory: Explicit garbage collection after frame processing
- Network: MJPEG streaming with quality optimization
- Database: Connection pooling with SQLAlchemy
- Change default credentials immediately
- Set strong SECRET_KEY environment variable
- Enable HTTPS in production
- Configure CORS for specific origins
- Implement rate limiting for API endpoints
- Regular security updates for dependencies
Client Request → JWT Token Validation → User Verification → Resource Access
- Expiry: 30 minutes default
- Storage: HTTP-only cookies + Authorization headers
- Refresh: Manual re-authentication required
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger("FaceRecognitionAPI")- Frame processing time monitoring
- Detection accuracy tracking
- System resource utilization
- RTSP connection stability
Monitor these endpoints:
/docs- API documentation availability/config- System configuration status/rtsp/streams- Active stream monitoring
- Fork the repository
- Create feature branch:
git checkout -b feature/amazing-feature - Commit changes:
git commit -m 'Add amazing feature' - Push to branch:
git push origin feature/amazing-feature - Open Pull Request
- Follow PEP 8 style guidelines
- Add type hints for new functions
- Include docstrings for public methods
- Write unit tests for new features
Copyright © 2025 Kendroo. All rights reserved.
Developed by: Iftekar Hossan (tovfikur)
Q: RTSP stream not connecting
- Verify camera URL and credentials
- Check network connectivity
- Ensure FFmpeg is installed
Q: Face detection not working
- Verify image quality and lighting
- Check GPU/CUDA installation
- Adjust detection threshold
Q: WebSocket connection fails
- Validate JWT token format
- Check browser WebSocket support
- Verify CORS configuration
- GitHub: tovfikur
- Project: rtsp_face_recognition
- v1.0.0: Initial release with core features
- Face detection and recognition
- RTSP stream processing
- WebSocket real-time communication
- JWT authentication
- Web-based management interface
Built with ❤️ using FastAPI, PyTorch, and modern web technologies.