🧠 Depression Detection by Multimodal Analysis

A multimodal machine learning system that detects depression by analyzing text, speech, and facial expressions from clinical interviews. Built on the E-DAIC (Extended DAIC-WOZ) dataset with a Flask web application for real-time screening.

📌 Overview

Depression is a major mental health disorder often underdiagnosed due to reliance on subjective self-reporting. This project builds an automated screening tool that combines three communication channels:

Modality	Features Extracted	Method
Text	Sentiment (VADER), TF-IDF, linguistic markers	NLP
Audio	MFCCs, eGeMAPS (pitch, energy, speaking rate)	OpenSMILE
Visual	Facial Action Units, head pose, gaze	OpenFace / face-api.js

The system uses late fusion to combine predictions from unimodal L1-regularized Logistic Regression models into a final depression risk score.

🏗️ System Architecture

┌─────────────────────────────────────────────┐
│           Data Acquisition Layer            │
│   Text │ Audio │ Visual                     │
└────┬────────┬────────┬──────────────────────┘
     │        │        │
┌────▼───┐ ┌──▼────┐ ┌─▼──────┐
│  Text  │ │ Audio │ │ Visual │   Preprocessing
│Preproc.│ │Preproc│ │Preproc.│
└────┬───┘ └──┬────┘ └─┬──────┘
     │        │        │
┌────▼───┐ ┌──▼────┐ ┌─▼──────┐
│  Text  │ │ Audio │ │ Visual │   Feature Extraction
│Features│ │Feature│ │Features│
└────┬───┘ └──┬────┘ └─┬──────┘
     │        │        │
     └────────┼────────┘
              │
     ┌────────▼────────┐
     │  Late Fusion    │        Multimodal Fusion
     │  (Weighted Avg) │
     └────────┬────────┘
              │
     ┌────────▼────────┐
     │   Depression    │        Classification
     │   Detection     │
     │  Output + Score │
     └─────────────────┘

📂 Project Structure

depression_project/
├── app.py                      # Flask web application
├── main.py                     # ML pipeline (train + evaluate)
├── requirements.txt            # Python dependencies
├── src/
│   ├── load_labels.py          # Load PHQ-8 labels from E-DAIC
│   ├── text_features.py        # Text feature extraction (VADER + TF-IDF)
│   ├── audio_features.py       # Audio feature extraction (MFCC + eGeMAPS)
│   ├── visual_features.py      # Visual feature extraction (AUs + pose)
│   ├── fusion.py               # Model training + late fusion
│   └── evaluate.py             # Metrics, plots, confusion matrices
├── models/                     # Trained .pkl model files
├── data/features/              # Extracted feature CSVs
├── results/                    # Evaluation outputs (plots, CSV)
├── templates/index.html        # Web app frontend (SPA)
├── static/
│   ├── css/style.css           # Dark theme + glassmorphism
│   └── js/app.js               # Frontend logic + face-api.js
└── notebooks/                  # Exploratory analysis (optional)

🚀 Quick Start

Prerequisites

Python 3.8+
E-DAIC dataset (for training pipeline)

Installation

git clone https://github.com/TheSpectre542005/Depression-Detection-Multimodal-.git
cd Depression-Detection-Multimodal-
pip install -r requirements.txt

Step 1: Verify Data Availability

Before training, check if E-DAIC data is properly accessible:

python diagnose_data.py

This will report any missing files or data quality issues.

Step 2: Run the ML Pipeline

python main.py

This trains unimodal models with cost-sensitive thresholding (weights false negatives higher), performs late fusion with dynamic weights, and saves results to results/.

New Features:

Cost-sensitive learning: False negatives (missing depression) are weighted 5x more than false positives
Dynamic fusion weights: Based on validation AUC performance
Clinical metrics: Sensitivity, specificity, PPV, NPV

Step 3: Validate Results

Check if models meet clinical thresholds:

python validate_models.py

Minimum acceptable thresholds: AUC > 0.70, Sensitivity > 0.70

Step 4: Run the Web Application

python app.py
# Open http://localhost:5000

🌐 Web Application

The Sentira web app provides a complete screening experience:

User Flow

Landing Page — Project overview and disclaimer
PHQ-8 Survey — Standard 8-question clinical questionnaire (keyboard shortcuts: 0-3)
Virtual Assistant Interview — AI chatbot asks 8 clinical-style questions while webcam captures facial expressions via face-api.js
Results Dashboard — Risk gauge, PHQ-8 breakdown, text analysis with sentiment, and facial expression chart

Combined Risk Score

The final risk score uses 3-way late fusion:

Modality	Weight	Source
PHQ-8 Score	35%	Clinical questionnaire
Text Analysis	35%	Trained ML model (L1 LogReg)
Facial Analysis	30%	face-api.js expression detection

📊 ML Pipeline Results

Model	Accuracy	F1	AUC-ROC
Text Only	0.758	0.429	0.591
Audio Only	0.394	0.474	0.548
Visual Only	0.455	0.526	0.657
Late Fusion	0.758	0.429	0.591
Early Fusion	0.576	0.462	0.635

Key Techniques

SMOTE applied inside cross-validation folds (prevents data leakage)
L1-regularized Logistic Regression for built-in feature selection
PCA dimensionality reduction (248 audio → 20, 214 visual → 20)
Smart fusion weights excluding modalities with AUC ≤ 0.52
Constrained thresholds (0.25–0.65) to prevent degenerate predictions

📁 Dataset

This project uses the E-DAIC (Extended DAIC-WOZ) dataset:

275 clinical interview recordings
Audio, video, and text transcripts per participant
PHQ-8 depression severity labels
Binary classification: PHQ-8 ≥ 10 → Depressed

The dataset is not included in this repository due to licensing restrictions.

🛠️ Technologies

Category	Tools
Language	Python 3.8+
ML	scikit-learn, imbalanced-learn
NLP	NLTK, VADER Sentiment
Audio	OpenSMILE (pre-extracted)
Visual	OpenFace (training), face-api.js (web)
Web	Flask, HTML/CSS/JavaScript
Design	Glassmorphism, CSS animations

⚠️ Disclaimer

This is a screening tool for research and educational purposes only. It does NOT provide medical diagnosis. If you or someone you know is struggling with depression, please contact a licensed mental health professional or call a crisis helpline.

📄 License

This project is for academic use. See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
data/features		data/features
models		models
resullts final final		resullts final final
src		src
static		static
templates		templates
tests		tests
.gitignore		.gitignore
ACTION_PLAN.md		ACTION_PLAN.md
IMPLEMENTATION_SUMMARY.md		IMPLEMENTATION_SUMMARY.md
PROJECT_REPORT.md		PROJECT_REPORT.md
README.md		README.md
app.py		app.py
calculate_accuracies.py		calculate_accuracies.py
config.py		config.py
diagnose_data.py		diagnose_data.py
diagnose_leakage.py		diagnose_leakage.py
evaluate_cv.py		evaluate_cv.py
fix_visual_threshold.py		fix_visual_threshold.py
generate_cv_results.py		generate_cv_results.py
generate_final_results.py		generate_final_results.py
generate_results.py		generate_results.py
improve_audio_model.py		improve_audio_model.py
improve_text_features.py		improve_text_features.py
improve_text_model.py		improve_text_model.py
improve_visual_model.py		improve_visual_model.py
inspect_files.py		inspect_files.py
main.py		main.py
requirements.txt		requirements.txt
test_output.txt		test_output.txt
test_output2.txt		test_output2.txt
tests_output.txt		tests_output.txt
text_cv_tfidf.py		text_cv_tfidf.py
update_cv_results.py		update_cv_results.py
validate_models.py		validate_models.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧠 Depression Detection by Multimodal Analysis

📌 Overview

🏗️ System Architecture

📂 Project Structure

🚀 Quick Start

Prerequisites

Installation

Step 1: Verify Data Availability

Step 2: Run the ML Pipeline

Step 3: Validate Results

Step 4: Run the Web Application

🌐 Web Application

User Flow

Combined Risk Score

📊 ML Pipeline Results

Key Techniques

📁 Dataset

🛠️ Technologies

⚠️ Disclaimer

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🧠 Depression Detection by Multimodal Analysis

📌 Overview

🏗️ System Architecture

📂 Project Structure

🚀 Quick Start

Prerequisites

Installation

Step 1: Verify Data Availability

Step 2: Run the ML Pipeline

Step 3: Validate Results

Step 4: Run the Web Application

🌐 Web Application

User Flow

Combined Risk Score

📊 ML Pipeline Results

Key Techniques

📁 Dataset

🛠️ Technologies

⚠️ Disclaimer

📄 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages