EEG-Based Motor Imagery Classification Using Machine Learning
A complete pipeline for classifying motor imagery EEG signals from the BCI Competition IV Dataset 2a (Graz Data Set A). This project implements and compares 7 machine learning models across 3 evaluation protocols, with an additional Weighted Majority Vote Ensemble (WMVE) method.
Developed as a Machine Learning mini-project at the Faculty of Sciences and Techniques of Mohammedia (FSTM), Hassan II University of Casablanca, under the supervision of Pr. Nabil Azouagh.
- Project Overview
- Dataset
- Pipeline Architecture
- Evaluation Protocols
- Models Implemented
- Results Summary
- Project Structure
- Requirements
- Installation and Setup
- Usage
- Technical Notes
- Authors
- References
- License
Brain-Computer Interfaces (BCI) enable direct communication between the human brain and external devices by interpreting neural signals. This project explores the classification of motor imagery intentions from EEG signals, addressing the fundamental question: can we reliably distinguish what a human brain imagines, and do all brains behave the same way?
The classification targets 4 motor imagery classes (chance level = 25%):
- Left hand (class 1)
- Right hand (class 2)
- Feet (class 3)
- Tongue (class 4)
Key findings:
- Linear SVM achieved the best mean accuracy of 51.35% on Protocol 1 (per-subject), more than double the chance level.
- Classical ML models outperformed deep learning (CNN) given the limited data (288 trials per subject).
- Inter-subject variability is the dominant challenge: subject A08 consistently reached 60-67%, while A02 and A05 remained near chance level.
- The progressive degradation from Protocol 1 (51.35%) to Protocol 1.5 (38.81%) to Protocol 2 (31.05%) quantifies the cost of cross-subject generalization.
BCI Competition IV Dataset 2a (Graz Data Set A)
- 9 subjects (A01 to A09)
- 2 sessions per subject: Training (T) and Evaluation (E), recorded on different days
- 288 trials per session (72 per class), 6 runs of 48 trials
- 22 EEG channels (10-20 international system), sampled at 250 Hz
- 3 EOG channels (excluded during preprocessing)
- File format: GDF (.gdf) with evaluation labels in MATLAB (.mat) files
- Cue-based paradigm: fixation cross (t=0s), beep (t=2s), visual cue (t=3s-6s), rest
The dataset is not included in this repository due to its size. It is available on our shared Google Drive (see Resources) or from the official source:
The processing pipeline is shared across all models and protocols:
Raw EEG (.gdf)
|
v
[1] Loading and Channel Selection (22 EEG channels via MNE-Python)
|
v
[2] Bandpass Filtering (Butterworth IIR, 8-30 Hz, order 4)
|
v
[3] Epoching (4s windows from cue onset, 1000 samples at 250 Hz)
|
v
[4] Artifact Rejection (event 1023)
|
v
[5] Feature Extraction
|-- Bandpower: 4 bands x 22 channels = 88 features (Welch PSD)
|-- CSP: Common Spatial Patterns (6 or 10 components)
|-- Concatenation: 94 features (P1) or 98 features (P1.5, P2)
|
v
[6] Normalization (StandardScaler, fitted on training data only)
|
v
[7] Dimensionality Reduction (PCA at 95% cumulative variance)
|
v
[8] Classification (7 models with GridSearchCV, 5-fold Stratified CV)
Frequency bands used for Bandpower extraction:
- Delta: 0.5-4 Hz
- Theta: 4-8 Hz
- Mu/Alpha: 8-13 Hz (sensorimotor rhythm)
- Beta: 13-30 Hz (motor activity)
CSP implementation:
- Protocol 1: Manual binary CSP (6 components, left hand vs right hand)
- Protocols 1.5 and 2: MNE multiclass CSP with Ledoit-Wolf regularization (10 components, one-vs-rest)
Three evaluation protocols of increasing difficulty were implemented:
- Train on session T (288 trials), evaluate on session E (288 trials), per subject
- Produces 9 individual accuracies
- Preprocessing, feature extraction, normalization, and PCA fitted per subject
- PCA components: 14 to 21 depending on the subject
- All 9 training sessions T pooled (2592 trials) as training set
- All 9 evaluation sessions E pooled (2592 trials) as test set
- Single global model, PCA retains 13 components
- All 18 sessions (9T + 9E) merged (5184 trials)
- Stratified random split: 80% training (4147), 20% test (1037)
- PCA retains 12 components
- Protocol 1: Top 5 models per subject, weighted by CV accuracy
- Protocol 1.5: Top 3 models globally (RF + SVM RBF + CNN)
Each model represents a distinct algorithmic family:
| Model | Family | Hyperparameter Grid |
|---|---|---|
| Linear SVM | Kernel methods (linear) | C: {0.01, 0.1, 1, 10}, class_weight: {None, balanced} |
| RBF SVM | Kernel methods (non-linear) | C: {1, 10, 100}, gamma: {0.001, 0.01, 0.1, scale, auto}, class_weight |
| Random Forest | Tree-based ensemble | n_estimators: {100, 200, 300}, max_depth: {10, 20, None}, criterion: {gini, entropy} |
| AdaBoost | Boosting ensemble | n_estimators: {50, 100, 200, 300}, learning_rate: {0.01, 0.1, 0.5, 1.0}, max_depth: {1, 2, 3} |
| Naive Bayes | Probabilistic | var_smoothing: {1e-9, 1e-8, 1e-7, 1e-6, 1e-5} |
| KNN | Instance-based | n_neighbors: {3, 5, 7, 9, 11, 15, 17, 19, 21}, weights, metric |
| CNN (1D) | Deep learning | filters: {8, 16}, kernel_size: {3, 5}, dense_units: {16, 32}, batch_size: {16, 32} |
CNN architecture: Input -> Reshape -> Conv1D -> ReLU -> MaxPooling1D -> Flatten -> Dense -> ReLU -> Dense(4) -> Softmax (Adam optimizer, 50 epochs, EarlyStopping patience=10)
All models are optimized using GridSearchCV with 5-fold StratifiedKFold on training data only. Models are saved as .pkl (joblib) or .keras (CNN) files.
| Rank | Model | CV (%) | EVAL (%) |
|---|---|---|---|
| 1 | Linear SVM | 60.48 | 51.35 |
| 2 | SVM RBF | 58.21 | 48.42 |
| 3 | Random Forest | 56.52 | 48.34 |
| 4 | Naive Bayes | 52.93 | 47.53 |
| 5 | AdaBoost | 54.21 | 46.60 |
| 6 | KNN | 50.51 | 44.45 |
| 7 | CNN | 55.44 | 44.68 |
| -- | WMVE Top 5 | -- | 51.54 |
| Model | CV (%) | TEST (%) |
|---|---|---|
| Random Forest | 42.63 | 38.81 |
| SVM RBF | 39.39 | 38.54 |
| CNN | 40.05 | 37.96 |
| WMVE Top 3 | -- | 40.16 |
| Model | CV (%) | TEST (%) |
|---|---|---|
| CNN | 39.69 | 40.31 |
| Linear SVM | 32.53 | 31.05 |
- Best: A08 = 65.62%
- Worst: A02 = 34.38%, A05 = 32.29%
- Chance level: 25.00%
Code_Project/
|
|-- DataSet/ # Raw EEG data (Google Drive only)
| |-- All_Data/
| |-- Train_Data/ # A01T.gdf ... A09T.gdf
| |-- Evaluation_Data/ # A01E.gdf/mat ... A09E.gdf/mat
|
|-- Explore_DataSet/ # Data exploration notebooks
| |-- Explore.ipynb
| |-- Read_Data_DataSet.ipynb
|
|-- Le pipeline commun - Pre-traitement/ # Preprocessing pipeline
| |-- Traitement_Donnees_Preparation/
| | |-- Traitement_Donnees_Preparation.ipynb # Protocol 1
| | |-- Traitement_Donnees_Preparation_Protocole_1_5.ipynb # Protocol 1.5
| | |-- Traitement_Donnees_Preparation_Protocole_2.ipynb # Protocol 2
| |-- Data_Processed/ # Protocol 1 (per subject A01-A09)
| |-- Data_Processed_Protocole_1_5/ # Protocol 1.5 (global T vs E)
| |-- Data_Processed_Protocole_2/ # Protocol 2 (global 80/20)
|
|-- Models/ # Training notebooks and saved models
| |-- SVM - Lineaire - 2/ # Linear SVM, Protocol 1
| |-- SVM - RBF - 2/ # RBF SVM, Protocol 1
| |-- Random Forest/ # Random Forest, Protocol 1
| |-- Naive Bayes/ # Naive Bayes, Protocol 1
| |-- AdaBoost/ # AdaBoost, Protocol 1
| |-- KNN/ # KNN, Protocol 1
| |-- CNN/ # CNN, Protocol 1
| |-- SVM - Lineaire - 2 - Protocole 1_5/ # Linear SVM, Protocol 1.5
| |-- SVM - RBF - 2 - Protocole 1_5/ # RBF SVM, Protocol 1.5
| |-- Random Forest - Protocole 1_5/ # Random Forest, Protocol 1.5
| |-- Naive Bayes - Protocole 1_5/ # Naive Bayes, Protocol 1.5
| |-- AdaBoost - Protocole 1_5/ # AdaBoost, Protocol 1.5
| |-- KNN - Protocole 1_5/ # KNN, Protocol 1.5
| |-- CNN - Protocole 1_5/ # CNN, Protocol 1.5
| |-- SVM - Lineaire - 2 - Protocole 2/ # Linear SVM, Protocol 2
| |-- CNN - Protocole 2/ # CNN, Protocol 2
| |-- WMVE - Protocole 1/ # WMVE Top 5, per subject
| |-- WMVE - Protocole 1_5/ # WMVE Top 3, global
|
|-- Test Models/ # Prediction and testing notebooks
| |-- SVM - Lineaire - 2/ # SVM test on A08
| |-- WMVE - Protocole 1/ # WMVE test on A02, A08, A09
| |-- WMVE - Protocole 1_5/ # WMVE Top 3 test
|
|-- .gitignore
|-- README.md
- Python 3.9+
- Jupyter Notebook (Anaconda recommended)
numpy>=1.21.0
scipy>=1.7.0
scikit-learn>=1.0.0
mne>=1.0.0
tensorflow>=2.10.0
scikeras>=0.9.0
matplotlib>=3.5.0
seaborn>=0.11.0
joblib>=1.1.0
pandas>=1.3.0
# Clone the repository
git clone https://github.com/YouIsm1/Classification-of-Human-Brain-Signals.git
cd Classification-of-Human-Brain-Signals
# Create a conda environment (recommended)
conda create -n bci python=3.9
conda activate bci
# Install dependencies
pip install numpy scipy scikit-learn mne tensorflow scikeras matplotlib seaborn joblib pandas- Clone this repository:
git clone https://github.com/YouIsm1/Classification-of-Human-Brain-Signals.git-
Download the dataset from Google Drive and place it in the
DataSet/All_Data/directory:- Training data:
DataSet/All_Data/Train_Data/A01T.gdf...A09T.gdf - Evaluation data:
DataSet/All_Data/Evaluation_Data/A01E.gdf...A09E.gdfwith corresponding.matlabel files
- Training data:
-
Install the required Python packages (see Requirements).
-
Open Jupyter Notebook and navigate to the project directory.
Run the appropriate preprocessing notebook depending on the protocol:
- Protocol 1:
Le pipeline commun - Pre-traitement/Traitement_Donnees_Preparation/Traitement_Donnees_Preparation.ipynb - Protocol 1.5:
Traitement_Donnees_Preparation_Protocole_1_5.ipynb - Protocol 2:
Traitement_Donnees_Preparation_Protocole_2.ipynb
This produces the preprocessed .npy files in the corresponding Data_Processed/ directories.
Navigate to the desired model folder under Models/ and run the training notebook. Each notebook follows a consistent structure:
- Imports and path configuration
- Data loading from
Data_Processed/ - Grid Search with 5-fold Stratified CV
- Evaluation on test data
- Results export (CSV, accuracy plots, confusion matrices)
- Model saving (.pkl or .keras)
After training all individual models, run the WMVE notebooks:
Models/WMVE - Protocole 1/WMVE_Protocole_1_Top_5.ipynbModels/WMVE - Protocole 1_5/WMVE_Protocole_1_5 - Ver 2 - Top_3.ipynb
Use the notebooks in Test Models/ to run predictions on specific subjects.
- Data leakage prevention: StandardScaler, CSP, and PCA are fitted exclusively on training data and applied (transform only) to evaluation/test data.
- Memory management for CNN:
tf.keras.backend.clear_session(),gc.collect(), anddel best_modelare called between subjects to free GPU/RAM memory accumulated by TensorFlow computation graphs. - Intermediate saving: Results are saved after each subject to enable recovery from crashes during long training sessions.
- Absolute paths: Jupyter Notebook launches from
C:\Users\user\, so all notebooks use absolute paths to the project directory. - Google Drive sync: The
.gitignorefile excludesDataSet/,*.npy,*.pkl,*.mat,*.gdf,*.keras,Data_Processed/,.ipynb_checkpoints/, anddesktop.inito avoid corrupting Git when the project folder is synced with Google Drive. - Training hardware: All training was performed on personal laptop CPUs without dedicated GPU access. CNN training required approximately 10 hours for Protocol 1 (9 subjects) and 5.6 hours for Protocol 2.
- ELWAFI Youssef - GitHub
- BAKAR Oussama
- LAAFAR Abdellah
- OUARRAK Aymen
Supervised by Pr. Nabil AZOUAGH
FSTM - Faculty of Sciences and Techniques of Mohammedia Hassan II University of Casablanca BSc in Data Science and Decision-Making Informatics (SDID), S6 Academic Year: 2025-2026
- Source Code (GitHub)
- Dataset and Models (Google Drive)
[R1] Code source du projet (GitHub) : ELWAFI Youssef (YouIsm1). Classification-of-Human-Brain-Signals. Repository public contenant l'ensemble des notebooks Jupyter (pretraitement, entrainement des 7 modeles sous 3 protocoles, WMVE, tests), les fichiers de configuration et la documentation du projet. Disponible sur : https://github.com/YouIsm1/Classification-of-Human-Brain-Signals
[R2] Donnees et modeles du projet (Google Drive) : Dossier partage contenant les fichiers volumineux exclus de GitHub : le dataset brut BCI Competition IV 2a (fichiers .gdf et .mat), les donnees preprocessees (fichiers .npy), les modeles sauvegardes (fichiers .pkl et .keras), ainsi que les versions PDF de tous les notebooks. Disponible sur : https://drive.google.com/drive/folders/14GamxUlmzX3gRGbCMqZbqpGqCpEEkbUy
[D1] Data sets 2a: "4-class motor imagery". Provided by the Institute for Knowledge Discovery (Laboratory of Brain-Computer Interfaces), Graz University of Technology (Clemens Brunner, Robert Leeb, Gernot Muller-Putz, Alois Schlogl, Gert Pfurtscheller). EEG, cued motor imagery (left hand, right hand, feet, tongue). 22 EEG channels (0.5-100 Hz; notch filtered), 3 EOG channels, 250 Hz sampling rate, 4 classes, 9 subjects. Disponible sur : http://www.bbci.de/competition/iv/
[D2] Description du dataset 2a. Disponible sur : https://www.bbci.de/competition/iv/desc_2a.pdf
[D3] Etiquettes reelles des ensembles d'evaluation de la competition - Ensemble de donnees 2a. Disponible sur : http://www.bbci.de/competition/iv/results/
[1] Brunner, C., Leeb, R., Muller-Putz, G., Schlogl, A., & Pfurtscheller, G. (2008). BCI Competition 2008 - Graz data set A. Institute for Knowledge Discovery, Graz University of Technology. Disponible sur : https://www.bbci.de/competition/iv/desc_2a.pdf
[2] Pfurtscheller, G., & Neuper, C. (2001). Motor imagery and direct brain-computer communication. Proceedings of the IEEE, 89(7), 1123-1134. DOI : https://doi.org/10.1109/5.939829
[3] Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., ... & Duchesnay, E. (2011). Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 12, 2825-2830. Disponible sur : https://jmlr.org/papers/v12/pedregosa11a.html
[4] Gramfort, A., Luessi, M., Larson, E., Engemann, D. A., Strohmeier, D., Brodbeck, C., ... & Hamalainen, M. S. (2013). MEG and EEG data analysis with MNE-Python. Frontiers in Neuroscience, 7, 267. DOI : https://doi.org/10.3389/fnins.2013.00267
[5] Lotte, F., Bougrain, L., Cichocki, A., Clerc, M., Congedo, M., Rakotomamonjy, A., & Yger, F. (2018). A review of classification algorithms for EEG-based brain-computer interfaces: a 10 year update. Journal of Neural Engineering, 15(3), 031005. DOI : https://doi.org/10.1088/1741-2552/aab2f2
[6] Ramoser, H., Muller-Gerking, J., & Pfurtscheller, G. (2000). Optimal spatial filtering of single trial EEG during imagined hand movement. IEEE Transactions on Rehabilitation Engineering, 8(4), 441-446. DOI : https://doi.org/10.1109/86.895946
[7] Chollet, F. et al. (2015). Keras: Deep Learning for humans. Disponible sur : https://keras.io/ Documentation TensorFlow : https://www.tensorflow.org/api_docs
[8] Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., ... & Zheng, X. (2016). TensorFlow: A System for Large-Scale Machine Learning. 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI), 265-283. Disponible sur : https://www.tensorflow.org/
[9] Adrian Adriano (2023). scikeras: Scikit-Learn Wrapper for Keras. Disponible sur : https://github.com/adriangb/scikeras
[10] Virtanen, P., Gommers, R., Oliphant, T. E., et al. (2020). SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nature Methods, 17, 261-272. DOI : https://doi.org/10.1038/s41592-019-0686-2
[11] Harris, C. R., Millman, K. J., van der Walt, S. J., et al. (2020). Array programming with NumPy. Nature, 585, 357-362. DOI : https://doi.org/10.1038/s41586-020-2649-2
[12] Hunter, J. D. (2007). Matplotlib: A 2D Graphics Environment. Computing in Science & Engineering, 9(3), 90-95. DOI : https://doi.org/10.1109/MCSE.2007.55
[13] Waskom, M. (2021). Seaborn: Statistical Data Visualization. Disponible sur : https://seaborn.pydata.org/
[14] Joblib: running Python functions as pipeline jobs. Disponible sur : https://joblib.readthedocs.io/
This project is developed for academic purposes as part of the Machine Learning course at FSTM. The BCI Competition IV Dataset 2a is provided by Graz University of Technology for research and educational use.