Skip to content
View komiayi's full-sized avatar

Block or report komiayi

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
komiayi/README.md

Komi Ayi

Statistician · Data Scientist · Research & Innovation

Master's in applied statistics with a specialization in causal modeling for health research. I focus on rigorous methodology and decision-oriented data analysis, with applied work spanning public health, behavioral data, and predictive modeling.

Full portfolio →

Currently open to data science and statistics roles.


📖 Research

Master's thesis (UQAM, 2025) — supervised by Prof. Karim Oualkacha:
Analyse de médiation causale pour des médiateurs non causalement liés.
Read on Archipel UQAM → · Companion R Shiny app: rbcm →

Featured projects

Python · Scikit-Learn · Streamlit

Multinomial logistic regression model predicting match outcomes for the 2025–2026 French Ligue 1, deployed as an interactive Streamlit dashboard. Processes offensive efficiency and defensive resilience metrics to output win/draw/loss probabilities.

→ Try the live app

R · Causal inference · Biostatistics

Research project on the effect of childhood trauma on cortisol stress reactivity, mediated by DNA methylation. Implements advanced statistical methods to correct for bias arising when mediators share an unmeasured common cause. A companion R Shiny tool — rbcm — implements these methods interactively.

Python · Random Forest · XGBoost

End-to-end churn modeling pipeline including feature engineering, dynamic threshold optimization for F1 maximization, and feature-importance interpretation of the best-performing model.

More projects (regression trees & ensemble methods, MixLaw R package…) available in the portfolio.


Tech stack

Languages Python · R · SQL
Python Pandas · NumPy · Scikit-Learn · SciPy · Matplotlib · Streamlit
R dplyr · tidyr · caret · ggplot2 · rpart · randomForest
Tools Tableau · SQLiteStudio · Git

Certifications

Google Advanced Data Analytics (Professional Certificate) · Introducing DAX · Create a dashboard with Tableau


Get in touch

LinkedIn Email Portfolio

Pinned Loading

  1. dna_mediation dna_mediation Public

    Causal mediation analysis under unmeasured confounding between correlated mediators. Application: childhood trauma → DNA methylation → cortisol stress reactivity. (R, Master's research)

    R 1

  2. MixLaw MixLaw Public

    R package implementing an S3 class for probability distribution mixtures, with methods for sampling, density visualization, and basic descriptive statistics. Developed for MAT8186 (UQAM).

    R 1

  3. churn-waze-prediction churn-waze-prediction Public

    Predicting Waze user churn with three tree-based models (Decision Tree, Random Forest, XGBoost), with class-imbalance handling and decision-threshold optimization. Final XGBoost recall = 0.64.

    Jupyter Notebook 1

  4. Ligue-1 Ligue-1 Public

    A multinomial logistic regression model that estimates win/draw/loss probabilities for French Ligue 1 matches (2025–2026 season), deployed as an interactive Streamlit application.

    Python 1

  5. regression-trees-ensemble-methods regression-trees-ensemble-methods Public

    Comparative study of regression trees, Bagging, and Random Forests focusing on MSE and stability. Friedman simulation (J=1000) + real-world application to Tehran apartment construction costs. Gradu…

    R 1

  6. rbcm rbcm Public

    An R Shiny application for causal mediation analysis when mediators are correlated but not causally linked.

    R 1