A Machine Learning–based diabetes prediction system built using Python and Logistic Regression. The model analyzes clinical health parameters to classify whether a patient is likely to have diabetes.
- Exploratory Data Analysis (EDA) – Dataset inspection, target distribution, and correlation heatmap visualization.
- Data Preprocessing – Feature scaling using StandardScaler.
- Machine Learning Model – Logistic Regression classifier for binary outcome prediction.
- Model Evaluation – Accuracy score, confusion matrix, and classification report.
- Sample Prediction System – Predicts diabetes risk for a new patient input.
GlycoRisk-Engine/
├── main.py
├── diabetes.csv
├── requirements.txt
- Python
- Pandas
- NumPy
- Scikit-learn
- Matplotlib
- Seaborn
- Clone the repository:
git clone https://github.com/Agent-A345/GlycoRisk-Engine.git
- Install the dependencies:
pip install -r requirements.txt
- Run the application
python main.py
To predict whether a patient is diabetic (1) or non-diabetic (0) using clinical features such as:
- Pregnancies
- Glucose
- Blood Pressure
- Skin Thickness
- Insulin
- BMI
- Diabetes Pedigree Function
- Age
The system applies feature scaling and Logistic Regression to perform binary classification.
The model is evaluated using:
- Accuracy Score
- Confusion Matrix
- Precision, Recall, F1-score
- Model serialization using joblib
- Hyperparameter tuning
- Class imbalance handling
- Web-based deployment
- Explainable AI integration (SHAP / LIME)
This project is licensed under the MIT License.