Heart Disease Early Detection ML Pipeline

Overview

This project implements a machine learning pipeline for the early detection of heart diseases using Apache Spark, MLlib library, and a variety of machine learning algorithms including RandomForest, Logistic Regression, and XGBoost. The pipeline is designed to process a CSV database, apply transformations, train models, evaluate their performance, and showcase results through data visualization.

Project Highlights

Machine Learning Algorithms:
- RandomForest
- Logistic Regression
- XGBoost
Processing Chain:
- Transformers: Utilized 2 transformers in the processing chain.
- Estimator: Trained models using various algorithms.
- Pipeline: Structured a pipeline for seamless data processing.
- Evaluation: Determined key metrics to assess model performance.
Activity:
- Hunted down a suitable CSV database for the project. https://www.kaggle.com/datasets/johnsmith88/heart-disease-dataset
- Illustrated the processing chain with a focus on transformers, estimators, pipelines, and evaluation.
- Programmed a notebook, rigorously tested, and ensured functionality.
- Showcased results through effective data visualization.

Results

Witness the power of data-driven insights!
The pipeline demonstrated promising outcomes in the early detection of heart diseases.
Achieved accuracy and efficiency through strategic algorithmic choices.

Usage

Prerequisites

Python (version 3.11.5)
Apache Spark

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
graph		graph
README.md		README.md
archive.zip		archive.zip
heart.csv		heart.csv
heart226.csv		heart226.csv
heart800.csv		heart800.csv
project1.ipynb		project1.ipynb
project1Test.ipynb		project1Test.ipynb
u.data		u.data

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Heart Disease Early Detection ML Pipeline

Overview

Project Highlights

Results

Usage

Prerequisites

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Heart Disease Early Detection ML Pipeline

Overview

Project Highlights

Results

Usage

Prerequisites

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages