Skip to content

Manishrajmss13/Heart-disease-prediction-using-SVC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Heart Disease Prediction using a Support Vector Classifier (SVC)

Project Goal

The objective of this project is to build a machine learning model to predict the presence of heart disease in a patient based on a set of medical attributes. This notebook covers the entire process, from initial data exploration to training and evaluating a Support Vector Classifier (SVC).


Dataset Used

Source:
This project uses a consolidated version of the popular Heart Disease dataset.

Content:

  • The dataset contains 1025 patient records and 14 key medical attributes.
  • An initial analysis showed that the data is of high quality, with no missing values and all features already in numerical format.

Attribute Information:

Feature Description
age Age of the patient in years
sex Gender (1 = male, 0 = female)
cp Chest pain type (0–3)
trestbps Resting blood pressure (mm Hg)
chol Serum cholesterol (mg/dl)
fbs Fasting blood sugar > 120 mg/dl (1 = true; 0 = false)
restecg Resting electrocardiographic results (0,1,2)
thalach Maximum heart rate achieved
exang Exercise induced angina (1 = yes; 0 = no)
oldpeak ST depression induced by exercise relative to rest
slope Slope of peak exercise ST segment
ca Number of major vessels (0–3) colored by fluoroscopy
thal 0 = normal; 1 = fixed defect; 2 = reversible defect
target Heart disease presence (0 = no, 1 = yes)

Approach: Step-by-Step Breakdown

1. Data Exploration

  • Loaded the dataset and performed a quick exploratory analysis using .info() and .describe() to:
    • Confirm data types
    • Check for missing values (there were none)
    • Understand the statistical distribution of each feature

2. Data Preparation

  • Feature Scaling:
    Since SVC is a distance-based algorithm and features are on different scales (e.g., age vs chol), applied StandardScaler to standardize features (mean = 0, std = 1). This ensures all features contribute equally to the model.
  • Data Splitting:
    Split the dataset into:
    • Training set: 80%
    • Testing set: 20%
      This allows the model to be trained and evaluated on unseen data.

3. Model Training & Evaluation

  • Algorithm: Support Vector Classifier (SVC) for binary classification.
  • Training: Model trained on the scaled training data.
  • Evaluation:
    • Accuracy on test data: ~85%
    • Generated a confusion matrix to analyze true positives, true negatives, false positives, and false negatives.

Notes

  • This project demonstrates how SVC can be used for binary classification tasks on real-world medical datasets.
  • Proper feature scaling and train-test splitting are crucial for good SVC performance.
  • The dataset is clean and numerical, making it ideal for practicing SVC and hyperparameter tuning.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published