Skip to content

MDVR9980/advanced-house-regression

Repository files navigation

๐Ÿก Advanced House Regression

This project focuses on building advanced regression models to predict house prices using machine learning techniques. The dataset used is based on the popular Kaggle competition: House Prices - Advanced Regression Techniques, with the dataset sourced from Hands-On ML by Aurรฉlien Gรฉron.

๐Ÿš€ Overview

The goal is to predict the final sale price of each house based on a rich set of features. We explore various regression models and preprocessing steps to achieve high accuracy.

๐Ÿ” Features

  • Data cleaning and preprocessing (handling missing values, encoding categorical features, feature engineering)
  • Feature scaling using StandardScaler
  • Model training and evaluation using LinearRegression
  • Batch prediction from CSV files and export of predicted results
  • Interactive UI with Streamlit for real-time predictions

๐Ÿ“ Project Structure

advanced-house-regression/
โ”œโ”€โ”€ housing.csv                  # Dataset file
โ”œโ”€โ”€ data_preprocessing.py       # Preprocesses and saves the training data
โ”œโ”€โ”€ train_model.py              # Trains the model and saves the scaler/model
โ”œโ”€โ”€ predict.py                  # Script to make a single prediction
โ”œโ”€โ”€ batch_predict.py            # Predicts house prices from new_data.csv and exports predictions.csv
โ”œโ”€โ”€ app.py                      # Streamlit app for web-based prediction
โ”œโ”€โ”€ new_data.csv                # Sample new data for batch predictions
โ”œโ”€โ”€ predictions.csv             # Output CSV with predictions
โ”œโ”€โ”€ scaler.pkl                  # Saved StandardScaler object
โ”œโ”€โ”€ linear_regression_model.pkl # Trained model file
โ”œโ”€โ”€ prepared_data.pkl           # Scaled and split dataset
โ”œโ”€โ”€ requirements.txt            # Python dependencies
โ””โ”€โ”€ README.md                   # Project documentation

๐Ÿ“… Setup & Usage

โšก Installation

pip install -r requirements.txt

๐ŸŽฎ Run Preprocessing and Training

python3 data_preprocessing.py
python3 train_model.py

๐Ÿ’ฐ Predict Single Input (CLI)

python3 predict.py

๐Ÿ“ค Batch Predict from CSV

python3 batch_predict.py

๐ŸŒ Run Streamlit App

streamlit run app.py

๐Ÿ“„ Dataset

Dataset used: housing.csv


๐Ÿ“ƒ Requirements

pandas
scikit-learn
joblib
streamlit

๐Ÿ“ˆ Example Output

Predicted House Price: $415,721
Predictions saved to predictions.csv

About

๐Ÿก End-to-end House Price Prediction system using Scikit-Learn. Features a robust regression pipeline, batch CSV processing, and an interactive Streamlit UI for real-time inference.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages