The goal of this project is to build a movie recommendation system using two core approaches:
- Content-Based Filtering – Recommending movies similar to those a user already likes, based on movie attributes.
 - Collaborative Filtering – Recommending movies based on the preferences of users with similar tastes.
 
The project also explores the advantages and limitations of each method, with an intention to potentially combine them in a future hybrid recommender system.
- 
Feature Extraction
- Extract movie release years from titles.
 - Split the 
genrescolumn into individual genres. - Apply One-Hot Encoding to create binary genre columns (0 = not in genre, 1 = in genre).
 
 - 
User Profile Building
- Introduce an example user with predefined ratings.
 - Identify genres of rated movies and compute weighted scores for each genre based on user ratings.
 
 - 
Recommendation Generation
- Compare all movies’ genre vectors against the user’s weighted genre profile.
 - Rank movies by similarity score and recommend the top results.
 
 - 
Advantages & Limitations
- Strengths: Personalization, reduced cold start for new users, transparency in recommendations.
 - Weaknesses: Over-specialization, limited diversity, dependency on feature quality.
 
 
- 
User-Based Collaborative Filtering
- Find other users with similar rating patterns to the target user using historical ratings.
 - Recommend movies that similar users have rated highly and the target user hasn’t seen.
 
 - 
Advantages & Limitations
- Strengths: Can suggest unexpected items outside user’s known preferences, adaptable to changing tastes.
 - Weaknesses: Cold start problem, data sparsity, scalability issues, potential privacy concerns.
 
 
- Python
 - pandas – Data manipulation and preprocessing
 - NumPy – Numerical operations for profile and similarity calculations
 - Matplotlib – Visualization of results (inline in Jupyter)
 
- movies.csv – Contains 
movieId,title, andgenresfor each movie. - ratings.csv – Contains 
userId,movieId,rating, andtimestamp. - Timestamp is dropped for this implementation as it is not currently used in recommendation logic.
 
- 
Clone the repository:
git clone <repo-url> cd <repo-folder>
 - 
Install dependencies:
pip install pandas numpy matplotlib
 - 
Run the notebook:
jupyter notebook "recommendation system project.ipynb" 
- Content-Based Filtering successfully recommends movies based on genre similarity to user preferences.
 - Collaborative Filtering provides recommendations based on similar users’ ratings.
 - The notebook contains step-by-step implementation and comparison of both methods.
 
Typical outputs in the notebook include:
- One-Hot Encoded genre matrix.
 - Weighted genre profile for a sample user.
 - Ranked list of recommended movies with highest similarity scores.
 - Collaborative filtering recommendation lists.
 
- Demonstrates a clear understanding of both content-based and collaborative filtering principles.
 - Shows end-to-end preprocessing, feature engineering, and recommendation scoring without relying on pre-built recommender libraries.
 - Highlights the trade-offs between the two methods and opens the path for future hybrid recommender systems.
 
Mehran Asgari 📧 [email protected] 🌐 https://github.com/imehranasgari
This project is licensed under the Apache 2.0 License – see the LICENSE file for details.
💡 Some interactive outputs (e.g., plots, widgets) may not display correctly on GitHub. If so, please view this notebook via nbviewer.org for full rendering.