Skip to content

abhinandan2540/SimSearch

Repository files navigation

SimSearch

Self-Supervised Image Representation Learning & Retrieval

SimSearch is a deep learning project focused on self-supervised learning for image representation and similarity-based retrieval. The goal is to learn meaningful feature embeddings without explicit labels, enabling clustering and efficient search across visual data.


Project Overview

Traditional supervised learning relies heavily on labeled datasets. In contrast, SimSearch leverages self-supervised learning to extract patterns and structure directly from raw images.

The model learns to:

  • Understand visual similarity
  • Separate different object categories
  • Form meaningful clusters in embedding space

Dataset

The dataset consists of 5 subcategories:

  • 👜 Bags
  • 🚗 Cars
  • 🐶 Dogs
  • 📱 Phones
  • 👟 Shoes

Even without labels during training, the model gradually learns to distinguish between these categories.


Methodology

  • Self-supervised learning approach (contrastive / representation learning)
  • Feature embedding generation
  • Dimensionality reduction for visualization
  • Clustering in latent space

Results & Visualization

After training, the model learns to separate datapoints and form clusters based on semantic similarity.

2D Embedding Visualization

2D

3D Embedding Visualization

3D

3D Interactive Embedding Visualization

newplot

Tech Stack

  • PyTorch
  • NumPy, Pandas, Matplotlib
  • Scikit-learn

Many Thanks

Abhinandan

About

Self-Supervised Image Representation Learning & Retrieval

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors