Skip to content

DLFundamentals/ssl-vs-sl-pt1

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

87 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Self-Supervised Contrastive Learning is Approximately Supervised Contrastive Learning

Project Page | Paper

In our work we make progress towards addressing the following question:

How does self-supervsied CL learn representations similar to supervised learning, despite lacking explicit supervision?

We acknowledge the following works for their open-source contributions:

Abstract

Despite its empirical success, the theoretical foundations of self-supervised contrastive learning (CL) are not yet fully established. In this work, we address this gap by showing that standard CL objectives implicitly approximate a supervised variant we call the negatives-only supervised contrastive loss (NSCL), which excludes same-class contrasts. We prove that the gap between the CL and NSCL losses vanishes as the number of semantic classes increases, under a bound that is both label-agnostic and architecture-independent.

We characterize the geometric structure of the global minimizers of the NSCL loss: the learned representations exhibit augmentation collapse, within-class collapse, and class centers that form a simplex equiangular tight frame. We further introduce a new bound on the few-shot error of linear-probing. This bound depends on two measures of feature variability—within-class dispersion and variation along the line between class centers. We show that directional variation dominates the bound and that the within-class dispersion's effect diminishes as the number of labeled samples increases. These properties enable CL and NSCL-trained representations to support accurate few-shot label recovery using simple linear probes.

Finally, we empirically validate our theoretical findings: the gap between CL and NSCL losses decays at a rate of $\mathcal{O}(\frac{1}{\text{classes}})$; the two losses are highly correlated; minimizing the CL loss implicitly brings the NSCL loss close to the value achieved by direct minimization; and the proposed few-shot error bound provides a tight estimate of probing performance in practice.

Installation

To get started, follow these steps:

git clone https://github.com/DLFundamentals/understanding-ssl.git
cd understanding-ssl

The packages that we use are straightforward to install. Please run the following command:

conda env create -f requirements.yml
conda activate contrastive

Pretraining SSL models

Training on Single GPU

Run the following command to train SimCLR on single GPU.

python scripts/train.py --config <path-to-yaml-config>

Distributed Training on Multiple GPUs

Run the following command to train SimCLR on multiple GPUs.

NOTE: In our experiments, we used 2 GPUs for training. You can adjust the number of GPUs based on your hadrware setup.

torchrun --nproc_per_node=N_GPUs --standalone scripts/multigpu_train_simclr.py --config <path-to-yaml-config>

Replace N_GPUs with the number of GPUs you want to use and <path-to-yaml-config> with the path to your configuration file.

Please refer to docs/pretraining for more details.

Linear Probing

To evaluate pretrained encoders via linear probing, you can run:

python scripts/linear_probe.py --config <path-to-config-file> --ckpt_path <path-to-ckpt-dir> --output_path <path-to-save-logs> --N <n_samples>

For example,

python scripts/linear_probe.py --config configs/simclr_DCL_cifar10_b1024.yaml --ckpt_path experiments/simclr/cifar10_dcl/checkpoints/ --output_path logs/cifar10/ --N 500

Evaluation

To validate our Theorem [1], you can run:

python scripts/losses_eval.py --config <path-to-config-file> --ckpt_path <path-to-ckpt-dir> --output_path <path-to-save-logs>

For example,

python scripts/losses_eval.py --config configs/simclr_DCL_cifar10_b1024.yaml --ckpt_path experiments/simclr/cifar10_dcl/checkpoints/ --output_path logs/cifar10/simclr/exp1/

This will log losses.csv file to your output_path directory. You can analyse losses as a function of epochs and verify our proposed bound.

Please refer to docs/evaluation scripts for reproducing additional experiments shown in our paper.

License

This project is licensed under the Apache-2.0 license.

📚 Citation

If you find our work useful in your research or applications, please cite us using the following BibTeX:

@inproceedings{luthra2025selfsupervised,
  title={Self-Supervised Contrastive Learning is Approximately Supervised Contrastive Learning},
  author={Luthra, Achleshwar and Yang, Tinabao and Galanti, Tomer},
  booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems},
  year={2025},
  url={https://openreview.net/forum?id=mf4V1SK0np}
}

About

Official code for "Self-Supervised Contrastive Learning is Approximately Supervised Contrastive Learning", NeurIPS 2025

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages