This repository contains scripts and notebooks to reproduce the experiments in
From t-SNE to UMAP with contrastive learning ICLR 2023 (openreview, arxiv)
Sebastian Damrich, Niklas Böhm, Fred A Hamprecht, Dmitry Kobak
@inproceedings{damrich2023from,
title={From $t$-{SNE} to {UMAP} with contrastive learning},
author={Damrich, Sebastian and B{\"o}hm, Jan Niklas and Hamprecht, Fred A and Kobak, Dmitry},
booktitle={International Conference on Learning Representations},
year={2023},
}
It depends on several other repositories, in particular contrastive-ne, which implement the actual logic and contain utilities.
Create and activate the conda environment
conda env create -f environment.yml
conda activate cl_tsne_umap
Install openTSNE, vis_utis, umap, ncvis and cne
git clone https://github.com/sdamrich/openTSNE
cd openTSNE
python setup.py install
cd ..
git clone https://github.com/sdamrich/vis_utils
cd vis_utils
python setup.py install
cd ..
git clone https://github.com/sdamrich/UMAPs-true-loss
cd UMAPs-true-loss
python setup.py install
cd ..
git clone https://github.com/sdamrich/ncvis
cd ncvis
make libs
make wrapper
git clone -b iclr2023 https://github.com/berenslab/contrastive-ne
pip install --no-deps .
cd ..
To reproduce the Neg-t-SNE embeddings from Fig. 1 a)-e), run
python scripts/compute_embds_cne.py
and check out the results in notebooks/negtsne.ipynb.
To reproduce the UMAP embeddings from Fig. S1 a)-c), run
python scripts/compute_embds_umap.py
and check out the results in notebooks/umap_vs_negtsne.ipynb.
To compute the metrics for the Neg-t-SNE embedding spectra (Fig. S4), run
python scripts/compute_metrics.py
and check out the results in notebooks/metrics.ipynb.
To reproduce the run time by batch size analysis from Fig. S6, run
python scripts/run_time_by_batch_size.py
and check out the results in notebooks/speed_up.ipynb.
To reproduce the SimCLR experiments with m=16 and random seed r=0, run
python cne_scripts_notebooks/scripts/cifar10_acc.py -m 16 -r 0
The results will be printed in terminal but can also be checked out in notebooks/eval_cifar.ipynb.
For other experiments adapt the parameters at the top of compute_embds_cne.py
and compute_embds_umap.py or at the top of the main function in cifar10_acc.py
accordingly. The number of negative samples and the random seed for cifar10_acc.py can be
passed as command line arguments, as above. Downloaded datasets and neighbor embedding results will be saved in cne_scripts_notebooks/data and figures
will be saved in cne_scripts_notebooks/figures.
All neighbor embedding results alongside their parameters can be
inspected in the jupyter notebooks in cne_scripts_notebooks/notebooks.
This list details which figures can be inspected using which notebooks:
- Fig 1:
negtsne.ipynb,tsne.ipynb,ncvis.ipynb - Fig 2:
umap_vs_negtsne.ipynb - Fig 3:
parametric.ipynb - Fig S1:
umap_vs_negtsne.ipynb - Fig S2:
umap_vs_negtsne.ipynb - Fig S3:
trimap.ipynb - Fig S4:
metrics.ipynb - Fig S5:
toy_experiment.ipynb - Fig S6:
speed_up.ipynb - Fig S7:
attr_rep_plot_UMAP_neg.ipynb - Fig S8:
umap_vs_negtsne_vary_n_noise.ipynb - Fig S9:
tsne_vs_ncvis.ipynb - Fig S10:
tsne_vs_ncvis.ipynb - Fig S11:
negtsne.ipynb,tsne_ipynb,ncvis.ipynb - Fig S12:
imba_mnist_negtsne.ipynb,imba_mnist_tsne_ncvis_umap.ipynb - Fig S13:
human_negtsne.ipynb,human_tsne_ncvis_umap.ipynb - Fig S14:
zebrafish_negtsne.ipynb,zebrafish_tsne_ncvis_umap.ipynb - Fig S15:
c_elegans_negtsne.ipynb,c_elegans_tsne_ncvis_umap.ipynb - Fig S16:
k49_negtsne.ipynb,k49_tsne_ncvis_umap.ipynb - Fig S17:
k49_negtsne.ipynb - Fig S18:
ncvis.ipynb,tsne.ipynb - Fig S19:
infonctsne.ipynb,tsne.ipynb - Tab 1:
eval_cifar.ipynb



