Pretraining Imagenet-1k Replication: Table 2 Hyperparameters

Hi Randall,

I'm a grad student at Duke working on replicating your LeJEPA results, specifically the pre-training configurations for the few-shot classification experiments in Table 2.

I wanted to check if you'd be willing to share the exact pre-training config used for the ViT-L run before the few-shot classification experiments. Specifically, I'm interested in:

- Batch size
- Exact LR/WD 
- Whether SWA was applied
- Linear probe fitting details for k-shot specifically

I really enjoyed the work. The SIGReg derivation is elegant and the removal of heuristics for ssl is a wonderful simplification.

Thanks for your time, and happy to share any findings that come out of this.

Best,
Alex

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pretraining Imagenet-1k Replication: Table 2 Hyperparameters #41

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Pretraining Imagenet-1k Replication: Table 2 Hyperparameters #41

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions