Hi Randall,
I'm a grad student at Duke working on replicating your LeJEPA results, specifically the pre-training configurations for the few-shot classification experiments in Table 2.
I wanted to check if you'd be willing to share the exact pre-training config used for the ViT-L run before the few-shot classification experiments. Specifically, I'm interested in:
- Batch size
- Exact LR/WD
- Whether SWA was applied
- Linear probe fitting details for k-shot specifically
I really enjoyed the work. The SIGReg derivation is elegant and the removal of heuristics for ssl is a wonderful simplification.
Thanks for your time, and happy to share any findings that come out of this.
Best,
Alex
Hi Randall,
I'm a grad student at Duke working on replicating your LeJEPA results, specifically the pre-training configurations for the few-shot classification experiments in Table 2.
I wanted to check if you'd be willing to share the exact pre-training config used for the ViT-L run before the few-shot classification experiments. Specifically, I'm interested in:
I really enjoyed the work. The SIGReg derivation is elegant and the removal of heuristics for ssl is a wonderful simplification.
Thanks for your time, and happy to share any findings that come out of this.
Best,
Alex