Skip to content

Pretraining Imagenet-1k Replication: Table 2 Hyperparameters #41

@alexoh2bd

Description

@alexoh2bd

Hi Randall,

I'm a grad student at Duke working on replicating your LeJEPA results, specifically the pre-training configurations for the few-shot classification experiments in Table 2.

I wanted to check if you'd be willing to share the exact pre-training config used for the ViT-L run before the few-shot classification experiments. Specifically, I'm interested in:

  • Batch size
  • Exact LR/WD
  • Whether SWA was applied
  • Linear probe fitting details for k-shot specifically

I really enjoyed the work. The SIGReg derivation is elegant and the removal of heuristics for ssl is a wonderful simplification.

Thanks for your time, and happy to share any findings that come out of this.

Best,
Alex

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions