Skip to content

Feat/eval on dataset#392

Draft
qhua360 wants to merge 4 commits intogalilai-group:mainfrom
qhua360:feat/eval-on-dataset
Draft

Feat/eval on dataset#392
qhua360 wants to merge 4 commits intogalilai-group:mainfrom
qhua360:feat/eval-on-dataset

Conversation

@qhua360
Copy link
Copy Markdown

@qhua360 qhua360 commented Feb 24, 2026

Description

Create a callback to run evals on arbitrary datasets. Also add a function for mapping callbacks to accepted eval functions.

Working example here https://github.com/galilai-group/clipa/blob/eval-on-dataset-rewrite/clip.py

Checklist

  • I have read the Contributing document.
  • The documentation is up-to-date with the changes I made (check build artifacts).
  • All tests passed, and additional code has been covered with new tests.
  • I have added the PR to the RELEASES.rst file.

qhua360 and others added 4 commits February 20, 2026 16:59
Add reusable EvalOnDataset callback that runs evaluation functions on
arbitrary datasets every N epochs with DDP support, and a
callback_to_evaluator adapter that wraps existing Lightning callbacks
(CLIPZeroShot, OnlineKNN, OnlineProbe) into evaluator functions.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace single name/data/evaluators params with a list of
EvalDatasetEntry dataclasses so one callback handles all eval
runs sequentially with a single DDP barrier, matching the
original ZeroShotEvalCallback behavior.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Let Lightning handle rank coordination and logger dispatch via
log_dict(sync_dist=True) instead of manually checking is_global_zero
and calling trainer.logger.log_metrics.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant