Skip to content

Conversation

@minettekaum
Copy link
Contributor

Description

Added KID (Kernel Inception Distance) metric

Related Issue

Fixes #(issue number)

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

The KID metric implementation has been tested with the following test suite:

  1. Main KID Test (test_kid):

    • Tests KID metric functionality on both CPU and CUDA devices
    • Uses LAION256 dataset with 4 batches of images
    • Validates that KID returns ~0 when comparing identical images (abs < 0.25)
    • Ensures proper handling of subset_size parameter (must be smaller than number of samples)
    • Checks for NaN values to ensure numerical stability
    • Test cases: test_kid[LAION256-cpu] and test_kid[LAION256-cuda] - both PASSED
  2. Call Type Tests (test_check_call_type):

    • Validates KID works correctly in both "single" and "pairwise" evaluation modes
    • Ensures proper call type conversion (gt_y for single, pairwise_gt_y for pairwise)
    • Test cases: test_check_call_type[single-kid] and test_check_call_type[pairwise-kid] - both PASSED
  3. All Torch Metrics Tests:

    • Ran full test suite: uv run pytest tests/evaluation/test_torch_metrics.py -v
    • Result: 41 passed, 7 skipped (CUDA tests on macOS), 0 failed
    • Confirmed no regressions in existing metrics

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Additional Notes

@minettekaum minettekaum changed the title Feat/kid metric Kid metric added Nov 11, 2025
Copy link
Member

@begumcig begumcig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for implementing KID, Minette, it already looks amazing! 💜 I just left one small comment: I think we could handle the special case in compute() a bit more cleanly (maybe by adding a kid_compute to the TorchMetrics enum?), so that the compute() function stays as generic as possible. Other than that, it’s already looking great and pretty much ready to go! 💜💜

result = self.metric.compute()

# Handle KID which returns a tuple (mean, std)
if self.metric_name == "kid" and isinstance(result, tuple) and len(result) == 2:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really have a suggestion for this yet but I think we should not do conditional statements in the compute function based on the metric name, as this can get messy quite easily, how do you feel?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good point, and I agree. Having conditionals in compute() based on the metric name can get messy fast. So I cleaned it up a bit:

  • Took out the if self.metric_name == "kid" check from compute().
  • Added a kid_compute function to handle KID’s tuple return.
  • Updated the TorchMetrics enum so KID has a 4th element for kid_compute (others stay as 3).
  • Now compute() just uses self.compute_fn from the enum if it’s there, otherwise falls back to the default self.metric.compute().

So now the compute() method is generic and clean, no metric-specific conditionals. Each metric can have its own compute logic in the enum, just like how we handle the update functions.

What do you think, does this look like a good direction?

Copy link
Member

@sharpenb sharpenb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good! Left a single comment

@github-actions
Copy link

This PR has been inactive for 10 days and is now marked as stale.

@github-actions github-actions bot added the stale label Nov 25, 2025
Copy link
Member

@begumcig begumcig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks super good Minette! I made a small comment about the blank line but feel free to merge! Amazing job you are a queen 👸

@github-actions github-actions bot removed the stale label Nov 28, 2025
@github-actions
Copy link

github-actions bot commented Dec 8, 2025

This PR has been inactive for 10 days and is now marked as stale.

@github-actions github-actions bot added the stale label Dec 8, 2025
@minettekaum minettekaum merged commit 7d11666 into main Dec 10, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants