feat(CrossValidationReport): Add threshold averaging for roc plot #1750

foster999 · 2025-05-23T11:00:23Z

Implements the threshold averaging method for ROC curve averaging (see #1702)

Includes simple test for averaging logic

Still todo:

Handle caching, so that average and none-average plots can be generated from the same report. It currently ignores new arguments and plots cached values. Should we cache averages separately?
Include constituent roc curves on average plot, or present a measure of variance/confidence
Handle plot kwargs for average ROC line (split_index is None)
Implement for PR curve, or add error to say averaging is undefined? Already added the parameter to PR, as _MetricsAccessor uses the same interface to call both PR and ROC displays.
Update docstrings with method description and reference

foster999 · 2025-05-23T11:53:21Z

@glemaitre do you have any thoughts on the questions I've included above?

Regarding showing how much variation there is, I had something like this in mind:

I generally have a large number of ROC curves in the average, so wouldn't want to show a legend for each one

thomass-dev · 2025-05-26T10:30:54Z

[automated comment] Please update your PR with main, so that the pytest workflow status will be reported.

foster999 · 2025-05-28T11:58:48Z

[automated comment] Please update your PR with main, so that the pytest workflow status will be reported.

Updated to use new format for passing data to displays. I think approval might be needed to get the workflows running?

…02-average-roc

skore/src/skore/sklearn/_plot/metrics/precision_recall_curve.py

github-actions · 2025-05-28T14:25:46Z

Coverage Report for skore/

File	Stmts	Miss	Cover	Missing
venv/lib/python3.12/site-packages/skore
__init__.py	23	0	100%
_config.py	28	0	100%
exceptions.py	4	4	0%	4, 15, 19, 23
venv/lib/python3.12/site-packages/skore/project
__init__.py	2	0	100%
metadata.py	67	0	100%
project.py	43	0	100%
reports.py	11	0	100%
widget.py	138	5	96%	375–377, 447–448
venv/lib/python3.12/site-packages/skore/sklearn
__init__.py	6	0	100%
_base.py	169	14	91%	45, 58, 126, 129, 182, 185–186, 188–191, 224, 227–228
find_ml_task.py	61	0	100%
types.py	22	0	100%
venv/lib/python3.12/site-packages/skore/sklearn/_comparison
__init__.py	5	0	100%
metrics_accessor.py	203	3	98%	170, 334, 1288
report.py	95	0	100%
utils.py	55	0	100%
venv/lib/python3.12/site-packages/skore/sklearn/_cross_validation
__init__.py	5	0	100%
metrics_accessor.py	207	1	99%	327
report.py	118	0	100%
venv/lib/python3.12/site-packages/skore/sklearn/_estimator
__init__.py	7	0	100%
feature_importance_accessor.py	143	2	98%	216–217
metrics_accessor.py	371	9	97%	158, 187, 189, 196, 287, 356, 360, 375, 410
report.py	155	0	100%
venv/lib/python3.12/site-packages/skore/sklearn/_plot
__init__.py	2	0	100%
base.py	5	0	100%
style.py	28	0	100%
utils.py	136	5	96%	51, 75–77, 81
venv/lib/python3.12/site-packages/skore/sklearn/_plot/metrics
__init__.py	5	0	100%
confusion_matrix.py	69	4	94%	90, 98, 120, 228
precision_recall_curve.py	230	1	99%	716
prediction_error.py	160	0	100%
roc_curve.py	295	37	87%	381, 431–433, 435–440, 442, 446, 451–452, 454, 460–462, 464–465, 470, 475, 477, 483–484, 490, 492–493, 495–496, 498, 607, 708, 874, 914, 1052, 1083
venv/lib/python3.12/site-packages/skore/sklearn/train_test_split
__init__.py	0	0	100%
train_test_split.py	49	0	100%
venv/lib/python3.12/site-packages/skore/sklearn/train_test_split/warning
__init__.py	8	0	100%
high_class_imbalance_too_few_examples_warning.py	17	1	94%	80
high_class_imbalance_warning.py	18	0	100%
random_state_unset_warning.py	10	0	100%
shuffle_true_warning.py	10	1	90%	46
stratify_is_set_warning.py	10	0	100%
time_based_column_warning.py	21	1	95%	73
train_test_split_warning.py	4	0	100%
venv/lib/python3.12/site-packages/skore/utils
__init__.py	6	2	66%	8, 13
_accessor.py	52	2	96%	67, 108
_environment.py	27	0	100%
_fixes.py	8	0	100%
_index.py	5	0	100%
_logger.py	22	4	81%	15–17, 19
_measure_time.py	10	0	100%
_parallel.py	38	3	92%	23, 33, 124
_patch.py	13	5	61%	21, 23–24, 35, 37
_progress_bar.py	45	0	100%
_show_versions.py	33	2	93%	65–66
_testing.py	37	0	100%
TOTAL	3311	106	96%

Tests	Skipped	Failures	Errors	Time
816	5 💤	0 ❌	0 🔥	1m 1s ⏱️

github-actions · 2025-05-28T14:31:05Z

Documentation preview @ 06b5895

skore/src/skore/sklearn/_plot/metrics/roc_curve.py

Co-authored-by: Auguste Baum <[email protected]>

…02-average-roc

skore/src/skore/sklearn/_cross_validation/metrics_accessor.py

auguste-probabl · 2025-05-30T08:44:51Z

It looks like you're still working on this PR; if so, can you set it to draft?

Co-authored-by: Auguste Baum <[email protected]>

foster999 · 2025-05-30T08:51:55Z

It looks like you're still working on this PR; if so, can you set it to draft?

Thanks for taking another look, just changed to draft now

I could do with advice on a couple of points please:

foster999 · 2025-05-30T08:52:26Z

skore/src/skore/sklearn/_cross_validation/metrics_accessor.py

+                display = display_class._compute_data_for_display(
+                    y_true=y_true,
+                    y_pred=y_pred,
+                    average=average,


Mypy is still complaining about this line, even after casting:
Unexpected keyword argument "average" for "_compute_data_for_display" of "PrecisionRecallCurveDisplay

Without casting it shows the same error twice. Any suggestions on why?

Hm, the cast looks correct to me... maybe reveal_type could help?

Side-note: This is a sign that _get_display should stop existing, although that should be the subject of another PR.

Hm, the cast looks correct to me... maybe reveal_type could help?

Thanks, though it looks how I would expect after casting:
Revealed type is "type[skore.sklearn._plot.metrics.roc_curve.RocCurveDisplay]"

foster999 · 2025-05-30T08:55:36Z

skore/src/skore/sklearn/_plot/metrics/roc_curve.py

+        average_roc_curve = self.roc_curve.query(query)
+        average_roc_auc = self.roc_auc.query(query)["roc_auc"].item()
+
+        line_kwargs_validated = _validate_style_kwargs({}, {})


I'm not sure how best to take style kwargs for the average line. Other lines slice using split_idx, but this is None for the average line

Yeah, it might be time to make roc_curve_kwargs more specific in this case. Something like

Kwargs = dict[str, Any] oneOrMore[T] = Union[T, list[T]] class RocCurveKwargs(TypedDict): splits: oneOrMore[Kwargs] average: Optional[Kwargs]

github-actions bot assigned foster999 May 23, 2025

foster999 force-pushed the 1702-average-roc branch 3 times, most recently from eb25f59 to 9c669e5 Compare May 23, 2025 11:30

feat(CrossValidationReport): Add threshold averaging for roc plot

4e4af76

foster999 force-pushed the 1702-average-roc branch from b5a971a to 4e4af76 Compare May 28, 2025 11:52

Merge branch 'main' into 1702-average-roc

2745dae

foster999 added 4 commits May 28, 2025 13:12

Update docstrings with average parameter

ed457c7

Merge branch '1702-average-roc' of github.com:foster999/skore into 17…

3b8e8eb

…02-average-roc

Add reference for threshold averaging

efc6608

Add average parameter to metrics cache key

06b5895

foster999 marked this pull request as ready for review May 28, 2025 12:38

auguste-probabl reviewed May 28, 2025

View reviewed changes

skore/src/skore/sklearn/_plot/metrics/precision_recall_curve.py Outdated Show resolved Hide resolved

auguste-probabl reviewed May 28, 2025

View reviewed changes

skore/src/skore/sklearn/_plot/metrics/roc_curve.py Outdated Show resolved Hide resolved

auguste-probabl reviewed May 28, 2025

View reviewed changes

skore/src/skore/sklearn/_plot/metrics/roc_curve.py Show resolved Hide resolved

foster999 and others added 5 commits May 28, 2025 15:56

Remove unused precision recall curve average parameter

30e027c

Update error message

5eb22fb

Co-authored-by: Auguste Baum <[email protected]>

Fix control flow for RocCurveDisplay

dab17ae

Merge branch '1702-average-roc' of github.com:foster999/skore into 17…

ff902d3

…02-average-roc

Merge branch 'main' into 1702-average-roc

07d31c6

auguste-probabl reviewed May 30, 2025

View reviewed changes

skore/src/skore/sklearn/_cross_validation/metrics_accessor.py Outdated Show resolved Hide resolved

foster999 marked this pull request as draft May 30, 2025 08:47

foster999 and others added 2 commits May 30, 2025 09:47

Update docs for averaging

10be285

Co-authored-by: Auguste Baum <[email protected]>

Merge branch 'main' into 1702-average-roc

058d22d

foster999 commented May 30, 2025

View reviewed changes

feat(CrossValidationReport): Add threshold averaging for roc plot #1750

Are you sure you want to change the base?

feat(CrossValidationReport): Add threshold averaging for roc plot #1750

Uh oh!

Conversation

foster999 commented May 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

foster999 commented May 23, 2025

Uh oh!

thomass-dev commented May 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

foster999 commented May 28, 2025

Uh oh!

Uh oh!

github-actions bot commented May 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented May 28, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

auguste-probabl commented May 30, 2025

Uh oh!

foster999 commented May 30, 2025

Uh oh!

foster999 May 30, 2025

Choose a reason for hiding this comment

Uh oh!

auguste-probabl May 30, 2025

Choose a reason for hiding this comment

Uh oh!

auguste-probabl May 30, 2025

Choose a reason for hiding this comment

Uh oh!

foster999 May 30, 2025

Choose a reason for hiding this comment

Uh oh!

foster999 May 30, 2025

Choose a reason for hiding this comment

Uh oh!

auguste-probabl May 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

foster999 commented May 23, 2025 •

edited

Loading

thomass-dev commented May 26, 2025 •

edited

Loading

github-actions bot commented May 28, 2025 •

edited

Loading

auguste-probabl May 30, 2025 •

edited

Loading