-
Notifications
You must be signed in to change notification settings - Fork 6
Metrics dashboard api #6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
794a07d
3b93629
9359f13
ddde2ff
0b86e6d
ace56ba
afa5890
92ea6fd
366c3e0
c3172e2
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,85 @@ | ||
| # Proposal for metrics/dashboard API | ||
|
|
||
| Goals: | ||
| * Unified treatment of many different kinds of metrics in dashboard | ||
| * A clean separation between dashboard (for visualization) and different kinds of backends (for metric calculation) | ||
|
|
||
| Status quo: | ||
| * Dashboard currently keeps track of the metric values in the cache dictionary `fairlearn.widget._fairlearn_widget.FairlearnWidget._response` | ||
| * The dictionary is updated in `fairlearn.widget._fairlearn_dashboard.FairlearnDashboard._on_request` | ||
| * There's a PR out to fill the whole data structure in `fairlearn.metrics.create_dashboard_dictionary` | ||
|
|
||
| Issues with the status quo: | ||
| * Code duplication / redundancy / brittleness due to copy-paste errors | ||
| * Currently only one kind of a metric (`<metric>_summary`) is supported and hard-wired into the dashboard dictionary | ||
|
|
||
| ## Proposal | ||
|
|
||
| ### Part I: More general dashboard dictionary | ||
|
|
||
| ```python | ||
| { | ||
| "prediction_type": "binary_classification" or "probabilistic_binary_classification" or "regression", | ||
| "array_bindings": { # all 1D arrays, including features and predictior vectors, are here | ||
| "<array_key>" : { # the keys can be arbitrary strings; not sure we need to force any convention, but see examples below | ||
| "name": string, # the name of a feature would be the feature name, of a prediction vector would be the model name | ||
| "values": number[], | ||
| "value_names": string[], # an optional field to encode categorical data | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Presumably we also specify that extra keys (e.g. inserted by AzureML) are to be preserved. |
||
| }, | ||
| "sensitive_feature gender" : { # an example feature | ||
| "name": "gender", | ||
| "values": [0, 1, 0, 0, 2], | ||
| "value_names": ["female", "male", "non-binary"], | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should there be a 'type' field in here, so things like 'prediction' and 'sensitive_feature' don't have to go into the key? |
||
| }, | ||
| "y_pred model0" : { # an example prediction vector | ||
| "name": "model0", | ||
| "values": [0, 0, 1, 1, 0], | ||
| }, | ||
| "y_true": { | ||
| "name": "y_true", | ||
| "values": [0, 1, 1, 1, 0], | ||
| }, | ||
| "sample_weight": { | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is user-provided, not the one set within ExponentiatedGradient, right? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If so, perhaps it's worth documenting this with a short comment There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. will do. this is just an example of an array that we may want to pass to the metrics--since many metrics work with this kind of an argument. |
||
| "name": "sample_weight", | ||
| "values": [0.1, 0.3, 1, 0.9], | ||
| } | ||
| ... | ||
| }, | ||
| "cache" : [ | ||
| { | ||
| "function": string, # python function name; we could either limit to fairlearn.metrics | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Use fully qualified names for sure. |
||
| # or use fully qualified names | ||
| "arguments": { | ||
| "<array_argument>": "<array_key>" or null, # array-valued arguments are matched with array bindings | ||
| "<numeric_argument>": number or null, # we should also support numeric arguments, strings, booleans | ||
| "<string_argument>": string or null, # null corresponds to None | ||
| "<boolean_argument>": boolean or null, | ||
| }, | ||
| "return_value": number or string or boolean or null or dict, | ||
| # dict could be encoded as { "keys": any[], "values": any[] } | ||
| }, | ||
| { # an example | ||
| "function": "fbeta_score_group_summary", | ||
| "arguments": { | ||
| "y_true": "y_true", | ||
| "y_pred": "y_pred model0", | ||
| "sensitive_features": "sensitive_feature gender", | ||
| "sample_weight": "sample_weight", | ||
| "beta": 0.3, | ||
| }, | ||
| "return_value": { | ||
| "overall": 0.11, | ||
| "by_group": { | ||
| "keys": [0, 1, 2], | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are the 'keys' necessary, if we required all categoricals to be integer-encoded? |
||
| "values": [0.15, 0.04, 0.03], | ||
| } | ||
| }, | ||
| }, | ||
| ... | ||
| ] | ||
| } | ||
| ``` | ||
|
|
||
| ### Part II: How to remove duplication | ||
|
|
||
| `<TODO>` | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason why we're omitting multiclass classification?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because we don't have any support for it yet, but we can definitely add other prediction types in future.