Use worst case method for MKI / KI#34
Conversation
| df.sort_values(by="sale_price", kind="mergesort", inplace=True) | ||
| df.sort_values( | ||
| by=["sale_price", "estimate"], | ||
| ascending=[True, False], |
There was a problem hiding this comment.
Uses False for ascending order for estimate in accordance with our external guidance.
After a lot of deliberation, we decided the best way forward is to assume the "worst case scenario" in terms of MKI/KI metrics by sorting the data first by the ascending actual value (sale price) and then by the descending predicted value (modeled result). Not saying this has to be your solution, but wanted to share our thinking if helpful.
jeancochrane
left a comment
There was a problem hiding this comment.
Great work here! Overall I think this looks right, just a few small nitpicks below related to the test definition.
| @pt.mark.parametrize("metric", ["mki", "ki"]) | ||
| def test_quintos_metric_matches_across_estimates(metric): | ||
| """ | ||
| For the quintos dataset, MKI/KI should be identical based | ||
| on the ordering of estimates. | ||
| """ | ||
| sample = ap.quintos_sample() |
There was a problem hiding this comment.
[Nitpick, optional] For consistency with other tests, I think it would make sense to switch to using a fixture here. Note how the unchanged tests above include the quintos_data fixture via a function parameter -- this is technically a fixture definition that inherits from the quintos_data fixture, but the principle is the same as if we were including the fixture in a test:
assesspy/assesspy/tests/test_metrics.py
Lines 11 to 15 in a0d0359
Here's the definition of the quintos_data fixture, which pytest loads automatically from conftest.py on startup so it can pass the fixture into any fixture or function that includes it in its function parameters:
assesspy/assesspy/tests/conftest.py
Lines 24 to 27 in a0d0359
If we follow my recommendation above to save the new data in a new sample file, we would need to define a new quintos_data_with_tiebreaks fixture in conftest.py and then include it here:
| @pt.mark.parametrize("metric", ["mki", "ki"]) | |
| def test_quintos_metric_matches_across_estimates(metric): | |
| """ | |
| For the quintos dataset, MKI/KI should be identical based | |
| on the ordering of estimates. | |
| """ | |
| sample = ap.quintos_sample() | |
| @pt.mark.parametrize("metric", ["mki", "ki"]) | |
| def test_quintos_metric_matches_across_estimates(metric, quintos_data_with_tiebreaks): | |
| """ | |
| For the quintos dataset, MKI/KI should be identical based | |
| on the ordering of estimates. | |
| """ | |
| sample = quintos_data_with_tiebreaks |
The change should be pretty similar if we stick with quintos_data -- we just wouldn't need the extra fixture definition in confest.py in that case, since the quintos_data fixture already exists.
Co-authored-by: Jean Cochrane <jeancochrane@users.noreply.github.com>
Co-authored-by: Jean Cochrane <jeancochrane@users.noreply.github.com>
| def test_mki_tiebreaks_consistent( | ||
| self, metric, quintos_data_with_tiebreaks | ||
| ): | ||
| sale_price, estimate, estimate_alt_sort_1, estimate_alt_sort_2 = ( |
There was a problem hiding this comment.
We can index, but this feels easier to interpret later.
jeancochrane
left a comment
There was a problem hiding this comment.
The new version of the test/fixture looks great! Some suggestions below, mostly just tweaks to documentation to make the purpose of this change clearer.
| df.sort_values( | ||
| by=["sale_price", "estimate"], | ||
| ascending=[True, False], | ||
| kind="mergesort", |
There was a problem hiding this comment.
[Question, optional] I wonder if this kwarg is still necessary? Per the pandas docs, kind is only used when sorting on a single column, but now we're sorting on two columns:
Choice of sorting algorithm. See also
numpy.sort()for more information. mergesort and stable are the only stable algorithms. For DataFrames, this option is only applied when sorting on a single column or label.
I'm agnostic as to whether we should leave the kwarg in or take it out -- it doesn't seem to make anything worse to leave it in, and it could provide a layer of defensiveness preventing us from accidentally reintroducing an unstable sort if we ever decide to switch back to sorting on a single column -- but I'd be interested to see if the tests still pass when we take it out.
There was a problem hiding this comment.
I assumed that it wouldn't affect anything. The only reason I left it in was if we ever wanted to do something with the dataset externally, it would remain the same. Maybe we wanted to look at class once sorted by MKI. That's not really a good example, but I could imagine something along these lines.
I expect it to pass even without it.
Co-authored-by: Jean Cochrane <jeancochrane@users.noreply.github.com>
Co-authored-by: Jean Cochrane <jeancochrane@users.noreply.github.com>
Co-authored-by: Jean Cochrane <jeancochrane@users.noreply.github.com>
Co-authored-by: Jean Cochrane <jeancochrane@users.noreply.github.com>
This uses a worst case scenario method for identifying MKI / KI values in order to ensure reproducibility.