-
Notifications
You must be signed in to change notification settings - Fork 25
Fix issue in isQSO_randomforest #844
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Thanks, @akrolewski. This change would not be backward-compatible, which we usually try to avoid. So, we have a choice to make:
Pinging @sbailey for an opinion. |
|
I think the existing code is fine and the problem is that @akrolewski is passing in instead of (I'm not actually sure that is the correct use of In the first case, We could make this more robust by doing initial type checking on the @akrolewski please review what you intended to be passing as |
|
Ahh, thanks for catching that Stephen! Yes, I confirm that if I pass boolean input to primary, everything works as expected. I agree that there is no need to make any changes to the code. I think it would be helpful to note in the documentation of isQSO_randomforest that primary should be a boolean array. I ran into this issue because I needed to apply target selection to magnified versions of the LSS catalogs to measure magnification bias, but the PRIMARY column is not tracked into the LSS catalogs--which is why I passed a vector of all ones, and I didn't realize that I should be passing a vector of all booleans. |
| # ADM default mask bits from the Legacy Surveys not set. | ||
| preSelection &= imaging_mask(maskbits) | ||
|
|
||
| preSelection = preSelection.astype('bool') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following discussion in the PR, I suggest that we drop this preSelection type casting line and instead around line 1876 update it to be
if primary is None:
primary = np.ones_like(gflux, dtype=bool)
else:
assert primary.dtype == np.dtype(bool)or something similar to check that primary is an array of booleans.
And also update the docstring to be complete with describing the inputs and their types. e.g. it is also unclear to me if this function is supposed to support scalar inputs for testing individual targets, or whether it only works with input arrays and scalars are explicitly not supported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sbailey: This sounds sensible to me. @akrolewski: I suggest closing this PR, opening a separate issue linking to @sbailey's synopsis, here, and then assigning me to fix that issue.
I noticed that some objects in the ELGnotqso clustering catalogs were being classified as quasars in isQSO_randomforest. Specifically, I ran the following code snippet. The last two lines should have shape zero (because there should be no quasars in ELGnotqso) but they do not! It seems that preSelection was type int, but should have been type bool; this meant that preSelection was not being applied so objects with W1 > 22.3, W2 > 22.3, etc were being mistakenly classified as quasars.