Add bound-only mask accessor compatible with current particle selection#93
Add bound-only mask accessor compatible with current particle selection#93diegod-01 wants to merge 9 commits into
Conversation
|
An initial look revealed a case that breaks: import unyt as u
from swiftsimio import cosmo_quantity
from swiftgalaxy import SWIFTGalaxy, SOAP, MaskCollection
from swiftgalaxy.demo_data import generated_examples
soap = SOAP(generated_examples.soap, soap_index=0, extra_mask=None)
sg = SWIFTGalaxy(generated_examples.virtual_snapshot, soap)
mask = MaskCollection(
gas=sg.gas.spherical_coordinates.r
< cosmo_quantity(
5, u.kpc, comoving=True, scale_factor=sg.metadata.scale_factor, scale_exponent=1
)
)
sg.mask_particles(mask)
bound_mask = sg.get_bound_only_mask()
bound_mask.dark_matter.mask # ok
bound_mask.gas.mask # IndexErrorIn general it looks like there are some issues around keeping the loaded data and the masks in the right state (which masks are applied when). Another breaking case: import unyt as u
from swiftsimio import cosmo_quantity
from swiftgalaxy import SWIFTGalaxy, SOAP, MaskCollection
from swiftgalaxy.demo_data import generated_examples
soap = SOAP(generated_examples.soap, soap_index=0, extra_mask=None)
sg = SWIFTGalaxy(generated_examples.virtual_snapshot, soap)
mask = MaskCollection(
gas=sg.gas.spherical_coordinates.r
< cosmo_quantity(
5, u.kpc, comoving=True, scale_factor=sg.metadata.scale_factor, scale_exponent=1
)
)
sg.mask_particles(mask)
bound_mask = sg.get_bound_only_mask(relative_to_current=False)
assert sg.gas.group_nr_bound.size == sg.gas.coordinates.size // 3Here the |
|
Had to refactor the code quite a lot. I tried using the combine_with method of the LazyMask class which required removing the references to the internal datasets _particle_dataset._group_nr_bound or _particle_dataset._particle_ids; good luck not breaking everything for anyone brave to try... I ended up choosing this solution as it does not require a copy of the reader instance or refactoring the entire codebase. Should also pass the lint/ruff/flake8/numpydoc/sphinx-lint/mypy tests as far as I know. Can be ugly-looking, but I couldn't think of anything better. Let me know! |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #93 +/- ##
===========================================
- Coverage 100.00% 99.95% -0.05%
===========================================
Files 7 7
Lines 2319 2341 +22
Branches 261 269 +8
===========================================
+ Hits 2319 2340 +21
- Partials 0 1 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
kyleaoman
left a comment
There was a problem hiding this comment.
Made specific suggestions to resolve the test coverage. Once you push an addition to the docs I'm happy to merge.
| dtype=int, | ||
| ), | ||
| ][self._mask_index] | ||
| if load_masked: |
There was a problem hiding this comment.
| if load_masked: | |
| else: | |
| raise ValueError("Unknown particle type.") | |
| if load_masked: |
The coverage analysis is complaining that there's an if/elif chain here where the case where none of the items is triggered is never encountered. This could technically be resolved by making the last elif into an else but that leads to weird behaviour if something not in ("gas", "dark_matter", "stars", "black_holes") is passed, so this seems safer. Means that something in the test suite has to trigger this error.
There was a problem hiding this comment.
Suggested such a test below, surprisingly difficult to trigger!
| assert got_mask.dtype == bool | ||
| assert got_mask.dtype == bool | ||
| assert got_mask.all() | ||
|
|
There was a problem hiding this comment.
| def test_invalid_particle_type(self, sg): | |
| """Check that an unknown particle type in masking is refused.""" | |
| sg.metadata.num_part[2] = 1 # claim that there's a "boundary" particle | |
| with pytest.raises(ValueError, match="Unknown particle type."): | |
| sg.halo_catalogue._generate_bound_only_mask(sg).boundary.mask | |
(Not sure I've got the line spacing right, might need to run ruff format.)
This PR adds a new SWIFTGalaxy method to retrieve a bound-only mask without applying it immediately.
What is included:
• Adds get_bound_only_mask(relative_to_current=True).
• Preserves lazy-loading and lazy-mask behavior.
• By default, returns a mask aligned with the currently selected particles (after spatial mask and any existing extra masks), so it is ready to apply directly.
• Supports relative_to_current=False to return the mask in post-spatial-mask index space.
• Raises a clear error when no halo catalogue is attached (probably superfluous).
– Tests –
Added masking tests for:
• error behavior without a halo catalogue,
• lazy behavior (no eager data loading),
• compatibility with currently masked particle arrays,
• default relative_to_current behavior.
Local checks passed for masking and halo mask-compatibility test subsets.