Skip to content

Conversation

stephenworsley
Copy link
Contributor

@stephenworsley stephenworsley commented Sep 29, 2025

🚀 Pull Request

Closes #3808.

@CLAassistant
Copy link

CLAassistant commented Sep 29, 2025

CLA assistant check
All committers have signed the CLA.

@codecov
Copy link

codecov bot commented Oct 7, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 90.25%. Comparing base (fa6d61d) to head (bb60e9e).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6730      +/-   ##
==========================================
- Coverage   90.29%   90.25%   -0.05%     
==========================================
  Files          91       91              
  Lines       24475    24631     +156     
  Branches     4571     4609      +38     
==========================================
+ Hits        22100    22230     +130     
- Misses       1607     1624      +17     
- Partials      768      777       +9     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@stephenworsley stephenworsley marked this pull request as ready for review October 16, 2025 21:05
Copy link
Member

@pp-mo pp-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good !
Made some very minor suggestions for clarity.

Comment on lines +667 to +670
df = [False] * len(max_outchunks)
for dim in dims:
df[dim] = True
df = tuple(df)
Copy link
Member

@pp-mo pp-mo Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be done more tidily

Suggested change
df = [False] * len(max_outchunks)
for dim in dims:
df[dim] = True
df = tuple(df)
df = tuple(i in dims for i in range(len(shape)))

-----
.. note:
If the output chunks would larger than the maximum chunksize set
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiny typo

Suggested change
If the output chunks would larger than the maximum chunksize set
If the output chunks would be larger than the maximum chunksize set

Comment on lines +156 to +158
# Note that one chunk is irregularly rechunked and the other isn't.
expected_chunk = (2, 2, 1, 2, 2)
assert result.chunks[1] == expected_chunk
Copy link
Member

@pp-mo pp-mo Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would be clearer if you checked chunks in both the rechunked dims

Suggested change
# Note that one chunk is irregularly rechunked and the other isn't.
expected_chunk = (2, 2, 1, 2, 2)
assert result.chunks[1] == expected_chunk
# Note that one chunk is irregularly rechunked and the other isn't.
assert result.chunks[0] == (1, 1, 1, 1, 1)
assert result.chunks[1] == (2, 2, 1, 2, 2) # split from the original chunks of (5, 4)

Comment on lines +143 to +147

result = map_complete_blocks(
cube, self.func, dims=(2, 3), out_sizes=(30, 40), dtype=lazy_array.dtype
)
assert is_lazy_data(result)
Copy link
Member

@pp-mo pp-mo Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this bit is contributing anything much, as we don't check any properties of the result, except that it is lazy.
I think we can remove this, and only check the main result (i.e. the rechunked one).

Suggested change
result = map_complete_blocks(
cube, self.func, dims=(2, 3), out_sizes=(30, 40), dtype=lazy_array.dtype
)
assert is_lazy_data(result)
#(nothing)

# Reduce the optimum dask chunksize.
with dask.config.set({"array.chunk-size": "32KiB"}):
result = map_complete_blocks(
cube, self.func, dims=(2, 3), out_sizes=(30, 40), dtype=lazy_array.dtype
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be a more convincing example if the "fixed" dims weren't at the end of the shape.

I think you can easily transpose the example so the initial content of
da.ones((5, 9, 10, 10), chunks=(2, 5, 10, 5))
==> da.ones((5, 10, 9, 10), chunks=(2, 10, 5, 5))
And here in the map call we use dims=(1, 3).

(N.B. I did try this, and it does actually seem to work!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

AreaWeighted regrid requires a lot of memory

3 participants