Skip to content

Conversation

ghiggi
Copy link
Owner

@ghiggi ghiggi commented Aug 23, 2025

Prework

What kind of change does this PR introduce? (check at least one)

  • Bugfix
  • Feature
  • Documentation
  • Tutorial
  • Code style update
  • Refactor
  • Build-related changes
  • Other, please describe:

Does this PR introduce a breaking change? (check one)

  • Yes
  • No

If yes, please describe the impact and communicate accordingly:

The PR fulfills these requirements:

  • It's submitted to the branch named as follow:
    • Fix a bug: bugfix-<some_key>-<word>
    • Improve the doc: doc-<some_key>-<word>
    • Improve a tutorial tutorial-<some_key>-<word>
    • Add a new feature: feature-<some_key>-<word>
    • Refactor some code: refactor-<some_key>-<word>
    • Optimize some code: optimize-<some_key>-<word>
  • When resolving a specific issue, it's referenced in the PR's title (e.g. fix #xxx[,#xxx], where "xxx" is the issue number)
  • Don't forget to link PR to issue if you are solving one.
  • All tests are passing.
  • New/updated tests are included

Summary

This PR adds the gpm.open_files function which allows to read a list of GPM files given the specified filepaths.
This PR address #80.

@ghiggi
Copy link
Owner Author

ghiggi commented Aug 23, 2025

HI @kmuehlbauer ! I add this in mind the entire week so I spent 2 hours this morning to implement it.

Can you try if it works well for you use case and report possible improvements?
Especially maybe try it out with parallel=True argument to see if you experience some troubles.

To avoid netCDF locking, I typically run it by initializing a dask client as follow:

import os
os.environ["HDF5_USE_FILE_LOCKING"] = "FALSE"
from dask.distributed import Client, LocalCluster
cluster = LocalCluster(
        n_workers=20,
        threads_per_worker=1, # important to set to 1 to avoid netcdf locking ! 
        processes=True,
  )
 client = Client(cluster)

FYI: The PR tests fails for a minor problem related to an update of the polars library, but this affect a specific functionality of the software which should not concern you. I will fix it as soon as I have time.

Copy link

codecov bot commented Aug 25, 2025

Codecov Report

❌ Patch coverage is 89.15663% with 9 lines in your changes missing coverage. Please review.
✅ Project coverage is 91.29%. Comparing base (c5d3d25) to head (61a3bbb).

Files with missing lines Patch % Lines
gpm/dataset/coords.py 63.63% 4 Missing ⚠️
gpm/dataset/dataset.py 86.95% 3 Missing ⚠️
gpm/dataset/granule.py 81.81% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #81      +/-   ##
==========================================
+ Coverage   91.18%   91.29%   +0.10%     
==========================================
  Files         135      135              
  Lines       17214    17270      +56     
==========================================
+ Hits        15696    15766      +70     
+ Misses       1518     1504      -14     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@coveralls
Copy link

Coverage Status

coverage: 91.291% (+0.1%) from 91.182%
when pulling 61a3bbb on add-open_mfdataset
into c5d3d25 on main.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants