Skip to content

[Python] Lazy reading and writing.#2878

Draft
evetion wants to merge 2 commits intomainfrom
feat/lazy-loading
Draft

[Python] Lazy reading and writing.#2878
evetion wants to merge 2 commits intomainfrom
feat/lazy-loading

Conversation

@evetion
Copy link
Member

@evetion evetion commented Feb 6, 2026

This refactors the read/write (and _load/_save) methods considerably and introduces:

  • Model._save(/, external=True, internal=True) to control writing all geopackage tables (internal) and/or all arrow/netcdf files (external).
  • TableModel._save(filepath) to write individual paths

Together with the existing _write_toml, this allows for more granular control of writing for the HydroMT plugin.

  • Model.read(/, lazy=true) only reads the model config, and nothing else. It means no tables, or empty ones in the case of Node/Link, which can be written to disk as well. It's much like the Model(startime, endtime, crs) constructor result in that sense.
  • Model.load(external=True, internal=True) to load all tables from disk (overwriting/resetting them) if any changes were done.
  • TableModel.load() for individual tables, same caveats as above.

This also refactors ChildModel and introduces a ParentModel (both now using composition instead of inheritance) so an empty lazy table can find the toml path of its (grandparent) Model. The ChildModel changes are (roughly) the same as #2876. If the ParentModel is too much, we could also allow loading with a context manager (with Model.context(): Model.node.table.load()?

Still needs a bit of cleanup.

@hboisgon

@evetion evetion requested a review from visr February 6, 2026 11:22
@visr
Copy link
Member

visr commented Feb 6, 2026

This seems fine to me API wise. I guess we'd recommend the eager way as we do now, and document the lazy API for more advanced use cases?

It looks like #2876 is ready, can you check and merge?

@evetion
Copy link
Member Author

evetion commented Feb 6, 2026

This seems fine to me API wise. I guess we'd recommend the eager way as we do now, and document the lazy API for more advanced use cases?

Yeah, the _save stuff is purely internal (which I find acceptable to use in a hydromt plugin, with the added test here). But since read is public anyway, the lazy kwarg becomes public, and we need to have a public load as well(?). It is a bit complex, and prone to user confusion (some tables empty), so the default should always be eager IMHO.

@visr
Copy link
Member

visr commented Feb 6, 2026

Shouldn't we strive for hydromt to only use public API though? Do you see issues with adding the new _save methods/kwargs to write?

@evetion
Copy link
Member Author

evetion commented Feb 6, 2026

We can easily make new write methods to make it public yes. But maybe we should put it into an advanced documentation section? It becomes easy to write incomplete and thus invalid models on read.

The lazy load is actually even more difficult, and sorta requires the refactor of the multinodetable. Now you can create or load tables that have a nodeid clash with a table that's not loaded.

That's why I'm ok with the hydromt plugin handling that state in a consistant state internally.

@visr
Copy link
Member

visr commented Feb 6, 2026

Yeah it is fine if it is advanced docs with warnings. Are you planning on doing the multinodetable refactor soon? We could revisit this after that.

@evetion
Copy link
Member Author

evetion commented Feb 6, 2026

Yeah it is fine if it is advanced docs with warnings. Are you planning on doing the multinodetable refactor soon? We could revisit this after that.

We should merge #2876, rebase this PR and merge it so we can start the work on HydroMT. Then the refactor should come before the next release, when is that? ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants