Add xDeepONet family to experimental models#1576
Conversation
Introduces physicsnemo.experimental.models.xdeeponet — a config-driven, unified implementation of eight DeepONet-based operator-learning architectures for both 2D and 3D spatial domains: - deeponet, u_deeponet, fourier_deeponet, conv_deeponet, hybrid_deeponet (single-branch variants) - mionet, fourier_mionet (two-branch multi-input variants) - tno (Temporal Neural Operator; branch2 = previous solution) Features: - Composable spatial branches (Fourier, UNet, Conv in any combination) - Three decoder types: mlp, conv, temporal_projection - Automatic spatial padding to multiples of 8 - Automatic trunk coordinate extraction (time or grid) - Optional adaptive pooling (internal_resolution) for resolution-agnostic training and inference Uses physicsnemo.models.unet.UNet as the UNet sub-module; a small internal adapter tiles a short time axis to reuse the library's 3D UNet for 2D spatial branches. Imports spectral, convolutional, and MLP layers from physicsnemo.nn and physicsnemo.models.mlp. Includes 29 unit tests covering all variants (2D/3D), decoder types, temporal projection, target_times override, gradient flow, and adaptive pooling. Related discussion with code owners: - Placed under experimental/ per PhysicsNeMo convention for new models. - Custom UNet dropped in favour of library UNet. - Tests under test/experimental/models/ for CI coverage. Signed-off-by: wdyab <wdyab@nvidia.com> Made-with: Cursor
Greptile SummaryThis PR adds the Important Files Changed
Reviews (11): Last reviewed commit: "xdeeponet: fix _build_conv_encoder for "..." | Re-trigger Greptile |
Fix six issues flagged by the Greptile review: - Make DeepONetWrapper / DeepONet3DWrapper inherit from physicsnemo.core.module.Module (MOD-001). Core DeepONet / DeepONet3D also pass proper MetaData dataclasses. - Raise ValueError at __init__ when mionet / fourier_mionet / tno are constructed without branch2_config (prevents silent degradation to a single-branch model). - Add optional output_window constructor parameter so the temporal_projection decoder registers temporal_head at __init__, producing a deterministic state_dict that round-trips cleanly. set_output_window is retained for backwards compatibility. - Raise ValueError from MLPBranch when num_layers < 2. - Convert public docstrings to r-prefixed raw strings with Parameters / Forward / Outputs sections and LaTeX shape notation per MOD-003. - Add jaxtyping.Float annotations and torch.compiler.is_compiling() guarded shape validation to all public forward methods (MOD-005, MOD-006). Signed-off-by: wdyab <wdyab@nvidia.com> Made-with: Cursor
Resolves CHANGELOG.md conflict under "### Added" by keeping the xDeepONet entry alongside the updated GLOBE bullet from main. Signed-off-by: wdyab <wdyab@nvidia.com> Made-with: Cursor
Fix the new P1 issue flagged in the second Greptile review and close two secondary gaps the summary called out: - DeepONet.forward / DeepONet3D.forward: raise RuntimeError when decoder_type='temporal_projection' is used but temporal_head is still None (i.e. the user neither passed output_window at construction nor called set_output_window before forward). Previously the silent ``if temporal_head is not None`` skip returned (B, H, W, width) instead of (B, H, W, K). - Deduplicate the VALID_VARIANTS list: pulled to a module-level _VALID_VARIANTS tuple; both DeepONet and DeepONet3D still expose it as the VALID_VARIANTS class attribute for a stable public API. - Extend the parametrized test lists to cover fourier_deeponet, hybrid_deeponet, and fourier_mionet, and add a dedicated TestFourierBranchPaths class with num_fourier_layers > 0 so the spectral-conv code path in SpatialBranch / SpatialBranch3D is actually exercised in CI. - Add a TestTemporalProjectionGuard::test_forward_without_output_window_raises regression test for the new RuntimeError. Signed-off-by: wdyab <wdyab@nvidia.com> Made-with: Cursor
Two new P1 issues flagged on 85076f6: - Case-sensitive decoder_type check: __init__ lowered ``decoder_type`` into ``self.decoder_type`` but then branched on the raw argument (``if decoder_type == "temporal_projection":``) and forwarded the raw value to ``_build_decoder``. A user passing ``decoder_type="MLP"`` or ``"Temporal_Projection"`` ended up with ``Unknown decoder_type: MLP`` bubbling out of ``_build_decoder``. Both branches of the check now use ``self.decoder_type``; same fix in ``DeepONet3D.__init__``. - MLP branch + decoder_type='temporal_projection' silently returned (B, T, width) instead of (B, K) because the MLP-branch path in ``forward`` never consulted ``self._temporal_projection``. The incompatibility is static, so reject it at __init__ with a descriptive ``ValueError`` rather than at forward. Same guard in ``DeepONet3D.__init__``. Regression tests: ``TestDecoderTypeNormalization`` (mixed-case ``"MLP"`` / ``"Temporal_Projection"`` accepted) and ``TestMLPBranchTemporalProjectionGuard`` (2D and 3D both reject the invalid combination). Signed-off-by: wdyab <wdyab@nvidia.com> Made-with: Cursor
Proactive audit on top of Greptile's round-4 findings. All plausible silent-degradation combinations at the config boundary now fail loudly at __init__ instead of producing wrong shapes or cryptic PyTorch errors at forward time. Construction-time guards added to both DeepONet and DeepONet3D: - Unknown decoder_type is rejected up front against a new module-level ``_VALID_DECODER_TYPES`` set (previously deferred to ``_build_decoder`` and only surfaced on the non-temporal branch). - MLPBranch branch1 paired with decoder_type='conv' is rejected (would otherwise crash inside ``Conv2d`` with a generic "Expected 3D or 4D input" message). Unified with the existing temporal_projection guard into a single check. - MLPBranch branch1 paired with a non-MLPBranch branch2 is rejected (element-wise product assumed matching ranks; previously broadcast nonsensically or raised a cryptic dim mismatch at forward). Regression tests: - ``TestMLPBranchConvDecoderGuard`` -- 2D/3D - ``TestMixedBranchTypeGuard`` -- 2D/3D - ``TestInvalidDecoderTypeGuard`` -- 2D/3D Full suite: 47 passed. Signed-off-by: wdyab <wdyab@nvidia.com> Made-with: Cursor
``_build_conv_encoder`` called ``get_activation`` directly without
the ``sin`` special-case handling used in every other activation
site in ``branches.py``. Passing
``{"encoder": {"type": "conv", "activation_fn": "sin"}}`` therefore
raised ``KeyError: Activation function sin not found``.
``torch.sin`` is a bare callable and cannot be placed inside an
``nn.Sequential`` (which requires ``nn.Module`` instances), so the
fix introduces a small ``_SinActivation`` wrapper module alongside
``_build_conv_encoder``. The helper is module-level and is called
from both ``DeepONet`` and ``DeepONet3D``; only one fix site exists
despite the function being invoked from both classes.
Regression test ``TestConvEncoderSinActivation`` constructs a
multi-layer conv encoder with ``activation_fn="sin"`` and runs a
forward pass to confirm neither the ``KeyError`` nor a
``nn.Sequential`` ``TypeError`` resurface.
Signed-off-by: wdyab <wdyab@nvidia.com>
Made-with: Cursor
|
Want your agent to iterate on Greptile's feedback? Try greploops. |
|
All Greptile feedback addressed across five rounds. Main fixes: wrappers inherit from |
There was a problem hiding this comment.
Is it standard practice to have documentation shipped in some markdown files? It is not a critical issue for this PR, but that's a significant shift from docs in docstring + rst files... I am worried this will lead to more fragmenattion of the API docs,, which would make information more difficult to discover
There was a problem hiding this comment.
IMO it's ok to ship some informative docs right here with the code like this. It's especially helpful in the days of agents and such.
That said it's no substitute for API docs like we ship with rst on the physicsnemo docs, either, and those should be the source of truth.
There was a problem hiding this comment.
tl;dr it's ok to drop hints here but not fully expect docs here, what do you think of that strategy @CharlelieLrt? We have some stuff like that with mesh, and I put some .md in the datapipes too. It's not about "how to use it" and more about "this is why it's designed this way in the code"
| :math:`K` points instead of the times extracted from ``x``, | ||
| enabling autoregressive temporal bundling where :math:`K \neq T`. | ||
|
|
||
| Outputs |
There was a problem hiding this comment.
Examples section missing in nearly all doctrings (at least those that are user-facing)
There was a problem hiding this comment.
Most of these test are kinda worthless. It should be:
- test for the constructor (both default parameters and custom ones) -> check all public attributes values
- for all public methods, including the forward:
- test instantiation + method non-regression vs. golden files
- test loading from checkpoint (.mdlus) + method non-regression vs same golden file
- test gradient flow
- test compile
Model and output should be as small and minimal as possible. No need to test internal submodule, layers and such, unless those are meant to be upstreamed to physicsnemo/nn
Summary
Introduces
physicsnemo.experimental.models.xdeeponet— a config-driven,unified implementation of eight DeepONet-based operator-learning
architectures for both 2D and 3D spatial domains:
deeponet,u_deeponet,fourier_deeponet,conv_deeponet,hybrid_deeponet— single-branch variantsmionet,fourier_mionet— two-branch multi-input variantstno— Temporal Neural Operator (branch2 = previous solution)This is the first of several PRs restructuring the Neural Operator
Factory per discussion with code owners. Subsequent PRs will upstream
xFNO (experimental) and refactor the reservoir-simulation NOF example
to consume these library models.
Closes Issue #1575
Key features
mlp,conv,temporal_projectionDesign decisions (per @coreyjadams guidance)
experimental/per the convention for new models.test/experimental/models/for CI coverage.Checklist
test/experimental/models/test_xdeeponet.py)CHANGELOG.mdis up to dateTest plan
decoder types, temporal projection,
target_timesoverride,gradient flow, adaptive pooling)
ruff check,ruff format,interrogate,markdownlint,licensepre-commit hooks passphysicsnemo/experimental/andtest/experimental/Related