Skip to content

Conversation

@SebastianM-C
Copy link
Member

Since we always dispatch on the optimizer, it makes much more sense
to have this as the first argument (and first type parameter).

Checklist

  • Appropriate tests were added
  • Any code changes were done in a way that does not break public API
  • All documentation related to code changes were updated
  • The new code follows the
    contributor guidelines, in particular the SciML Style Guide and
    COLPRAC.
  • Any new documentation only uses public API

Additional context

Add any other context about the problem here.

@ChrisRackauckas
Copy link
Member

Make this a 4.0 and let's just do it. We are basically the only ones that depend on OptimizationBase directly so it's okay this didn't get lumped into the v3.

@SebastianM-C
Copy link
Member Author

I'm not sure how to handle the CI part. The Core CI for OptimizationBase uses OptimizationOptimJL, so if we have a breaking change, we can't CI it without doing both at the same time.

Looking at runtests.jl, I see that we have

if GROUP == "All" || GROUP == "Core"
dev_subpkg("OptimizationOptimJL")
dev_subpkg("OptimizationOptimisers")
dev_subpkg("OptimizationMOI")
elseif GROUP == "GPU" || GROUP == "OptimizationPolyalgorithms"
dev_subpkg("OptimizationOptimJL")
dev_subpkg("OptimizationOptimisers")
elseif GROUP == "OptimizationNLPModels"
dev_subpkg("OptimizationOptimJL")
dev_subpkg("OptimizationMOI")
end

so there are certain groups of packages that are intended to go together, but if we look at the CI logs, the dev commands don't seem to have any effect. I think that that's because we go through Pkg.test before we even get to the dev_subpkgs. I think that going with [sources] instead of dev_subpkg might work better, should I try that? Also, if I'm messing with the CI setup, would it be reasonable to do v1.11 & lts for now to reduce the noise?

Btw, if we do a v4, can we simplify the supports_opt_cache_interface too? Right now we have 2 separate functions with the same name, one in OptimizationBase and one in SciMLBase and we only use the SciMLBase in __solve, but almost all optimizers extend both because (on master & latest SciMLBase) both define that function.

@ChrisRackauckas
Copy link
Member

We just need to merge, release OptimizationBase v4, then downstream test via sources, then merge and release stuff. But the fact that v1.12 came out and the autodiff packages red X our CI just absolutely murders our ability to check this stuff automatically, so 😅 it's a bit rough right now.

Let me make sure v2 backports are in a good spot first then we do this.

Also, maybe we might want an IIP dispatch on this sooner or later?

@ChrisRackauckas
Copy link
Member

Btw, if we do a v4, can we simplify the supports_opt_cache_interface too? Right now we have 2 separate functions with the same name, one in OptimizationBase and one in SciMLBase and we only use the SciMLBase in __solve, but almost all optimizers extend both because (on master & latest SciMLBase) both define that function.

Yes, let's make something in SciMLBase like allowsinit or allows_init_iterator or something. DifferentialEquations, NonlinearSolve, etc. should just use the same trait.

@SebastianM-C
Copy link
Member Author

SebastianM-C commented Oct 20, 2025

Yes, let's make something in SciMLBase like allowsinit or allows_init_iterator or something. DifferentialEquations, NonlinearSolve, etc. should just use the same trait.

SciML/SciMLBase.jl#1157 adds has_init

Regarding CI, should I change the CI to use 1.11 instead of 1.12?

Also, maybe we might want an IIP dispatch on this sooner or later?

@ChrisRackauckas what do you mean by the IIP dispatch?

@ChrisRackauckas
Copy link
Member

Dispatching on whether its functions are in place

@SebastianM-C
Copy link
Member Author

so all the AD generated functions would have both variants and the loss is oop only?

@ChrisRackauckas
Copy link
Member

They are supposed to, they just don't right now.

@SebastianM-C SebastianM-C force-pushed the cleanup branch 3 times, most recently from 1b5d318 to 60285ce Compare October 23, 2025 11:54
SebastianM-C and others added 2 commits October 23, 2025 17:04
Since we always dispatch on the optimizer, it makes much more sense
to have this as the first argument (and first type parameter).
@SebastianM-C
Copy link
Member Author

I think has_init is wrongly defined, it should be on Any and not AbstractSciMLAlgorithm since we also sometimes use types from other packages for that.

@SebastianM-C
Copy link
Member Author

SciML/SciMLBase.jl#1161

@SebastianM-C SebastianM-C force-pushed the cleanup branch 5 times, most recently from 003c646 to 2b5dfe8 Compare October 23, 2025 17:34
SebastianM-C and others added 4 commits October 23, 2025 20:37
Updates all optimizer packages to use SciMLBase.has_init (requires
SciMLBase 2.122).

Co-Authored-By: Claude <[email protected]>
Co-authored-by: Claude <[email protected]>
we can't use the julia-actions/julia-runtest@v1 action
because that will resolve the environment before we get the chance to dev the required packages
for OptimizationAuglag & OptimizationMultistartOptimization
SebastianM-C and others added 11 commits October 23, 2025 21:16
update location for IncompatibleOptimizerError

fix bug in error
since the package directly depends on the other
optimizers, we need to dev at the same time
expose linear_solver & kkt_system
pass barrier options via MadNLP structs
pass quasi_newton_options via MadNLP structs

Co-authored-by: Claude <[email protected]>
Co-authored-by: Claude <[email protected]>
@SebastianM-C SebastianM-C changed the title reorder fields in OptimizationCache for cleaner dispatches Clean up and update to OptimizationBase@v4 Oct 24, 2025
SebastianM-C and others added 2 commits October 24, 2025 04:01
The Enzyme extension's Hessian-vector product (HVP) implementation was
incorrectly using `Enzyme.make_zero(x)` which zeroed out the tangent
direction vector `v`, causing the forward-mode AD to have no direction
to differentiate in. This resulted in HVP always returning zeros.

Fixed by using the correct forward-over-reverse AD approach with the
existing `inner_grad` function, which properly computes H*v by taking
the gradient ∇f(θ) in reverse mode and differentiating it in forward
mode along direction v.

Fixes both in-place (OptimizationFunction{true}) and out-of-place
(OptimizationFunction{false}) versions.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@SebastianM-C
Copy link
Member Author

SebastianM-C commented Oct 24, 2025

Changes Summary

Core Interface Refactoring (2 commits)

  1. Reorder fields in OptimizationCache (f228263)
    - Moved optimizer to first field/type parameter for cleaner dispatches
  2. Bump OptimizationBase compat (f39ccb2)
    - Updated compatibility after cache field reordering

API Modernization (1 commit)

  1. Replace supports_opt_cache_interface with has_init (2786ea5)
    - Migrated all optimizers to use SciMLBase.has_init (requires SciMLBase 2.122+)

Test Infrastructure & Julia 1.11 Support (4 commits)

  1. Make OptimizationBase tests runnable on Julia v1.11 (b24295b)
    - Updated test Project.toml and fixed compatibility issues
  2. Fix deprecations (7b7a143)
  3. Make CI run for LTS (279ae96)
  4. Fix test env setup (d9c1eb8)

Package-Specific Test Fixes (9 commits)

  1. OptimizationPRIMA: Fix cache construction (e1645a8)
  2. OptimizationSciPy: Fix tests (b732ced)
  3. OptimizationPolyalgorithms: Fix test env (ed1db98)
  4. OptimizationNLPModels: Fix test env (7a48505)
  5. OptimizationMadNLP:
    - Use [email protected] API (d33c663)
    - Cleanup interface (9fd21df)
    - Fix tests (70552a4)
  6. OptimizationManopt: Avoid errors and warnings in tests (9bbb4ec)
  7. IncompatibleOptimizerError location fixes (2 commits: b2a7eb2, becf556)

Regarding the fixes for OptimizationManopt (13) we were getting errors in the tests because the implementation was trying to call functions that we not generated by AD, so I changed SciMLBase.requireshessian to true. I understood that this is not necessarily required from #1009, but if we don't change the implementation we to use the approximate formulations, we will get an error. Let me know if this needs to be reverted.

Documentation & Environment (3 commits)

  1. Fix docs env (7b8dfd0)
    - Updated OptimizationBase compat: "2, 3" → "4"
    - Updated SciMLBase compat: "2.30.0" → "2.122.1"
    - Added complete [sources] section with all 15+ lib subpackages
  2. Skip test_stale_deps on 1.10 (6839663 & 8d4ac3a)
    - Workaround for Julia 1.10 lacking [sources] support

Bug Fixes (1 commit)

  1. Fix Enzyme HVP returning zeros (b0860b8)
    - Fixed Hessian-vector product returning [0, 0] instead of correct values
    - Root cause: Enzyme.make_zero(x) zeroing out tangent direction
    - Fixed both in-place and out-of-place implementations

SebastianM-C and others added 6 commits October 24, 2025 04:32
`lag_h!` requires `cons_h!` for σ=0 case. Now generates `cons_h!`
whenever `lag_h==true` to prevent `MethodError`.

Co-authored-by: Claude <[email protected]>
When computing the Lagrangian Hessian with lag_h!, the case σ=0 requires
special handling since it reduces to just the weighted sum of constraint
Hessians (Σᵢ λᵢ∇²cᵢ) without the objective contribution.

Previously, this case would fail when cons_h was not explicitly requested
but lag_h was, because the constraint Hessian preparations were not created.

This commit:
- Always creates constraint Hessian preparations when either cons_h or lag_h is true
- Adds cons_h_weighted!(H, θ, λ) function to compute the weighted sum directly into H
- Updates lag_h! to use cons_h_weighted! when σ=0

This fixes the edge case in OptimizationMadNLP where the solver could hit
σ=0 during iterations, particularly with exact Hessian and sparse KKT systems.

Applies to both OptimizationDIExt and OptimizationZygoteExt.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Adds regression test to ensure lag_h! correctly handles the σ=0 case
where the Lagrangian Hessian reduces to just the weighted sum of
constraint Hessians. Tests both single and multiple constraint scenarios
with different AD backends (ForwardDiff, ReverseDiff, Zygote).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@ChrisRackauckas ChrisRackauckas merged commit f880035 into SciML:master Oct 24, 2025
72 of 81 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants