Skip to content

Invalidation & TTFX reduction progress: March-April 2026 summary #3338

@ChrisRackauckas-Claude

Description

@ChrisRackauckas-Claude

Overview

This issue summarizes the invalidation and TTFX (time-to-first-execution) improvement work across the SciML ecosystem from February to April 2026. It documents what was measured, what was fixed, what remains, and provides reproducible benchmarks.

All measurements use Julia 1.11.9 on Linux x86_64, using SnoopCompile/SnoopCompileCore for invalidation analysis and @elapsed for TTFX.


OrdinaryDiffEqCore Invalidations: Three Eras

Since OrdinaryDiffEqBDF can't be loaded against old registry versions (NonlinearSolve compat breakage), the cleanest comparison uses OrdinaryDiffEqCore — which contains the core dependency chain (Static, Polyester, FillArrays, DataStructures, etc.) without the NonlinearSolve complications.

Era 1: Feb 2026 (OrdinaryDiffEqCore v3.10, FillArrays + Static + Polyester)

Package Invalidations Root Cause
Static 137 convert(::Type{T<:Number}, ::StaticInt/Bool/Float64) — 95 children. any(::Tuple{StaticBool...}) — 31 children. Type piracy on broad Base types.
DataStructures 58 iterate on SwissDict (33 children), issubset on SortedSet (12)
FillArrays 30 print_matrix_row for FillArrays union types supersedes Base method — 25 children
SciMLBase 7 Moshi.Data variants
CloseOpenIntervals 5 Brought in by Static chain
Total 237 17 trees

Era 2: Apr 2026 registered (OrdinaryDiffEqCore v3.27, Static + Polyester, no FillArrays)

Package Invalidations Root Cause
Static 137 Same as Era 1 (identical Static.jl version)
DataStructures 59 Same as before (+1 from version change)
SciMLBase 7 Moshi.Data variants
CloseOpenIntervals 5 Same as before
Total 208 14 trees

FillArrays was removed between these eras, dropping 30 invalidations (237 → 208, but DataStructures went up by 1).

Era 3: v7 branch (no Polyester, no Static, no FillArrays)

Package Invalidations Root Cause
DataStructures 58 Same as always
SciMLBase 7 Moshi.Data variants (will go away with Moshi removal from SciMLBase)
Total 65 5 trees

Summary: 237 → 208 → 65 invalidated methods (73% reduction from Feb to v7 branch)


Full OrdinaryDiffEqBDF Stack: Invalidation Breakdown (Era 2, registered)

Loading the full BDF solver stack (which adds NonlinearSolve, ForwardDiff, StaticArrays, etc.) produces 2,442 total invalidated methods across 60 trees:

Package Invalidations % Root Cause
Static.jl 732 30% ifelse(::Static.False/True, x, y) — type piracy on ifelse(::Any, ::Any, ::Any) invalidates max, all ForwardDiff Jacobian computations, NonlinearSolve trust region. 261 children each for the two ifelse trees. Also convert(::Type{T<:Number}, ::StaticInt) (95 children).
StaticArrays 480 20% convert(::Type{Array{T,N}}, ::SizedArray) supersedes Base's convert(::Type{T}, ::AbstractArray), invalidating all compiled Vector{Float64} conversions
DataStructures 225 9% SwissDict iterators, SortedSet issubset, SparseIntSet zip
NonlinearSolve 166 7% Extension loading
NonlinearSolveFirstOrder 163 7% Extension loading
DiffEqBase 161 7% Various method overwrites
NonlinearSolveBaseSparseMatrixColoringsExt 150 6% is_extension_loaded(::Val{:SparseMatrixColorings}) cascading
DifferentiationInterfaceSparseMatrixColoringsExt 104 4% Extension loading
OrderedCollections 60 2% Part of DataStructures
CloseOpenIntervals 57 2% From Static chain
ForwardDiff 48 2%
Others 96 4% Various extensions and small sources

The dependency chains that were eliminated

Polyester.jl → Static.jl chain (~11 packages removed by #3336):

Polyester → PolyesterWeave → ThreadingUtilities
         → Static → StaticArrayInterface → StrideArraysCore
                                          → CloseOpenIntervals
                                          → LayoutPointers
                  → CPUSummary
                  → BitTwiddlingConvenienceFunctions
                  → SIMDTypes

FillArrays (removed between Era 1 and Era 2):


TTFX Benchmarks (Julia 1.11.9, 3 runs each, pre-compiled)

OrdinaryDiffEqBDF: Lorenz problem

Metric Old (~Mar 3, v1.20) Registered (v1.24) v7 (no Polyester)
using OrdinaryDiffEqBDF 4.76s 5.36s 5.26s
solve(prob, FBDF()) 0.30s 0.30s 0.30s
solve(prob, QNDF()) 4.26s 3.20s 3.14s
  • QNDF TTFX improved 26% (4.26s → 3.14s) over the month
  • FBDF TTFX unchanged at 0.30s — precompile workloads doing their job
  • Load time increased ~0.5s from expanded precompile workloads (net positive: trades load for TTFX)
  • Polyester removal had negligible direct TTFX impact (benefit is reduced invalidation for downstream code)

What Was Fixed

Dependency removal

  1. FillArrays — removed from OrdinaryDiffEqBDF dep tree (between Feb and Apr 2026)
  2. Static.jl direct dep — removed from DiffEqBase and OrdinaryDiffEqCore (Remove Static.jl and StaticArrayInterface; use FastBroadcast Serial/Threaded #3299, Remove StaticArrayInterface, bump FastBroadcast/RAT compat #3309, v7 branch)
  3. Polyester.jl — removed from OrdinaryDiffEqCore and all 13 subpackages (Remove Polyester.jl dependency to eliminate Static.jl invalidations #3336). @threaded PolyesterThreads() falls back to Threads.@threads :static (equivalent for 2-4 iteration counts). Eliminated entire 11-package transitive chain.

Precompile workloads

  1. DiffEqBase (Add precompilation workload for common operations DiffEqBase.jl#1246, Higher order calculations in ImplicitEulerExtrapolation #1254, New exponential integrators #1291):
    • calculate_residuals: 0.14s → 0.0002s (~700x)
    • ODE_DEFAULT_NORM: 0.017s → 0.00002s (~850x)
    • ODE_DEFAULT_UNSTABLE_CHECK: 8.7ms → 0.003ms (2900x)
    • SubArray+Dual broadcast: 3.05s → 0.80s (73% reduction)
  2. OrdinaryDiffEqSDIRK (Add precompilation workloads for OrdinaryDiffEqSDIRK #3099): TRBDF2 first-solve after KenCarp4: ~3.0s → ~0.12s (25x)

Invalidation fixes

  1. ForwardDiff extension (Gate promote_f on forwarddiffs_model trait to prevent ForwardDiff extension invalidation DiffEqBase.jl#1281): Gated promote_f to prevent cascade
  2. FunctionWrappersWrappers v1.0 (Fix FunctionWrappersWrappers v1.0 compat in DiffEqBase ForwardDiff ext #3332, Cherry-pick FunctionWrappersWrappers v1.0 fix to v7 #3337): Fixed constructor API for v7 branch

SciMLBase improvements

  1. Moshi.jl removal (Replace Moshi with plain Julia structs for Clocks SciMLBase.jl#1295): Moshi's @data macro was 21% of precompile time. Replaced with plain Julia: 3731ms → 0.1ms. SciMLBase precompilation: 17.7s → 13.6s (23% reduction)

What Remains / Future Work

Remaining invalidation sources in full BDF stack (~1,650 estimated after Polyester removal)

Package Est. Invalidations Actionable?
StaticArrays ~480 Hard — real dependency (SVector, MVector used everywhere). convert(::Type{Array{T,N}}, ::SizedArray) is the root. Upstream fix needed.
DataStructures ~225 Moderate — only BinaryHeap/FasterForward used. Could vendor those or use lighter dep.
NonlinearSolve stack ~330 Extension loading — inherent to mechanism
DiffEqBase ~161 Some may be addressable
SparseMatrixColorings exts ~254 Extension loading — inherent
ForwardDiff ~48 Likely inherent
Moshi/SciMLBase ~7 Will be eliminated when Moshi removal propagates

Open questions

  • Can DataStructures be replaced with a lighter dependency (just BinaryHeap)?
  • Can StaticArrays convert methods be made narrower upstream?
  • Is there a way to reduce extension-loading invalidation cascades (NonlinearSolve, SparseMatrixColorings)?

Reproducing These Measurements

Invalidation analysis script
using SnoopCompileCore

invalidations = @snoop_invalidations begin
    using OrdinaryDiffEqBDF  # or OrdinaryDiffEqCore for smaller scope
end

using SnoopCompile

trees = SnoopCompile.invalidation_trees(invalidations)
sort!(trees, by=t->SnoopCompile.countchildren(t), rev=true)

pkg_counts = Dict{String,Int}()
for tree in trees
    pkg = string(Base.moduleroot(tree.method.module))
    pkg_counts[pkg] = get(pkg_counts, pkg, 0) + SnoopCompile.countchildren(tree)
end
for (pkg, count) in sort(collect(pkg_counts), by=x->x[2], rev=true)
    println("  $pkg: $count")
end
println("Total: ", sum(SnoopCompile.countchildren(t) for t in trees; init=0))
TTFX benchmark script
t_load = @elapsed using OrdinaryDiffEqBDF

prob = ODEProblem(
    (du,u,p,t) -> (du[1]=10(u[2]-u[1]); du[2]=u[1]*(28-u[3])-u[2]; du[3]=u[1]*u[2]-8/3*u[3]),
    [1.0, 0.0, 0.0], (0.0, 1.0))

t_fbdf = @elapsed solve(prob, FBDF())
t_qndf = @elapsed solve(prob, QNDF())

println("load=$(round(t_load,digits=2))s FBDF=$(round(t_fbdf,digits=2))s QNDF=$(round(t_qndf,digits=2))s")

Related PRs and Issues

OrdinaryDiffEq.jl

DiffEqBase.jl

SciMLBase.jl

DifferentialEquations.jl

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions