Skip to content

uw.function.evaluate() never hits the lambdify cache (fresh sympy.Dummy per call → +1 cache entry every call) #194

@lmoresi

Description

@lmoresi

Symptom

Repeated identical uw.function.evaluate(expr, pts) calls never hit the lambdify cache. underworld3.function.pure_sympy_evaluator._lambdify_cache grows by one entry on every call and sympy.lambdify recompiles every time — the cache provides zero benefit on the hot evaluate path and grows unbounded across a long run.

call 1: cache size = 1
call 2: cache size = 2
call 3: cache size = 3
...                       # +1 every call, never a hit

Minimal reproducer (tiny mesh, no scale dependence)

import numpy as np, sympy, underworld3 as uw
from underworld3.function.pure_sympy_evaluator import _lambdify_cache, clear_lambdify_cache

m = uw.meshing.UnstructuredSimplexBox(minCoords=(0.,0.), maxCoords=(1.,1.), cellSize=0.5)
x = m.X[0]; expr = sympy.erf(5*x - 2)/2
pts = np.array([[0.2,0.3],[0.5,0.5],[0.8,0.1]])

clear_lambdify_cache()
for i in range(1, 8):
    uw.function.evaluate(expr, pts, rbf=True)
    print(f"call {i}: cache size = {len(_lambdify_cache)}")
# -> 1,2,3,4,5,6,7  (expected for a working cache: 1,1,1,1,1,1,1)

Root cause

get_cached_lambdified (src/underworld3/function/pure_sympy_evaluator.py) keys the cache on:

cache_key = (_expr_hash(expr), symbols_tuple, modules_tuple)   # _expr_hash = md5(sympy.srepr(expr))

On the evaluate() pure-sympy path the mesh coordinate is substituted with a freshly created sympy.Dummy every call. Dummy carries a globally-unique, monotonically increasing dummy_index, and sympy.srepr embeds it. Spying on the expression handed to get_cached_lambdified across 3 identical calls:

call 0: Mul(Rational(1,2), erf(Add(Mul(Integer(5), Dummy('_coord_0', dummy_index=4670761)), Integer(-2))))
call 1: Mul(Rational(1,2), erf(Add(Mul(Integer(5), Dummy('_coord_0', dummy_index=4670765)), Integer(-2))))
call 2: Mul(Rational(1,2), erf(Add(Mul(Integer(5), Dummy('_coord_0', dummy_index=4670769)), Integer(-2))))

symbols_tuple is stable (('__coord_0','__coord_1')); only the in-expression Dummy.dummy_index churns. Different srepr → different md5 → different cache key → guaranteed miss + new entry every call. The cache mechanism itself is correct; the key is computed from a representation that is unstable across identical calls.

Fix direction (proposed, not yet implemented)

Make the cache key invariant to Dummy index churn — e.g. in _expr_hash, canonicalise dummies by name before srepr (the cache key already carries symbols_tuple separately, so name-stable placeholders are safe and the change is localised to pure_sympy_evaluator.py). Alternative (riskier): stop minting fresh Dummys per call upstream in the coordinate-substitution / symbol-disambiguation path.

Targeted-hash-canonicalisation looks contained and low-risk; it should be validated against tests/test_0720_lambdify_optimization_paths.py plus a cache-hit assertion, and a before/after check that _lambdify_cache size stays flat under repeated identical calls.

Relationship to #171

#171 ("Volume integral evaluation time grows linearly") is a different root cause — PetscDSSetObjective accumulation on a shared DS in Integral.evaluate() (petsc_maths.pyx) — and was closed as un-actionable for lack of a small reproducer for that path. This issue is a separate code path (function.evaluatepure_sympy_evaluator), a different mechanism (cache-key instability), and does have a trivial mesh-independent reproducer. They share only the "linear growth across repeated calls" symptom family.

Impact

  • uw.function.evaluate() recompiles via sympy.lambdify on every call for pure-sympy expressions (mesh coordinates, analytic source terms, BCs). No caching benefit on a hot path.
  • _lambdify_cache grows without bound across a long run (one entry per evaluate call).
  • Any workflow that evaluates analytic expressions repeatedly (diagnostics, time-dependent forcing, BC updates) pays full lambdify cost every step.

Discovered while

Replacing the flaky wall-clock assertions in tests/test_0720_lambdify_optimization_paths.py::TestPerformanceExpectations with a real cache-hit assertion (PR #193). The old assert time2 <= time1 * 2 tolerance was loose enough to mask the recompile; a proper cache-contract test exposed it.

Underworld development team with AI support from Claude Code

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions