uw.function.evaluate() never hits the lambdify cache (fresh sympy.Dummy per call → +1 cache entry every call)

## Symptom

Repeated **identical** `uw.function.evaluate(expr, pts)` calls never hit the lambdify cache. `underworld3.function.pure_sympy_evaluator._lambdify_cache` grows by **one entry on every call** and `sympy.lambdify` recompiles every time — the cache provides zero benefit on the hot evaluate path and grows unbounded across a long run.

```
call 1: cache size = 1
call 2: cache size = 2
call 3: cache size = 3
...                       # +1 every call, never a hit
```

## Minimal reproducer (tiny mesh, no scale dependence)

```python
import numpy as np, sympy, underworld3 as uw
from underworld3.function.pure_sympy_evaluator import _lambdify_cache, clear_lambdify_cache

m = uw.meshing.UnstructuredSimplexBox(minCoords=(0.,0.), maxCoords=(1.,1.), cellSize=0.5)
x = m.X[0]; expr = sympy.erf(5*x - 2)/2
pts = np.array([[0.2,0.3],[0.5,0.5],[0.8,0.1]])

clear_lambdify_cache()
for i in range(1, 8):
    uw.function.evaluate(expr, pts, rbf=True)
    print(f"call {i}: cache size = {len(_lambdify_cache)}")
# -> 1,2,3,4,5,6,7  (expected for a working cache: 1,1,1,1,1,1,1)
```

## Root cause

`get_cached_lambdified` (`src/underworld3/function/pure_sympy_evaluator.py`) keys the cache on:

```python
cache_key = (_expr_hash(expr), symbols_tuple, modules_tuple)   # _expr_hash = md5(sympy.srepr(expr))
```

On the `evaluate()` pure-sympy path the mesh coordinate is substituted with a **freshly created `sympy.Dummy`** every call. `Dummy` carries a globally-unique, monotonically increasing `dummy_index`, and `sympy.srepr` embeds it. Spying on the expression handed to `get_cached_lambdified` across 3 identical calls:

```
call 0: Mul(Rational(1,2), erf(Add(Mul(Integer(5), Dummy('_coord_0', dummy_index=4670761)), Integer(-2))))
call 1: Mul(Rational(1,2), erf(Add(Mul(Integer(5), Dummy('_coord_0', dummy_index=4670765)), Integer(-2))))
call 2: Mul(Rational(1,2), erf(Add(Mul(Integer(5), Dummy('_coord_0', dummy_index=4670769)), Integer(-2))))
```

`symbols_tuple` is stable (`('__coord_0','__coord_1')`); only the in-expression `Dummy.dummy_index` churns. Different srepr → different md5 → different cache key → guaranteed miss + new entry every call. The cache *mechanism* itself is correct; the key is computed from a representation that is unstable across identical calls.

## Fix direction (proposed, not yet implemented)

Make the cache key invariant to `Dummy` index churn — e.g. in `_expr_hash`, canonicalise dummies by **name** before `srepr` (the cache key already carries `symbols_tuple` separately, so name-stable placeholders are safe and the change is localised to `pure_sympy_evaluator.py`). Alternative (riskier): stop minting fresh `Dummy`s per call upstream in the coordinate-substitution / symbol-disambiguation path.

Targeted-hash-canonicalisation looks contained and low-risk; it should be validated against `tests/test_0720_lambdify_optimization_paths.py` plus a cache-hit assertion, and a before/after check that `_lambdify_cache` size stays flat under repeated identical calls.

## Relationship to #171

#171 (\"Volume integral evaluation time grows linearly\") is a **different** root cause — `PetscDSSetObjective` accumulation on a shared DS in `Integral.evaluate()` (`petsc_maths.pyx`) — and was closed as un-actionable for lack of a small reproducer for *that* path. This issue is a separate code path (`function.evaluate` → `pure_sympy_evaluator`), a different mechanism (cache-key instability), and **does** have a trivial mesh-independent reproducer. They share only the \"linear growth across repeated calls\" symptom family.

## Impact

- `uw.function.evaluate()` recompiles via `sympy.lambdify` on every call for pure-sympy expressions (mesh coordinates, analytic source terms, BCs). No caching benefit on a hot path.
- `_lambdify_cache` grows without bound across a long run (one entry per evaluate call).
- Any workflow that evaluates analytic expressions repeatedly (diagnostics, time-dependent forcing, BC updates) pays full lambdify cost every step.

## Discovered while

Replacing the flaky wall-clock assertions in `tests/test_0720_lambdify_optimization_paths.py::TestPerformanceExpectations` with a real cache-hit assertion (PR #193). The old `assert time2 <= time1 * 2` tolerance was loose enough to mask the recompile; a proper cache-contract test exposed it.

Underworld development team with AI support from [Claude Code](https://claude.com/claude-code)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

uw.function.evaluate() never hits the lambdify cache (fresh sympy.Dummy per call → +1 cache entry every call) #194

Symptom

Minimal reproducer (tiny mesh, no scale dependence)

Root cause

Fix direction (proposed, not yet implemented)

Relationship to #171

Impact

Discovered while

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

uw.function.evaluate() never hits the lambdify cache (fresh sympy.Dummy per call → +1 cache entry every call) #194

Description

Symptom

Minimal reproducer (tiny mesh, no scale dependence)

Root cause

Fix direction (proposed, not yet implemented)

Relationship to #171

Impact

Discovered while

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions