Add enzyme to benchmark tests #1039

wsmoses · 2025-09-14T00:22:27Z

No description provided.

codecov · 2025-09-14T00:36:17Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 82.24%. Comparing base (7249158) to head (043c3a3).
⚠️ Report is 5 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1039      +/-   ##
==========================================
- Coverage   82.34%   82.24%   -0.11%     
==========================================
  Files          38       38              
  Lines        3949     3949              
==========================================
- Hits         3252     3248       -4     
- Misses        697      701       +4

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

coveralls · 2025-09-14T00:37:28Z

Pull Request Test Coverage Report for Build 17931429932

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

For more information on this, see Tracking coverage changes with pull request builds.
To avoid this issue with future PRs, see these Recommended CI Configurations.
For a quick fix, rebase this PR at GitHub. Your next report should be accurate.

Details

0 of 0 changed or added relevant lines in 0 files are covered.
27 unchanged lines in 6 files lost coverage.
Overall coverage decreased (-2.3%) to 80.28%

Files with Coverage Reduction	New Missed Lines	%
src/model.jl	1	86.73%
src/debug_utils.jl	3	89.27%
src/extract_priors.jl	3	60.53%
src/pointwise_logdensities.jl	3	93.06%
src/values_as_in_model.jl	4	72.41%
src/threadsafe.jl	13	62.16%

Totals
Change from base Build 17494778045:	-2.3%
Covered Lines:	3159
Relevant Lines:	3935

💛 - Coveralls

penelopeysm · 2025-09-15T10:05:54Z

benchmarks/src/DynamicPPLBenchmarks.jl

    :reversediff => ADTypes.AutoReverseDiff(; compile=false),
    :reversediff_compiled => ADTypes.AutoReverseDiff(; compile=true),
    :mooncake => ADTypes.AutoMooncake(; config=nothing),
+    :enzyme => ADTypes.AutoEnzyme(; config=nothing),


This doesn't work :(

doesn't work how? probably we need to use adtypes in the same way as in adtests (currently its a constructor error)

penelopeysm · 2025-09-15T10:06:45Z

benchmarks/benchmarks.jl

+    ("Smorgasbord", smorgasbord_instance, :typed, :enzyme, true),
    ("Loop univariate 1k", loop_univariate1k, :typed, :mooncake, true),
+    ("Loop univariate 1k", loop_univariate1k, :typed, :enzyme, true),
    ("Multivariate 1k", multivariate1k, :typed, :mooncake, true),
+    ("Multivariate 1k", multivariate1k, :typed, :enzyme, true),
    ("Loop univariate 10k", loop_univariate10k, :typed, :mooncake, true),
+    ("Loop univariate 10k", loop_univariate10k, :typed, :enzyme, true),
    ("Multivariate 10k", multivariate10k, :typed, :mooncake, true),
+    ("Multivariate 10k", multivariate10k, :typed, :enzyme, true),
    ("Dynamic", Models.dynamic(), :typed, :mooncake, true),
+    ("Dynamic", Models.dynamic(), :typed, :enzyme, true),
    ("Submodel", Models.parent(randn(rng)), :typed, :mooncake, true),
+    ("Submodel", Models.parent(randn(rng)), :typed, :enzyme, true),
    ("LDA", lda_instance, :typed, :reversediff, true),
+    ("LDA", lda_instance, :typed, :enzyme, true),
 ]


In the interests of not making this take too long, could we restrict the Enzyme ones to Smorgasboard, Dynamic, Submodel and LDA?

Might also depend on whether you want to test both forward- and reverse-mode.

yebai · 2025-09-16T10:54:16Z

ADTests already provides benchmarking for Enzyme: https://turinglang.org/ADTests/

The point of these simple benchmarks is to provide a ballpark indicator of whether PRs did something very bad for performance. Since these benchmarks capture general performance issues, such as allocation and type stability, we only need to run them on a few AD backends to avoid excessive CI time.

wsmoses · 2025-09-18T04:18:50Z

My thought was to put some Enzyme integration tests here to make it easier for @penelopeysm & myself to identify if a PPL change caused an error which prevents differentiation (I believe a majority of recent issues stem from DynamicPPL changes).

This seemed like a happy middle for not running all tests, but alternatively @yebai if you prefer we can also add Enzyme to the other AD tests (which frankly would also be good for users).

yebai · 2025-09-18T09:21:54Z

@wsmoses DynamicPPL has already been tested against Enzyme in an isolated CI. See, eg, here.

Speaking of experience, Enzyme is still fragile and breaks for innocent Julia code. This is consistent with a few other places where we test against Enzyme, such as Bijectors and AdvancedVI. To address these issues, the best approach is to have comprehensive unit tests inside Enzyme against Julia syntax, so one has the empirical guarantee of Julia syntax coverage. Then, for cases that Enzyme doesn't plan to support, one can provide clear docs so users can work around them.

For these reasons, I'd suggest against adding Enzyme to the DynamicPPL benchmarking before Enzyme gets the comprehensive unit test on Julia syntax.

wsmoses · 2025-09-18T12:43:16Z

I see, though the fact it's red and seemingly ignored atm seems like a bad sign. Though in particular looking at the history #1005 does seem to be the change which made it start failing, which makes sense given all the accumulator-derived minimizations @penelopeysm and I have been looking into as of late (hopefully they're now all resolved as of this morning).

and that's quite strange @yebai do you have an example where you can point to a regression on those repos from Enzyme offhand? In particular we run the Bijectors tests you added on every commit of Enzyme and they haven't ever failed since you've added them. Of course they're fixed to whatever bijectors version you chose (unless updated), which is why I think it's more likely something in that code (or a depenency) changed.

That said, like I mentioned in the other thread -- happy to have whatever additional integration tests you guys like/think are useful!

penelopeysm · 2025-09-25T11:46:19Z

I think as a happy middle ground let's add it to Smorgasbord (that model's meant, at least in theory, to pick up as many DynamicPPL features as we can squeeze in, so if there are severe regressions with Enzyme we would pick it up ... if somebody looks). I'm going to close this PR and reopen one myself so that the workflow can run

Add enzyme to benchmark tests

9bb1b8e

Add Enzyme as an automatic differentiation backend

54f94b8

penelopeysm requested changes Sep 15, 2025

View reviewed changes

address comments

043c3a3

penelopeysm closed this Sep 25, 2025

penelopeysm mentioned this pull request Sep 25, 2025

add Enzyme benchmarks and incorporate some CompatHelper bits #1056

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add enzyme to benchmark tests #1039

Add enzyme to benchmark tests #1039

Uh oh!

wsmoses commented Sep 14, 2025

Uh oh!

codecov bot commented Sep 14, 2025 •

edited

Loading

Uh oh!

coveralls commented Sep 14, 2025 •

edited

Loading

Uh oh!

penelopeysm Sep 15, 2025

Uh oh!

wsmoses Sep 15, 2025

Uh oh!

penelopeysm Sep 15, 2025

Uh oh!

yebai commented Sep 16, 2025

Uh oh!

wsmoses commented Sep 18, 2025

Uh oh!

yebai commented Sep 18, 2025

Uh oh!

wsmoses commented Sep 18, 2025

Uh oh!

penelopeysm commented Sep 25, 2025

Uh oh!

Uh oh!

Add enzyme to benchmark tests #1039

Add enzyme to benchmark tests #1039

Uh oh!

Conversation

wsmoses commented Sep 14, 2025

Uh oh!

codecov bot commented Sep 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coveralls commented Sep 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Test Coverage Report for Build 17931429932

Warning: This coverage report may be inaccurate.

Details

💛 - Coveralls

Uh oh!

penelopeysm Sep 15, 2025

Choose a reason for hiding this comment

Uh oh!

wsmoses Sep 15, 2025

Choose a reason for hiding this comment

Uh oh!

penelopeysm Sep 15, 2025

Choose a reason for hiding this comment

Uh oh!

yebai commented Sep 16, 2025

Uh oh!

wsmoses commented Sep 18, 2025

Uh oh!

yebai commented Sep 18, 2025

Uh oh!

wsmoses commented Sep 18, 2025

Uh oh!

penelopeysm commented Sep 25, 2025

Uh oh!

Uh oh!

codecov bot commented Sep 14, 2025 •

edited

Loading

coveralls commented Sep 14, 2025 •

edited

Loading