Skip to content

Conversation

wsmoses
Copy link
Contributor

@wsmoses wsmoses commented Sep 14, 2025

No description provided.

Copy link

codecov bot commented Sep 14, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 82.24%. Comparing base (7249158) to head (043c3a3).
⚠️ Report is 5 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1039      +/-   ##
==========================================
- Coverage   82.34%   82.24%   -0.11%     
==========================================
  Files          38       38              
  Lines        3949     3949              
==========================================
- Hits         3252     3248       -4     
- Misses        697      701       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@coveralls
Copy link

coveralls commented Sep 14, 2025

Pull Request Test Coverage Report for Build 17931429932

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 27 unchanged lines in 6 files lost coverage.
  • Overall coverage decreased (-2.3%) to 80.28%

Files with Coverage Reduction New Missed Lines %
src/model.jl 1 86.73%
src/debug_utils.jl 3 89.27%
src/extract_priors.jl 3 60.53%
src/pointwise_logdensities.jl 3 93.06%
src/values_as_in_model.jl 4 72.41%
src/threadsafe.jl 13 62.16%
Totals Coverage Status
Change from base Build 17494778045: -2.3%
Covered Lines: 3159
Relevant Lines: 3935

💛 - Coveralls

:reversediff => ADTypes.AutoReverseDiff(; compile=false),
:reversediff_compiled => ADTypes.AutoReverseDiff(; compile=true),
:mooncake => ADTypes.AutoMooncake(; config=nothing),
:enzyme => ADTypes.AutoEnzyme(; config=nothing),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't work :(

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't work how? probably we need to use adtypes in the same way as in adtests (currently its a constructor error)

Comment on lines 46 to 61
("Smorgasbord", smorgasbord_instance, :typed, :enzyme, true),
("Loop univariate 1k", loop_univariate1k, :typed, :mooncake, true),
("Loop univariate 1k", loop_univariate1k, :typed, :enzyme, true),
("Multivariate 1k", multivariate1k, :typed, :mooncake, true),
("Multivariate 1k", multivariate1k, :typed, :enzyme, true),
("Loop univariate 10k", loop_univariate10k, :typed, :mooncake, true),
("Loop univariate 10k", loop_univariate10k, :typed, :enzyme, true),
("Multivariate 10k", multivariate10k, :typed, :mooncake, true),
("Multivariate 10k", multivariate10k, :typed, :enzyme, true),
("Dynamic", Models.dynamic(), :typed, :mooncake, true),
("Dynamic", Models.dynamic(), :typed, :enzyme, true),
("Submodel", Models.parent(randn(rng)), :typed, :mooncake, true),
("Submodel", Models.parent(randn(rng)), :typed, :enzyme, true),
("LDA", lda_instance, :typed, :reversediff, true),
("LDA", lda_instance, :typed, :enzyme, true),
]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the interests of not making this take too long, could we restrict the Enzyme ones to Smorgasboard, Dynamic, Submodel and LDA?

Might also depend on whether you want to test both forward- and reverse-mode.

@yebai
Copy link
Member

yebai commented Sep 16, 2025

ADTests already provides benchmarking for Enzyme: https://turinglang.org/ADTests/

The point of these simple benchmarks is to provide a ballpark indicator of whether PRs did something very bad for performance. Since these benchmarks capture general performance issues, such as allocation and type stability, we only need to run them on a few AD backends to avoid excessive CI time.

@wsmoses
Copy link
Contributor Author

wsmoses commented Sep 18, 2025

My thought was to put some Enzyme integration tests here to make it easier for @penelopeysm & myself to identify if a PPL change caused an error which prevents differentiation (I believe a majority of recent issues stem from DynamicPPL changes).

This seemed like a happy middle for not running all tests, but alternatively @yebai if you prefer we can also add Enzyme to the other AD tests (which frankly would also be good for users).

@yebai
Copy link
Member

yebai commented Sep 18, 2025

@wsmoses DynamicPPL has already been tested against Enzyme in an isolated CI. See, eg, here.

Speaking of experience, Enzyme is still fragile and breaks for innocent Julia code. This is consistent with a few other places where we test against Enzyme, such as Bijectors and AdvancedVI. To address these issues, the best approach is to have comprehensive unit tests inside Enzyme against Julia syntax, so one has the empirical guarantee of Julia syntax coverage. Then, for cases that Enzyme doesn't plan to support, one can provide clear docs so users can work around them.

For these reasons, I'd suggest against adding Enzyme to the DynamicPPL benchmarking before Enzyme gets the comprehensive unit test on Julia syntax.

@wsmoses
Copy link
Contributor Author

wsmoses commented Sep 18, 2025

I see, though the fact it's red and seemingly ignored atm seems like a bad sign. Though in particular looking at the history #1005 does seem to be the change which made it start failing, which makes sense given all the accumulator-derived minimizations @penelopeysm and I have been looking into as of late (hopefully they're now all resolved as of this morning).

and that's quite strange @yebai do you have an example where you can point to a regression on those repos from Enzyme offhand? In particular we run the Bijectors tests you added on every commit of Enzyme and they haven't ever failed since you've added them. Of course they're fixed to whatever bijectors version you chose (unless updated), which is why I think it's more likely something in that code (or a depenency) changed.

That said, like I mentioned in the other thread -- happy to have whatever additional integration tests you guys like/think are useful!

@penelopeysm
Copy link
Member

I think as a happy middle ground let's add it to Smorgasbord (that model's meant, at least in theory, to pick up as many DynamicPPL features as we can squeeze in, so if there are severe regressions with Enzyme we would pick it up ... if somebody looks). I'm going to close this PR and reopen one myself so that the workflow can run

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants