Skip to content

Conversation

@nsiccha
Copy link

@nsiccha nsiccha commented May 13, 2025

Fixes #248 for me - I believe, I haven't actually run nor added any tests yet.

Also, maybe this shouldn't be just merged into main? How does this go, @sethaxen?

@codecov
Copy link

codecov bot commented May 13, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 76.19%. Comparing base (20dd77e) to head (6971c21).

❗ There is a different number of reports uploaded between BASE (20dd77e) and HEAD (6971c21). Click for more details.

HEAD has 148 uploads less than BASE
Flag BASE (20dd77e) HEAD (6971c21)
154 6
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #249      +/-   ##
==========================================
- Coverage   82.08%   76.19%   -5.89%     
==========================================
  Files          13       13              
  Lines         586      584       -2     
==========================================
- Hits          481      445      -36     
- Misses        105      139      +34     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@nsiccha
Copy link
Author

nsiccha commented May 13, 2025

I'm guessing this makes a bunch of tests fail which rely on the previous behavior?

Copy link
Member

@sethaxen sethaxen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! I made some notes in #248 (comment). I think a full fix should also deepcopy the optimizer in multipathfinder.

Basically, replace

iter_optimizers = fill(optimizer, nruns)

with
iter_optimizers = (deepcopy(optimizer) for _ in 1:nruns)

Can you also add a brief test that fails on main but would pass with this PR?

Comment on lines +27 to +28
const DEFAULT_LINE_SEARCH_CONSTRUCTOR = LineSearches.HagerZhang
const DEFAULT_LINE_SEARCH_INIT_CONSTRUCTOR = LineSearches.InitialHagerZhang
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep the constant name and just make it the constructor instead of the object.

Suggested change
const DEFAULT_LINE_SEARCH_CONSTRUCTOR = LineSearches.HagerZhang
const DEFAULT_LINE_SEARCH_INIT_CONSTRUCTOR = LineSearches.InitialHagerZhang
const DEFAULT_LINE_SEARCH = LineSearches.HagerZhang
const DEFAULT_LINE_SEARCH_INIT = LineSearches.InitialHagerZhang

Comment on lines +34 to +35
linesearch=DEFAULT_LINE_SEARCH_CONSTRUCTOR(),
alphaguess=DEFAULT_LINE_SEARCH_INIT_CONSTRUCTOR(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
linesearch=DEFAULT_LINE_SEARCH_CONSTRUCTOR(),
alphaguess=DEFAULT_LINE_SEARCH_INIT_CONSTRUCTOR(),
linesearch=DEFAULT_LINE_SEARCH(),
alphaguess=DEFAULT_LINE_SEARCH_INIT(),

@sethaxen
Copy link
Member

Also, maybe this shouldn't be just merged into main? How does this go, @sethaxen?

Pathfinder follows a continuous deployment model, so yes we'll merge this directly into main and immediately register a release. It's a non-breaking bug fix PR so just add a patch version bump.

@sethaxen
Copy link
Member

I'm guessing this makes a bunch of tests fail which rely on the previous behavior?

Seems to only be 2 tests:

@test result.optimizer ===
Pathfinder.default_optimizer(Pathfinder.DEFAULT_HISTORY_LENGTH)

@test result.optimizer ===
Pathfinder.default_optimizer(Pathfinder.DEFAULT_HISTORY_LENGTH)

Not certain if equality check would work here, but if not, just checking the optimizer is an LBFGS with the same m, linesearch and linesearch init types would be fine.

@sethaxen
Copy link
Member

Docs build failure is unrelated, and I'll fix in a separate PR

@nsiccha
Copy link
Author

nsiccha commented May 13, 2025

Ah, okay! Will do. The new test would probably have to check that constructing a default_optimizer pre and post a pathfinder run results in identical-in-value but different-in-memory objects, if that makes sense. Will probably add this later today 👍

@sethaxen
Copy link
Member

The new test would probably have to check that constructing a default_optimizer pre and post a pathfinder run results in identical-in-value but different-in-memory objects, if that makes sense.

I think that makes sense for testing that this particular approach we're using now works, but it would be even better to have a test that was independent of our default_optimizer but instead directly tested our invariants (things that we should be able to guarantee are true). Here our invariants would be:

  1. if you didn't pass a stateful optimizer/log-density function and if your RNG is thread-safe, then pathfinder should be thread-safe.
  2. multipathfinder with a thread-safe RNG and a multithreading executor should be thread-safe.

The only way I can think of to test thread-safeness is with reproducibility.

I'm thinking 2 tests:

  1. Call pathfinder multiple times in a multi-threaded loop with identically seeded thread-safe RNG for a nontrivial model (maybe the banana) without specifying the optimizer. Verify that results (e.g. trace, trace gradient, log-density, and draws) are numerically identical (with ==)
  2. Call multipathfinder with a user-constructed LBFGS (so they have a shared state), a thread-safe RNG, and PreferParallel executor. Reseed identically and re-run. Results should be identical.

We do have reproducibility tests here:

Random.seed!(rng, seed)
result2 = multipathfinder(
ℓ, ndraws; nruns, ndraws_elbo, ndraws_per_run, rng, executor
)
@test result2.fit_distribution == result.fit_distribution
@test result2.draws == result.draws
@test result2.draw_component_ids == result.draw_component_ids
Random.seed!(rng, seed)
result3 = multipathfinder(
ℓ, ndraws; nruns, ndraws_elbo, ndraws_per_run, rng, executor
)
for (c1, c2) in
zip(result.fit_distribution.components, result3.fit_distribution.components)
@test c1 c2 atol = 1e-6
end
.
My guess is that these are currently passing because the log-density is so trivial that very little time is spent in the linesearch, so the runs don't interfere with each other often.

Will probably add this later today 👍

I really appreciate it! Let me know if you'd like help with any of this.

@nsiccha
Copy link
Author

nsiccha commented May 13, 2025

My guess is that these are currently passing because the log-density is so trivial that very little time is spent in the linesearch, so the runs don't interfere with each other often.

Right, there was also no issue in my example in the github issue for the parallel standard normal run.

@sethaxen
Copy link
Member

@nsiccha anything I can do to help out with this PR?

@nsiccha
Copy link
Author

nsiccha commented Jul 8, 2025

@sethaxen, right, so sorry, I've been a lot busier than expected. I'll try to do it now :)

@nsiccha
Copy link
Author

nsiccha commented Jul 8, 2025

Ah, I've been trying to construct a test that fails for multipathfinder, but was unable to. The reason being you already catching the multipathfinder+stateful optimizer case, see

# also support optimizers that store state
zip(_init, Iterators.map(deepcopy, iter_optimizers))

Which is probably why few if any other people have run into this issue...

I'll still finish the changes, but will only add a new test for a parallel pathfinder call without an explicitly passed optimizer, alright, @sethaxen?

@sethaxen
Copy link
Member

Ah, I've been trying to construct a test that fails for multipathfinder, but was unable to. The reason being you already catching the multipathfinder+stateful optimizer case, see

Ah, yes, it seems we do catch that case! Okay good, this bug wasn't as severe as it initially looked.

I'll still finish the changes, but will only add a new test for a parallel pathfinder call without an explicitly passed optimizer, alright, @sethaxen?

Yes, that sounds like the right approach. Thanks!

@nsiccha
Copy link
Author

nsiccha commented Jul 14, 2025

this bug wasn't as severe as it initially looked.

Yeah, indeed. In retrospect, if it had affected everyone, I guess it would have been discovered earlier.

BTW, the reason why I was even using the parallel (single) pathfinder thing was because I wanted to initialize several chains for the same posterior in parallel, but I did not want all eventual initialization points to come from a single approximation, which AFAICT inevitably happens for high dimensions and importance resampling. I wanted to do something slightly more clever, but for that I'd needed to be able to match each draw to the approximation that generated it, but AFAICT that wasn't possible with multipathfinder. I guess in the end I could have used the fit_distributions from the PathfinderResult - I'm unsure why I didn't. Or actually, IIRC for some reason the multipathfinder method was actually much slower than my parallel pathfinder implementation?

I'm unsure, I eventually just stuck with the simple parallel pathfinder approach, which worked well for me (except for the race condition).

@sethaxen
Copy link
Member

BTW, the reason why I was even using the parallel (single) pathfinder thing was because I wanted to initialize several chains for the same posterior in parallel, but I did not want all eventual initialization points to come from a single approximation, which AFAICT inevitably happens for high dimensions and importance resampling. I wanted to do something slightly more clever, but for that I'd needed to be able to match each draw to the approximation that generated it, but AFAICT that wasn't possible with multipathfinder

The multi-pathfinder result object stores the individual single-path results, each of which stores the draws the chain generated, so you can always access those or, as you said, use the fit distribution for each path. You can also disable importance resampling with importance=False. It still resamples with replacement but does not use importance weights to do so. Then the draw_component_ids field stores for each draw in draws the index of the single-path run that generated that specific draw.

@nsiccha
Copy link
Author

nsiccha commented Jul 24, 2025

Makes sense! I was in the end also affected by julia-vscode/julia-vscode#3853, but wasn't aware at the time.

And also wrapped the (single) pathfinder calls in another loop, retrying pathfinder until NUTS initialization worked for the returned (initialization) draw, which mainly means until the gradient evaluation does not error.

Maybe this is actually something that Pathfinder should (optionally) check? That the log density (gradient) can be evaluated for the returned draws?

@sethaxen
Copy link
Member

Maybe this is actually something that Pathfinder should (optionally) check? That the log density (gradient) can be evaluated for the returned draws?

Oh that's interesting, can you open an issue for this feature?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Non reproducible behaviour of pathfinder if run in parallel (even if every task gets passed its own RNG)

2 participants