A benchmark of random-fu (replacing mwc-random)#323
A benchmark of random-fu (replacing mwc-random)#323idontgetoutmuch wants to merge 2 commits intotweag:masterfrom
Conversation
|
Annoyingly I have had to comment out some of the existing benchmarks. I tried but if I uncomment the other tests then even though |
|
So with but with @Shimuuar I don't know why this would be. The implementation in |
|
Can you, instead of editing in place, copy the file |
|
Well this is mysterious. When I run both benchmarks in one program I get I wonder what the benchmark is actually measuring. |
| normalBenchmarks = [ bench "Normal single sample monad bayes" $ nfIO $ do | ||
| sampleIOfixed (do xs <- replicateM 1000 $ normal 0.0 1.0 | ||
| return $ sum xs) | ||
| , bench "Normal single sample monad bayes fu" $ nfIO $ do | ||
| FU.sampleIOfixed (do xs <- replicateM 1000 $ normal 0.0 1.0 | ||
| return $ sum xs) |
There was a problem hiding this comment.
You are calling polymorphic code (the inner do blocks) on particular types. This means that additionally to the actual computation, type class dictionary lookup is also performed. This can have a performance impact. Maybe this can be improved by adding a SPECIALISE pragma to the MonadDistribution instance definitions of both sampler modules. See https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/pragmas.html#specialize-pragma.
I imagine a change like this:
uniform a b = SamplerT (ReaderT $ uniformRM (a, b))
{-# SPECIALISE uniform :: Double -> Double -> SamplerIO Double #-}This is for the MWC sampler, and a similar pragma can be added to the random-fu sampler. Hopefully, both will be faster and more comparable then.
| instance StatefulGen g m => MonadDistribution (SamplerT g m) where | ||
| random = SamplerT (ReaderT $ RF.runRVar $ RF.stdUniform) | ||
|
|
||
| uniform a b = SamplerT (ReaderT $ RF.runRVar $ RF.doubleUniform a b) |
There was a problem hiding this comment.
| uniform a b = SamplerT (ReaderT $ RF.runRVar $ RF.doubleUniform a b) | |
| uniform a b = SamplerT (ReaderT $ RF.runRVar $ RF.doubleUniform a b) | |
| {-# SPECIALISE uniform :: Double -> Double -> SamplerIO Double #-} |
|
@turion I didn't try your suggestions yet and went back to basics. I have and get So now I need to add something like a |
gives this So it seems adding the type class adds almost no overhead and the results remain consistent (with random-fu being faster than mwx for normal samples). |
|
If you add a type class to the same module, the type class dictionary lookup will be optimized away by GHC. For it to have a performance impact, you have to put it into a separate module, as is done in the library. The compilation unit for GHC is a single module. This means that modules/files are optimized independently. In consequence, you will not get realistic benchmarks if you put the type class code in the same file as the benchmark. A benchmark has to be written like any user code in order to be realistic, i.e. use the library. Again, if you add |
|
I did not check my statements against Core, so one should maybe look into the optimized core first to make sure that this is what's happening. |
I am sure you are right but the discrepancy seems to be caused by using |
|
WIth |
|
I tried but it made no difference. |
gives |
No description provided.