- 
                Notifications
    You must be signed in to change notification settings 
- Fork 432
          Decouple rand and eltype
          #1905
        
          New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
  
    Decouple rand and eltype
  
  #1905
              
            Conversation
747a191    to
    0ea5502      
    Compare
  
    | Codecov Report❌ Patch coverage is  
 Additional details and impacted files@@            Coverage Diff             @@
##           master    #1905      +/-   ##
==========================================
- Coverage   86.35%   85.42%   -0.94%     
==========================================
  Files         146      146              
  Lines        8782     8865      +83     
==========================================
- Hits         7584     7573      -11     
- Misses       1198     1292      +94     ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
 | 
| Given that, according to the docstring Distributions.jl/src/common.jl Lines 96 to 102 in a1010e4 
 the sole purpose of eltypeis to return the type ofrand, which I agree isn't really feasible, should we also deprecate the methods here as part of this PR? | 
| #1907 is related. | 
5644af6    to
    3e66d3e      
    Compare
  
    | My personal opinion: This is orthogonal to the fact that Distributions should probably support  In an ideal situation, I think we would define  | 
| 
 I became convinced that in general this is not possible. It basically means trying to manually figure out type inference, a task that even the Julia compiler can fail at. Of course, for some number types and distributions it can be done (as also the compiler will succeed in many instances). But in general it's far from trivial. As in the initial stages of Distributions, one could be restrictive and limit samples to  | 
4db3858    to
    d824cf8      
    Compare
  
    d824cf8    to
    2e7752e      
    Compare
  
    | A few distributions still give me  julia> dist = Semicircle(50.0f0)
Semicircle{Float32}(r=50.0f0)
julia> rand(dist, 5)
5-element Array{Float64, 1}:
  36.80671953487414
 -18.355635129900335
 -11.701855436648922
 -21.444118928985656
  -5.80120463505302
julia> dist = JohnsonSU(0.0f0, 1.0f0, 0.0f0, 1.0f0)
JohnsonSU{Float32}(ξ=0.0f0, λ=1.0f0, γ=0.0f0, δ=1.0f0)
julia> rand(dist, 5)
5-element Array{Float64, 1}:
  0.5311696707298299
  1.632313034117999
  0.04951771555318912
  0.4721610259428258
 -3.052321854866766
julia> dist = Chisq(5.0f0)
Chisq{Float32}(ν=5.0f0)
julia> rand(dist, 5)
5-element Array{Float32, 1}:
 15.465032
 1.888659
 7.013455
 4.258529
 3.9611576Can this be fixed? | 
| The reason is that for quite a few of the older, probably less used and definitely less frequently updated distributions the  | 
| This is not the first time I have went on a rant about how Distributions in this package are effectively unordered containers with infinite size, but I have stumbled upon new ammunition: This section of the Julia stdlib documentation claims: 
 This led me to JuliaLang/julia#31787 and JuliaLang/julia#27756 I believe that it is actually deeper Random.jl precedent that  Additional ramblingOn a similar note, I do not think that it is controversial that  I also believe that it is reasonable that  julia> d = Normal(0f0, 1f0)
Normal{Float32}(μ=0.0f0, σ=1.0f0)
julia> typeof(quantile(d, .5)) == typeof(rand(d))
falseFor many distribution types, perhaps the real conversation we should be having is if we should force  For distributions parameterized by mean/std, similar logic definitely holds. It is not a stretch to extend this for other location/spread parameters. Shape parameters are ambiguous, but if someone types  For distributions like Binomial, I would argue that the type of  Yet further rambling
 | 
| 
 | 
| Sorry, late to the party, but with iterators we have a "HasEltype" attribute to address exactly this issue, how is this design related? | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a review of censored, Cholesky-, and matrix-variate distributions.
| if uplo === 'L' | ||
| for j in 1:d | ||
| A[j, j] = 1 | ||
| for i in (j+1):d | ||
| A[i, j] = 0 | ||
| end | ||
| end | ||
| else | ||
| for j in 1:d | ||
| for i in 1:(j - 1) | ||
| A[i, j] = 0 | ||
| end | ||
| A[j, j] = 1 | ||
| end | ||
| end | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe the loop is a little more efficient, but since this is a rare case, perhaps it's better for decreased code complexity to just copy the identity matrix?
| if uplo === 'L' | |
| for j in 1:d | |
| A[j, j] = 1 | |
| for i in (j+1):d | |
| A[i, j] = 0 | |
| end | |
| end | |
| else | |
| for j in 1:d | |
| for i in 1:(j - 1) | |
| A[i, j] = 0 | |
| end | |
| A[j, j] = 1 | |
| end | |
| end | |
| copyto!(A, I) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume I wrote the code in that way to avoid updating the inactive triangle. But I didn't benchmark it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think LinearAlgebra makes any promises about the contents of the inactive triangle, and I think it would be crazy for a user calling rand!(rng, ::LKJCholesky, ::Cholesky) to have any expectations about the factors matrix other than that the active triangle follows the distribution. So I don't see a problem with overwriting the inactive triangle to simplify the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any correctness problems with overwriting the entire matrix. I just assume that the motivation for the current version was to reduce the number of setindex! calls. I'll actually doubt that it has a significant performance impact but I'll benchmark it when I'm back at my computer, just to be sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not a huge deal; I'll leave it up to you.
| for j in 1:d | ||
| for i in 1:(j - 1) | ||
| R[i, j] = 0 | ||
| end | ||
| R[j, j] = 1 | ||
| for i in (j + 1):d | ||
| R[i, j] = 0 | ||
| end | ||
| end | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above:
| for j in 1:d | |
| for i in 1:(j - 1) | |
| R[i, j] = 0 | |
| end | |
| R[j, j] = 1 | |
| for i in (j + 1):d | |
| R[i, j] = 0 | |
| end | |
| end | |
| copyto!(R, I) | 
Co-authored-by: Seth Axen <[email protected]>
| 
 Sometimes quantiles of integer-valued distributions are taken to interpolate between the integers flanking the 50% value of the cumulative. E.g. the median of an unbiased Bernoulli distribution would be 0.5 in this convention. Just saying. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a review of the mixture models and multivariate distributions.
| randn!(rng, x) | ||
| for i in eachindex(x) | ||
| k = rand(rng, psampler) | ||
| x[i] = muladd(x[i], stds[k], means[k]) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code seems to assume that stds and means have 1-based indexing. Is this enforced anywhere?
| x::AbstractVector{<:Real}) | ||
| function rand(rng::AbstractRNG, d::Union{Dirichlet,DirichletCanon}) | ||
| x = map(αi -> rand(rng, Gamma(αi)), d.alpha) | ||
| return lmul!(inv(sum(x)), x) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe use normalize! so we also get its special-casing to avoid over/underflow?
| return lmul!(inv(sum(x)), x) | |
| return normalize!(x, 1) | 
Also, if e.g. d.alpha is a StaticArrays.StaticArray, this would fail, right, since x would also be immutable? So maybe it make sense to allocate x first, map! into it, and then have this line?
| end | ||
| function rand(rng::AbstractRNG, d::Dirichlet{<:Real,<:FillArrays.AbstractFill{<:Real}}) | ||
| x = rand(rng, Gamma(FillArrays.getindex_value(d.alpha)), length(d)) | ||
| return lmul!(inv(sum(x)), x) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| return lmul!(inv(sum(x)), x) | |
| return normalize!(x, 1) | 
| # use exponential generation method with inversion, where for gaps in the ranks, we | ||
| # use the fact that the sum Y of k IID variables xₘ ~ Exp(1) is Y ~ Gamma(k, 1). | ||
| # Lurie, D., and H. O. Hartley. "Machine-generation of order statistics for Monte | ||
| # Carlo computations." The American Statistician 26.1 (1972): 26-27. | ||
| # this is slow if length(d.ranks) is close to n and quantile for d.dist is expensive, | ||
| # but this branch is probably taken when length(d.ranks) is small or much smaller than n. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're just calling the mutating method, I don't think we need to copy such a large comment.
| # use exponential generation method with inversion, where for gaps in the ranks, we | |
| # use the fact that the sum Y of k IID variables xₘ ~ Exp(1) is Y ~ Gamma(k, 1). | |
| # Lurie, D., and H. O. Hartley. "Machine-generation of order statistics for Monte | |
| # Carlo computations." The American Statistician 26.1 (1972): 26-27. | |
| # this is slow if length(d.ranks) is close to n and quantile for d.dist is expensive, | |
| # but this branch is probably taken when length(d.ranks) is small or much smaller than n. | 
| end | ||
| function rand(rng::AbstractRNG, d::MvLogNormal, n::Int) | ||
| xs = rand(rng, d.normal, n) | ||
| map!(exp, xs, xs) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the docs for map!:
Behavior can be unexpected when any mutated argument shares memory with any other argument.
Maybe we could broadcast here instead?
| map!(exp, xs, xs) | |
| xs .= exp.(xs) | 
| _rand!(rng, d.normal, x) | ||
| function rand(rng::AbstractRNG, d::MvLogNormal) | ||
| x = rand(rng, d.normal) | ||
| map!(exp, x, x) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| map!(exp, x, x) | |
| x .= exp.(x) | 
| @eval begin | ||
| Base.@propagate_inbounds function rand!(rng::AbstractRNG, d::MvLogNormal, x::AbstractArray{<:Real,$N}) | ||
| rand!(rng, d.normal, x) | ||
| map!(exp, x, x) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| map!(exp, x, x) | |
| x .= exp.(x) | 
| y = similar(x, (1, cols)) | ||
| unwhiten!(d.Σ, randn!(rng, x)) | ||
| rand!(rng, chisqd, y) | ||
| x .= x ./ sqrt.(y ./ d.df) .+ d.μ | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not broadcast the rand call to avoid allocating an intermediate container?
| y = similar(x, (1, cols)) | |
| unwhiten!(d.Σ, randn!(rng, x)) | |
| rand!(rng, chisqd, y) | |
| x .= x ./ sqrt.(y ./ d.df) .+ d.μ | |
| unwhiten!(d.Σ, randn!(rng, x)) | |
| rand!(rng, chisqd, y) | |
| x .= x ./ sqrt.(rand.(rng, chisqd) ./ d.df) .+ d.μ | 
One of my take-aways from issues such as #1252, #1902, #894, #1783, #1041, and #1071 is that
eltypeis not only quite inconsistently implemented and unclearly documented currently but more generally it might be a bad design decision to use it for pre-allocating containers of samples: Setting it to a fixed type (historicallyFloat64for continuous andIntfor discrete distributions) is too limiting, but trying to infer it from parameters is challenging and doomed to fail for more challenging scenarios that are not as clear as e.g.Normal.This PR tries to decouple
randandeltype, to make it easier to possibly eventually deprecate and removeeltype.Fixes #1041.
Fixes #1071.
Fixes #1082.
Fixes #1252.
Fixes #1783.
Fixes #1884.
Fixes #1902.
Fixes #1907.