Skip to content

perf: add specializations for mapreduce min/max #48

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions src/array_of_similar_arrays.jl
Original file line number Diff line number Diff line change
Expand Up @@ -262,6 +262,9 @@ Base.@propagate_inbounds function _deepview_impl_aosa(A::ArrayOfSimilarArrays, i
end


Base.mapreduce(::typeof(maximum), ::typeof(max), A::ArrayOfSimilarArrays; kw...) = maximum(flatview(A); kw...)
Base.mapreduce(::typeof(minimum), ::typeof(min), A::ArrayOfSimilarArrays; kw...) = minimum(flatview(A); kw...)
Comment on lines +265 to +266
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to forward the kwargs to minimum here? The kwargs of mapreduce and minimum are not compatible, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah technically they don't completely overlap, however, the problem is we need init because other wise there's no way to make reducing over empty arrays work

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would you in this case rather curry minimum to a minimum(..., init=0) and put this in mapreduce? and that hopefully still works because the typeof(minimum) would still trigger on the curried function?

I don't know how this would look like in Julia, but in Python it would be something like the following pseudo-code:

minimum_with_init = partial(minimum, init=0)

mapreduce(minimum_with_init, min, A)

I'm not sure how multiple dispatch works with curried functions, and not sure if this is a common solution/paradigm in Julia.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be best to declare and handle the kwargs explicitly.

As for the minimum with init, I doubt many users would write their code like that ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be best to declare and handle the kwargs explicitly.

there's no good default for init=, so if user doesn't provide one, we also don't

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in ffe6030

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On generic arrays the default implementation of mapreduce is

mapreduce(f, op, A::AbstractArrayOrBroadcasted; dims=:, init=_InitialValue()) =
_mapreduce_dim(f, op, init, A, dims)

but on GPU arrays it is

Base.mapreduce(f, op, A::AnyGPUArray, As::AbstractArrayOrBroadcasted...; dims=:, init=nothing)

So handling of the init default differs, and _InitialValue is not public. it would be nice of there was an official "default init" construct, but there isn't, so we have to get a bit creative.

Also, we can't just pass a dims kwarg that is not : through to the aggregation function on the flattened collection, it would result in incorrect results in many cases.

I suggest we do something like this (untested!):

using Base: add_sum, mul_prod

for (f, op) in [
    (:sum, :add_sum), (:product, :mul_prod), (:maximum, :max), (:minimum, :min)
]
    for AoA in [:AbstractArrayOfSimilarArrays, :AbstractVectorOfArrays]
        @eval begin
            Base.mapreduce(::typeof($f), ::typeof($op), A::$AoA; dims = :, init = nothing)
                = _aoa_associative_mapreduce($f, $op, $AoA, init)
        end
    end
end

_aoa_associative_mapreduce(f, ::Any, A, ::Colon, ::Nothing) = f(flatview(A))
_aoa_associative_mapreduce(f, ::Any, A, ::Colon, init) = f(flatview(A); init = init)

function _aoa_associative_mapreduce(f, op, A, dims, init) =
    # Wrap `f` to dispatch to generic mapreduce implementation for `dims!=:`:
    wrapped_f = x -> f(x)
    return maprecduce_impl(x -> f(x), op, A, dims = dims, init = init)
end

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@oschulz I think we're gonna give up making PR now 😂

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, we have to do it somewhat cleanly, esp. the dims handling.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's no dims to handle if we're just specializing VectorOfX, because vector is 1D

let's do that first I guess



const VectorOfSimilarArrays{
T, M, L,
Expand Down
4 changes: 4 additions & 0 deletions src/vector_of_arrays.jl
Original file line number Diff line number Diff line change
Expand Up @@ -337,6 +337,10 @@ function Base.append!(A::VectorOfArrays{T,N}, B::AbstractVector{<:AbstractArray{
end


Base.mapreduce(::typeof(maximum), ::typeof(max), V::VectorOfArrays) = maximum(flatview(V))
Base.mapreduce(::typeof(minimum), ::typeof(min), V::VectorOfArrays) = minimum(flatview(V))


Base.vcat(V::VectorOfArrays) = V

function Base.vcat(Vs::(VectorOfArrays{U,N} where U)...) where {N}
Expand Down
11 changes: 11 additions & 0 deletions test/array_of_similar_arrays.jl
Original file line number Diff line number Diff line change
Expand Up @@ -365,6 +365,17 @@ using StatsBase: cov2cor
@test @inferred(ArraysOfArrays._innerlength(VSV)) == N
end

@testset "mapreduce maximum/minimum shortcut" begin
r1 = rand(1,4); r2 = rand(1,4); r3 = rand(1,4); r4 = rand(1,4)
ASA = ArrayOfSimilarArrays([r1,r2,r3,r4])

@test mapreduce(maximum, max, ASA) == maximum(flatview(ASA))
@test (@allocated mapreduce(maximum, max, ASA)) == (@allocated maximum(flatview(ASA)))

@test mapreduce(minimum, min, ASA) == minimum(flatview(ASA))
@test (@allocated mapreduce(minimum, min, ASA)) == (@allocated minimum(flatview(ASA)))
end

@testset "map and broadcast" begin
A_flat = rand(2,3,4,5,6)
A = nestedview(A_flat, 2)
Expand Down
2 changes: 1 addition & 1 deletion test/runtests.jl
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

import Test

Test.@testset "Package ArraysOfArrays" begin
Test.@testset verbose=true "Package ArraysOfArrays" begin
include("test_aqua.jl")
include("functions.jl")
include("array_of_similar_arrays.jl")
Expand Down
37 changes: 37 additions & 0 deletions test/vector_of_arrays.jl
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,43 @@ using ArraysOfArrays: full_consistency_checks, append_elemptr!, element_ptr
end


@testset "mapreduce maximum/minimum shortcut" begin
A1 = ref_AoA3(Float32, 3); # A2 = ref_AoA3(Float32, 0)
A3 = ref_AoA3(Float32, 4); A4 = ref_AoA3(Float64, 2)

B1 = VectorOfArrays(A1); # B2 = VectorOfArrays(A2);
B3 = VectorOfArrays(A3); B4 = VectorOfArrays(A4);

@testset "maximum - correctness" begin
@test mapreduce(maximum, max, B1) == mapreduce(maximum, max, Array(B1))
# `init` kwarg is not supported for this specialization right now
# @test mapreduce(maximum, max, B2; init=Float32(0.)) == mapreduce(maximum, max, Array(B2); init=Float32(0.))
@test mapreduce(maximum, max, B3) == mapreduce(maximum, max, Array(B3))
@test mapreduce(maximum, max, B4) == mapreduce(maximum, max, Array(B4))
end

@testset "maximum - performance" begin
B1_naive = Array(B1)
mapreduce(maximum, max, B1_naive)
@test (@allocated mapreduce(maximum, max, B1)) <= (@allocated mapreduce(maximum, max, B1_naive))
end

@testset "minimum - correctness" begin
@test mapreduce(minimum, min, B1) == mapreduce(minimum, min, Array(B1))
# `init` kwarg is not supported for this specialization right now
# @test mapreduce(minimum, min, B2; init=Float32(0.)) == mapreduce(minimum, min, Array(B2); init=Float32(0.))
@test mapreduce(minimum, min, B3) == mapreduce(minimum, min, Array(B3))
@test mapreduce(minimum, min, B4) == mapreduce(minimum, min, Array(B4))
end

@testset "minimum - performance" begin
B1_naive = Array(B1)
mapreduce(minimum, min, B1_naive)
@test (@allocated mapreduce(minimum, min, B1)) <= (@allocated mapreduce(minimum, min, B1_naive))
end
end


@testset "append! and vcat" begin
A1 = ref_AoA3(Float32, 3); A2 = ref_AoA3(Float32, 0)
A3 = ref_AoA3(Float32, 4); A4 = ref_AoA3(Float64, 2)
Expand Down