Concatenation without splatting

607 views
Skip to first unread message

Cedric St-Jean

unread,
Nov 22, 2015, 8:04:26 AM11/22/15
to julia-users
I have a big vector of vectors. Is there any way to vcat/hcat them without splatting?

arr_of_arr = Vector[[1],[2,3],[4,5]]
vcat
(arr_of_arr...)

I'm asking because splatting big arrays is a performance issue (and IIRC it blows the stack at some point).

Mauro

unread,
Nov 22, 2015, 3:12:55 PM11/22/15
to julia...@googlegroups.com
In ODE.jl, I've used

vcat_nosplat(y) = eltype(y[1])[el[1] for el in y] # Does vcat(y...) without the splatting

I think the eltype might not be needed. There may be better ways though.

Tomas Lycken

unread,
Nov 23, 2015, 5:07:18 AM11/23/15
to julia-users

That doesn’t quite seem to do what you want, Mauro:

julia> arr_of_arr = Vector{Int}[[1],[2,3],[4,5]]
3-element Array{Array{Int64,1},1}:
 [1]
 [2,3]
 [4,5]

julia> vcat_nosplat(y) = eltype(y[1])[el[1] for el in y]
vcat_nosplat (generic function with 1 method)

julia> vcat_nosplat(arr_of_arr)
3-element Array{Int64,1}:
 1
 2
 4

It’s quite trivial to achieve the desired result with only a few lines of code, though:

julia> function vcat_nosplat2(y)
       result = Array(eltype(y[1]), 0)
       sizehint!(result, sum(map(length, y))) #skip if iterating is more expensive than reallcoation

       for a in y
           for x in a
               push!(result, x)
           end
       end

       result
       end
vcat_nosplat2 (generic function with 1 method)

julia> vcat_nosplat2(arr_of_arr)
5-element Array{Int64,1}:
 1
 2
 3
 4
 5

// T

Dan

unread,
Nov 23, 2015, 5:51:25 AM11/23/15
to julia-users
Using `append!` instead of `push!` and letting efficient dynamic reallocation of vector do the resizing:

function vcat_nosplat2a(y)

    result
= Array(eltype(y[1]), 0)

   
for a in y
        append
!(result, a)
   
end
    result
end

(@benchmark shows way less allocations and 2-3x time)
BTW the splat version is just as quick, but perhaps allocation on stack is problematic (anybody check the limit?)

Tomas Lycken

unread,
Nov 23, 2015, 7:56:33 AM11/23/15
to julia-users

Using append! is definitely an improvement, thanks!

The reason I included the sizehint! call was that this resizes the vector once, and without requiring to copy any elements, and I assumed those saved copies would quickly win over having to iterate the collection an extra time to get the total number of elements. I’m glad to be proven wrong :)

function vcat_prealloc(y)
    result = Array(eltype(y[1]), sum(map(length, y)))

    for a in y
        append!(result, a)
    end
end

function vcat_dynamic(y)
    result = Array(eltype(y[1]), 0)
    for a in y
        append!(result, a)
    end
end

arr_of_arrs = Vector{Int}[rand(1:10, rand(1:10)) for _ in 1:10_000]

julia> @benchmark vcat_dynamic(arr_of_arrs)
================ Benchmark Results ========================
     Time per evaluation: 441.11 μs [405.40 μs, 476.82 μs]
Proportion of time in GC: 0.00% [0.00%, 0.00%]
        Memory allocated: 1.00 mb
   Number of allocations: 16 allocations
       Number of samples: 100
   Number of evaluations: 100
 Time spent benchmarking: 0.07 s

julia> @benchmark vcat_prealloc(arr_of_arrs)
================ Benchmark Results ========================
     Time per evaluation: 716.85 μs [652.07 μs, 781.62 μs]
Proportion of time in GC: 0.00% [0.00%, 0.00%]
        Memory allocated: 933.78 kb
   Number of allocations: 7 allocations
       Number of samples: 100
   Number of evaluations: 100
 Time spent benchmarking: 0.10 s

// T

Mauro

unread,
Nov 23, 2015, 10:56:15 AM11/23/15
to julia...@googlegroups.com
> That doesn’t quite seem to do what you want, Mauro:

Yes, you're right, sorry for the noise!

Glen O

unread,
Nov 27, 2015, 6:39:07 AM11/27/15
to julia-users
Any chance that foldl(vcat,arr_of_arr) will do the job?

Cedric St-Jean

unread,
Nov 27, 2015, 7:39:34 AM11/27/15
to julia-users
foldl would work, but it's going to create a ton of temporary arrays.

None of the proposed efficient solutions work with hcat... I suppose if splatting is a problem I should allocate and fill in the array myself.

Tomas Lycken

unread,
Nov 30, 2015, 1:32:42 AM11/30/15
to julia-users

For hcat, what shape would you want the output to have?

With your example input, the result of vcat is straightforward:

v = [[1],[2,3],[4,5]]
vcat(v...)
# -> [1, 2, 3, 4, 5]

But for hcat, what output would you want? hcat(v...) throws a DimensionMismatch error stating that vectors must have the same length.
//T

Cedric St-Jean

unread,
Nov 30, 2015, 9:42:35 AM11/30/15
to julia-users
Oh, right, the input vectors are all the same size. It's for a machine learning training set.

I've yet to hit the `hcat(v...)` limit in this particular case. I was just curious to see what the alternatives were, since it seems to be common wisdom that splatting arbitrarily-sized vectors is bad. Thank you for the answers everyone.

Cédric

Tomas Lycken

unread,
Nov 30, 2015, 1:43:03 PM11/30/15
to julia-users

If they’re the same lengths, you could do vcat-then-reshape to get the same result as for hcat, but without splatting :)

reshape(fast_vcat(arr_of_arrs), length(arr_of_arrs[1]), length(arr_of_arrs))

// T

Reply all
Reply to author
Forward
0 new messages