Is bad performance implied for constructing types with more than 7 fields?

391 views
Skip to first unread message

Jutho

unread,
Jan 18, 2015, 5:07:51 PM1/18/15
to juli...@googlegroups.com
I was trying to time the speed of adding CartesianIndex objects, compared to just Int addition using the following code:

function timesum(N)
    ind=1
    ind1=CartesianIndex((1,))
    ind2=CartesianIndex((1,1))
    ind3=CartesianIndex((1,1,1))
    ind4=CartesianIndex((1,1,1,1))
    ind5=CartesianIndex((1,1,1,1,1))
    ind6=CartesianIndex((1,1,1,1,1,1))
    ind7=CartesianIndex((1,1,1,1,1,1,1))
    ind8=CartesianIndex((1,1,1,1,1,1,1,1))
    @time s=mymul(ind,N,1)
    @time s1=mymul(ind1,N,1)
    @time s2=mymul(ind2,N,1)
    @time s3=mymul(ind3,N,1)
    @time s4=mymul(ind4,N,1)
    @time s5=mymul(ind5,N,1)
    @time s6=mymul(ind6,N,1)
    @time s7=mymul(ind7,N,1)
    @time s8=mymul(ind8,N,1)
    return s, s1, s2, s3, s4, s5, s6, s7, s8
end

function mymul(ind,N,b)
    s=ind
    j=1
    while j<N
        s+=ind
        j+=b
    end
    return s
end

The b is always one, but is there because otherwise a normal for j=1:N loop seems to always optimize away the Int addition and does not allow me to properly time the normal Int addition. Running timesum(1_000_000_000) gives the following results

elapsed time: 0.323047918 seconds (0 bytes allocated)

elapsed time: 0.322218919 seconds (0 bytes allocated)

elapsed time: 0.424590085 seconds (0 bytes allocated)

elapsed time: 0.628663348 seconds (0 bytes allocated)

elapsed time: 0.614400146 seconds (0 bytes allocated)

elapsed time: 0.721610207 seconds (0 bytes allocated)

elapsed time: 0.809785727 seconds (0 bytes allocated)

elapsed time: 0.934339029 seconds (0 bytes allocated)

elapsed time: 6.077033222 seconds (0 bytes allocated)


The first is Int addition, and then CartesianIndex{1} up to 8. The N=8 case seems worrisome, even though there is still no allocation going on. Because I thought that this could be caused because + for CartesianIndex is defined as a stagedfunction, I also implemented the following direct function call:


myplus(a::Base.IteratorsMD.CartesianIndex_8,b::Base.IteratorsMD.CartesianIndex_8)=Base.IteratorsMD.CartesianIndex_8(a[1]+b[1],a[2]+b[2],a[3]+b[3],a[4]+b[5],a[5]+b[5],a[6]+b[6],a[7]+b[7],a[8]+b[8])


However, with mymul calling myplus , the same timings appear. So I assume this is just caused by the constructor call itself, which in julia v0.4 gives rise to a call(::Type{Base.IteratorsMD.CartesianIndex_8},arg1,...,arg8) , which is a function with 9 arguments, thus exceeding the MAX_TUPLETYPE_LEN=8 . Does this mean that it is impossible to construct a type (immutable) with more than 7 fields efficiently? Does this also imply that functions will more than 8 arguments will always have bad performance? As you can see, I have really no knowledge about these things are implemented in Julia, or even how they work in general for a programming language/compiler/processor/... But I would like to understand this better.

Jutho

unread,
Jan 22, 2015, 1:29:21 PM1/22/15
to juli...@googlegroups.com
Nobody has any thoughts or comments? It cannot be that Julia, the new superfast language (which I really love, no irony here), breaks down on its promises if you decide to use a type / immutable with more than 7 fields. Probably I am just missing something?

Isaiah Norton

unread,
Jan 22, 2015, 1:36:40 PM1/22/15
to juli...@googlegroups.com
I would guess this is due to the fact that tuple type specialization currently has some size limits:

https://github.com/JuliaLang/julia/blob/f1b00f8024e1f0e551afe7a5925fdebb1fb2df92/base/inference.jl#L4-L5

This issue is one points of Jeff's type system revamp:

https://github.com/JuliaLang/julia/issues/8974


Jutho

unread,
Jan 23, 2015, 3:06:38 AM1/23/15
to juli...@googlegroups.com

This issue is one points of Jeff's type system revamp:

https://github.com/JuliaLang/julia/issues/8974


Cool. Looking forward to this. 
Reply all
Reply to author
Forward
0 new messages