Type asserts help a bit but not that much

308 views
Skip to first unread message

Mauro

unread,
Jan 12, 2015, 5:25:58 AM1/12/15
to juli...@googlegroups.com
As many did before (e.g. see recent posts on julia-users), I was looking
into the performance of higher order functions. There is NumericFuns
and FastAnonymous which help with that. Anyway, the least magical and
most straight forward approach to this would be if type asserts could
solve this:

function higherorder(f::Function, T::Type, x)
out = f(x)::T
end

But asserts don't help much as my investigation suggests
https://gist.github.com/mauro3/ebc4d0b0ed53943775dd

Summarising that investigation:

1.When getting things out of a vec=Vector{Any}, then using ... =
vec::Float64 instead of just ... = vec brings down the memory
allocations to what using a Vector{Float64} allocates (80 bytes in the
test case). However, speed is still about 5x slower. (same goes for
Dict{_,Any})

2.Doing the same for Functions (or ObjectIdDicts), similar to above
example, brings down allocation by 50% but speed at most improves by a
factor of 2x. Using the function straight results in 4x faster speed
and no allocations (80 bytes).

Questions:

- Why do the type assertions not help? I would have thought that this
would give the compiler all the info it needs to make fast code?

- Why is there a difference between point 1 & 2 above (i.e. getting
values out of a Vector{Any} vs a Function)

- Would that be something to implement? Just adding a type-annotation
would be a lot simpler (to use and explain) than using a whole package
to get fast higher order functions.

Tim Holy

unread,
Jan 12, 2015, 6:13:04 AM1/12/15
to juli...@googlegroups.com
If I understand correctly, type-asserts are well-named: they run the "bare"
code (which includes boxing the result) and then assert that the type matches
what you say it is. Their main advantage of using a type-assert is that later
code benefits from knowing the type, so type instability does not "cascade"
throughout the function.

The alternative, prospectively assigning the type, basically requires an
implementation functions with declared output types (issue #1090), but with an
additional twist: you want to be able to declare that `getindex(A, i)`, even
when `A` is an `Vector{Any}`, has return value Float64 *for that particular
`i`* (and maybe not for any other `i`). At the level of machine code, this is
quite a different function from either `getindex(::Vector{Float64}, ::Int)`
(which takes advantage of the packing in memory of Float64) or
`getindex(::Vector{Any}, ::Int)` (which uses one pointer per element, and
dereferences it to find it). So it would be an entirely new beast in julia.

Even if this gets implemented, there's still a lot of use for the current
type-assert failure; for example, if you're _wrong_ about the type of `x =
@unboxed Float64 getindex(A, i)`, that would be a bit like using @inbounds
with an index that is not, in fact, in-bounds. (I just made up that macro
syntax now, I have no idea what kind of syntax would be best for the kind of
output-type declaration you're interested in here.)

--Tim

Tim Holy

unread,
Jan 12, 2015, 6:45:21 AM1/12/15
to juli...@googlegroups.com
Correction due to me reading your post a little more carefully: the reduction
in memory allocation proves that under some circumstances, there is no boxing.
So the slower speed comes from the fact that a `Vector{Any}` is actually a
vector of pointers, and you have to dereference each one to retrieve the
value. That destroys all kinds of optimizations (SIMD, possibly good cache
behavior, etc.).

I'm guessing the unboxing happens only when the function can be inlined.
Functions passed as arguments cannot currently be inlined.

--Tim

On Monday, January 12, 2015 11:22:22 AM Mauro wrote:

Mauro

unread,
Jan 12, 2015, 8:38:05 AM1/12/15
to juli...@googlegroups.com
Hi Tim,

thanks for the clarification. My misconception was that type assertion
could come for free. Somehow that seems to be the case for the
Vector{Any} example but not for the Function example.

Anyway, playing around with this problem a bit more I came up with this
solution, which is similar to FastAnonymous but using `call` overloading
instead of constructors:


immutable Fun{Name} end

fn = Fun{:fn}()
call(::Fun{:fn}, x) = sin(x)
# can have multimethods
call(::Fun{:fn}, x::Int16) = "haha"

# can be used like a normal function:
fn(5) # -> -0.958
fn(int16(5)) # -> "haha"


Here the gist with performance tests:
https://gist.github.com/mauro3/30bef47144c2ec4beb8d

Naive question:

Would there be any disadvantage of using Fun instead of Function
everywhere?

Erik Schnetter

unread,
Jan 12, 2015, 8:56:36 AM1/12/15
to juli...@googlegroups.com
On Jan 12, 2015, at 7:34 , Mauro <maur...@runbox.com> wrote:
>
> Naive question:
>
> Would there be any disadvantage of using Fun instead of Function
> everywhere?

Obligatory pun: This would be Fun!

-erik

--
Erik Schnetter <schn...@gmail.com>
http://www.perimeterinstitute.ca/personal/eschnetter/

My email is as private as my paper mail. I therefore support encrypting
and signing email messages. Get my PGP key from http://pgp.mit.edu/.

signature.asc

Tim Holy

unread,
Jan 12, 2015, 9:05:41 AM1/12/15
to juli...@googlegroups.com
I need to rewrite FastAnonymous for julia 0.4, because the combination of call
overloading and stagedfunctions is quite powerful. Below I've amplified your
example, showing that you can basically solve the slow-passing-of-function-
arguments problem. This doesn't really handle the "anonymous data" portion of
FastAnonymous, but with a little work that too can be arranged.

Best,
--Tim

function mysin!(b, a)
for i = 1:length(a)
b[i] = sin(a[i])
end
b
end

function mymap!(f, b, a)
for i = 1:length(a)
b[i] = f(a[i])
end
b
end

immutable Fun{Name} end

stagedfunction call{fn}(::Fun{fn}, x)
:($fn(x))
end

a = rand(10^7);
b = similar(a);

mysin!(b, a);
fsin = Fun{:sin}()
mymap!(fsin, b, a);

julia> @time 1
elapsed time: 0.001543315 seconds (13808 bytes allocated)
1

julia> @time mysin!(b, a);
elapsed time: 0.153308325 seconds (80 bytes allocated)

julia> @time mymap!(fsin, b, a);
elapsed time: 0.151995279 seconds (80 bytes allocated)

Andreas Noack

unread,
Jan 12, 2015, 9:13:47 AM1/12/15
to juli...@googlegroups.com
...or we could just define the arithmetic functions as types and overload call on the type

immutable + end

call(::Type{+}, x::Int, y::Int) = box(Int,add_int(unbox(Int,x),unbox(Int,y)))

Unfortunately, the code doesn't get optimized well in map even though Int+Int appears to be fully optimized according to code_native.

Mauro

unread,
Jan 12, 2015, 9:57:23 AM1/12/15
to juli...@googlegroups.com
@eric: if this takes off the type should probably be named `FUN` or
maybe even `FUN!`

@tim: That is a sweet stagedfunction trick!

@andreas: Equally sweet trick! BTW, using your + in map! shows no
allocations. Comparison to a normal `-` looks pretty good:

julia> @time map!(-, c, b, a);
elapsed time: 0.016618344 seconds (6391904 bytes allocated)

julia> @time map!(+, c, b, a);
elapsed time: 0.00044628 seconds (80 bytes allocated)

However, it doesn't work so well using it with a self-written map-function like
in Tim's example. See https://gist.github.com/mauro3/84d43de3f5bcc9b935ba

Question:
So, why not use it everywhere? Or put differently, why is it that this
trick works but functions don't work when passed as arguments?

John Myles White

unread,
Jan 12, 2015, 12:15:06 PM1/12/15
to juli...@googlegroups.com
Tim, this is great. This makes me a lot more hopeful that we could demo out how functions would behave if they had a richer type system.

-- John
Reply all
Reply to author
Forward
0 new messages