What are the implications (pro/con) of the impending change tho vectorize functions with map() ?

208 views
Skip to first unread message

datnamer

unread,
Feb 10, 2016, 3:21:31 PM2/10/16
to julia-users
Will this eliminate intermediaries in vectorized code? Multithreading? What are the benefits?

Milan Bouchet-Valat

unread,
Feb 10, 2016, 3:27:37 PM2/10/16
to julia...@googlegroups.com
Le mercredi 10 février 2016 à 12:21 -0800, datnamer a écrit :
> Will this eliminate intermediaries in vectorized code?
> Multithreading? What are the benefits?
Could you explain what you mean by "the impending change tho vectorize
functions with map()"?


Regards

Stefan Karpinski

unread,
Feb 10, 2016, 3:28:15 PM2/10/16
to Julia Users
It's not entirely clear to me what you mean.

datnamer

unread,
Feb 10, 2016, 3:37:42 PM2/10/16
to julia-users
https://github.com/JuliaLang/julia/issues/8450

In that issue, Jeff states: "And telling library writers to put @vectorize on all appropriate functions is silly; you should be able to just write a function, and if somebody wants to compute it for every element they use map."

What does moving from a magic @vectorize, to map, give us?  Implicit parallelism or other benefits? 

Jeff Bezanson

unread,
Feb 10, 2016, 5:36:35 PM2/10/16
to julia...@googlegroups.com
1. It's simply the right way to factor the functionality. If I write a
function, it doesn't make sense for me to guess that people will want
to map it over arrays, and therefore put `@vectorize` on it. Anybody
can `map` any function they want, whether the author of that function
knows about it or not.

2. Generality: `@vectorize` generates a particular implementation of
how to do a `map`. To support new array types, new methods would need
to be added to every function that had `@vectorize` applied to it,
which is unworkable. With `map`, containers implement just one
function and you get the benefit everywhere. This does indeed speak to
parallelism, since `map` over a distributed array will automatically
be parallel.

3. Code size. `@vectorize` generates many methods up front, whether
they will be needed or not. With `map`, all you need are scalar
functions and (in theory) one implementation of `map`, and we can
generate specializations just for the functions that are actually
mapped.

4. Some degree of loop fusion (eliminating temporary arrays). Using
`map` makes it less likely that temporary arrays will be allocated,
since writing `map(x->2x+1, x)` naturally does only one pass over `x`,
unlike the vectorized `2x+1`. This is not as good as having the
compiler do this optimization, but it helps a little.

So what stands in the way of all this awesomeness? The fact that the
syntax `map(f, x)` is uglier than `f(x)`. Hence issue #8450.

datnamer

unread,
Feb 10, 2016, 6:28:16 PM2/10/16
to julia-users
Thanks Jeff, that makes total sense to me :)

Also a bit off topic, but i see that Elemental.jl gives lots of distributed linalg capabilities to DArrays. How could similar methods be brought to a single node yet  on disk /out of core structure? Can it be chunked up for Elemental, or would it require new  shared memory linalg routines?
Reply all
Reply to author
Forward
0 new messages