Pre-allocation and for-loops

616 views

Skip to first unread message

Jérémy Béjanin

unread,

May 21, 2015, 7:27:26 PM5/21/15

to julia...@googlegroups.com

In the performance tips section of the manual, it is recommended to, if convenient/possible, pre-allocate the variables used in for-loops in order to avoid unecessary allocation and garbage collection.

The example given uses a function xinc(x) which returns a array . Running this function in a for loop results in a lot of memory allocation and takes significantly more time than the updating version of that same function xinc!. This is clear to me.

However, if I change the line in xinc(x) to return a float: return [x, x+1, x+2] to return x, I notice that the memory use is only 96 bytes, which I assume means that no allocation is happening other than the initial one. There is no advantage in that case to using the xinc! function. I see that the manual suggests that the improvement will only occur with "complex" types. Why is that the case?

My other question concerns the necessity of updating functions: why is not possible to pre-allocate the variable and use the "normal" function, returning its output in the preallocated variable? The space in memory is already there, why is it not possible to use it? I believe that this is what MATLAB does.

Thanks for your help,

Jeremy

Tim Holy

unread,

May 21, 2015, 9:14:26 PM5/21/15

to julia...@googlegroups.com

On Thursday, May 21, 2015 04:04:46 PM Jérémy Béjanin wrote:
> However, if I change the line in xinc(x) to return a float: return [x, x+1,
> x+2] to return x, I notice that the memory use is only 96 bytes, which I
> assume means that no allocation is happening other than the initial one.
> There is no advantage in that case to using the xinc! function. I see that
> the manual suggests that the improvement will only occur with "complex"
> types. Why is that the case?

That example is there only for illustration (that's why it says "trivial
example"); given what the algorithm does, of course returning x alone would be
sufficient information. A more relevant example might be matrix multiplication,
where you want to return the whole resultant matrix. Basically, the advice
there only applies if, in order to return the result, you need to allocate
memory. For objects that don't need a container of some kind, usually no
memory allocation is required.

The 96 bytes are not really "the initial one," because julia can return single
scalars with no allocation whatsoever. The little bit of memory you see
reported has more to do with interaction with the REPL than anything else.

It's also worth pointing out that in julia 0.4, the long-term cost of
allocating memory is a lot lower. In my initial implementations, I usually
don't worry much about allocating memory: I write my algorithms "the easy way"
and wait until I find I need to speed things up before I worry about such
optimizations. (On julia 0.3 it's more of an issue, however, so much so that I
often planned ahead.)

> My other question concerns the necessity of updating functions: why is not
> possible to pre-allocate the variable and use the "normal" function,
> returning its output in the preallocated variable? The space in memory is
> already there, why is it not possible to use it? I believe that this is
> what MATLAB does.

Reusing memory is on the TODO list, but not the done list. If you're
interested in such topics, issues like
https://github.com/JuliaLang/julia/pull/10084
might be places to get started.

--Tim

Reply all

Reply to author

Forward

0 new messages