optimization levels in julia?

1,444 views
Skip to first unread message

Meik Hellmund

unread,
Dec 6, 2015, 10:11:05 PM12/6/15
to julia-users
Hi,

It seems that julia does not "optimize away" constant expressions in functions:

julia> f(x)=x+2-2
f (generic function with 1 method)

julia> f(1.e-18)
0.0

This is of course correct in Float64 arithmetics. 
But I wonder: In languages like Fortran and C the result of such code  depends on the "optimization flags" used when compiling. 
With optimization, the compiler would reduce the function to f(x)=x.
Is there something comparable in Julia?

Another test. Compare
    f(x)=x+sin(.34567)

to

    const sn=sin(.34567)
    g(x)=x+sn


julia> y=0; @time(for i in 1:10^9; y=f(y); end)
 22.489526 seconds (2.00 G allocations: 29.802 GB, 3.50% gc time)

julia> y=0; @time(for i in 1:10^9; y=g(y); end)
 16.268512 seconds (1000.00 M allocations: 14.901 GB, 2.61% gc time)


It looks like Julia does not optimize even the simplest constant expressions so that they are  evaluated  only once. 
And why, by all means, does it allocate so much memory for that?
 Is there something I overlook?

 Best wishes, 
Meik

ele...@gmail.com

unread,
Dec 6, 2015, 10:42:34 PM12/6/15
to julia-users


On Monday, December 7, 2015 at 1:11:05 PM UTC+10, Meik Hellmund wrote:
Hi,

It seems that julia does not "optimize away" constant expressions in functions:

julia> f(x)=x+2-2
f (generic function with 1 method)

julia> f(1.e-18)
0.0

This is of course correct in Float64 arithmetics. 
But I wonder: In languages like Fortran and C the result of such code  depends on the "optimization flags" used when compiling. 
With optimization, the compiler would reduce the function to f(x)=x.
Is there something comparable in Julia?

Another test. Compare
    f(x)=x+sin(.34567)

to

    const sn=sin(.34567)
    g(x)=x+sn


julia> y=0; @time(for i in 1:10^9; y=f(y); end)
 22.489526 seconds (2.00 G allocations: 29.802 GB, 3.50% gc time)

julia> y=0; @time(for i in 1:10^9; y=g(y); end)
 16.268512 seconds (1000.00 M allocations: 14.901 GB, 2.61% gc time)

Your `y` is a global variable, that prevents optimisations, putting the above in functions gives me no material difference between the time or allocations.  See http://julia.readthedocs.org/en/latest/manual/performance-tips/

ele...@gmail.com

unread,
Dec 6, 2015, 10:50:38 PM12/6/15
to julia-users
Also if, inside the function, you keep the type of `y` stable by y=0.0 Julia will optimise away the second loop entirely, and do no allocations for the first loop.

Meik Hellmund

unread,
Dec 7, 2015, 4:59:08 AM12/7/15
to julia-users

Thank you for pointing out the problem with a global y and its type. 
I hope my new code example below is better. I want to point at 
the relative performance difference of the functions f and g.  

I am experimenting with Julia for  a couple of weeks and I like it very much. 
In my experience, every language has a different  borderline between 
"The programmer should think about optimal performance" and 
"Don't overoptimize, the compiler is smarter than you". 

So I just want to understand if Julia does 
constant expression evaluation at compile time or not

I think julia 0.4.1 does not. My code:
#-------------------------------------
function stupidtest(f)
    for x in 1 : 10^8
       f(x)
    end
end

f(x) =  x + log(2.) * sin(.34567)^2

const z = log(2.) * sin(.34567)^2
g(x) = x + z

@time( stupidtest(f) )
@time( stupidtest(g) )
#------- EOF------------------------------

$julia --optimize --inline=yes --check-bounds=no --math-mode=fast mycode.jl 
  2.869774 seconds (200.00 M allocations: 2.980 GB, 3.39% gc time)
  1.855530 seconds (200.00 M allocations: 2.980 GB, 4.32% gc time)


It also seems that I (with a C++/Fortran background) do not really understand 
some aspects of the language. Why does a loop with a function call
need so much memory allocations?

Any explanations are appreciated!
 
Cheers, Meik


Milan Bouchet-Valat

unread,
Dec 7, 2015, 5:17:37 AM12/7/15
to julia...@googlegroups.com
I think passing functions as an argument isn't as fast as it could be
at the moment. Try defining two completely different test functions:

julia> f(x) = x + log(2.) * sin(.34567)^2
f (generic function with 1 method)

julia> const z = log(2.) * sin(.34567)^2
0.07957594310953507

julia> g(x) = x + z
g (generic function with 1 method)

julia> function stupidtestf()
for x in 1 : 10^8
f(x)
end
end
stupidtestf (generic function with 1 method)

julia> function stupidtestg()
for x in 1 : 10^8
g(x)
end
end
stupidtestg (generic function with 1 method)

julia> @time( stupidtestf() )
1.623907 seconds (2.13 k allocations:
109.855 KB)

julia> @time stupidtestf()
1.631431 seconds (4 allocations: 160 bytes)

julia> @time stupidtestg()
0.003252 seconds (1.81 k allocations: 92.401 KB)

julia> @time stupidtestg()
0.000002 seconds (4 allocations: 160 bytes)

Here the difference is well visible.


Also, benchmarking should be done after starting Julia and calling the
function once to avoid including compilation time and allocations.


Regards

Kristoffer Carlsson

unread,
Dec 7, 2015, 5:50:08 AM12/7/15
to julia-users
You have already been linked the performance tip link once: http://julia.readthedocs.org/en/latest/manual/performance-tips/

It even explicitly says:

"Unexpected memory allocation is almost always a sign of some problem with your code, usually a problem with type-stability. Consequently, in addition to the allocation itself, it’s very likely that the code generated for your function is far from optimal. Take such indications seriously and follow the advice below."

When you start with something new, it is generally appreciated if you try take some time to read the documentation. Explaining the same thing to everyone is an O(N) effort while writing a good documentation explaining it is O(1).

ele...@gmail.com

unread,
Dec 7, 2015, 6:00:19 AM12/7/15
to julia-users
You are making the assumption that `sin(.34567)` is a constant, but in Julia technically sin is dependent on global state, specifically the rounding and denormalisation modes http://docs.julialang.org/en/release-0.4/stdlib/numbers/?highlight=rounding#Base.set_rounding.  It these change during the loop, then the answer may not be the same, and like all global state Julia can't infer that.  Whereas you have demonstrated that you can tell Julia its a constant with the same value as calculated at initialisation.

Yichao Yu

unread,
Dec 7, 2015, 6:48:43 AM12/7/15
to Julia Users
On Mon, Dec 7, 2015 at 6:00 AM, <ele...@gmail.com> wrote:
> You are making the assumption that `sin(.34567)` is a constant, but in Julia
> technically sin is dependent on global state, specifically the rounding and
> denormalisation modes
> http://docs.julialang.org/en/release-0.4/stdlib/numbers/?highlight=rounding#Base.set_rounding.
> It these change during the loop, then the answer may not be the same, and
> like all global state Julia can't infer that. Whereas you have demonstrated
> that you can tell Julia its a constant with the same value as calculated at
> initialisation.
>

Ref https://github.com/JuliaLang/julia/issues/9942

Erik Schnetter

unread,
Dec 7, 2015, 12:11:38 PM12/7/15
to julia...@googlegroups.com
Use the `@fastmath` macro. This dynamically enables floating-point optimizations in the expression (or function) to which you apply it. Compare `@inbounds`, `@simd`.

-erik

Meik Hellmund

unread,
Dec 8, 2015, 6:23:15 AM12/8/15
to julia-users
Many thanks to all of you answered. I learned a lot.                            
There are two issues I stumbled:                                                
                                                                                
(I) The vast memory allocations of my naive test function:                      
                                                                                
      function stupidtest(f)                                                    
         for x=1. : 1.e8                                                        
            f(x)                                                                
         end                                                                    
      end                                                                       
                                                                                
      @time(stupidtest(sqrt))                                                   
      1.701242 seconds (200.00 M allocations: 2.980 GB, 5.49% gc time)          
                                                                                
    As Milan pointed out, passing functions as an argument and then calling     
    them massively leads to unexpected memory allocations.                      
    The performance tips doc suggests checking for type problems:                
                                                                                                                                                                
     @code_warntype(stupidtest(sqrt))                                            
       ...                                                                      
         (f::F)(x::Int64)::Any            <==         the  '::Any' part is red            
       ...                                                                      
                                                                                                                                                                
    My reading of this is this: the compiler can't interfere                    
    the return type of the function argument and the user can't                 
    (currently?) declare it. Right?                                             
                                                                                
                                                                                 
(II) My other question: constant folding in expressions like                    
           f(x)=x+3.*sin(.3)                                                    
     Yichao Yu pointed to issue #9942, whose opening                            
     post is about constant folding in loops like                               
             for i=1:n;  x+=sin(.34); end                                                                                                                                                    
     So I will watch this issue - and next time I stumble upon something,       
     I will search through the issues. Seems like a great place to dig          
     deeper into the state of the language. There are  proposals 
    (#414,  #13555) about pure function annotation which are relevant here. 
                                   

Best wishes, Meik
 

Stefan Karpinski

unread,
Dec 8, 2015, 12:04:56 PM12/8/15
to Julia Users
The first issue is by far the more serious of the two. Fortunately, Jeff is actively working on this issue (along with several others) so that should be addressed soon on 0.5-dev. The second issue is annoying but relatively easy to code around by manually hoisting computations out of loops when the argument is constant over the loop.

Steven G. Johnson

unread,
Dec 8, 2015, 1:24:53 PM12/8/15
to julia-users
On Monday, December 7, 2015 at 4:59:08 AM UTC-5, Meik Hellmund wrote:
It also seems that I (with a C++/Fortran background) do not really understand 
some aspects of the language. Why does a loop with a function call
need so much memory allocations?

If you do @code_warntype stupidtest(f), you will see that the compiler can't figure out the return-type of f(x), and so it allocates a "box" that can hold any value to store the return value.  Since this allocation happens on every call to f(x), the total memory usage is high.

There are plans to fix this, but in the meantime if you need to pass *small* functions like this as arguments to be called in *inner loops*, then it is better to define a functor type (a type with an overloaded "call" method); for technical reasons, functor calls can be inlined, whereas function-argument calls currently cannot.   (For large/expensive functions, the extra overhead of their calls doesn't matter.)

On a separate note, no, constant expressions like log(2.0) are currently not evaluated at compile-time.  See https://github.com/JuliaLang/julia/issues/14324
Reply all
Reply to author
Forward
0 new messages