Comparing timing with Pypy

168 views
Skip to first unread message

Diego Javier Zea

unread,
Jan 7, 2013, 10:44:34 AM1/7/13
to julia...@googlegroups.com
I found this post:
http://morepypy.blogspot.com.ar/2011/08/pypy-is-faster-than-c-again-string.html
and I feel curiosity for trying in Julia and see if it's more faster or not.
The result is here:

julia> function f()
       
for i in 1:10000000
           
"$i $i"
       
end
       
end

julia
> benchmark(f,"comp_pypy","julia",4)
1x12 DataFrame:
       
BenchmarkCategory BenchmarkName Iterations TotalWall AverageWall MaxWall MinWall             Timestamp            JuliaVersion    JuliaHash CodeHash      OS
[1,]          "comp_pypy"       "julia"          4   33.5648      8.3912 8.40188 8.38071 "2013-01-07 12:23:20" "0.0.0+106484954.r5387" "53872cd2c9"       NA "Linux"


[ I used John's https://github.com/johnmyleswhite/Benchmark.jl ]

dzea@deepthought:~$ time pypy ej.pypy

real    
0m0.406s
user    
0m0.392s
sys    
0m0.012s


Pypy is too more faster here
Why do you think pypy is faster on this example?

Stefan Karpinski

unread,
Jan 7, 2013, 5:07:38 PM1/7/13
to Julia Users

tl;dr – That PyPy performance post is very misleading. PyPy only has better printf performance than C in the particular case where you print the same expression twice. Specifically, their JIT figures out that it can decode the value only once instead of twice, which C cannot do because libc's printf is all run-time, so PyPy is about twice as fast since decoding is almost all of the work. If you change the test to anything where you're printing different expressions, PyPy is no faster than C. It seems to me that this post was either intentionally misleading or they didn't understand why their compiler was doing so suspiciously well on this benchmark (any time you're handily beating C is cause for suspicion). Moreover, as soon as PyPy has to print to an output stream instead of just creating a string in memory, the performance tanks becoming 7x slower than C.

Currently Julia's printfd benchmark time is around 55 ms – about 2x slower than C. I'm sure we can get closer with more work on I/O, but there are bigger fish to fry. Your benchmark code isn't actually doing a printf test but rather a string interpolation equivalent to strcat(string(i)," ",string(i)) – what it expands to at compile time. Since I don't know what your PyPy benchmark code is doing, it's hard to say how PyPy could be 20x faster. I get the following relative timings in Julia:

julia> @elapsed begin
           for i = 1:100000
               "$i $(i+1)"
           end
       end
0.30051112174987793

julia> @elapsed begin
           for i = 1:100000
               @sprintf("%d %d\n",i,i+1)
           end                                
       end
0.18344807624816895

julia> @elapsed begin
           open("/dev/null","w") do io
               for i = 1:100000
                   @printf(io,"%d %d\n",i,i+1)
               end
           end
       end
0.10406017303466797

Since we're 2x slower than C for printf, this implies that sprintf is 3.5x slower than C, which is not great, and interpolation is almost 6x slower than C, which is kind of terrible. If PyPy is really 20x faster than our interpolation, though, that still implies that it's 3.33x faster than C, which seems pretty implausible.



--
 
 

Stefan Karpinski

unread,
Jan 7, 2013, 5:41:08 PM1/7/13
to Julia Users
It's possible that gc is really killing us when running for that many iterations.
Reply all
Reply to author
Forward
0 new messages