compilation warnings + some benchmark

16 views
Skip to first unread message

Dorival Pedroso

unread,
Jun 26, 2012, 9:36:49 PM6/26/12
to copperhe...@googlegroups.com
Hi,

I'm just wondering what am I doing wrong with the following code? Apparently I can't get OpenMP or TBB to speed my code up in my machine (Debian Linux x86_64 -- Intel Xeon X5482 3.2GHz) -- time measurements after 2nd/3rd compilation, of course.

Unfortunately I don't have a CUDA card with double precision capabilities here...

Just another small thing: I'm getting the warning "module.cpp:45:17: warning: variable compresult set but not used" Thanks again for the nice work and best wishes with SciPy/12. Cheers. Dorival

Output:
Allocation: dt = 3.47743415833
Python:     dt = 0.76766204834
TBB:        dt = 2.15556788445
OpenMP:     dt = 1.36044192314
Total:      dt = 7.76110601425

Code:
import sys
from   copperhead import *
from   numpy      import linspace
from   time       import time, clock

@cu
def xpy(x, y):
    def x_p_y(x, y): return x + y
    return map(x_p_y, x, y)

if __name__=='__main__':

    n = int(sys.argv[1]) if len(sys.argv)>1 else 100000001

    t0  = time()
    x   = linspace( 0.,  1., n)
    y   = linspace(10., 20., n)
    t1  = time()
    dta = t1-t0
    print 'Allocation: dt =', dta

    t0   = time()
    z    = x + y
    t1   = time()
    dtb  = t1-t0
    print 'Python:     dt =', dtb

    t0  = time()
    with places.tbb: z = xpy(x, y)
    t1  = time()
    dtc = t1-t0
    print 'TBB:        dt =', dtc

    t0  = time()
    with places.openmp: z = xpy(x, y)
    t1  = time()
    dtd = t1-t0
    print 'OpenMP:     dt =', dtd
    print 'Total:      dt =', dta+dtb+dtc+dtd

Warning:
recompiling for non-existent cache dir (__pycache__/test_speed.py/xpy/8a92a9192ee662f9c571aa746fce21a1).
__pycache__/test_speed.py/xpy/8a92a9192ee662f9c571aa746fce21a1/module.cpp: In function ‘copperhead::sp_cuarray tbb_tag_xpyFnTupleSeqFloat64SeqFloat64SeqFloat64::_xpy(copperhead::sp_cuarray, copperhead::sp_cuarray)’:
__pycache__/test_speed.py/xpy/8a92a9192ee662f9c571aa746fce21a1/module.cpp:45:17: warning: variable ‘compresult’ set but not used [-Wunused-but-set-variable]
test_speed.py

Bryan Catanzaro

unread,
Jun 26, 2012, 11:31:22 PM6/26/12
to copperhe...@googlegroups.com
Hi Dorival -
I haven't been using GCC 4.7, so I haven't seen those warnings before,
but I'm not surprised to see them. I'll work on eliminating them.

As to the benchmark question - there is one thing you can do to
improve things - use cuarrays as the input to your function instead of
numpy arrays. Change the allocation part to look like this:
t0 = time()
x = linspace( 0., 1., n)
y = linspace(10., 20., n)
cuarray_x = cuarray(x)
cuarray_y = cuarray(y)
t1 = time()
dta = t1-t0
print 'Allocation: dt =', dta

And then use cuarray_x and cuarray_y. If you use numpy arrays, the
cost of copying the data into the copperhead data structure is going
to be paid every time you call the function.

Using numpy arrays:
(copperhead-new)Elendil:samples catanzar$ python vadd.py 10000000
Allocation: dt = 0.483824968338
Python: dt = 0.0831031799316
TBB: dt = 0.451075077057
OpenMP: dt = 0.193276882172
Total: dt = 1.2112801075

Using cuarrays (on my Core 2 Duo laptop):
(copperhead-new)Elendil:samples catanzar$ python vadd.py 10000000
Allocation: dt = 0.63053393364
Python: dt = 0.111096858978
TBB: dt = 0.20470905304
OpenMP: dt = 0.133543968201
Total: dt = 1.07988381386

Using cuarrays helps - now the OpenMP code is almost as fast as the
native numpy code. However, I wouldn't expect this code to run much
faster than numpy, even when parallelized - the parallelization incurs
some overhead, and the code is basically bandwidth bound anyway. To
see more of a difference, you could try something more compute
intensive (like sort), or create a more complicated program, which
Copperhead would fuse together to reduce memory traffic.

For example, calling sort, I see the following:
Python: dt = 2.00371193886
TBB: dt = 2.5909011364
OpenMP: dt = 1.27429485321

- bryan

Dorival Pedroso

unread,
Jun 26, 2012, 11:43:07 PM6/26/12
to copperhe...@googlegroups.com
sure! it does work now -- i was also thinking that we'd need a more computational-intensive function to see greater differences... thanks! dorival

Bryan Catanzaro

unread,
Jun 27, 2012, 2:45:52 PM6/27/12
to copperhe...@googlegroups.com
Hi Dorival -
I've pushed an update to github.com/copperhead/copperhead that
attempts to remove some warnings. If you have time to try it out and
let me know whether the warnings are gone, that'd be helpful. In
order to force recompilation, you may need to delete the __pycache__
directory where your copperhead programs live.

Thanks,
bryan

On Tue, Jun 26, 2012 at 8:43 PM, Dorival Pedroso

Dorival Pedroso

unread,
Jun 28, 2012, 5:18:40 AM6/28/12
to copperhe...@googlegroups.com
hi bryan, i've downloaded the new source code tree from scratch, typed python setup.py build (then install with sudo), deleted the __pycache__ directory in my working dir; but now i'm getting a problem with 'No module named np_interop'... my siteconf.py has:
NP_INC_DIR = "/usr/lib/pymodules/python2.7/numpy/core/include"
and numpy files are in the right place....
cheers.
dorival

Bryan Catanzaro

unread,
Jun 28, 2012, 11:21:12 AM6/28/12
to copperhe...@googlegroups.com
Sorry about that, I forgot to commit a file on my end. It should be there now.

- bryan

Dorival Pedroso

unread,
Jun 28, 2012, 9:11:56 PM6/28/12
to copperhe...@googlegroups.com
thanks! it's all good here now (no warnings). cheers. dorival
Reply all
Reply to author
Forward
0 new messages