None of the below worked, and apologies if it's too silly or documented somewhere, but: how to do a dot product of two vectors?
(Been doing some Copperhead tests for a class project.)
There's also a secondary question about continued development of Copperhead. Friend of mine who attended Bryan's talk at SC'11 says the project is very much alive, but Google code commits are from a year ago. Is there another place one should go looking?
Thanks! Sajith.
from copperhead import * from itertools import imap from operator import mul import numpy import timeit import sys
@cu def dot_product(x, y): def elem_wise(xi, yi): return xi * yi return sum(map(elem_wise, x, y))
To see what functions you can call from within a Copperhead program, take a look at prelude.py, which has some rudimentary documentation. However, not all functions mentioned there work yet, especially with the code in the main public repository.
The project is very much alive, although I'm currently the only person working on it. During the past year I made a lot of progress (some of which you can see in the public clones on Google Code, especially this one: http://code.google.com/r/bryancatanzaro-copperhead/source/browse), and wrote and defended my dissertation on Copperhead (which you can see here if you're interested: http://www.eecs.berkeley.edu/Pubs/TechRpts/2011/EECS-2011-45.html). I then joined Nvidia Research, where I am continuing work on the project, with a focus on making it practically useful, instead of being just an interesting research vehicle. I have completely rewritten the compiler in the past few months, and hope to push a new, more stable version publically in the next few weeks. Although my new compiler doesn't support all the features the old compiler attempted to do, I am documenting it and testing it, so that I can invite more people like yourself to use it.
Thanks for the interest, and I'll let you know when I push my new compiler publicly.
Take care, bryan
On Sun, Dec 4, 2011 at 7:38 PM, Sajith T S <saj...@gmail.com> wrote:
> None of the below worked, and apologies if it's too silly or > documented somewhere, but: how to do a dot product of two vectors?
> (Been doing some Copperhead tests for a class project.)
> There's also a secondary question about continued development of > Copperhead. Friend of mine who attended Bryan's talk at SC'11 says > the project is very much alive, but Google code commits are from a > year ago. Is there another place one should go looking?
> Thanks! > Sajith.
> from copperhead import * > from itertools import imap > from operator import mul > import numpy > import timeit > import sys
However both didn't quite work for me. This is what I get:
Traceback (most recent call last): File "./dot.py", line 39, in <module> t1 = do_run(dogpu, "GPU") File "./dot.py", line 27, in do_run n = t.timeit(count) File "/usr/lib64/python2.7/timeit.py", line 194, in timeit timing = self.inner(it, self.timer) File "/usr/lib64/python2.7/timeit.py", line 100, in inner _func() File "./dot.py", line 19, in dogpu gpu = dot_product(x, y) File "/home/sasasidh/software/lib64/python2.7/site-packages/copperhead-0.1a1-py2 .7.egg/copperhead/runtime/cufunction.py", line 56, in __call__ return P.execute(self, args, kwargs) File "/home/sasasidh/software/lib64/python2.7/site-packages/copperhead-0.1a1-py2 .7.egg/copperhead/runtime/driver.py", line 60, in execute return execute(cufn, *args, **kwargs) File "/home/sasasidh/software/lib64/python2.7/site-packages/copperhead-0.1a1-py2 .7.egg/copperhead/runtime/driver.py", line 86, in execute return_value = compiled_fn(*cu_inputs) File "<string>", line 10, in dot_product File "/home/sasasidh/software/lib64/python2.7/site-packages/copperhead-0.1a1-py2 .7.egg/copperhead/runtime/cubox.py", line 31, in __call__ return self.fn(*args_cache) File "/home/sasasidh/software/lib64/python2.7/site-packages/copperhead-0.1a1-py2 .7.egg/copperhead/thrust/reduce.py", line 63, in sum result = module.entryPoint(array) TypeError: No registered converter was able to produce a C++ rvalue of type unsigned long long from this Python object of type PooledDeviceAllocation
(The code I'm trying is attached, if you can risk looking at some poor greenhorn lines of Python.)
I'm using Copperhead from Google code main repository. Perhaps I should switch to another clone?
Lately I've had a chance to look at several GPU programming DSLs (Copperhead papers including your dissertation, but admittedly I haven't spent a lot of time with it -- along with Accelerate, Nikola etc), and Copperhead is certainly among the most promising ones. Good luck with the new direction!
I've found Thrust to be very interesting and useful, and I'm looking forward to having Copperhead too with the official CUDA distribution one day. Particularly so since (no offense!) I've found the whole thing a pain to set up, but it's only expected of new code. :)
Out of curiosity, do you think if there ever will be an OpenCL backend?
Bryan Catanzaro <bryan.catanz...@gmail.com> wrote: > Hi Sajith - > Thanks for the question. Here are a couple ways to do a dot product that > should work:
> @cu > def dot_product(x, y): > def elem_wise(xi, yi): > return xi * yi > return sum(map(elem_wise, x, y))
> To see what functions you can call from within a Copperhead program, take a > look at prelude.py, which has some rudimentary documentation. However, not > all functions mentioned there work yet, especially with the code in the > main public repository.
> The project is very much alive, although I'm currently the only person > working on it. During the past year I made a lot of progress (some of which > you can see in the public clones on Google Code, especially this one: > http://code.google.com/r/bryancatanzaro-copperhead/source/browse), and > wrote and defended my dissertation on Copperhead (which you can see here if > you're interested: > http://www.eecs.berkeley.edu/Pubs/TechRpts/2011/EECS-2011-45.html). I then > joined Nvidia Research, where I am continuing work on the project, with a > focus on making it practically useful, instead of being just an interesting > research vehicle. I have completely rewritten the compiler in the past few > months, and hope to push a new, more stable version publically in the next > few weeks. Although my new compiler doesn't support all the features the > old compiler attempted to do, I am documenting it and testing it, so that I > can invite more people like yourself to use it.
> Thanks for the interest, and I'll let you know when I push my new compiler > publicly.
> Take care, > bryan
> On Sun, Dec 4, 2011 at 7:38 PM, Sajith T S <saj...@gmail.com> wrote:
> > Greetings!
> > None of the below worked, and apologies if it's too silly or > > documented somewhere, but: how to do a dot product of two vectors?
> > (Been doing some Copperhead tests for a class project.)
> > There's also a secondary question about continued development of > > Copperhead. Friend of mine who attended Bryan's talk at SC'11 says > > the project is very much alive, but Google code commits are from a > > year ago. Is there another place one should go looking?
> > Thanks! > > Sajith.
> > from copperhead import * > > from itertools import imap > > from operator import mul > > import numpy > > import timeit > > import sys
> TypeError: No registered converter was able to produce a C++ rvalue of type unsigned long long from this Python object of type PooledDeviceAllocation
Oh, in fact this is the same error I've been getting from all sample programs except simple_tests.py. Does it suggest that something is wrong with my Copperhead install?
Thanks, Sajith.
-- "the lyf so short, the craft so long to lerne." -- Chaucer.
> > TypeError: No registered converter was able to produce a C++ rvalue of > type unsigned long long from this Python object of type > PooledDeviceAllocation
> Oh, in fact this is the same error I've been getting from all sample > programs except simple_tests.py. Does it suggest that something is > wrong with my Copperhead install?
> Thanks, > Sajith.
> -- > "the lyf so short, the craft so long to lerne." > -- Chaucer.
Bryan Catanzaro <bryan.catanz...@gmail.com> wrote: > Yes, I think so. What version of CodePy, PyCUDA and CUDA are you running? > I'm guessing you're on 64-bit Linux?
> - bryan
> On Sun, Dec 4, 2011 at 9:52 PM, Sajith T S <saj...@gmail.com> wrote:
> > > TypeError: No registered converter was able to produce a C++ rvalue of > > type unsigned long long from this Python object of type > > PooledDeviceAllocation
> > Oh, in fact this is the same error I've been getting from all sample > > programs except simple_tests.py. Does it suggest that something is > > wrong with my Copperhead install?
> > Thanks, > > Sajith.
-- "the lyf so short, the craft so long to lerne." -- Chaucer.
I've seen this bug before - it arises from changes in the way PyCUDA and Boost export the functions PyCUDA provides, which Copperhead programs expect to use. In the past, I've solved it by: 1. Not using PyCUDA's shipped Boost library, and instead using the system Boost library when building PyCUDA. 2. Sometimes I have had to use an older version of Boost. 1.41 has worked for me. I'm not sure if this is absolutely necessary, or if just building PyCUDA with the system Boost library is good enough.
For what it's worth, the new version of the Copperhead runtime and compiler do not use PyCUDA (although they still use Codepy, another of Andreas Klöckner's projects). In other words, this particular issue is solved in the new version of Copperhead I expect to release shortly. I realize Copperhead is too difficult to install, and I'm working to make this process easier.
Also, I notice from your trace that you're interested in timing Copperhead program execution. A couple things:
1. The first time you run the function, Copperhead has to invoke nvcc, which takes O(10) seconds. Subsequent runs will use a cached binary.
2. If you care about the overhead of moving data back and forth between the CPU and GPU, you should use CuArray objects. The following code will work, but more slowly: a = dot_product(np.array(...), np.array(...)) Copperhead can't control GPU memory placement for numpy arrays, so this code will result in extraneous memory transfers. Instead, do this: x = CuArray(np.array(...)) y = CuArray(np.array(...)) a = dot_product(x, y)
This will ensure that data is only moved when necessary.
- bryan
On Sun, Dec 4, 2011 at 11:43 PM, Sajith T S <saj...@gmail.com> wrote:
> CUDA 4.0, Codepy 2011.1, PyCUDA 2011.1.3, cgen 2011.1.
> Bryan Catanzaro <bryan.catanz...@gmail.com> wrote: > > Yes, I think so. What version of CodePy, PyCUDA and CUDA are you > running? > > I'm guessing you're on 64-bit Linux?
> > - bryan
> > On Sun, Dec 4, 2011 at 9:52 PM, Sajith T S <saj...@gmail.com> wrote:
> > > > TypeError: No registered converter was able to produce a C++ rvalue > of > > > type unsigned long long from this Python object of type > > > PooledDeviceAllocation
> > > Oh, in fact this is the same error I've been getting from all sample > > > programs except simple_tests.py. Does it suggest that something is > > > wrong with my Copperhead install?
> > > Thanks, > > > Sajith.
> -- > "the lyf so short, the craft so long to lerne." > -- Chaucer.
Bryan Catanzaro <bryan.catanz...@gmail.com> wrote: > I've seen this bug before - it arises from changes in the way PyCUDA and > Boost export the functions PyCUDA provides, which Copperhead programs > expect to use. In the past, I've solved it by: > 1. Not using PyCUDA's shipped Boost library, and instead using the system > Boost library when building PyCUDA. > 2. Sometimes I have had to use an older version of Boost. 1.41 has worked > for me. I'm not sure if this is absolutely necessary, or if just building > PyCUDA with the system Boost library is good enough.
> For what it's worth, the new version of the Copperhead runtime and compiler > do not use PyCUDA (although they still use Codepy, another of Andreas > Kl ckner's projects). In other words, this particular issue is solved in > the new version of Copperhead I expect to release shortly. I realize > Copperhead is too difficult to install, and I'm working to make this > process easier.
> Also, I notice from your trace that you're interested in timing Copperhead > program execution. A couple things:
> 1. The first time you run the function, Copperhead has to invoke nvcc, > which takes O(10) seconds. Subsequent runs will use a cached binary.
> 2. If you care about the overhead of moving data back and forth between > the CPU and GPU, you should use CuArray objects. > The following code will work, but more slowly: > a = dot_product(np.array(...), np.array(...)) > Copperhead can't control GPU memory placement for numpy arrays, so this > code will result in extraneous memory transfers. > Instead, do this: > x = CuArray(np.array(...)) > y = CuArray(np.array(...)) > a = dot_product(x, y)
> This will ensure that data is only moved when necessary.
> - bryan
> On Sun, Dec 4, 2011 at 11:43 PM, Sajith T S <saj...@gmail.com> wrote:
> > Yes, it's 64-bit Linux. This is what "uname -a" says:
> > CUDA 4.0, Codepy 2011.1, PyCUDA 2011.1.3, cgen 2011.1.
> > Bryan Catanzaro <bryan.catanz...@gmail.com> wrote: > > > Yes, I think so. What version of CodePy, PyCUDA and CUDA are you > > running? > > > I'm guessing you're on 64-bit Linux?
> > > - bryan
> > > On Sun, Dec 4, 2011 at 9:52 PM, Sajith T S <saj...@gmail.com> wrote:
> > > > > TypeError: No registered converter was able to produce a C++ rvalue > > of > > > > type unsigned long long from this Python object of type > > > > PooledDeviceAllocation
> > > > Oh, in fact this is the same error I've been getting from all sample > > > > programs except simple_tests.py. Does it suggest that something is > > > > wrong with my Copperhead install?
> > > > Thanks, > > > > Sajith.
-- "the lyf so short, the craft so long to lerne." -- Chaucer.
> Ah, yes -- disabling shipped Boost library, and using system Boost (I > used 1.42) and then rebuilding and re-installing PyCUDA did the trick. > Thanks!
> Thank you for the additional pointers also -- they are very helpful.
> Sajith.
> Bryan Catanzaro <bryan.catanz...@gmail.com> wrote: > > I've seen this bug before - it arises from changes in the way PyCUDA and > > Boost export the functions PyCUDA provides, which Copperhead programs > > expect to use. In the past, I've solved it by: > > 1. Not using PyCUDA's shipped Boost library, and instead using the > system > > Boost library when building PyCUDA. > > 2. Sometimes I have had to use an older version of Boost. 1.41 has > worked > > for me. I'm not sure if this is absolutely necessary, or if just > building > > PyCUDA with the system Boost library is good enough.
> > For what it's worth, the new version of the Copperhead runtime and > compiler > > do not use PyCUDA (although they still use Codepy, another of Andreas > > Klöckner's projects). In other words, this particular issue is solved in > > the new version of Copperhead I expect to release shortly. I realize > > Copperhead is too difficult to install, and I'm working to make this > > process easier.
> > Also, I notice from your trace that you're interested in timing > Copperhead > > program execution. A couple things:
> > 1. The first time you run the function, Copperhead has to invoke nvcc, > > which takes O(10) seconds. Subsequent runs will use a cached binary.
> > 2. If you care about the overhead of moving data back and forth between > > the CPU and GPU, you should use CuArray objects. > > The following code will work, but more slowly: > > a = dot_product(np.array(...), np.array(...)) > > Copperhead can't control GPU memory placement for numpy arrays, so this > > code will result in extraneous memory transfers. > > Instead, do this: > > x = CuArray(np.array(...)) > > y = CuArray(np.array(...)) > > a = dot_product(x, y)
> > This will ensure that data is only moved when necessary.
> > - bryan
> > On Sun, Dec 4, 2011 at 11:43 PM, Sajith T S <saj...@gmail.com> wrote:
> > > Yes, it's 64-bit Linux. This is what "uname -a" says:
> > > Bryan Catanzaro <bryan.catanz...@gmail.com> wrote: > > > > Yes, I think so. What version of CodePy, PyCUDA and CUDA are you > > > running? > > > > I'm guessing you're on 64-bit Linux?
> > > > - bryan
> > > > On Sun, Dec 4, 2011 at 9:52 PM, Sajith T S <saj...@gmail.com> wrote:
> > > > > > TypeError: No registered converter was able to produce a C++ > rvalue > > > of > > > > > type unsigned long long from this Python object of type > > > > > PooledDeviceAllocation
> > > > > Oh, in fact this is the same error I've been getting from all > sample > > > > > programs except simple_tests.py. Does it suggest that something is > > > > > wrong with my Copperhead install?
> > > > > Thanks, > > > > > Sajith.
> -- > "the lyf so short, the craft so long to lerne." > -- Chaucer.
Thank you for your patience. I guess I should try testing it to the extreme. You know, the way people are supposed conduct themselves in mailing lists. So I've got the next set of questions!
First, what would it take to make something like this work?
@cu def vector_sum(x): sum(map((lambda xi: xi if xi > 0 else xi * -1), x))
It dumps a bunch of traceback on me, ending with:
"ValueError: visiting unknown node: <_ast.Expr object at 0x2999950>".
I can send the whole thing if you're interested.
Second, have you tried to make Black & Scholes kernel (the one shipped with Nvidia SDK) work with Copperhead? It doesn't look like a line by line translation to Copperhead would work, in the absence of abs(), exp(), sqrt() etc. Do you have suggestions on how to approach this?
Bryan Catanzaro <bryan.catanz...@gmail.com> wrote: > Glad to hear that worked!
> On Mon, Dec 5, 2011 at 11:02 AM, Sajith T S <saj...@gmail.com> wrote:
> > Ah, yes -- disabling shipped Boost library, and using system Boost (I > > used 1.42) and then rebuilding and re-installing PyCUDA did the trick. > > Thanks!
> > Thank you for the additional pointers also -- they are very helpful.
> > Sajith.
> > Bryan Catanzaro <bryan.catanz...@gmail.com> wrote: > > > I've seen this bug before - it arises from changes in the way PyCUDA and > > > Boost export the functions PyCUDA provides, which Copperhead programs > > > expect to use. In the past, I've solved it by: > > > 1. Not using PyCUDA's shipped Boost library, and instead using the > > system > > > Boost library when building PyCUDA. > > > 2. Sometimes I have had to use an older version of Boost. 1.41 has > > worked > > > for me. I'm not sure if this is absolutely necessary, or if just > > building > > > PyCUDA with the system Boost library is good enough.
> > > For what it's worth, the new version of the Copperhead runtime and > > compiler > > > do not use PyCUDA (although they still use Codepy, another of Andreas > > > Kl ckner's projects). In other words, this particular issue is solved in > > > the new version of Copperhead I expect to release shortly. I realize > > > Copperhead is too difficult to install, and I'm working to make this > > > process easier.
> > > Also, I notice from your trace that you're interested in timing > > Copperhead > > > program execution. A couple things:
> > > 1. The first time you run the function, Copperhead has to invoke nvcc, > > > which takes O(10) seconds. Subsequent runs will use a cached binary.
> > > 2. If you care about the overhead of moving data back and forth between > > > the CPU and GPU, you should use CuArray objects. > > > The following code will work, but more slowly: > > > a = dot_product(np.array(...), np.array(...)) > > > Copperhead can't control GPU memory placement for numpy arrays, so this > > > code will result in extraneous memory transfers. > > > Instead, do this: > > > x = CuArray(np.array(...)) > > > y = CuArray(np.array(...)) > > > a = dot_product(x, y)
> > > This will ensure that data is only moved when necessary.
> > > - bryan
> > > On Sun, Dec 4, 2011 at 11:43 PM, Sajith T S <saj...@gmail.com> wrote:
> > > > Yes, it's 64-bit Linux. This is what "uname -a" says:
> > > > Bryan Catanzaro <bryan.catanz...@gmail.com> wrote: > > > > > Yes, I think so. What version of CodePy, PyCUDA and CUDA are you > > > > running? > > > > > I'm guessing you're on 64-bit Linux?
> > > > > - bryan
> > > > > On Sun, Dec 4, 2011 at 9:52 PM, Sajith T S <saj...@gmail.com> wrote:
> > > > > > > TypeError: No registered converter was able to produce a C++ > > rvalue > > > > of > > > > > > type unsigned long long from this Python object of type > > > > > > PooledDeviceAllocation
> > > > > > Oh, in fact this is the same error I've been getting from all > > sample > > > > > > programs except simple_tests.py. Does it suggest that something is > > > > > > wrong with my Copperhead install?
> > > > > > Thanks, > > > > > > Sajith.
-- "the lyf so short, the craft so long to lerne." -- Chaucer.
> Thank you for your patience. I guess I should try testing it to the > extreme. You know, the way people are supposed conduct themselves in > mailing lists. So I've got the next set of questions!
> First, what would it take to make something like this work?
> @cu > def vector_sum(x): > sum(map((lambda xi: xi if xi > 0 else xi * -1), x))
> It dumps a bunch of traceback on me, ending with:
> "ValueError: visiting unknown node: <_ast.Expr object at 0x2999950>".
> I can send the whole thing if you're interested.
> Second, have you tried to make Black & Scholes kernel (the one shipped > with Nvidia SDK) work with Copperhead? It doesn't look like a line by > line translation to Copperhead would work, in the absence of abs(), > exp(), sqrt() etc. Do you have suggestions on how to approach this?
> Regards, > Sajith.
> Bryan Catanzaro <bryan.catanz...@gmail.com> wrote: > > Glad to hear that worked!
> > On Mon, Dec 5, 2011 at 11:02 AM, Sajith T S <saj...@gmail.com> wrote:
> > > Ah, yes -- disabling shipped Boost library, and using system Boost (I > > > used 1.42) and then rebuilding and re-installing PyCUDA did the trick. > > > Thanks!
> > > Thank you for the additional pointers also -- they are very helpful.
> > > Sajith.
> > > Bryan Catanzaro <bryan.catanz...@gmail.com> wrote: > > > > I've seen this bug before - it arises from changes in the way PyCUDA > and > > > > Boost export the functions PyCUDA provides, which Copperhead programs > > > > expect to use. In the past, I've solved it by: > > > > 1. Not using PyCUDA's shipped Boost library, and instead using the > > > system > > > > Boost library when building PyCUDA. > > > > 2. Sometimes I have had to use an older version of Boost. 1.41 has > > > worked > > > > for me. I'm not sure if this is absolutely necessary, or if just > > > building > > > > PyCUDA with the system Boost library is good enough.
> > > > For what it's worth, the new version of the Copperhead runtime and > > > compiler > > > > do not use PyCUDA (although they still use Codepy, another of Andreas > > > > Klöckner's projects). In other words, this particular issue is > solved in > > > > the new version of Copperhead I expect to release shortly. I realize > > > > Copperhead is too difficult to install, and I'm working to make this > > > > process easier.
> > > > Also, I notice from your trace that you're interested in timing > > > Copperhead > > > > program execution. A couple things:
> > > > 1. The first time you run the function, Copperhead has to invoke > nvcc, > > > > which takes O(10) seconds. Subsequent runs will use a cached binary.
> > > > 2. If you care about the overhead of moving data back and forth > between > > > > the CPU and GPU, you should use CuArray objects. > > > > The following code will work, but more slowly: > > > > a = dot_product(np.array(...), np.array(...)) > > > > Copperhead can't control GPU memory placement for numpy arrays, so > this > > > > code will result in extraneous memory transfers. > > > > Instead, do this: > > > > x = CuArray(np.array(...)) > > > > y = CuArray(np.array(...)) > > > > a = dot_product(x, y)
> > > > This will ensure that data is only moved when necessary.
> > > > - bryan
> > > > On Sun, Dec 4, 2011 at 11:43 PM, Sajith T S <saj...@gmail.com> > wrote:
> > > > > Yes, it's 64-bit Linux. This is what "uname -a" says:
> > > > > Bryan Catanzaro <bryan.catanz...@gmail.com> wrote: > > > > > > Yes, I think so. What version of CodePy, PyCUDA and CUDA are you > > > > > running? > > > > > > I'm guessing you're on 64-bit Linux?
> > > > > > - bryan
> > > > > > On Sun, Dec 4, 2011 at 9:52 PM, Sajith T S <saj...@gmail.com> > wrote:
> > > > > > > Sajith T S <saj...@gmail.com> wrote:
> > > > > > > > TypeError: No registered converter was able to produce a C++ > > > rvalue > > > > > of > > > > > > > type unsigned long long from this Python object of type > > > > > > > PooledDeviceAllocation
> > > > > > > Oh, in fact this is the same error I've been getting from all > > > sample > > > > > > > programs except simple_tests.py. Does it suggest that > something is > > > > > > > wrong with my Copperhead install?
> > > > > > > Thanks, > > > > > > > Sajith.
> -- > "the lyf so short, the craft so long to lerne." > -- Chaucer.
Sorry, I hit send before I meant to. If all that's required to get Black-Scholes working with Copperhead is adding abs, sqrt (I think exp is already there), then that's easy. Just add a Python implementation in prelude.py, and make sure the C++ include files that Copperhead includes have definitions.
- bryan
On Tue, Dec 6, 2011 at 12:58 PM, Sajith T S <saj...@gmail.com> wrote:
> Thank you for your patience. I guess I should try testing it to the > extreme. You know, the way people are supposed conduct themselves in > mailing lists. So I've got the next set of questions!
> First, what would it take to make something like this work?
> @cu > def vector_sum(x): > sum(map((lambda xi: xi if xi > 0 else xi * -1), x))
> It dumps a bunch of traceback on me, ending with:
> "ValueError: visiting unknown node: <_ast.Expr object at 0x2999950>".
> I can send the whole thing if you're interested.
> Second, have you tried to make Black & Scholes kernel (the one shipped > with Nvidia SDK) work with Copperhead? It doesn't look like a line by > line translation to Copperhead would work, in the absence of abs(), > exp(), sqrt() etc. Do you have suggestions on how to approach this?
> Regards, > Sajith.
> Bryan Catanzaro <bryan.catanz...@gmail.com> wrote: > > Glad to hear that worked!
> > On Mon, Dec 5, 2011 at 11:02 AM, Sajith T S <saj...@gmail.com> wrote:
> > > Ah, yes -- disabling shipped Boost library, and using system Boost (I > > > used 1.42) and then rebuilding and re-installing PyCUDA did the trick. > > > Thanks!
> > > Thank you for the additional pointers also -- they are very helpful.
> > > Sajith.
> > > Bryan Catanzaro <bryan.catanz...@gmail.com> wrote: > > > > I've seen this bug before - it arises from changes in the way PyCUDA > and > > > > Boost export the functions PyCUDA provides, which Copperhead programs > > > > expect to use. In the past, I've solved it by: > > > > 1. Not using PyCUDA's shipped Boost library, and instead using the > > > system > > > > Boost library when building PyCUDA. > > > > 2. Sometimes I have had to use an older version of Boost. 1.41 has > > > worked > > > > for me. I'm not sure if this is absolutely necessary, or if just > > > building > > > > PyCUDA with the system Boost library is good enough.
> > > > For what it's worth, the new version of the Copperhead runtime and > > > compiler > > > > do not use PyCUDA (although they still use Codepy, another of Andreas > > > > Klöckner's projects). In other words, this particular issue is > solved in > > > > the new version of Copperhead I expect to release shortly. I realize > > > > Copperhead is too difficult to install, and I'm working to make this > > > > process easier.
> > > > Also, I notice from your trace that you're interested in timing > > > Copperhead > > > > program execution. A couple things:
> > > > 1. The first time you run the function, Copperhead has to invoke > nvcc, > > > > which takes O(10) seconds. Subsequent runs will use a cached binary.
> > > > 2. If you care about the overhead of moving data back and forth > between > > > > the CPU and GPU, you should use CuArray objects. > > > > The following code will work, but more slowly: > > > > a = dot_product(np.array(...), np.array(...)) > > > > Copperhead can't control GPU memory placement for numpy arrays, so > this > > > > code will result in extraneous memory transfers. > > > > Instead, do this: > > > > x = CuArray(np.array(...)) > > > > y = CuArray(np.array(...)) > > > > a = dot_product(x, y)
> > > > This will ensure that data is only moved when necessary.
> > > > - bryan
> > > > On Sun, Dec 4, 2011 at 11:43 PM, Sajith T S <saj...@gmail.com> > wrote:
> > > > > Yes, it's 64-bit Linux. This is what "uname -a" says:
> > > > > Bryan Catanzaro <bryan.catanz...@gmail.com> wrote: > > > > > > Yes, I think so. What version of CodePy, PyCUDA and CUDA are you > > > > > running? > > > > > > I'm guessing you're on 64-bit Linux?
> > > > > > - bryan
> > > > > > On Sun, Dec 4, 2011 at 9:52 PM, Sajith T S <saj...@gmail.com> > wrote:
> > > > > > > Sajith T S <saj...@gmail.com> wrote:
> > > > > > > > TypeError: No registered converter was able to produce a C++ > > > rvalue > > > > > of > > > > > > > type unsigned long long from this Python object of type > > > > > > > PooledDeviceAllocation
> > > > > > > Oh, in fact this is the same error I've been getting from all > > > sample > > > > > > > programs except simple_tests.py. Does it suggest that > something is > > > > > > > wrong with my Copperhead install?
> > > > > > > Thanks, > > > > > > > Sajith.
> -- > "the lyf so short, the craft so long to lerne." > -- Chaucer.
That was the first thing I tried, but that didn't work; doing map(abs, x) outside Copperhead did. I've attached the code I've been trying to run and the traceback, in case you might want to see that.
(I realize that numpy.arange() do not generate negative numbers; but I wasn't exactly interested in that...)
I haven't switched to the new bryancatanzaro-copperhead clone repo yet; maybe I should try doing that?
(For whatever it's worth, friend of mine and I have been doing a timing comparison between Accelerate and Copperhead for a class project. Copperhead seems to be doing really well in our tests; however surely it's too soon to draw conclusions since both of us are not experienced in writing well performing Haskell and/or Python and/or GPU programs. Still, thought you might be interested.)
Bryan Catanzaro <bryan.catanz...@gmail.com> wrote: > Hi Sajith - > Thanks for the questions, and don't worry, you're not testing my patience. > Keep the questions coming!
> I would write your function like this: > @cu > def vector_sum(x): > def elwise(xi): > if (xi > 0): > return xi > else: > return -xi > return sum(map(elwise, x))
> I haven't tried making the Black-Scholes kernel work with Copperhead, so > I'm not sure.
> - bryan
> On Tue, Dec 6, 2011 at 12:58 PM, Sajith T S <saj...@gmail.com> wrote:
> > Hi Bryan,
> > Thank you for your patience. I guess I should try testing it to the > > extreme. You know, the way people are supposed conduct themselves in > > mailing lists. So I've got the next set of questions!
> > First, what would it take to make something like this work?
> > @cu > > def vector_sum(x): > > sum(map((lambda xi: xi if xi > 0 else xi * -1), x))
> > It dumps a bunch of traceback on me, ending with:
> > "ValueError: visiting unknown node: <_ast.Expr object at 0x2999950>".
> > I can send the whole thing if you're interested.
> > Second, have you tried to make Black & Scholes kernel (the one shipped > > with Nvidia SDK) work with Copperhead? It doesn't look like a line by > > line translation to Copperhead would work, in the absence of abs(), > > exp(), sqrt() etc. Do you have suggestions on how to approach this?
> > Regards, > > Sajith.
> > Bryan Catanzaro <bryan.catanz...@gmail.com> wrote: > > > Glad to hear that worked!
> > > On Mon, Dec 5, 2011 at 11:02 AM, Sajith T S <saj...@gmail.com> wrote:
> > > > Ah, yes -- disabling shipped Boost library, and using system Boost (I > > > > used 1.42) and then rebuilding and re-installing PyCUDA did the trick. > > > > Thanks!
> > > > Thank you for the additional pointers also -- they are very helpful.
> > > > Sajith.
> > > > Bryan Catanzaro <bryan.catanz...@gmail.com> wrote: > > > > > I've seen this bug before - it arises from changes in the way PyCUDA > > and > > > > > Boost export the functions PyCUDA provides, which Copperhead programs > > > > > expect to use. In the past, I've solved it by: > > > > > 1. Not using PyCUDA's shipped Boost library, and instead using the > > > > system > > > > > Boost library when building PyCUDA. > > > > > 2. Sometimes I have had to use an older version of Boost. 1.41 has > > > > worked > > > > > for me. I'm not sure if this is absolutely necessary, or if just > > > > building > > > > > PyCUDA with the system Boost library is good enough.
> > > > > For what it's worth, the new version of the Copperhead runtime and > > > > compiler > > > > > do not use PyCUDA (although they still use Codepy, another of Andreas > > > > > Kl�ckner's projects). In other words, this particular issue is > > solved in > > > > > the new version of Copperhead I expect to release shortly. I realize > > > > > Copperhead is too difficult to install, and I'm working to make this > > > > > process easier.
> > > > > Also, I notice from your trace that you're interested in timing > > > > Copperhead > > > > > program execution. A couple things:
> > > > > 0. The development clone of Copperhead, signficantly reduced > > Copperhead > > > > > runtime overhead compared to the version you're using. You can grab > > it > > > > > from here: > > > > > http://code.google.com/r/bryancatanzaro-copperhead/source/checkout
> > > > > 1. The first time you run the function, Copperhead has to invoke > > nvcc, > > > > > which takes O(10) seconds. Subsequent runs will use a cached binary.
> > > > > 2. If you care about the overhead of moving data back and forth > > between > > > > > the CPU and GPU, you should use CuArray objects. > > > > > The following code will work, but more slowly: > > > > > a = dot_product(np.array(...), np.array(...)) > > > > > Copperhead can't control GPU memory placement for numpy arrays, so > > this > > > > > code will result in extraneous memory transfers. > > > > > Instead, do this: > > > > > x = CuArray(np.array(...)) > > > > > y = CuArray(np.array(...)) > > > > > a = dot_product(x, y)
> > > > > This will ensure that data is only moved when necessary.
> > > > > - bryan
> > > > > On Sun, Dec 4, 2011 at 11:43 PM, Sajith T S <saj...@gmail.com> > > wrote:
> > > > > > Yes, it's 64-bit Linux. This is what "uname -a" says:
> > > > > > Bryan Catanzaro <bryan.catanz...@gmail.com> wrote: > > > > > > > Yes, I think so. What version of CodePy, PyCUDA and CUDA are you > > > > > > running? > > > > > > > I'm guessing you're on 64-bit Linux?
> > > > > > > - bryan
> > > > > > > On Sun, Dec 4, 2011 at 9:52 PM, Sajith T S <saj...@gmail.com> > > wrote:
> > > > > > > > Sajith T S <saj...@gmail.com> wrote:
> > > > > > > > > TypeError: No registered converter was able to produce a C++ > > > > rvalue > > > > > > of > > > > > > > > type unsigned long long from this Python object of type > > > > > > > > PooledDeviceAllocation
> > > > > > > > Oh, in fact this is the same error I've been getting from all > > > > sample > > > > > > > > programs except simple_tests.py. Does it suggest that > > something is > > > > > > > > wrong with my Copperhead install?
> > > > > > > > Thanks, > > > > > > > > Sajith.
-- "the lyf so short, the craft so long to lerne." -- Chaucer.
I've pushed some changes to the bryancatanzaro-copperhead clone repo that adds exp, abs, and sqrt functionality. The following code runs correctly with your tester: @cu def vector_sum(x): def el_wise(xi): return abs(xi) return sum(map(el_wise, x))
- bryan
On Tue, Dec 6, 2011 at 3:31 PM, Sajith T S <saj...@gmail.com> wrote:
> That was the first thing I tried, but that didn't work; doing map(abs, > x) outside Copperhead did. I've attached the code I've been trying to > run and the traceback, in case you might want to see that.
> (I realize that numpy.arange() do not generate negative numbers; but I > wasn't exactly interested in that...)
> I haven't switched to the new bryancatanzaro-copperhead clone repo > yet; maybe I should try doing that?
> (For whatever it's worth, friend of mine and I have been doing a > timing comparison between Accelerate and Copperhead for a class > project. Copperhead seems to be doing really well in our tests; > however surely it's too soon to draw conclusions since both of us are > not experienced in writing well performing Haskell and/or Python > and/or GPU programs. Still, thought you might be interested.)
> Thanks, > Sajith.
> Bryan Catanzaro <bryan.catanz...@gmail.com> wrote: > > Hi Sajith - > > Thanks for the questions, and don't worry, you're not testing my > patience. > > Keep the questions coming!
> > I would write your function like this: > > @cu > > def vector_sum(x): > > def elwise(xi): > > if (xi > 0): > > return xi > > else: > > return -xi > > return sum(map(elwise, x))
> > I haven't tried making the Black-Scholes kernel work with Copperhead, so > > I'm not sure.
> > - bryan
> > On Tue, Dec 6, 2011 at 12:58 PM, Sajith T S <saj...@gmail.com> wrote:
> > > Hi Bryan,
> > > Thank you for your patience. I guess I should try testing it to the > > > extreme. You know, the way people are supposed conduct themselves in > > > mailing lists. So I've got the next set of questions!
> > > First, what would it take to make something like this work?
> > > @cu > > > def vector_sum(x): > > > sum(map((lambda xi: xi if xi > 0 else xi * -1), x))
> > > It dumps a bunch of traceback on me, ending with:
> > > I can send the whole thing if you're interested.
> > > Second, have you tried to make Black & Scholes kernel (the one shipped > > > with Nvidia SDK) work with Copperhead? It doesn't look like a line by > > > line translation to Copperhead would work, in the absence of abs(), > > > exp(), sqrt() etc. Do you have suggestions on how to approach this?
> > > Regards, > > > Sajith.
> > > Bryan Catanzaro <bryan.catanz...@gmail.com> wrote: > > > > Glad to hear that worked!
> > > > On Mon, Dec 5, 2011 at 11:02 AM, Sajith T S <saj...@gmail.com> > wrote:
> > > > > Ah, yes -- disabling shipped Boost library, and using system Boost > (I > > > > > used 1.42) and then rebuilding and re-installing PyCUDA did the > trick. > > > > > Thanks!
> > > > > Thank you for the additional pointers also -- they are very > helpful.
> > > > > Sajith.
> > > > > Bryan Catanzaro <bryan.catanz...@gmail.com> wrote: > > > > > > I've seen this bug before - it arises from changes in the way > PyCUDA > > > and > > > > > > Boost export the functions PyCUDA provides, which Copperhead > programs > > > > > > expect to use. In the past, I've solved it by: > > > > > > 1. Not using PyCUDA's shipped Boost library, and instead using > the > > > > > system > > > > > > Boost library when building PyCUDA. > > > > > > 2. Sometimes I have had to use an older version of Boost. 1.41 > has > > > > > worked > > > > > > for me. I'm not sure if this is absolutely necessary, or if just > > > > > building > > > > > > PyCUDA with the system Boost library is good enough.
> > > > > > For what it's worth, the new version of the Copperhead runtime > and > > > > > compiler > > > > > > do not use PyCUDA (although they still use Codepy, another of > Andreas > > > > > > Klöckner's projects). In other words, this particular issue is > > > solved in > > > > > > the new version of Copperhead I expect to release shortly. I > realize > > > > > > Copperhead is too difficult to install, and I'm working to make > this > > > > > > process easier.
> > > > > > Also, I notice from your trace that you're interested in timing > > > > > Copperhead > > > > > > program execution. A couple things:
> > > > > > 0. The development clone of Copperhead, signficantly reduced > > > Copperhead > > > > > > runtime overhead compared to the version you're using. You can > grab > > > it > > > > > > from here:
> > > > > > 1. The first time you run the function, Copperhead has to invoke > > > nvcc, > > > > > > which takes O(10) seconds. Subsequent runs will use a cached > binary.
> > > > > > 2. If you care about the overhead of moving data back and forth > > > between > > > > > > the CPU and GPU, you should use CuArray objects. > > > > > > The following code will work, but more slowly: > > > > > > a = dot_product(np.array(...), np.array(...)) > > > > > > Copperhead can't control GPU memory placement for numpy arrays, > so > > > this > > > > > > code will result in extraneous memory transfers. > > > > > > Instead, do this: > > > > > > x = CuArray(np.array(...)) > > > > > > y = CuArray(np.array(...)) > > > > > > a = dot_product(x, y)
> > > > > > This will ensure that data is only moved when necessary.
> > > > > > - bryan
> > > > > > On Sun, Dec 4, 2011 at 11:43 PM, Sajith T S <saj...@gmail.com> > > > wrote:
> > > > > > > Yes, it's 64-bit Linux. This is what "uname -a" says:
> > > > > > > Bryan Catanzaro <bryan.catanz...@gmail.com> wrote: > > > > > > > > Yes, I think so. What version of CodePy, PyCUDA and CUDA > are you > > > > > > > running? > > > > > > > > I'm guessing you're on 64-bit Linux?
> > > > > > > > - bryan
> > > > > > > > On Sun, Dec 4, 2011 at 9:52 PM, Sajith T S <saj...@gmail.com
> > > wrote:
> > > > > > > > > Sajith T S <saj...@gmail.com> wrote:
> > > > > > > > > > TypeError: No registered converter was able to produce a > C++ > > > > > rvalue > > > > > > > of > > > > > > > > > type unsigned long long from this Python object of type > > > > > > > > > PooledDeviceAllocation
> > > > > > > > > Oh, in fact this is the same error I've been getting from > all > > > > > sample > > > > > > > > > programs except simple_tests.py. Does it suggest that > > > something is > > > > > > > > > wrong with my Copperhead install?
Bryan Catanzaro <bryan.catanz...@gmail.com> wrote: > I've pushed some changes to the bryancatanzaro-copperhead clone repo that > adds exp, abs, and sqrt functionality. The following code runs correctly > with your tester: > @cu > def vector_sum(x): > def el_wise(xi): > return abs(xi) > return sum(map(el_wise, x))
> - bryan
> On Tue, Dec 6, 2011 at 3:31 PM, Sajith T S <saj...@gmail.com> wrote:
> > Hi Bryan,
> > That was the first thing I tried, but that didn't work; doing map(abs, > > x) outside Copperhead did. I've attached the code I've been trying to > > run and the traceback, in case you might want to see that.
> > (I realize that numpy.arange() do not generate negative numbers; but I > > wasn't exactly interested in that...)
> > I haven't switched to the new bryancatanzaro-copperhead clone repo > > yet; maybe I should try doing that?
> > (For whatever it's worth, friend of mine and I have been doing a > > timing comparison between Accelerate and Copperhead for a class > > project. Copperhead seems to be doing really well in our tests; > > however surely it's too soon to draw conclusions since both of us are > > not experienced in writing well performing Haskell and/or Python > > and/or GPU programs. Still, thought you might be interested.)
> > Thanks, > > Sajith.
> > Bryan Catanzaro <bryan.catanz...@gmail.com> wrote: > > > Hi Sajith - > > > Thanks for the questions, and don't worry, you're not testing my > > patience. > > > Keep the questions coming!
> > > I would write your function like this: > > > @cu > > > def vector_sum(x): > > > def elwise(xi): > > > if (xi > 0): > > > return xi > > > else: > > > return -xi > > > return sum(map(elwise, x))
> > > I haven't tried making the Black-Scholes kernel work with Copperhead, so > > > I'm not sure.
> > > - bryan
> > > On Tue, Dec 6, 2011 at 12:58 PM, Sajith T S <saj...@gmail.com> wrote:
> > > > Hi Bryan,
> > > > Thank you for your patience. I guess I should try testing it to the > > > > extreme. You know, the way people are supposed conduct themselves in > > > > mailing lists. So I've got the next set of questions!
> > > > First, what would it take to make something like this work?
> > > > @cu > > > > def vector_sum(x): > > > > sum(map((lambda xi: xi if xi > 0 else xi * -1), x))
> > > > It dumps a bunch of traceback on me, ending with:
> > > > I can send the whole thing if you're interested.
> > > > Second, have you tried to make Black & Scholes kernel (the one shipped > > > > with Nvidia SDK) work with Copperhead? It doesn't look like a line by > > > > line translation to Copperhead would work, in the absence of abs(), > > > > exp(), sqrt() etc. Do you have suggestions on how to approach this?
> > > > Regards, > > > > Sajith.
> > > > Bryan Catanzaro <bryan.catanz...@gmail.com> wrote: > > > > > Glad to hear that worked!
> > > > > On Mon, Dec 5, 2011 at 11:02 AM, Sajith T S <saj...@gmail.com> > > wrote:
> > > > > > Ah, yes -- disabling shipped Boost library, and using system Boost > > (I > > > > > > used 1.42) and then rebuilding and re-installing PyCUDA did the > > trick. > > > > > > Thanks!
> > > > > > Thank you for the additional pointers also -- they are very > > helpful.
> > > > > > Sajith.
> > > > > > Bryan Catanzaro <bryan.catanz...@gmail.com> wrote: > > > > > > > I've seen this bug before - it arises from changes in the way > > PyCUDA > > > > and > > > > > > > Boost export the functions PyCUDA provides, which Copperhead > > programs > > > > > > > expect to use. In the past, I've solved it by: > > > > > > > 1. Not using PyCUDA's shipped Boost library, and instead using > > the > > > > > > system > > > > > > > Boost library when building PyCUDA. > > > > > > > 2. Sometimes I have had to use an older version of Boost. 1.41 > > has > > > > > > worked > > > > > > > for me. I'm not sure if this is absolutely necessary, or if just > > > > > > building > > > > > > > PyCUDA with the system Boost library is good enough.
> > > > > > > For what it's worth, the new version of the Copperhead runtime > > and > > > > > > compiler > > > > > > > do not use PyCUDA (although they still use Codepy, another of > > Andreas > > > > > > > Kl ckner's projects). In other words, this particular issue is > > > > solved in > > > > > > > the new version of Copperhead I expect to release shortly. I > > realize > > > > > > > Copperhead is too difficult to install, and I'm working to make > > this > > > > > > > process easier.
> > > > > > > Also, I notice from your trace that you're interested in timing > > > > > > Copperhead > > > > > > > program execution. A couple things:
> > > > > > > 0. The development clone of Copperhead, signficantly reduced > > > > Copperhead > > > > > > > runtime overhead compared to the version you're using. You can > > grab > > > > it > > > > > > > from here:
> > > > > > > 1. The first time you run the function, Copperhead has to invoke > > > > nvcc, > > > > > > > which takes O(10) seconds. Subsequent runs will use a cached > > binary.
> > > > > > > 2. If you care about the overhead of moving data back and forth > > > > between > > > > > > > the CPU and GPU, you should use CuArray objects. > > > > > > > The following code will work, but more slowly: > > > > > > > a = dot_product(np.array(...), np.array(...)) > > > > > > > Copperhead can't control GPU memory placement for numpy arrays, > > so > > > > this > > > > > > > code will result in extraneous memory transfers. > > > > > > > Instead, do this: > > > > > > > x = CuArray(np.array(...)) > > > > > > > y = CuArray(np.array(...)) > > > > > > > a = dot_product(x, y)
> > > > > > > This will ensure that data is only moved when necessary.
> > > > > > > - bryan
> > > > > > > On Sun, Dec 4, 2011 at 11:43 PM, Sajith T S <saj...@gmail.com> > > > > wrote:
> > > > > > > > Yes, it's 64-bit Linux. This is what "uname -a" says:
> > > > > > > > Bryan Catanzaro <bryan.catanz...@gmail.com> wrote: > > > > > > > > > Yes, I think so. What version of CodePy, PyCUDA and CUDA > > are you > > > > > > > > running? > > > > > > > > > I'm guessing you're on 64-bit Linux?
> > > > > > > > > - bryan
> > > > > > > > > On Sun, Dec 4, 2011 at 9:52 PM, Sajith T S <saj...@gmail.com
> > > > wrote:
> > > > > > > > > > Sajith T S <saj...@gmail.com> wrote:
> > > > > > > > > > > TypeError: No registered converter was able to produce a > > C++ > > > > > > rvalue > > > > > > > > of > > > > > > > > > > type unsigned long long from this Python object of type > > > > > > > > > > PooledDeviceAllocation
> > > > > > > > > > Oh, in fact this is the same error I've been getting from > > all > > > > > > sample > > > > > > > > > > programs except simple_tests.py. Does it suggest that > > > > something is > > > > > > > > > > wrong with my Copperhead install?