No module named cuda_ndarray

545 views
Skip to first unread message

andrew cooke

unread,
Jul 2, 2010, 3:10:05 PM7/2/10
to theano-users

I'm probably doing something pretty dumb, as I am new to Theano, but I
have just installed from mercurial on OpenSuse 11.2 and, when I try to
run the code described at http://deeplearning.net/software/theano/tutorial/using_gpu.html,
I see the output:

qiy6 Theano: THEANO_FLAGS=device=gpu0 python gpu-test.py
ERROR (theano.sandbox.cuda): Failed to compile cuda_ndarray.cu: No
module named cuda_ndarray
WARNING (theano.sandbox.cuda): Cuda is disabled, cuda-based code will
thus not be working properly
Looping 1000 times took 4.01663899422 seconds
Result is [ 1.23178032 1.61879341 1.52278065 ..., 2.20771815
2.29967753
1.62323285]
Used the cpu
qiy6 Theano: echo $CUDA_ROOT
/usr/local/cuda
qiy6 Theano: nvcc
nvcc fatal : No input files specified; use option --help for more
information

I get similar output (No module...) when I run nosetests, at the
start.

Cuda itself works OK (I have been using it for a while; I am looking
at Theano as a way of making my GPU-related development more
productive).

From what I can understand, cuda_ndarray was a separate project that
was merged into Theano, so I should not need to install it? However:

qiy6 Theano: find . -name "cuda_ndarray*"
./theano/sandbox/cuda/cuda_ndarray.cu
./theano/sandbox/cuda/cuda_ndarray.cuh

Finally, running nosetests gives
----------------------------------------------------------------------
Ran 879 tests in 110.248s

FAILED (SKIP=17, errors=4, failures=4)

which may also be worrying? I have no idea if that is expected or
not. This is with the code checked out about an hour ago....

Anyway, is the above sufficient to explain what I may be doing wrong?

Thanks,
Andrew

James Bergstra

unread,
Jul 2, 2010, 4:18:47 PM7/2/10
to theano...@googlegroups.com
Hi Andrew,

That looks like a normal number of test failures.  We're working to have zero, but there are  a few not-too-serious ones that fail.

The cuda_ndarray.cuda_ndarray is relative to Theano's cache directory.  Normally this is $HOME/.theano/<arch>

It looks like the compilation of this file worked, but importing it did not.  Have a look in the sandbox/cuda/__init__ function near line 87 if you want to try and figure it out.  I'm afraid I can't offer much more help than that over email based on the info you've provided.

James

andrew cooke

unread,
Jul 2, 2010, 4:47:11 PM7/2/10
to theano-users

Hi,

Thanks for the reply - helped me know where to look and gave me
confidence it was something to track down...

OK, so if I do:

qiy6 tmp: nvcc -I /usr/include/python2.6 -I /usr/lib64/python2.6/site-
packages/numpy/core/include/ /home/andrew/pkg/Theano/theano/sandbox/
cuda/cuda_ndarray.cu

I see (note error at the end):

In file included from /usr/include/python2.6/Python.h:8,
from /home/andrew/pkg/Theano/theano/sandbox/cuda/
cuda_ndarray.cu:1:
/usr/include/python2.6/pyconfig.h:1055:1: warning: "_POSIX_C_SOURCE"
redefined
In file included from /usr/local/cuda/bin/../include/host_config.h:74,
from /usr/local/cuda/bin/../include/cuda_runtime.h:
45,
from <command-line>:0:
/usr/include/features.h:158:1: warning: this is the location of the
previous definition
In file included from /usr/include/python2.6/Python.h:8,
from /home/andrew/pkg/Theano/theano/sandbox/cuda/
cuda_ndarray.cu:1:
/usr/include/python2.6/pyconfig.h:1067:1: warning: "_XOPEN_SOURCE"
redefined
In file included from /usr/local/cuda/bin/../include/host_config.h:74,
from /usr/local/cuda/bin/../include/cuda_runtime.h:
45,
from <command-line>:0:
/usr/include/features.h:160:1: warning: this is the location of the
previous definition
In file included from /usr/include/python2.6/Python.h:8,
from /home/andrew/pkg/Theano/theano/sandbox/cuda/
cuda_ndarray.cu:1:
/usr/include/python2.6/pyconfig.h:1055:1: warning: "_POSIX_C_SOURCE"
redefined
In file included from /usr/local/cuda/bin/../include/host_config.h:74,
from /usr/local/cuda/bin/../include/cuda_runtime.h:
45,
from <command-line>:0:
/usr/include/features.h:158:1: warning: this is the location of the
previous definition
In file included from /usr/include/python2.6/Python.h:8,
from /home/andrew/pkg/Theano/theano/sandbox/cuda/
cuda_ndarray.cu:1:
/usr/include/python2.6/pyconfig.h:1067:1: warning: "_XOPEN_SOURCE"
redefined
In file included from /usr/local/cuda/bin/../include/host_config.h:74,
from /usr/local/cuda/bin/../include/cuda_runtime.h:
45,
from <command-line>:0:
/usr/include/features.h:160:1: warning: this is the location of the
previous definition
/usr/include/c++/4.4/x86_64-suse-linux/bits/c++locale.h: In function
‘int std::__convert_from_v(__locale_struct* const&, char*, int, const
char*, ...)’:
/usr/include/c++/4.4/x86_64-suse-linux/bits/c++locale.h:86: error:
‘__builtin_stdarg_start’ was not declared in this scope



I *think* this may be the source of the problem? Any ideas on how to
fix it? (C++ isn't my idea of fun).

Thanks,
Andrew

andrew cooke

unread,
Jul 2, 2010, 4:53:28 PM7/2/10
to theano-users

Whoa. This may be the 4.3/4.3 issue. Trying 4.3...

Dumitru Erhan

unread,
Jul 2, 2010, 4:56:50 PM7/2/10
to theano...@googlegroups.com
Do you have gcc 4.4?

if yes, straight from our docs:

"There is a compatibility issue affecting some Ubuntu 9.10 users, and probably anyone using CUDA 2.3 with gcc-4.4. Symptom: errors about “__sync_fetch_and_add” being undefined. Solution 1: make gcc-4.3 the default gcc (http://pascalg.wordpress.com/2010/01/14/cuda-on-ubuntu-9-10linux-mint-helena/Solution 2: make another gcc (e.g. gcc-4.3) the default just for nvcc. Do this by making a directory (e.g. $HOME/.theano/nvcc-bindir) and installing two symlinks in it: one called gcc pointing to gcc-4.3 (or lower) and one called g++ pointing to g++-4.3 (or lower). Then add compiler_bindir = /path/to/nvcc-bindir to the [nvcc] section of your .theanorc (libdoc_config)."


The symptom is different, but the solution will probably work for you,
Dumitru
--
http://dumitru.ca, +1-330-DOOMIE-3

andrew cooke

unread,
Jul 2, 2010, 5:08:27 PM7/2/10
to theano-users

Well, that seems to have worked, except for the results, particularly
the last line below that seems to be based on inspecting your AST (the
times aren't very reliable - I'm currently thrashing the disks on this
machine with another test, so everything is erratic).

qiy6 Theano: python gpu-test.py
Using gpu device 0: GeForce 8400 GS
Looping 1000 times took 7.32045483589 seconds
Result is [ 1.23178032 1.61879341 1.52278065 ..., 2.20771815
2.29967753
1.62323285]
Used the cpu <<<<<<<<<<<<<<<<?????

Andrew


On Jul 2, 4:56 pm, Dumitru Erhan <dumitru.er...@gmail.com> wrote:
> Do you have gcc 4.4?
>
> if yes, straight from our docs:
>
> "There is a compatibility issue affecting some Ubuntu 9.10 users, and
> probably anyone using CUDA 2.3 with gcc-4.4. Symptom: errors about
> “__sync_fetch_and_add” being undefined. *Solution 1:* make gcc-4.3 the
> default gcc (http://pascalg.wordpress.com/2010/01/14/cuda-on-ubuntu-9-10linux-mint...
> ) *Solution 2:* make another gcc (e.g. gcc-4.3) the default just for nvcc.

James Bergstra

unread,
Jul 2, 2010, 5:26:52 PM7/2/10
to theano...@googlegroups.com
Congrats - it looks like the cuda_ndarray built ok.

To use the GPU currently requires using floating-point arrays though, try setting THEANO_FLAGS=floatX=float32 to do math that is GPU-compatible.  Are you doing that?

James

Dumitru Erhan

unread,
Jul 2, 2010, 5:29:12 PM7/2/10
to theano...@googlegroups.com
On Fri, Jul 2, 2010 at 14:26, James Bergstra <james.b...@gmail.com> wrote:
Congrats - it looks like the cuda_ndarray built ok.

To use the GPU currently requires using floating-point arrays though, try setting THEANO_FLAGS=floatX=float32 to do math that is GPU-compatible.  Are you doing that?


Our docs are a bit broken (that's where he's taking the example from: http://deeplearning.net/software/theano/tutorial/using_gpu.html#putting-it-all-together), as they fail to specify floatX=float32... I'll push a fix

Dumitru



--
http://dumitru.ca, +1-330-DOOMIE-3

andrew cooke

unread,
Jul 2, 2010, 5:37:56 PM7/2/10
to theano-users

Awesome. That worked; thanks. Sorry - I did see the floatX mentioned
elsewhere, but it didn't click.

The tests are now giving memory errors, but I see that's a known issue
(I need to try subdirs in turn). I am using a very poor card on this
machine for testing - but we have some nice Teslas at work :o)

Thanks again for the help,
Andrew

On Jul 2, 5:29 pm, Dumitru Erhan <dumitru.er...@gmail.com> wrote:
> On Fri, Jul 2, 2010 at 14:26, James Bergstra <james.bergs...@gmail.com>wrote:
>
> > Congrats - it looks like the cuda_ndarray built ok.
>
> > To use the GPU currently requires using floating-point arrays though, try
> > setting THEANO_FLAGS=floatX=float32 to do math that is GPU-compatible.  Are
> > you doing that?
>
> Our docs are a bit broken (that's where he's taking the example from:http://deeplearning.net/software/theano/tutorial/using_gpu.html#putti...),
Reply all
Reply to author
Forward
0 new messages