pygpu.gpuarray.GpuArrayException

1,452 views
Skip to first unread message

Tecx Nitrom

unread,
Jan 27, 2017, 6:21:01 AM1/27/17
to theano-users

I am trying to set theano to use gpu but getting  Gpu Array Exception.


Using Theano backend.
ERROR (theano.gpuarray): Could not initialize pygpu, support disabled
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/theano/gpuarray/__init__.py", line 164, in <module>
    use(config.device)
  File "/usr/lib/python3.6/site-packages/theano/gpuarray/__init__.py", line 151, in use
    init_dev(device)
  File "/usr/lib/python3.6/site-packages/theano/gpuarray/__init__.py", line 100, in init_dev
    pygpu.blas.gemm(0, tmp, tmp, 0, tmp, overwrite_c=True)
  File "pygpu/blas.pyx", line 129, in pygpu.blas.gemm (pygpu/blas.c:3354)
  File "pygpu/blas.pyx", line 44, in pygpu.blas.pygpu_blas_rgemm (pygpu/blas.c:2011)
pygpu.gpuarray.GpuArrayException: (b'Unsupported operation', 5)

Frédéric Bastien

unread,
Jan 27, 2017, 8:45:36 AM1/27/17
to theano-users
Hi,

To be sure, you use the development version of Theano, not Theano 0.8.2? And you also installed the development version of libgpuarray?

What are your Theano flag?

Fred

--

---
You received this message because you are subscribed to the Google Groups "theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to theano-users+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Tecx Nitrom

unread,
Jan 27, 2017, 9:08:46 AM1/27/17
to theano...@googlegroups.com
I do not know theano  version. But it is latest pull made from theano git repo

--

---
You received this message because you are subscribed to a topic in the Google Groups "theano-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/theano-users/GQJ8yojgJo4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to theano-users+unsubscribe@googlegroups.com.

Frédéric Bastien

unread,
Jan 27, 2017, 9:16:31 AM1/27/17
to theano-users
Try to answer all the questions in my email. Otherwise it will delay the help we can offer.

What are your Theano flag?

Tecx Nitrom

unread,
Jan 27, 2017, 9:31:09 AM1/27/17
to theano...@googlegroups.com
Sorry my bad..

floatX (('float64', 'float32', 'float16'))
    Doc:  Default floating-point precision for python casts.

Note: float16 support is experimental, use at your own risk.
    Value:  float32

warn_float64 (('ignore', 'warn', 'raise', 'pdb'))
    Doc:  Do an action when a tensor variable with float64 dtype is created. They can't be run on the GPU with the current(old) gpu back-end and are slow with gamer GPUs.
    Value:  ignore

cast_policy (('custom', 'numpy+floatX'))
    Doc:  Rules for implicit type casting
    Value:  custom

int_division (('int', 'raise', 'floatX'))
    Doc:  What to do when one computes x / y, where both x and y are of integer types
    Value:  int

device (cpu, gpu*, opencl*, cuda*)
    Doc:  Default device for computations. If cuda* or opencl*, change thedefault to try to move computation to the GPU. Do not use upper caseletters, only lower case even if NVIDIA uses capital letters.
    Value:  opencl0:1

init_gpu_device (, gpu*, opencl*, cuda*)
    Doc:  Initialize the gpu device to use, works only if device=cpu. Unlike 'device', setting this option will NOT move computations, nor shared variables, to the specified GPU. It can be used to run GPU-specific tests on a particular GPU.
    Value: 

force_device (<function BoolParam.<locals>.booltype at 0x7f808bfb1840>)
    Doc:  Raise an error if we can't use the specified device
    Value:  False

print_global_stats (<function BoolParam.<locals>.booltype at 0x7f808bfb19d8>)
    Doc:  Print some global statistics (time spent) at the end
    Value:  False

<theano.configdefaults.ContextsParam object at 0x7f808bfb09b0>
    Doc: 
    Context map for multi-gpu operation. Format is a
    semicolon-separated list of names and device names in the
    'name->dev_name' format. An example that would map name 'test' to
    device 'cuda0' and name 'test2' to device 'opencl0:0' follows:
    "test->cuda0;test2->opencl0:0".

    Invalid context names are 'cpu', 'cuda*' and 'opencl*'
   
    Value: 

print_active_device (<function BoolParam.<locals>.booltype at 0x7f808bfb1c80>)
    Doc:  Print active device at when the GPU device is initialized.
    Value:  True

enable_initial_driver_test (<function BoolParam.<locals>.booltype at 0x7f808bfb1e18>)
    Doc:  Tests the nvidia driver when a GPU device is initialized.
    Value:  True

cuda.root (<class 'str'>)
    Doc:  directory with bin/, lib/, include/ for cuda utilities.
       This directory is included via -L and -rpath when linking
       dynamically compiled modules.  If AUTO and nvcc is in the
       path, it will use one of nvcc parent directory.  Otherwise
       /usr/local/cuda will be used.  Leave empty to prevent extra
       linker directives.  Default: environment variable "CUDA_ROOT"
       or else "AUTO".
      
    Value: 

<theano.configparser.ConfigParam object at 0x7f808bfb0b38>
    Doc:  Extra compiler flags for nvcc
    Value: 

nvcc.compiler_bindir (<class 'str'>)
    Doc:  If defined, nvcc compiler driver will seek g++ and gcc in this directory
    Value: 

nvcc.fastmath (<function BoolParam.<locals>.booltype at 0x7f808bfb7268>)
    Doc: 
    Value:  False

gpuarray.sync (<function BoolParam.<locals>.booltype at 0x7f808bfb7400>)
    Doc:  If True, every op will make sure its work is done before
                returning.  Setting this to True will slow down execution,
                but give much more accurate results in profiling.
    Value:  False

gpuarray.preallocate (<class 'float'>)
    Doc:  If negative it disables the allocation cache. If
             between 0 and 1 it enables the allocation cache and
             preallocates that fraction of the total GPU memory.  If 1
             or greater it will preallocate that amount of memory (in
             megabytes).
    Value:  0.0

gpuarray.sched (('default', 'multi', 'single'))
    Doc:  The sched parameter passed for context creation to pygpu.
                With CUDA, using "multi" is equivalent to using the parameter
                cudaDeviceScheduleYield. This is useful to lower the
                CPU overhead when waiting for GPU. One user found that it
                speeds up his other processes that was doing data augmentation.
            
    Value:  default

gpuarray.single_stream (<function BoolParam.<locals>.booltype at 0x7f808bfb76a8>)
    Doc: 
             If your computations are mostly lots of small elements,
             using single-stream will avoid the synchronization
             overhead and usually be faster.  For larger elements it
             does not make a difference yet.  In the future when true
             multi-stream is enabled in libgpuarray, this may change.
             If you want to make sure to have optimal performance,
             check both options.
            
    Value:  True

<theano.configparser.ConfigParam object at 0x7f808bfb0e48>
    Doc:  This flag is deprecated; use dnn.conv.algo_fwd.
    Value:  True

<theano.configparser.ConfigParam object at 0x7f808bfb0eb8>
    Doc:  This flag is deprecated; use `dnn.conv.algo_bwd_filter` and `dnn.conv.algo_bwd_data` instead.
    Value:  True

<theano.configparser.ConfigParam object at 0x7f808bfb9048>
    Doc:  This flag is deprecated; use dnn.conv.algo_bwd_data and dnn.conv.algo_bwd_filter.
    Value:  True

dnn.conv.algo_fwd (('small', 'none', 'large', 'fft', 'fft_tiling', 'winograd', 'guess_once', 'guess_on_shape_change', 'time_once', 'time_on_shape_change'))
    Doc:  Default implementation to use for cuDNN forward convolution.
    Value:  small

dnn.conv.algo_bwd_data (('none', 'deterministic', 'fft', 'fft_tiling', 'winograd', 'guess_once', 'guess_on_shape_change', 'time_once', 'time_on_shape_change'))
    Doc:  Default implementation to use for cuDNN backward convolution to get the gradients of the convolution with regard to the inputs.
    Value:  none

dnn.conv.algo_bwd_filter (('none', 'deterministic', 'fft', 'small', 'guess_once', 'guess_on_shape_change', 'time_once', 'time_on_shape_change'))
    Doc:  Default implementation to use for cuDNN backward convolution to get the gradients of the convolution with regard to the filters.
    Value:  none

dnn.conv.precision (('as_input_f32', 'as_input', 'float16', 'float32', 'float64'))
    Doc:  Default data precision to use for the computation in cuDNN convolutions (defaults to the same dtype as the inputs of the convolutions, or float32 if inputs are float16).
    Value:  as_input_f32

dnn.include_path (<class 'str'>)
    Doc:  Location of the cudnn header (defaults to the cuda root)
    Value: 

dnn.library_path (<class 'str'>)
    Doc:  Location of the cudnn header (defaults to the cuda root)
    Value: 

dnn.enabled (('auto', 'True', 'False'))
    Doc:  'auto', use cuDNN if available, but silently fall back to not using it if not present. If True and cuDNN can not be used, raise an error. If False, disable cudnn
    Value:  auto

assert_no_cpu_op (('ignore', 'warn', 'raise', 'pdb'))
    Doc:  Raise an error/warning if there is a CPU op in the computational graph.
    Value:  ignore

mode (('Mode', 'DebugMode', 'FAST_RUN', 'NanGuardMode', 'FAST_COMPILE', 'DEBUG_MODE'))
    Doc:  Default compilation mode
    Value:  Mode

cxx (<class 'str'>)
    Doc:  The C++ compiler to use. Currently only g++ is supported, but supporting additional compilers should not be too difficult. If it is empty, no C++ code is compiled.
    Value:  /usr/bin/g++

linker (('cvm', 'c|py', 'py', 'c', 'c|py_nogc', 'vm', 'vm_nogc', 'cvm_nogc'))
    Doc:  Default linker used if the theano flags mode is Mode
    Value:  cvm

allow_gc (<function BoolParam.<locals>.booltype at 0x7f808bfbbae8>)
    Doc:  Do we default to delete intermediate results during Theano function calls? Doing so lowers the memory requirement, but asks that we reallocate memory at the next function call. This is implemented for the default linker, but may not work for all linkers.
    Value:  True

optimizer (('fast_run', 'merge', 'fast_compile', 'None'))
    Doc:  Default optimizer. If not None, will use this optimizer with the Mode
    Value:  fast_run

optimizer_verbose (<function BoolParam.<locals>.booltype at 0x7f808bfbbd08>)
    Doc:  If True, we print all optimization being applied
    Value:  False

on_opt_error (('warn', 'raise', 'pdb', 'ignore'))
    Doc:  What to do when an optimization crashes: warn and skip it, raise the exception, or fall into the pdb debugger.
    Value:  warn

<theano.configparser.ConfigParam object at 0x7f808bfb9da0>
    Doc:  This config option was removed in 0.5: do not use it!
    Value:  True

nocleanup (<function BoolParam.<locals>.booltype at 0x7f808bf4c048>)
    Doc:  Suppress the deletion of code files that did not compile cleanly
    Value:  False

on_unused_input (('raise', 'warn', 'ignore'))
    Doc:  What to do if a variable in the 'inputs' list of  theano.function() is not used in the graph.
    Value:  raise

tensor.cmp_sloppy (<class 'int'>)
    Doc:  Relax tensor._allclose (0) not at all, (1) a bit, (2) more
    Value:  0

tensor.local_elemwise_fusion (<function BoolParam.<locals>.booltype at 0x7f808bf4c378>)
    Doc:  Enable or not in fast_run mode(fast_run optimization) the elemwise fusion optimization
    Value:  True

gpu.local_elemwise_fusion (<function BoolParam.<locals>.booltype at 0x7f808bf4c510>)
    Doc:  Enable or not in fast_run mode(fast_run optimization) the gpu elemwise fusion optimization
    Value:  True

lib.amdlibm (<function BoolParam.<locals>.booltype at 0x7f808bf4c6a8>)
    Doc:  Use amd's amdlibm numerical library
    Value:  False

gpuelemwise.sync (<function BoolParam.<locals>.booltype at 0x7f808bf4c840>)
    Doc:  when true, wait that the gpu fct finished and check it error code.
    Value:  True

traceback.limit (<class 'int'>)
    Doc:  The number of stack to trace. -1 mean all.
    Value:  8

traceback.compile_limit (<class 'int'>)
    Doc:  The number of stack to trace to keep during compilation. -1 mean all. If greater then 0, will also make us save Theano internal stack trace.
    Value:  0

experimental.unpickle_gpu_on_cpu (<function BoolParam.<locals>.booltype at 0x7f808bf4cae8>)
    Doc:  Allow unpickling of pickled CudaNdarrays as numpy.ndarrays.This is useful, if you want to open a CudaNdarray without having cuda installed.If you have cuda installed, this will force unpickling tobe done on the cpu to numpy.ndarray.Please be aware that this may get you access to the data,however, trying to unpicke gpu functions will not succeed.This flag is experimental and may be removed any time, whengpu<>cpu transparency is solved.
    Value:  False

numpy.seterr_all (('ignore', 'warn', 'raise', 'call', 'print', 'log', 'None'))
    Doc:  ("Sets numpy's behaviour for floating-point errors, ", "see numpy.seterr. 'None' means not to change numpy's default, which can be different for different numpy releases. This flag sets the default behaviour for all kinds of floating-point errors, its effect can be overriden for specific errors by the following flags: seterr_divide, seterr_over, seterr_under and seterr_invalid.")
    Value:  ignore

numpy.seterr_divide (('None', 'ignore', 'warn', 'raise', 'call', 'print', 'log'))
    Doc:  Sets numpy's behavior for division by zero, see numpy.seterr. 'None' means using the default, defined by numpy.seterr_all.
    Value:  None

numpy.seterr_over (('None', 'ignore', 'warn', 'raise', 'call', 'print', 'log'))
    Doc:  Sets numpy's behavior for floating-point overflow, see numpy.seterr. 'None' means using the default, defined by numpy.seterr_all.
    Value:  None

numpy.seterr_under (('None', 'ignore', 'warn', 'raise', 'call', 'print', 'log'))
    Doc:  Sets numpy's behavior for floating-point underflow, see numpy.seterr. 'None' means using the default, defined by numpy.seterr_all.
    Value:  None

numpy.seterr_invalid (('None', 'ignore', 'warn', 'raise', 'call', 'print', 'log'))
    Doc:  Sets numpy's behavior for invalid floating-point operation, see numpy.seterr. 'None' means using the default, defined by numpy.seterr_all.
    Value:  None

warn.ignore_bug_before (('0.7', 'None', 'all', '0.3', '0.4', '0.4.1', '0.5', '0.6', '0.7', '0.8', '0.8.1', '0.8.2', '0.9'))
    Doc:  If 'None', we warn about all Theano bugs found by default. If 'all', we don't warn about Theano bugs found by default. If a version, we print only the warnings relative to Theano bugs found after that version. Warning for specific bugs can be configured with specific [warn] flags.
    Value:  0.7

warn.argmax_pushdown_bug (<function BoolParam.<locals>.booltype at 0x7f808bfc00d0>)
    Doc:  Warn if in past version of Theano we generated a bug with the theano.tensor.nnet.nnet.local_argmax_pushdown optimization. Was fixed 27 may 2010
    Value:  False

warn.gpusum_01_011_0111_bug (<function BoolParam.<locals>.booltype at 0x7f808bfc0268>)
    Doc:  Warn if we are in a case where old version of Theano had a silent bug with GpuSum pattern 01,011 and 0111 when the first dimensions was bigger then 4096. Was fixed 31 may 2010
    Value:  False

warn.sum_sum_bug (<function BoolParam.<locals>.booltype at 0x7f808bfc0400>)
    Doc:  Warn if we are in a case where Theano version between version 9923a40c7b7a and the 2 august 2010 (fixed date), generated an error in that case. This happens when there are 2 consecutive sums in the graph, bad code was generated. Was fixed 2 August 2010
    Value:  False

warn.sum_div_dimshuffle_bug (<function BoolParam.<locals>.booltype at 0x7f808bfc0598>)
    Doc:  Warn if previous versions of Theano (between rev. 3bd9b789f5e8, 2010-06-16, and cfc6322e5ad4, 2010-08-03) would have given incorrect result. This bug was triggered by sum of division of dimshuffled tensors.
    Value:  False

warn.subtensor_merge_bug (<function BoolParam.<locals>.booltype at 0x7f808bfc0730>)
    Doc:  Warn if previous versions of Theano (before 0.5rc2) could have given incorrect results when indexing into a subtensor with negative stride (for instance, for instance, x[a:b:-1][c]).
    Value:  False

warn.gpu_set_subtensor1 (<function BoolParam.<locals>.booltype at 0x7f808bfc08c8>)
    Doc:  Warn if previous versions of Theano (before 0.6) could have given incorrect results when moving to the gpu set_subtensor(x[int vector], new_value)
    Value:  False

warn.vm_gc_bug (<function BoolParam.<locals>.booltype at 0x7f808bfc0a60>)
    Doc:  There was a bug that existed in the default Theano configuration, only in the development version between July 5th 2012 and July 30th 2012. This was not in a released version. If your code was affected by this bug, a warning will be printed during the code execution if you use the `linker=vm,vm.lazy=True,warn.vm_gc_bug=True` Theano flags. This warning is disabled by default as the bug was not released.
    Value:  False

warn.signal_conv2d_interface (<function BoolParam.<locals>.booltype at 0x7f808bfc0bf8>)
    Doc:  Warn we use the new signal.conv2d() when its interface changed mid June 2014
    Value:  False

warn.reduce_join (<function BoolParam.<locals>.booltype at 0x7f808bfc0d90>)
    Doc:  Your current code is fine, but Theano versions prior to 0.7 (or this development version) might have given an incorrect result. To disable this warning, set the Theano flag warn.reduce_join to False. The problem was an optimization, that modified the pattern "Reduce{scalar.op}(Join(axis=0, a, b), axis=0)", did not check the reduction axis. So if the reduction axis was not 0, you got a wrong answer.
    Value:  False

warn.inc_set_subtensor1 (<function BoolParam.<locals>.booltype at 0x7f808bfc0f28>)
    Doc:  Warn if previous versions of Theano (before 0.7) could have given incorrect results for inc_subtensor and set_subtensor when using some patterns of advanced indexing (indexing with one vector or matrix of ints).
    Value:  False

warn.round (<function BoolParam.<locals>.booltype at 0x7f808bf52158>)
    Doc:  Round changed its default from Seed to use for randomized unit tests. Special value 'random' means using a seed of None.
    Value:  True

compute_test_value (('off', 'ignore', 'warn', 'raise', 'pdb'))
    Doc:  If 'True', Theano will run each op at graph build time, using Constants, SharedVariables and the tag 'test_value' as inputs to the function. This helps the user track down problems in the graph before it gets optimized.
    Value:  off

print_test_value (<function BoolParam.<locals>.booltype at 0x7f808bf52378>)
    Doc:  If 'True', the __eval__ of a Theano variable will return its test_value when this is available. This has the practical conseguence that, e.g., in debugging `my_var` will print the same as `my_var.tag.test_value` when a test value is defined.
    Value:  False

compute_test_value_opt (('off', 'ignore', 'warn', 'raise', 'pdb'))
    Doc:  For debugging Theano optimization only. Same as compute_test_value, but is used during Theano optimization
    Value:  off

unpickle_function (<function BoolParam.<locals>.booltype at 0x7f808bf52598>)
    Doc:  Replace unpickled Theano functions with None. This is useful to unpickle old graphs that pickled them when it shouldn't
    Value:  True

reoptimize_unpickled_function (<function BoolParam.<locals>.booltype at 0x7f808bf52730>)
    Doc:  Re-optimize the graph when a theano function is unpickled from the disk.
    Value:  False

exception_verbosity (('low', 'high'))
    Doc:  If 'low', the text of exceptions will generally refer to apply nodes with short names such as Elemwise{add_no_inplace}. If 'high', some exceptions will also refer to apply nodes with long descriptions  like:
    A. Elemwise{add_no_inplace}
            B. log_likelihood_v_given_h
            C. log_likelihood_h
    Value:  low

openmp (<function BoolParam.<locals>.booltype at 0x7f808bf52950>)
    Doc:  Allow (or not) parallel computation on the CPU with OpenMP. This is the default value used when creating an Op that supports OpenMP parallelization. It is preferable to define it via the Theano configuration file ~/.theanorc or with the environment variable THEANO_FLAGS. Parallelization is only done for some operations that implement it, and even for operations that implement parallelism, each operation is free to respect this flag or not. You can control the number of threads used with the environment variable OMP_NUM_THREADS. If it is set to 1, we disable openmp in Theano by default.
    Value:  False

openmp_elemwise_minsize (<class 'int'>)
    Doc:  If OpenMP is enabled, this is the minimum size of vectors for which the openmp parallelization is enabled in element wise ops.
    Value:  200000

check_input (<function BoolParam.<locals>.booltype at 0x7f808bf52b70>)
    Doc:  Specify if types should check their input in their C code. It can be used to speed up compilation, reduce overhead (particularly for scalars) and reduce the number of generated C files.
    Value:  True

cache_optimizations (<function BoolParam.<locals>.booltype at 0x7f808bf52d08>)
    Doc:  WARNING: work in progress, does not work yet. Specify if the optimization cache should be used. This cache will any optimized graph and its optimization. Actually slow downs a lot the first optimization, and could possibly still contains some bugs. Use at your own risks.
    Value:  False

unittests.rseed (<class 'str'>)
    Doc:  Seed to use for randomized unit tests. Special value 'random' means using a seed of None.
    Value:  666

NanGuardMode.nan_is_error (<function BoolParam.<locals>.booltype at 0x7f808bf54048>)
    Doc:  Default value for nan_is_error
    Value:  True

NanGuardMode.inf_is_error (<function BoolParam.<locals>.booltype at 0x7f808bf541e0>)
    Doc:  Default value for inf_is_error
    Value:  True

NanGuardMode.big_is_error (<function BoolParam.<locals>.booltype at 0x7f808bf54378>)
    Doc:  Default value for big_is_error
    Value:  True

NanGuardMode.action (('raise', 'warn', 'pdb'))
    Doc:  What NanGuardMode does when it finds a problem
    Value:  raise

optimizer_excluding (<class 'str'>)
    Doc:  When using the default mode, we will remove optimizer with these tags. Separate tags with ':'.
    Value: 

optimizer_including (<class 'str'>)
    Doc:  When using the default mode, we will add optimizer with these tags. Separate tags with ':'.
    Value: 

optimizer_requiring (<class 'str'>)
    Doc:  When using the default mode, we will require optimizer with these tags. Separate tags with ':'.
    Value: 

DebugMode.patience (<class 'int'>)
    Doc:  Optimize graph this many times to detect inconsistency
    Value:  10

DebugMode.check_c (<function BoolParam.<locals>.booltype at 0x7f808bf548c8>)
    Doc:  Run C implementations where possible
    Value:  True

DebugMode.check_py (<function BoolParam.<locals>.booltype at 0x7f808bf54a60>)
    Doc:  Run Python implementations where possible
    Value:  True

DebugMode.check_finite (<function BoolParam.<locals>.booltype at 0x7f808bf54bf8>)
    Doc:  True -> complain about NaN/Inf results
    Value:  True

DebugMode.check_strides (<class 'int'>)
    Doc:  Check that Python- and C-produced ndarrays have same strides. On difference: (0) - ignore, (1) warn, or (2) raise error
    Value:  0

DebugMode.warn_input_not_reused (<function BoolParam.<locals>.booltype at 0x7f808bf54ea0>)
    Doc:  Generate a warning when destroy_map or view_map says that an op works inplace, but the op did not reuse the input for its output.
    Value:  True

DebugMode.check_preallocated_output (<class 'str'>)
    Doc:  Test thunks with pre-allocated memory as output storage. This is a list of strings separated by ":". Valid values are: "initial" (initial storage in storage map, happens with Scan),"previous" (previously-returned memory), "c_contiguous", "f_contiguous", "strided" (positive and negative strides), "wrong_size" (larger and smaller dimensions), and "ALL" (all of the above).
    Value: 

DebugMode.check_preallocated_output_ndim (<class 'int'>)
    Doc:  When testing with "strided" preallocated output memory, test all combinations of strides over that number of (inner-most) dimensions. You may want to reduce that number to reduce memory or time usage, but it is advised to keep a minimum of 2.
    Value:  4

profiling.time_thunks (<function BoolParam.<locals>.booltype at 0x7f808bf582f0>)
    Doc:  Time individual thunks when profiling
    Value:  True

profiling.n_apply (<class 'int'>)
    Doc:  Number of Apply instances to print by default
    Value:  20

profiling.n_ops (<class 'int'>)
    Doc:  Number of Ops to print by default
    Value:  20

profiling.output_line_width (<class 'int'>)
    Doc:  Max line width for the profiling output
    Value:  512

profiling.min_memory_size (<class 'int'>)
    Doc:  For the memory profile, do not print Apply nodes if the size
             of their outputs (in bytes) is lower than this threshold
    Value:  1024

profiling.min_peak_memory (<function BoolParam.<locals>.booltype at 0x7f808bf588c8>)
    Doc:  The min peak memory usage of the order
    Value:  False

profiling.destination (<class 'str'>)
    Doc: 
             File destination of the profiling output
            
    Value:  stderr

profiling.debugprint (<function BoolParam.<locals>.booltype at 0x7f808bf58ae8>)
    Doc: 
             Do a debugprint of the profiled functions
            
    Value:  False

profiling.ignore_first_call (<function BoolParam.<locals>.booltype at 0x7f808bf58c80>)
    Doc: 
             Do we ignore the first call of a Theano function.
            
    Value:  False

optdb.position_cutoff (<class 'float'>)
    Doc:  Where to stop eariler during optimization. It represent the position of the optimizer where to stop.
    Value:  inf

optdb.max_use_ratio (<class 'float'>)
    Doc:  A ratio that prevent infinite loop in EquilibriumOptimizer.
    Value:  8.0

gcc.cxxflags (<class 'str'>)
    Doc:  Extra compiler flags for gcc
    Value: 

cmodule.warn_no_version (<function BoolParam.<locals>.booltype at 0x7f808bf5c048>)
    Doc:  If True, will print a warning when compiling one or more Op with C code that can't be cached because there is no c_code_cache_version() function associated to at least one of those Ops.
    Value:  False

cmodule.remove_gxx_opt (<function BoolParam.<locals>.booltype at 0x7f808bf5c1e0>)
    Doc:  If True, will remove the -O* parameter passed to g++.This is useful to debug in gdb modules compiled by Theano.The parameter -g is passed by default to g++
    Value:  False

cmodule.compilation_warning (<function BoolParam.<locals>.booltype at 0x7f808bf5c378>)
    Doc:  If True, will print compilation warnings.
    Value:  False

cmodule.preload_cache (<function BoolParam.<locals>.booltype at 0x7f808bf5c510>)
    Doc:  If set to True, will preload the C module cache at import time
    Value:  False

cmodule.age_thresh_use (<class 'int'>)
    Doc:  In seconds. The time after which Theano won't reuse a compile c module.
    Value:  2073600

blas.ldflags (<class 'str'>)
    Doc:  lib[s] to include for [Fortran] level-3 blas implementation
    Value: 

metaopt.verbose (<function BoolParam.<locals>.booltype at 0x7f808bf5c8c8>)
    Doc:  Enable verbose output for meta optimizers
    Value:  False

profile (<function BoolParam.<locals>.booltype at 0x7f808bf5ca60>)
    Doc:  If VM should collect profile information
    Value:  False

profile_optimizer (<function BoolParam.<locals>.booltype at 0x7f808bf5cbf8>)
    Doc:  If VM should collect optimizer profile information
    Value:  False

profile_memory (<function BoolParam.<locals>.booltype at 0x7f808bf5cd90>)
    Doc:  If VM should collect memory profile information and print it
    Value:  False

<theano.configparser.ConfigParam object at 0x7f808bf5e2e8>
    Doc:  Useful only for the vm linkers. When lazy is None, auto detect if lazy evaluation is needed and use the apropriate version. If lazy is True/False, force the version used between Loop/LoopGC and Stack.
    Value:  None

warn.identify_1pexp_bug (<function BoolParam.<locals>.booltype at 0x7f808bf5f048>)
    Doc:  Warn if Theano versions prior to 7987b51 (2011-12-18) could have yielded a wrong result due to a bug in the is_1pexp function
    Value:  False

on_shape_error (('warn', 'raise'))
    Doc:  warn: print a warning and use the default value. raise: raise an error
    Value:  warn

tensor.insert_inplace_optimizer_validate_nb (<class 'int'>)
    Doc:  -1: auto, if graph have less then 500 nodes 1, else 10
    Value:  -1

experimental.local_alloc_elemwise (<function BoolParam.<locals>.booltype at 0x7f808bf5f378>)
    Doc:  DEPRECATED: If True, enable the experimental optimization local_alloc_elemwise. Generates error if not True. Use optimizer_excluding=local_alloc_elemwise to dsiable.
    Value:  True

experimental.local_alloc_elemwise_assert (<function BoolParam.<locals>.booltype at 0x7f808bf5f400>)
    Doc:  When the local_alloc_elemwise is applied, add an assert to highlight shape errors.
    Value:  True

scan.allow_gc (<function BoolParam.<locals>.booltype at 0x7f808bf5f620>)
    Doc:  Allow/disallow gc inside of Scan (default: False)
    Value:  False

scan.allow_output_prealloc (<function BoolParam.<locals>.booltype at 0x7f808bf5f7b8>)
    Doc:  Allow/disallow memory preallocation for outputs inside of scan (default: True)
    Value:  True

scan.debug (<function BoolParam.<locals>.booltype at 0x7f808bf5f950>)
    Doc:  If True, enable extra verbose output related to scan
    Value:  False

pycuda.init (<function BoolParam.<locals>.booltype at 0x7f808bf5fae8>)
    Doc:  If True, always initialize PyCUDA when Theano want to
                initilize the GPU.  Currently, we must always initialize
                PyCUDA before Theano do it.  Setting this flag to True,
                ensure that, but always import PyCUDA.  It can be done
                manually by importing theano.misc.pycuda_init before theano
                initialize the GPU device.
                 
    Value:  False

cublas.lib (<class 'str'>)
    Doc:  Name of the cuda blas library for the linker.
    Value:  cublas

lib.cnmem (<class 'float'>)
    Doc:  Do we enable CNMeM or not (a faster CUDA memory allocator).

             The parameter represent the start size (in MB or % of
             total GPU memory) of the memory pool.

             0: not enabled.
             0 < N <= 1: % of the total GPU memory (clipped to .985 for driver memory)
             > 0: use that number of MB of memory.

            
    Value:  0.0

compile.wait (<class 'int'>)
    Doc:  Time to wait before retrying to aquire the compile lock.
    Value:  5

compile.timeout (<class 'int'>)
    Doc:  In seconds, time that a process will wait before deciding to
override an existing lock. An override only happens when the existing
lock is held by the same owner *and* has not been 'refreshed' by this
owner for more than this period. Refreshes are done every half timeout
period for running processes.
    Value:  120

compiledir_format (<class 'str'>)
    Doc:  Format string for platform-dependent compiled module subdirectory
(relative to base_compiledir). Available keys: device, gxx_version,
hostname, numpy_version, platform, processor, python_bitwidth,
python_int_bitwidth, python_version, short_platform, theano_version.
Defaults to 'compiledir_%(short_platform)s-%(processor)s-%(python_vers
ion)s-%(python_bitwidth)s'.
    Value:  compiledir_%(short_platform)s-%(processor)s-%(python_version)s-%(python_bitwidth)s

<theano.configparser.ConfigParam object at 0x7f808bf5eda0>
    Doc:  platform-independent root directory for compiled modules
    Value:  /home/alishan/.theano

<theano.configparser.ConfigParam object at 0x7f808bf5ef60>
    Doc:  platform-dependent cache directory for compiled modules
    Value:  /home/alishan/.theano/compiledir_Linux-4.4--MANJARO-x86_64-with-glibc2.3.4--3.6.0-64




PNM

unread,
Jan 28, 2017, 8:16:50 AM1/28/17
to theano-users
Hello,
I found the same issue with a git version in local environment. In my case it seems to be a problem with Python 3, since Python 2 does not have the issue. Details below.
Thanks,
Pablo

----------------------------
Name: Theano
Version: 0.9.0b1
Summary: Optimizing compiler for evaluating mathematical expressions on CPUs and GPUs.
Home-page: http://deeplearning.net/software/theano/
Author: LISA laboratory, University of Montreal
Author-email: thean...@googlegroups.com
License: BSD
Location: /opt/libs_2017_01_26/python3_env/lib/python3.5/site-packages
Requires: numpy, scipy, six
----------------------------
GIT version:
Theano 70129ffb66320140275be7f75152e792c4647510
libgpuarray d838f6a43bcc56fb280a8b40fdbc03276b3eb7fd
----------------------------
In Python 3.5.3 or 3.6.0:

import os
os.environ['THEANO_FLAGS'] = """
    device=cuda0,
    floatX=float32,
"""
import theano

Using cuDNN version 5105 on context None

ERROR (theano.gpuarray): Could not initialize pygpu, support disabled
Traceback (most recent call last):
  File "/opt/libs_2017_01_26/python3_env/lib/python3.5/site-packages/theano/gpuarray/__init__.py", line 164, in <module>
    use(config.device)
  File "/opt/libs_2017_01_26/python3_env/lib/python3.5/site-packages/theano/gpuarray/__init__.py", line 151, in use
    init_dev(device)
  File "/opt/libs_2017_01_26/python3_env/lib/python3.5/site-packages/theano/gpuarray/__init__.py", line 100, in init_dev

    pygpu.blas.gemm(0, tmp, tmp, 0, tmp, overwrite_c=True)
  File "pygpu/blas.pyx", line 129, in pygpu.blas.gemm (pygpu/blas.c:3354)
  File "pygpu/blas.pyx", line 44, in pygpu.blas.pygpu_blas_rgemm (pygpu/blas.c:2011)
pygpu.gpuarray.GpuArrayException: (b'Unsupported operation', 5)

----------------------------
In Python 2.7.13:

import os
os.environ['THEANO_FLAGS'] = """
    device=cuda0,
    floatX=float32,
"""
import theano

Using cuDNN version 5105 on context None
Mapped name None to device cuda0: GeForce GTX 960M (0000:01:00.0)
To unsubscribe from this group and stop receiving emails from it, send an email to theano-users...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to a topic in the Google Groups "theano-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/theano-users/GQJ8yojgJo4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to theano-users...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to theano-users...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to a topic in the Google Groups "theano-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/theano-users/GQJ8yojgJo4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to theano-users...@googlegroups.com.

PNM

unread,
Jan 29, 2017, 2:59:36 AM1/29/17
to theano-users
Hello,
As for me, the issue in Python3 is already solved in git version 3343d912717ee85c5c2e0572cfae94581b35e32b.
Thanks,
Pablo

PNM

unread,
Feb 3, 2017, 2:12:39 AM2/3/17
to theano-users
Hello,

Sorry, I thought this issue with GPU in new backend was solved, but I realized I was mistakenly running a test in CPU..
I use the latest git version (8b9f73365e4932f1c005a0a37b907d28985fbc5f) in local environment. In my case it seems to be a problem with Python 3, since Python 2 does not have the issue. Details below.


Thanks,
Pablo

----------------------------
Name: Theano
Version: 0.9.0b1
Summary: Optimizing compiler for evaluating mathematical expressions on CPUs and GPUs.
Home-page: http://deeplearning.net/software/theano/
Author: LISA laboratory, University of Montreal
Author-email: thean...@googlegroups.com
License: BSD
Location: /opt/libs/python3_
env/lib/python3.5/site-packagesRequires: numpy, scipy, six
----------------------------
GIT version:
Theano 8b9f73365e4932f1c005a0a37b907d28985fbc5f
libgpuarray 61e7ff8e4a76d78afc97d337337898560c03f247

----------------------------
In Python 3.5.3 or 3.6.0:

import os
os.environ['THEANO_FLAGS'] = """
    device=cuda0,
    floatX=float32,
"""
import theano

Using cuDNN version 5105 on context None

ERROR (theano.gpuarray): Could not initialize pygpu, support disabled
Traceback (most recent call last):
  File "/opt/libs/python3_env/lib/python3.5/site-packages/theano/gpuarray/__init__.py", line 164, in <module>
    use(config.device)
  File "/opt/libs/python3_env/lib/python3.5/site-packages/theano/gpuarray/__init__.py", line 151, in use
    init_dev(device)
  File "/opt/libs/python3_env/lib/python3.5/site-packages/theano/gpuarray/__init__.py", line 100, in init_dev

    pygpu.blas.gemm(0, tmp, tmp, 0, tmp, overwrite_c=True)
  File "pygpu/blas.pyx", line 129, in pygpu.blas.gemm (pygpu/blas.c:3354)
  File "pygpu/blas.pyx", line 44, in pygpu.blas.pygpu_blas_rgemm (pygpu/blas.c:2011)
pygpu.gpuarray.GpuArrayException: (b'Unsupported operation', 5)

----------------------------
In Python 2.7.13:

import os
os.environ['THEANO_FLAGS'] = """
    device=cuda0,
    floatX=float32,
"""
import theano

Using cuDNN version 5105 on context None
Mapped name None to device cuda0: GeForce GTX 960M (0000:01:00.0)

Message has been deleted

LM

unread,
Feb 19, 2017, 8:04:04 PM2/19/17
to theano-users
Fred,

I have exactly the same issue as Tecx. I mean I got exactly the same message as Tecx got when testing theano.

I am using Theano 0.9.0b1, libgpuarray 0.6.0, and iMac (Retina 5k, 27-in, Late 2015) with AMD Radeon R9 M390 2048 Mb.

I wrote the .theanorc:

[global]
device = opencl0:1
floatX = float32


I wrote device = opencl0:1 because in a terminal I checked the device by:

python -c "import pygpu; print(pygpu.init('opencl0:1').devname)"
and got:

AMD Radeon R9 M390 Compute Engine

'opencl0:0' returns CPU.

Please advise.
To unsubscribe from this group and stop receiving emails from it, send an email to theano-users...@googlegroups.com.

Frédéric Bastien

unread,
Feb 20, 2017, 12:02:16 PM2/20/17
to theano-users
You need to install clblas or clblast to have a blas for opencl.

Without that, it is missing many operation.

Note, in Theano, we are missing many/some operation in OpenCL. We don'T have time to work in the short term on that. So if your goal is to use opencl with Theano, you will probably need to help us convert some cuda code to our code that support cuda and opencl.

Frédéric

Sina Samangooei

unread,
Feb 25, 2017, 4:29:10 AM2/25/17
to theano-users
Hey Frederic 

could you elaborate on this? I'm interested in contributing what is missing but I'd like to know where to start, is there an ongoing discussion/list of things that need to be converted?

Cheers
- Sina
Reply all
Reply to author
Forward
0 new messages