new install complains about Invalid Device

605 views
Skip to first unread message

hora...@gmail.com

unread,
Jun 13, 2017, 9:15:17 PM6/13/17
to Discuss
Hey folks, this is a new install of tensorflow 1.1.0 under a conda environment. I ran a very simple test script and got this error below. Can someone please explains.

My config :
    ubuntu 14.04
    CUDA 8.0
    Anaconda2-4.4.0-Linux-x86_64
    python 2.7.13
    Nvidia GTX 660 Ti 3Gb

Error msg :

(tensorflowGPU) rspace@homantin:~/Dropbox/GIT/ZhouTutorials/tensorflowTUT$ python tensorflow6_session.py 
2017-06-13 18:09:08.506229: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-13 18:09:08.506273: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-13 18:09:08.506281: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-06-13 18:09:08.506287: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-13 18:09:08.506295: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2017-06-13 18:09:08.570576: I tensorflow/core/common_runtime/gpu/gpu_device.cc:887] Found device 0 with properties: 
name: GeForce GTX 660 Ti
major: 3 minor: 0 memoryClockRate (GHz) 1.0455
pciBusID 0000:04:00.0
Total memory: 2.95GiB
Free memory: 2.91GiB
2017-06-13 18:09:08.570665: W tensorflow/stream_executor/cuda/cuda_driver.cc:485] creating context when one is currently active; existing: 0x25fb360
2017-06-13 18:09:08.629965: E tensorflow/core/common_runtime/direct_session.cc:137] Internal: failed initializing StreamExecutor for CUDA device ordinal 1: Internal: failed call to cuDevicePrimaryCtxRetain: CUDA_ERROR_INVALID_DEVICE
Traceback (most recent call last):
  File "tensorflow6_session.py", line 18, in <module>
    sess = tf.Session()
  File "/home/rspace/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1193, in __init__
    super(Session, self).__init__(target, graph, config=config)
  File "/home/rspace/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 554, in __init__
    self._session = tf_session.TF_NewDeprecatedSession(opts, status)
  File "/home/rspace/anaconda2/lib/python2.7/contextlib.py", line 24, in __exit__
    self.gen.next()
  File "/home/rspace/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
    pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InternalError: Failed to create session.

Toby Boyd

unread,
Jun 13, 2017, 9:34:54 PM6/13/17
to hora...@gmail.com, Discuss
There could be a variety of problems.  Given the error is directly from Cuda I would start with this thread.  Their problem ended up being multi-gpu and they were using docker but there may be some good ideas in there that relate to your situation.  

--
You received this message because you are subscribed to the Google Groups "Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to discuss+unsubscribe@tensorflow.org.
To post to this group, send email to dis...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/discuss/f6f96594-3f95-422b-98a2-d87b567ef4c8%40tensorflow.org.

hora...@gmail.com

unread,
Jun 13, 2017, 9:49:24 PM6/13/17
to Discuss, hora...@gmail.com
Forgot to mention this little unusual situation.

This is a brand new Dell T5810 linux box. It came with a Nvidia video card NVS 310. I added the GTX 660 Ti. 

Could this cause problem ?

Toby Boyd

unread,
Jun 13, 2017, 11:35:56 PM6/13/17
to hora...@gmail.com, Discuss
The person on the thread had a similar issue.  This is not a normal.  Run nvidia-smi just to make sure you see both devices.   Then when running tensorflow use CUDA_VISIBLE_DEVICES=0 or try 1 before you run the script.  That is just a guess.  This was in the solution I found googling your problem.

Toby

--
You received this message because you are subscribed to the Google Groups "Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to discuss+unsubscribe@tensorflow.org.
To post to this group, send email to dis...@tensorflow.org.

Mark Inkley

unread,
Jun 14, 2017, 11:49:48 PM6/14/17
to Toby Boyd, hora...@gmail.com, Discuss
Horace

You might also try deviceQuery which comes with the NVIDIA CUDA SAMPLES (I only have 7.5) - if this passes then your CUDA install is good.  I have not upgraded to 8 yet, but assume the same sample would be available under 8



 ~/NVIDIA_CUDA-7.5_Samples/1_Utilities/deviceQuery/deviceQuery 
/home/mink/NVIDIA_CUDA-7.5_Samples/1_Utilities/deviceQuery/deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce GTX 1060 6GB"
  CUDA Driver Version / Runtime Version          8.0 / 7.5
  CUDA Capability Major/Minor version number:    6.1
  Total amount of global memory:                 6070 MBytes (6365052928 bytes)
MapSMtoCores for SM 6.1 is undefined.  Default to use 128 Cores/SM
MapSMtoCores for SM 6.1 is undefined.  Default to use 128 Cores/SM
  (10) Multiprocessors, (128) CUDA Cores/MP:     1280 CUDA Cores
  GPU Max Clock rate:                            1709 MHz (1.71 GHz)
  Memory Clock rate:                             4004 Mhz
  Memory Bus Width:                              192-bit
  L2 Cache Size:                                 1572864 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 2 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device PCI Domain ID / Bus ID / location ID:   0 / 1 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 8.0, CUDA Runtime Version = 7.5, NumDevs = 1, Device0 = GeForce GTX 1060 6GB
Result = PASS




On 15 Jun 2017, at 1:31 PM, Mark Inkley <min...@gmail.com> wrote:

To test your CUDA install sees your GPU devices run the command deviceQuery that comes with the NVIDIA CUDA Sam
On 15 Jun 2017, at 12:57 PM, Mark Inkley <min...@gmail.com> wrote:

You need to check your CUDA & nVidia install as per the cuda install manuals.

Toby is correct, if the nvidia tools can’t see your cards your install is broken.

My experience with installing cuda and nvidia drivers using apt/yum install commands was never 
To unsubscribe from this group and stop receiving emails from it, send an email to discuss+u...@tensorflow.org.

To post to this group, send email to dis...@tensorflow.org.

Toby Boyd

unread,
Jun 15, 2017, 1:06:45 AM6/15/17
to Mark Inkley, horace T, Discuss
I think horace and I fixed this off line and did not ping the thread.  Visible cuda devices fixed it.  Horace had another video card that was getting in the way.  Super common in my setup at work as well as other googlers.  

Reply all
Reply to author
Forward
0 new messages