Unknown compute capability warning on cudnn 8.0

403 views
Skip to first unread message

Xander Dunn

unread,
Jan 29, 2021, 8:55:19 AM1/29/21
to Swift for TensorFlow

I’m trying out one of GCP’s A100 GPUs for the first time. When using the XLA backend, I see this warning printed several times during the first few training iterations of my model:

2021-01-29 05:31:15.645174: W tensorflow/compiler/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc:97] Unknown compute capability (8, 0) .Defaulting to telling LLVM that we're compiling for sm_75

I’m on Ubuntu 18.04 with Swift for Tensorflow 0.12, CUDA 11.0 Update 1, and cudnn 8.0.5. The filename of the Swift for Tensorflow version I’m using indicates that it is meant for cudnn 8: swift-tensorflow-RELEASE-0.12-cuda11.0-cudnn8-ubuntu18.04.tar.gz. I believe cudnn 8 is the compute capability (8, 0) that the warning is referring to.

Has anyone else seen this warning? I don’t think this is expected?

Thanks,
Xander

Brad Larson

unread,
Jan 29, 2021, 9:17:56 AM1/29/21
to Swift for TensorFlow, xan...@xander.ai
In this case, I believe `compute capability (8, 0)` refers to CUDA compute capabilities, not the cuDNN version. The Swift for TensorFlow CUDA 11.0 toolchains are built with the following compute capabilities:

sm_35,sm_50,sm_60,sm_70,sm_75,compute_80

Those correspond to generations of Nvidia GPUs per this nice list. compute_80 should be the the compute capability level for the A100, so it's strange that this is being reported as an unknown compute capability. It's possible that the version of TensorFlow we used for the 0.12 release (2.3.0, if I remember correctly) didn't have support for that compute capability within XLA yet. It looks like it's saying that it did recognize the sm_75 CUDA capability that's a step below it in our capability list.

Try one of our CUDA 11.0, cuDNN 8.0 nightly Ubuntu 18.04 toolchains and see if the same warning is repeated with that newer toolchain. The nightlies now use TensorFlow 2.4.0, which I know has support for the A100 GPUs and their compute capabilities within it.

Xander Dunn

unread,
Jan 29, 2021, 9:35:26 AM1/29/21
to Swift for TensorFlow, bradl...@google.com, Xander Dunn

Thanks Brad, that’s helpful.

My project currently doesn’t build on the nightlies because of a penguin dependency and this issue. I’ll try it with the nightlies as soon as I can build.

Brennan Saeta

unread,
Jan 29, 2021, 3:47:29 PM1/29/21
to Xander Dunn, Swift for TensorFlow, bradl...@google.com
Hi Xander! I posted a reply on https://github.com/saeta/penguin/issues/142. tl;dr: I tried a while back and just again today and unfortunately I couldn't reproduce the problem you are seeing. Could you provide more information to help me reproduce it? Thanks! -Brennan

--
To unsubscribe from this group and stop receiving emails from it, send an email to swift+un...@tensorflow.org.

Xander Dunn

unread,
Jan 29, 2021, 4:25:22 PM1/29/21
to Swift for TensorFlow, Brennan Saeta, Swift for TensorFlow, bradl...@google.com, Xander Dunn
Thanks for taking a look Brennan. It looks like we're using different toolchains. The build problem is specific to the stock nightly toolchains. I added some additional info to the GitHub issue.

Brennan Saeta

unread,
Jan 29, 2021, 5:29:02 PM1/29/21
to Xander Dunn, Swift for TensorFlow, bradl...@google.com
Thank you for the extra information! https://github.com/saeta/penguin/pull/143 for the short-term workaround. :-) -Brennan

Xander Dunn

unread,
Jan 29, 2021, 11:22:03 PM1/29/21
to Swift for TensorFlow, Brennan Saeta, Swift for TensorFlow, bradl...@google.com, Xander Dunn
Brad, I can confirm that I do not see the compute version warnings when running on an A100 GPU with the latest nightly stock toolchain. Tensorflow 2.4 appears to have done the trick. Thanks!
Reply all
Reply to author
Forward
0 new messages