Why my __CUDA_ARCH__ is less than 200 on my sm

Henggang Cui

unread,

Oct 8, 2015, 3:02:18 PM10/8/15

to Caffe Users

Hi,

My GPU is Tesla K20C, and when I do cudaGetDeviceProperties(), my "major" is 3 and "minor" is 5. So it's sm_35 architecture (Am I correct?).

But in the "device_alternate.hpp":

// CUDA: thread number configuration.

// Use 1024 threads per block, which requires cuda sm_2x or above,

// or fall back to attempt compatibility (best of luck to you).

#if __CUDA_ARCH__ >= 200

const int CAFFE_CUDA_NUM_THREADS = 1024;

#else

const int CAFFE_CUDA_NUM_THREADS = 512;

#endif

My "__CUDA_ARCH__ " is less than 200, so my "CAFFE_CUDA_NUM_THREADS" becomes 512. What is wrong here? Why it thinks my architecture is below sm_2x?

Thanks,

Cui

Felix Abecassis

unread,

Oct 10, 2015, 6:54:41 PM10/10/15

to Caffe Users

Hello,

It's a known bug, this macro should not be used in host code.
https://github.com/BVLC/caffe/issues/418
But AFAIK, this does not impact performance.

Henggang Cui

unread,

Oct 10, 2015, 7:08:55 PM10/10/15

to Caffe Users

I see. So will I get better performance if I use 1024 threads per block?

Thanks,

Cui

Felix Abecassis

unread,

Oct 10, 2015, 7:14:42 PM10/10/15

to Caffe Users

I don't think so. But feel free to try :)

Jonathan L Long

unread,

Nov 5, 2015, 12:09:39 AM11/5/15

to Felix Abecassis, Caffe Users

Felix is right; see https://github.com/BVLC/caffe/pull/3285.

JLL

--
You received this message because you are subscribed to the Google Groups "Caffe Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to caffe-users...@googlegroups.com.
To post to this group, send email to caffe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/caffe-users/e759ce35-3105-4aef-994f-8d41ff9968d6%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Reply all

Reply to author

Forward

Why my __CUDA_ARCH__ is less than 200 on my sm_35 architecture GPU?

Henggang Cui

Felix Abecassis

Henggang Cui

Felix Abecassis

Jonathan L Long