Caffe is not really using GPU P2P and cannot exploit the benefit of NVLINK

466 views
Skip to first unread message

Marcelo Amaral

unread,
Jan 9, 2017, 1:46:29 PM1/9/17
to Caffe Users
Hi, 

I have run caffe applications cifar10 and imagent

For cifar10 (the basic example) it presents 0.5% of PtoP communication between GPUs.

For other type of networks: caffe train --solver=models/{bvlc_alexnet, bvlc_googlenet, bvlc_reference_caffenet}/solver.prototxt -gpu 0,1
There was no PtoP when profiling with nvprof.

Does someone know how caffe use multiple GPUs? 
How to exploit the benefit of P2P communication with multiple GPUs?

Thanks!

Dennis Mungai

unread,
Jan 9, 2017, 2:07:07 PM1/9/17
to Marcelo Amaral, Caffe Users
Hello there, 

You will need to build Nvidia's Caffe branch from GitHub that integrates the NCCL ("Nickel") library.

That will solve your problem. 

Regards,

Dennis. 

--
You received this message because you are subscribed to the Google Groups "Caffe Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to caffe-users+unsubscribe@googlegroups.com.
To post to this group, send email to caffe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/caffe-users/0e5a1c61-95e1-4e87-b697-024a80782182%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Dennis Mungai

unread,
Jan 9, 2017, 2:11:55 PM1/9/17
to Marcelo Amaral, Caffe Users
Hello Marcelo,

Here is a gist that describes the build process for a shared HPC system on Ubuntu 16.04 LTS for the Nvidia Cafe branch with the NCCL library:



Adapt as necessary. 

Regards,

Dennis. 

Marcelo Amaral

unread,
Jan 10, 2017, 10:11:27 AM1/10/17
to Dennis Mungai, Caffe Users
Hi Dennis.

Actually I was using caffe with NCCL.
After you talked about that, I also tried without NCCL and then caffe started to use PtoP between GPUs.
Either there is a bug in NCCL or I am missing some configuration.

Thank you very much for you attention!

Regards,
Marcelo
--
-- 
====================================================================
Marcelo Carneiro do Amaral
BSC - Barcelona Super Computer Center
UPC - Universitat Politêcnica Catalunya
====================================================================

ALERTA: A informação contida nesta mensagem é confidencial, e destinada ao uso exclusivo do destinatário. Caso essa correspondência tenha sido recebida por equívoco, notifico que sua divulgação é proibida por lei, e solicito que o remetente seja comunicado imediatamente, via e-mail.  Obrigado.

NOTICE:  This transmittal and/or attachments may be a privileged or confidential information. If you are not the intended recipient, you are hereby notified that you have received this transmittal in error. Any review, dissemination, distribution or copying of this transmittal is strictly prohibited. If you have received this message in error, please notify sender by return e-mail.




kishen suraj P

unread,
Jan 15, 2017, 3:33:56 AM1/15/17
to Caffe Users, dmn...@gmail.com
what happens when there is no p2p DMA access?
Will multi gpu work?
To unsubscribe from this group and stop receiving emails from it, send an email to caffe-users...@googlegroups.com.

To post to this group, send email to caffe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/caffe-users/0e5a1c61-95e1-4e87-b697-024a80782182%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Marcelo Amaral

unread,
Jan 16, 2017, 3:30:08 AM1/16/17
to Caffe Users
Hi Kishen,

Yes, it works with GPUs in different sockets.

sariv...@gmail.com

unread,
Mar 18, 2017, 3:20:11 AM3/18/17
to Caffe Users, dmn...@gmail.com
Hi Marcelo,

I too observed the same behavior with BVLC caffe RC5 which uses NCCL. Did you happen to find out the reason for this ?

Thanks,
Saritha


On Tuesday, January 10, 2017 at 8:41:27 PM UTC+5:30, Marcelo Amaral wrote:
To unsubscribe from this group and stop receiving emails from it, send an email to caffe-users...@googlegroups.com.

To post to this group, send email to caffe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/caffe-users/0e5a1c61-95e1-4e87-b697-024a80782182%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

david chiu

unread,
Apr 13, 2017, 11:37:50 AM4/13/17
to Caffe Users
Hi Marcelo 
    I also use the GPU with NVLink and I installed nv-caffe + nccl.
When i run the imagenet training with googlenet, it seen have no benefit whit nvlink?
Did it need any configure when build nccl or caffe ?

Thanks
David Chiu

Marcelo Amaral於 2017年1月16日星期一 UTC+8下午4時30分08秒寫道:
Reply all
Reply to author
Forward
0 new messages