Replicating XLA Executables on different GPUs

88 views
Skip to first unread message

Sean Moriarity

unread,
Dec 15, 2020, 9:27:24 PM12/15/20
to XLA development
Hello everyone,

We were recently testing some of XLAs parallelism features and ran in to some issues running replicated executables across different devices (in our case a GeForce RTX 2060 and 2070).

Digging into the source, we noticed that XLA validates that an executable is compatible with a device by just checking the name of the device against the name of the device it was compiled for. We were able to adjust this check and get everything running just fine, but have some follow-on questions.

I understand that an executable compiled for one device would run best on that device, but why does it need to be exactly the same device? My understanding was that so long as the architectures were the same, the generated executable should run without issue. What are the implications of making this check a bit less restrictive?

Sanjoy Das

unread,
Dec 15, 2020, 10:14:48 PM12/15/20
to Sean Moriarity, Tim Shen, XLA development
XLA GPU incidentally depends on the amount of memory on the GPU -- when we auto-tune convolutions we pick algorithms whose workspace size fits on the GPU.  This means that even though Titan-V and V100 have compute capability 7.0, an XLA executable compiled for the latter might not run on the former.

(Aside: this approach assumes that the amount of GPU memory available during compilation is <= the amount of memory available at runtime which may not be true.  It also makes it harder to cross-compile.)

Other than that I don't know of any hard dependencies.

CC +Tim Shen in case I'm forgetting something.

-- Sanjoy
 

--
You received this message because you are subscribed to the Google Groups "XLA development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to xla-dev+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/xla-dev/b6a826ac-eacd-4ba1-91f5-6d3249ff3368n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages