How best to execute computations?

Joel

unread,

Apr 1, 2022, 10:51:34 AM4/1/22

to XLA development

Hi XLA team,

I've seen a couple of approaches to running XLA computations, and I'd like to find out what's recommended. I can either use the functionality in

xla/client/client_library.h

xla/client/client.h

xla/client/local_client.h

or the functionality in xla/pjrt/. What's the difference between these? Is one recommended over the other? Are there other approaches I've missed?

I'd also like to make sure that I'm actually using XLA, as when I run it I see

"""

2022-04-01 15:48:43.635974: I tensorflow/compiler/xla/service/service.cc:171] XLA service 0x1d4a830 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2022-04-01 15:48:43.636093: I tensorflow/compiler/xla/service/service.cc:179] StreamExecutor device (0): Host, Default Version

"""

which suggests I could potentially not actually be using XLA.

Regards,

Joel

Peter Hawkins

unread,

Apr 1, 2022, 11:55:15 AM4/1/22

to Joel, XLA development

Hi...

On Fri, Apr 1, 2022 at 10:51 AM 'Joel' via XLA development <xla...@googlegroups.com> wrote:

Hi XLA team,

I've seen a couple of approaches to running XLA computations, and I'd like to find out what's recommended. I can either use the functionality in

xla/client/client_library.h
xla/client/client.h
xla/client/local_client.h

Of those, local_client.h is probably the right API to use. I would not use the non-local client APIs any more. client_library.h contains factories for building clients, so you'd need that too

or the functionality in xla/pjrt/. What's the difference between these? Is one recommended over the other? Are there other approaches I've missed?

PJRT is an opinionated runtime layer built on top of local_client.h for the use of JAX (primarily). It hides things like Stream mechanics and presents an interface that works on asynchronous array values (futures). You are welcome to use it: JAX does, but note we don't consider it a stable API and the JAX team may change it at any time. If you want something different, you'd need to talk to us.

Contrast that with local_client.h, which requires that you manage a lot of details yourself (e.g., which stream to use). On CPU, PJRT is also higher-performance than the local client APIs, since it avoids the StreamExecutor layer which isn't a good API for things that aren't accelerators connected by a relatively slow PCIe link. On GPU, the two should perform similarly.

Peter

I'd also like to make sure that I'm actually using XLA, as when I run it I see
"""
2022-04-01 15:48:43.635974: I tensorflow/compiler/xla/service/service.cc:171] XLA service 0x1d4a830 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2022-04-01 15:48:43.636093: I tensorflow/compiler/xla/service/service.cc:179] StreamExecutor device (0): Host, Default Version
"""
which suggests I could potentially not actually be using XLA.

That warning is intended for TensorFlow users, mostly so they don't get misled into thinking XLA is being used just because of the log message. If you are using XLA APIs yourself, then you are definitely using XLA.

Peter

Regards,
Joel

--
You received this message because you are subscribed to the Google Groups "XLA development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to xla-dev+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/xla-dev/edce1191-2dab-4033-b012-3ccff3995e89n%40googlegroups.com.

Joel

unread,

Apr 1, 2022, 12:35:49 PM4/1/22

to XLA development

Thank you Peter for the quick response.

> I would not use the non-local client APIs any more

Are the non-local clients the ones in pjrt/?

> On GPU, the two should perform similarly.

Oh I didn't realise it was possible to run on GPU via the client/ functionality. That was actually why I was looking at moving to pjrt/, so that's very useful to know.

Reply all

Reply to author

Forward