tf.while and XlaWhile Op

dav...@graphcore.ai

unread,

Jul 3, 2017, 11:09:51 AM7/3/17

to XLA development

What are the plans for support of the tf.while construction? It seems to generate an unsupported op called Enter. Looking further, I can see that this is probably correct.

I can also see that there is an op specifically added to the core called 'XlaWhile', which maps neatly to the HLO while instruction.

Will there be automatic conversion between the various ops generated by the tf.while operator and an XlaWhile operator?

Cheers,

David

Peter Hawkins

unread,

Jul 5, 2017, 8:38:49 AM7/5/17

to dav...@graphcore.ai, XLA development

Hi...

Yes, as it happens we do have such a conversion:

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/compiler/tf2xla/functionalize_control_flow.h

The conversion is not, however, completely hooked up yet. In particular, currently the mark_for_compilation_pass does not consider the loop operators (Enter/Exit/Switch/Merge/LoopCond) to be candidates for compilation. If you manually override the behavior of the compilation marking pass then you can compile loops.

Note that that code is not sufficient to handle gradients of loops --- I am actively working on loop gradients and dynamic RNNs right now.

You can also write Python code that generates the XlaWhile operator directly, although that isn't a supported API. Many details like state mutations inside loops will not work seamlessly if you call the while loop operator directly, but for pure computations it works fine.

Peter

--
You received this message because you are subscribed to the Google Groups "XLA development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to xla-dev+u...@googlegroups.com.
To post to this group, send email to xla...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/xla-dev/e494bc35-a92f-4d1e-b6ba-73119e9ff703%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

dav...@graphcore.ai

unread,

Jul 5, 2017, 9:23:25 AM7/5/17

to XLA development

thanks - i had not looked hard enough to spot functionalize_control_flow.h

dav...@graphcore.ai

unread,

Oct 3, 2017, 8:11:23 AM10/3/17

to XLA development

Hi,

I've been asked to look at 'https://github.com/deepmind/dnc' and also dynamic RNNs. I was wondering where tf.while_loop support stands?

If I register the various loop control ops with my device, then the mark_for_compilation_pass aborts because it doesn't like those ops. If I then remove
the abort then the ops get through to the functionalize control loops code and they are turned into an XLA while correctly. Nevertheless, I have only a small example, and there
may be loops which cannot be converted to an XLA while which I have not found.

I also find that the auto-generated gradient ops contain links between ops which are not part of the same 'frame', and are rejected by the loop structure generator.

cheers

Peter Hawkins

unread,

Oct 3, 2017, 8:33:14 AM10/3/17

to dav...@graphcore.ai, XLA development

Hi...

On Tue, Oct 3, 2017 at 8:11 AM <dav...@graphcore.ai> wrote:

Hi,

I've been asked to look at 'https://github.com/deepmind/dnc' and also dynamic RNNs. I was wondering where tf.while_loop support stands?

In general a lot of the internals to compile are there, but as you observe, the auto-clustering (mark_for_compilation_pass.cc) current blacklists loops.

The auto-clustering code does not consider loops compilable for the very simple reason that the auto-clustering code depends on a cycle-detection algorithm to avoid creating deadlocks, so the easiest starting point was to simply ban loops. I think this is fixable but I don't have many cycles to put into it right now.

I think in the short term we are likely to add a more explicit "compile this part of the graph as a unit" scoping mechanism that bypasses autoclustering. One reason is performance predictability. Another reason is that if you want to compile variable reads and writes, then autoclustering changes when side-effects (such as variable reads and writes) happen. It is antisocial to change the order of effects under the user as the result of a clustering algorithm they don't know about and can't control.

I'm sure we will get back to autoclustering but it's looking slightly lower on the priority list at the moment. This is still in flux, though. Your input is welcome.

Currently the only way in the checked in tree is to use the loop translation is via the ahead-of-time compilation path, which just compiles the entire graph you give it and doesn't try to be smart about what is compiled.

If I register the various loop control ops with my device, then the mark_for_compilation_pass aborts because it doesn't like those ops. If I then remove
the abort then the ops get through to the functionalize control loops code and they are turned into an XLA while correctly. Nevertheless, I have only a small example, and there
may be loops which cannot be converted to an XLA while which I have not found.

Yes. In general we have a somewhat awkward design problem --- XLA has a functional "While" loop, but TensorFlow has a dataflow graph loop built out of 6 operators. Changing TensorFlow's while loop to be like XLA's while loop is difficult and interacts with many things, so in the short term we have a pass that converts Tensorflow while loops to XLA while loops.

The rewrite is fairly syntactic, so it won't work in all cases. That said, for forward loops it seems to be working fairly reliably. Gradients of loops are still something of a work in progress, but simple cases work.

I also find that the auto-generated gradient ops contain links between ops which are not part of the same 'frame', and are rejected by the loop structure generator.

If you have a test case, I would like to look into this.

One way that this might happen is if autoclustering isn't putting all the parts of the loop (Enter/Exit/Switch/Merge/NextIteration/LoopCond and everything in between) in the same cluster. Currently autoclustering knows nothing about loop frames so it might easily split a loop up.

Peter

cheers

--
You received this message because you are subscribed to the Google Groups "XLA development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to xla-dev+u...@googlegroups.com.
To post to this group, send email to xla...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/xla-dev/4f954822-2d27-4c55-9c6a-c54743f65a77%40googlegroups.com.

dav...@graphcore.ai

unread,

Oct 3, 2017, 8:46:22 AM10/3/17

to XLA development

Thanks for the quick reply.

My forward pass test case was nice and trivial, but the full forward+back was just trying to run the 'dnc' code from Deepmind. I will try to create a pared down example though.

(are you not in CA right now, or are you working at 5.30am?)

Reply all

Reply to author

Forward