Quasi-dynamically unrolling RNNs?

142 views
Skip to first unread message

Mohammed AlQuraishi

unread,
Nov 12, 2015, 6:08:44 PM11/12/15
to Discuss
I haven't had a chance to delve into TF too deeply but my impression is that currently it is not able to dynamically unroll an RNN based on the number of time steps in a sample (or alternatively the longest time step in a batch). Instead it can only do a fixed number of unroll steps, and so to do full BPTT one would have to unroll for as many time steps as the longest sequence in the data set, even if in any given batch no sequence is that long. Is my understanding correct (using the provided RNN examples)?

If that is the case, is it possible, either using multiple graphs or just a single graph with conditionals, to have "branches" set at different discrete number of time steps, like 50, 100, 150, etc, and depending on the length of the longest sequence in a given batch, a different branch of the graph, or an altogether different graph, is used? Is this a good idea to do in practice, given that the motivation behind it would be to save on processing time by avoiding unnecessarily long unrolls that only encounter padded 0s. There will presumably be some drawbacks to using larger graphs or multiple graphs, including shuttling data between graphs in the latter case.

Rafał Józefowicz

unread,
Nov 12, 2015, 6:19:27 PM11/12/15
to Discuss
You are correct. We don't fully support dynamic unrolling at the moment.
Take a look at the seq2seq tutorial [link] on the "bucketing and padding" section, which describes how to support your use case of having multiple graphs for different number of unrolled steps. It is often a good idea if you have to do full unrolling of the RNNs.

Evan Pu

unread,
Nov 12, 2015, 7:58:33 PM11/12/15
to Discuss
Hey man thanks I was wondering what padded zeros are. This post explained it well haha. Cheers.

Mohammed AlQuraishi

unread,
Nov 13, 2015, 8:03:46 AM11/13/15
to Discuss
Ok thanks for the reply. This is very much along the lines I was thinking, and looking at seq2seq.py will be very helpful.
Reply all
Reply to author
Forward
0 new messages