I haven't had a chance to delve into TF too deeply but my impression is that currently it is not able to dynamically unroll an RNN based on the number of time steps in a sample (or alternatively the longest time step in a batch). Instead it can only do a fixed number of unroll steps, and so to do full BPTT one would have to unroll for as many time steps as the longest sequence in the data set, even if in any given batch no sequence is that long. Is my understanding correct (using the provided RNN examples)?
If that is the case, is it possible, either using multiple graphs or just a single graph with conditionals, to have "branches" set at different discrete number of time steps, like 50, 100, 150, etc, and depending on the length of the longest sequence in a given batch, a different branch of the graph, or an altogether different graph, is used? Is this a good idea to do in practice, given that the motivation behind it would be to save on processing time by avoiding unnecessarily long unrolls that only encounter padded 0s. There will presumably be some drawbacks to using larger graphs or multiple graphs, including shuttling data between graphs in the latter case.