How to find intermediate outputs of LSTM by using tf.nn.dynamic_rnn in tensorflow

2,723 views
Skip to first unread message

abhishek kumar

unread,
May 14, 2017, 3:26:43 PM5/14/17
to Discuss
I am using LSTMCell which are stacked by tf.contrib.rnn.MultiRNNCell and then using tf.nn.dynamic_rnn. The output and state which tf.nn.dynamic_rnn returns are the final output
and final state. As per my understanding the final output is the output of the last layer across various time-steps. What I need is all the intermediate outputs in LSTM. Please help me.

My guess is to write custom _dynamic_rnn as defined in https://github.com/tensorflow/tensorflow/blob/r1.1/tensorflow/python/ops/rnn.py
so that it returns all the outputs. Please suggest.


My code flow is like

...
...
    num_neurons = 10
    num_layers = 3
    max_length = 8
    frame_size = 5

    cell = tf.contrib.rnn.LSTMCell(num_neurons, state_is_tuple= True)
    cell = tf.contrib.rnn.MultiRNNCell([cell] * num_layers)

    sequence = tf.placeholder(tf.float32, [None, max_length, frame_size])

   output, state = tf.nn.dynamic_rnn(
        cell,
        sequence,
        dtype=tf.float32,
        sequence_length=length(sequence),
    )
...
...

Thanks

Thomas Quintana

unread,
May 14, 2017, 9:51:49 PM5/14/17
to abhishek kumar, Discuss
As per the documentation tf.nn.dynamic_rnn returns the output for every time step.


outputs: The RNN output `Tensor`.

 
If time_major == False (default), this will be a `Tensor` shaped:
   
`[batch_size, max_time, cell.output_size]`.

 
If time_major == True, this will be a `Tensor` shaped:
   
`[max_time, batch_size, cell.output_size]`.

--
You received this message because you are subscribed to the Google Groups "Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to discuss+u...@tensorflow.org.
To post to this group, send email to dis...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/discuss/89aa40dc-4022-413e-9feb-60330e9d534c%40tensorflow.org.

Gangeshwar Krishnamurthy

unread,
May 15, 2017, 12:04:53 AM5/15/17
to Discuss
Hi,
As per my understanding, the output returned by dynamic_rnn is a Tensor of intermediate state values(ie, h1, h2, ... hT) and the state returned by it is the final state (i.e, hT).
To demonstrate this, try the below code.

import tensorflow as tf
import numpy as np

tf.reset_default_graph()

# Create input data
X = np.random.randn(2, 10, 8)

# The second example is of length 6
X[1,6:] = 0
X_lengths = [10, 6]

cell = tf.nn.rnn_cell.LSTMCell(num_units=64, state_is_tuple=True)

outputs, last_states = tf.nn.dynamic_rnn(
   
cell=cell,
    dtype=tf.float64,
    sequence_length=X_lengths,
    inputs=X)

result = tf.contrib.learn.run_n(
    {"outputs": outputs, "last_states": last_states},
    n=1,
    feed_dict=None)

assert result[0]["outputs"].shape == (2, 10, 64)
print(result[0]["outputs"][-1])
print(result[0]["last_states"])

The values printed should be equal. Which means that the state is the last state of the RNN and outputs is the state of the RNN at every timestep.

Above code is modified from https://github.com/dennybritz/tf-rnn/blob/master/dynamic_rnn.ipynb

Thanks,
Gangeshwar

Gangeshwar Krishnamurthy

unread,
May 15, 2017, 12:31:32 AM5/15/17
to Discuss
I have made a mistake in the last two lines of the code.
Here is the correction. Try this code
import tensorflow as tf
import numpy as np

tf
.reset_default_graph()

# Create input data
X
= np.random.randn(2, 10, 8)

# The second example is of length 6
#X[1,6:] = 0
X_lengths
= [10, 10]

cell
= tf.contrib.rnn.LSTMCell(num_units=64, state_is_tuple=True)


outputs
, last_states = tf.nn.dynamic_rnn(
    cell
=cell,
    dtype
=tf.float64,
    sequence_length
=X_lengths,
    inputs
=X)

result
= tf.contrib.learn.run_n(
   
{"outputs": outputs, "last_states": last_states},
    n
=1,
    feed_dict
=None)

assert result[0]["outputs"].shape == (2, 10, 64)
print(result[0]["outputs"][-1][-1]), "\n"
print(result[0]["last_states"].h[-1])

assert (result[0]["last_states"].h[-1] == result[0]["outputs"][-1][-1]).all()

abhishek kumar

unread,
May 19, 2017, 10:39:01 PM5/19/17
to Discuss
Hi Gangeshwar and Thomas,

First of all I am very sorry for the late reply. I think I was unable to explain the question well. Gangeshwar, the example you gave is quite helpful. I would explain my problem again using your example. I have updated your codes to use MultiRNNCell i.e. stacking of LSTMCell is done.

import tensorflow as tf
import numpy as np

tf.reset_default_graph()

# Create input data
X = np.random.randn(2, 10, 8)

X_lengths = [10, 10]
num_layers = 3

cell = tf.contrib.rnn.LSTMCell(num_units=64, state_is_tuple=True)

cell = tf.contrib.rnn.MultiRNNCell([cell] * num_layers)

outputs, last_states = tf.nn.dynamic_rnn(
    cell=cell,
    dtype=tf.float64,
    sequence_length=X_lengths,
    inputs=X)

result = tf.contrib.learn.run_n(
    {"outputs": outputs, "last_states": last_states},
    n=1,
    feed_dict=None)

print('\n')
print "results shape", len(result), "\n"

print "result[0]['outputs'].shape", result[0]["outputs"].shape, "\n"

print "For 1st layer result[0]['last_states'][0].c.shape", result[0]["last_states"][0].c.shape, "\n"
print "For 1st layer result[0]['last_states'][0].h.shape", result[0]["last_states"][0].h.shape, "\n"

print "For 2nd layer result[0]['last_states'][1].c.shape", result[0]["last_states"][1].c.shape, "\n"
print "For 2nd layer result[0]['last_states'][1].h.shape", result[0]["last_states"][1].h.shape, "\n"

print "For 3rd layer result[0]['last_states'][2].c.shape", result[0]["last_states"][2].c.shape, "\n"
print "For 3rd layer result[0]['last_states'][2].h.shape", result[0]["last_states"][2].h.shape, "\n"

print "**************** Actual Values *********************\n"
print "For 1st layer, across 2 batches, in last time-step, result[0]['last_states'][0].c \n", result[0]["last_states"][0].c, "\n"
print "For 1st layer, across 2 batches, in last time-step, result[0]['last_states'][0].h \n", result[0]["last_states"][0].h, "\n"

print "For 2nd layer, across 2 batches, in last time-step, result[0]['last_states'][1].c \n", result[0]["last_states"][1].c, "\n"
print "For 2nd layer, across 2 batches, in last time-step, result[0]['last_states'][1].h \n", result[0]["last_states"][1].h, "\n"

print "For 3rd layer, across 2 batches, in last time-step, result[0]['last_states'][2].c \n", result[0]["last_states"][2].c, "\n"
print "For 3rd layer, across 2 batches, in last time-step, result[0]['last_states'][2].h \n", result[0]["last_states"][2].h, "\n"
#print('\n')

print "For 3rd layer, Total output for 1st batch across 10 time-steps via Outputs \n", (result[0]["outputs"][0]), "\n"
print "For 3rd layer, Total output for 2nd batch across 10 time-steps via Outputs \n", (result[0]["outputs"][1]), "\n"

assert result[0]["outputs"].shape == (2, 10, 64)
print "Last layer's, last batch, Output \n", result[0]["outputs"][-1][-1], "\n"
print "Last layer's, last batch, Output through state \n", result[0]["last_states"][-1].h[-1]

assert (result[0]["last_states"][-1].h[-1] == result[0]["outputs"][-1][-1]).all()
print """\n Verified that output returned by dynamic_rnn is a Tensor of intermediate
       state values(ie, h1, h2, ... hT) and the state returned by it is the
       final state (i.e, hT)"""

----
My problem is to find the outputs of intermediate layers (in the above code for 1st and 2nd layer), basically for each intermediate layer the output at various time-steps across all batches. What I can get is output of last layer for various time-steps across all batches via the output returned by dynamic_rnn. I can extract the output of intermediate layers via the last_states but only for the last time-steps as you will see in the above example code, result[0]["last_states"][0].h gives me the output of 1st layer, across 2 batches, in last time-step. What I need is output of 1st layer, across 2 batches, in various time-step and not just the last one. As you can see when I print result[0]["outputs"][0] and result[0]["outputs"][1], I get the output of last layer, i.e. 3rd here for both batches across all time-steps (i.e. 10 here). 

In case you want to check the output file after running this code, please find attached in the attachment. Please let me know in case I am not clear anywhere.
hellolstm_output.txt

Eugene Brevdo

unread,
May 20, 2017, 11:21:18 AM5/20/17
to abhishek kumar, Discuss
Write your own MultiRnnCell that creates a tuple of all the intermediate outputs and emits that tuple as the final output.  The output of dynamic_rnn will then be a tuple of outputs of all the layers across time.

--
You received this message because you are subscribed to the Google Groups "Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to discuss+unsubscribe@tensorflow.org.

To post to this group, send email to dis...@tensorflow.org.

abhishek kumar

unread,
May 21, 2017, 12:36:14 AM5/21/17
to Discuss
Hi Eugene,

Thanks for the reply. I tried what you said, i.e. wrote a custom MultiRNNCell where I only made changes in its __call__ function as below

def __call__(self, inputs, state, scope=None):
    """Run this multi-layer cell on inputs, starting from state."""
    with vs.variable_scope(scope or "multi_rnn_cell"):
      cur_state_pos = 0
      cur_inp = inputs
      new_states = []
      new_outputs = []
      for i, cell in enumerate(self._cells):
        with vs.variable_scope("cell_%d" % i):
          if self._state_is_tuple:
            if not nest.is_sequence(state):
              raise ValueError(
                  "Expected state to be a tuple of length %d, but received: %s"
                  % (len(self.state_size), state))
            cur_state = state[i]
          else:
            cur_state = array_ops.slice(
                state, [0, cur_state_pos], [-1, cell.state_size])
            cur_state_pos += cell.state_size
          cur_inp, new_state = cell(cur_inp, cur_state)
          new_states.append(new_state)
          new_outputs.append(cur_inp)
    new_states = (tuple(new_states) if self._state_is_tuple else
                  array_ops.concat(new_states, 1))
    new_outputs = (tuple(new_outputs) if self._state_is_tuple else
                  array_ops.concat(new_outputs, 1))

    #return cur_inp, new_states
    #print "In MultiRNNCell, new_outputs.ndim ",new_outputs.ndim, "\n"
    return new_outputs, new_states

I changed no other code apart from this, but I am still not getting the desired output. In my example the shape of output (of dynamic_rnn) still remains (2, 10, 64), where
2    is no of batches
10  time-steps
64  number of units

and I am not able to get outputs corresponding to each layers, i.e. 1st and 2nd layer.
Did I make the correct changes or Do I need to make changes in other code too like in dynamic_rnn?

Thanks
Abhishek
To unsubscribe from this group and stop receiving emails from it, send an email to discuss+u...@tensorflow.org.

Eugene Brevdo

unread,
May 21, 2017, 1:53:16 AM5/21/17
to abhishek kumar, Discuss
You need to update the output_size property as well.

To unsubscribe from this group and stop receiving emails from it, send an email to discuss+unsubscribe@tensorflow.org.

To post to this group, send email to dis...@tensorflow.org.

abhishek kumar

unread,
May 21, 2017, 3:44:23 AM5/21/17
to Discuss
Hi Eugene,

Thanks a ton for you help. I figured out my mistake after writing my last comment :)
Reply all
Reply to author
Forward
0 new messages