Using a TimeDistributed or recurrent layer on inputs with different numbers of timesteps

98 views

Skip to first unread message

ma...@allenai.org

unread,

Jan 20, 2017, 7:14:17 PM1/20/17

to Keras-users

Hi,

I'm trying to use TimeDistributed and LSTM in a way that apparently isn't supported; have any of you run into the same problem? And if so, is there a way around this?

Here's the problem:

>>> from keras.layers import Highway, Input, TimeDistributed

>>> input1 = Input(shape=(3, 5))

>>> input2 = Input(shape=(1, 5))

>>> highway_layer = Highway(activation='relu', name='highway')

>>> distributed_highway_layer = TimeDistributed(highway_layer, name='distributed_highway')

>>> highway_input1 = distributed_highway_layer(input1)

>>> highway_input2 = distributed_highway_layer(input2)

Traceback (most recent call last):

File "<stdin>", line 1, in <module>

File "/home/mattg/anaconda3/lib/python3.5/site-packages/keras/engine/topology.py", line 494, in __call__

self.assert_input_compatibility(x)

File "/home/mattg/anaconda3/lib/python3.5/site-packages/keras/engine/topology.py", line 434, in assert_input_compatibility

str(x_shape))

Exception: Input 0 is incompatible with layer distributed_highway: expected shape=(None, 3, 5), found shape=(None, 1, 5)

Because the TimeDistributed layer was built when applied to the first input, it thinks it has a particular input shape, but this assumption fails when it gets applied to the second input, with a different number of timesteps, and it crashes.

Where do you run into this problem? I'm trying to re-implement a reading comprehension model, and it uses highway layers on top of word embeddings for both a question and a passage of text. So my question tensor has shape (batch_size, num_question_words, embedding_dim), and my passage tensor has shape (batch_size, num_passage_words, embedding_dim). I just want a highway layer applied to the embeddings, and I want it to be the _same_ highway layer for both the question and the passage. The code above seems like a natural way to implement this (assuming "input1" and "input2" actually previous layer outputs that are my word embeddings for the question and the passage). However, it doesn't work.

I can think of one work around, and that is to instantiate two separate TimeDistributed objects, both using the same underlying Highway layer. Because TimeDistributed doesn't have any parameters, this actually works, but it's a bit ugly. And it wouldn't be so bad if this only applied to TimeDistributed, but you get the same problem with any recurrent layer:

>>> from keras.layers import LSTM, Input

>>> input1 = Input(shape=(3, 5))

>>> input2 = Input(shape=(1, 5))

>>> lstm = LSTM(10)

>>> lstm(input1)

>>> lstm(input2)

Traceback (most recent call last):

File "<stdin>", line 1, in <module>

File "/home/mattg/anaconda3/lib/python3.5/site-packages/keras/engine/topology.py", line 494, in __call__

self.assert_input_compatibility(x)

File "/home/mattg/anaconda3/lib/python3.5/site-packages/keras/engine/topology.py", line 434, in assert_input_compatibility

str(x_shape))

Exception: Input 0 is incompatible with layer distributed_highway: expected shape=(None, 3, 5), found shape=(None, 1, 5)

And I don't know a way around this problem. Any ideas?

Matt

Reply all

Reply to author

Forward

0 new messages