What is the purpose of RepeatVector

3,334 views
Skip to first unread message

reinis

unread,
Jun 1, 2016, 10:09:30 AM6/1/16
to Keras-users
Hello,

the documentation clearly states what does RepeatVector do.

What I don't understand is why would I want to repeat same input n times and feed it to the same next layer?

e.g.

self.add(self.encoder)
self.add(Dropout(dropout))
self.add(RepeatVector(output_length))
self.add(self.decoder)

So I I encode input, do the dropout, then repeat the very same encoded input 'output_length' times and feed it to same decoder. What is the purpose of repeating?

thx
reinis

boll...@gmx.de

unread,
Jun 1, 2016, 2:23:36 PM6/1/16
to Keras-users
Is that example from the seq2seq repo? If so, the output from the encoder is a single vector, while the decoder expects sequential input, i.e. output_length timesteps of vectors. By using RepeatVector, we achieve that the same vector (coming from the encoder) is used as the input for each timestep of the decoder.

- Marcel

reinis

unread,
Jun 1, 2016, 3:23:21 PM6/1/16
to Keras-users, boll...@gmx.de
Hi Marcel,

it is from the actual SimpleSeq2seq implementation of keras.

Alas, your answer is still puzzling me. Based on this code:

self.encoder = LSTM(hidden_dim, **kwargs)
self.decoder = LSTM(hidden_dim if depth[1]>1 else output_dim, return_sequences=True, **kwargs)

I dont see how decoder expects sequential input. I assume that I could very well supply just one vector to the decoder and it will still work with no issues. I am providing time-series by formatting my training data as time series.

So I would like to make my question more precise: What is the LOGIC purpose of providing a sequence with the same vector to decoder?

thx for your help
reinis

boll...@gmx.de

unread,
Jun 1, 2016, 4:02:22 PM6/1/16
to Keras-users, boll...@gmx.de
It expects sequential(/time-series) input due to the simple fact that it's an LSTM. Of course you could supply it with a sequence containing just a single vector, but then your output will have length 1 as well. In other words, to get output with n timesteps, you need to provide input with n timesteps.

Note that this is NOT already accounted for by the fact that your training data has n timesteps, because the encoder LSTM squashes these timesteps to a single vector (due to implied return_sequences=False). To get n timesteps out of the decoder again, we need to provide it input as a time-series, and a very simple way to do that is by repeating the output of the previous layer n times.

- Marcel

reinis

unread,
Jun 1, 2016, 4:16:50 PM6/1/16
to Keras-users, boll...@gmx.de
Excellent, I almost got it. One final question though.

Encoder encodes based on my time-serialized data and squashes it into one vector representation. This squashed vector represents then the whole sequence I supplied in input. Now, aren't we loosing information if I provide to decoder only one input vector (the squashed one) for a whole time series? Or is this just a trick to inform Decoder about the length of input that was originally (before squashing) there. Because without this trick, I would expect that decoder will learn based on length of the sequence in supplied label of the given time series.

I know this assumption possibly is very wrong and occurs only due to my poor knowledge about nets.

thx
reinis

boll...@gmx.de

unread,
Jun 1, 2016, 4:34:17 PM6/1/16
to Keras-users, boll...@gmx.de
I guess you can call it a trick, but note that it really informs the decoder about the output length, which doesn't have to be the same as the input length! I'm not sure where you think we could lose information -- as you said, the whole input sequence is encoded in the squashed vector, but how well this can be done depends on many factors, among them the dimensionality of this vector (hidden_dim), the properties of the data, etc.

I also don't quite understand your last sentence "Because without this trick...". Without RepeatVector(), you simply could not train with target sequences longer than 1 element, because the shape of the training data would not match the expected output shape of the model.

- Marcel

reinis

unread,
Jun 1, 2016, 5:35:15 PM6/1/16
to Keras-users, boll...@gmx.de
Finally I got it.

I totally ignored the fact that code clearly says how much repetitions have to be performed:

self.add(RepeatVector(output_length))

The key here is the parameter output_length! I totally ignored it and assumed that repetition is based somehow on shape of input, which is not! Repetition is based on output length which totally makes sense.

Thanks again Marcel, great help.
reinis
Reply all
Reply to author
Forward
0 new messages