Sequence to Sequence autoencoder

1,766 views
Skip to first unread message

madhum...@gmail.com

unread,
May 26, 2016, 8:22:58 AM5/26/16
to Keras-users
Hello,

In the tutorial about autoencoder implementation in Keras, particularly sequence to sequence autoencoder, it is suggested that we first encode the entire sequence into a single vector using LSTMs, and then repeat the sequence for 'n' times, where 'n' is the number of timesteps, before decoding. However, I fail to understand why is this repetition of vectors required before decoding.

Instead, why can't vector representations for sequences be learned like in a standard encoder, just replacing the Dense layers with LSTM layers, and padding the sequences to a maxlen, where each entry in the sequence is of size input_dim? Since it is an encoder-decoder model, the input sequence would be the same as the output sequence. 

Thanks in advance,
Kind Regards,
Madhumita

ashis...@gmail.com

unread,
Jul 27, 2016, 6:51:45 AM7/27/16
to Keras-users, madhum...@gmail.com
Hi, 

I guess to model each sequence. If a sentence has three words  sentence = <w1, w2, w3>. Then you need to represent each word (W_i) with a vector and then repeat n time (which is your sequence length). Hope this helps. 

Madhumita

unread,
Aug 28, 2016, 4:17:46 PM8/28/16
to ashis...@gmail.com, Keras-users

Hi,

Thanks a lot for your reply! However, repeating the vector 'n' times is the point of confusion for me. For example,

If my words w1, w2, w3 - each can be represented with a dimension size of 300. From what I understand, once each word is encoded, it is feeded to the decoder (one time step at a time). So when I repeat the vector of these words, how many times do I repeat it? If I want the output representation of the word be of 600 dimensions, do I repeat it two times? What if input and output are the same (like in an autoencoder)?

Kind Regards,
Madhumita

dhrushil

unread,
Jan 27, 2017, 3:34:45 AM1/27/17
to Keras-users, madhum...@gmail.com

I'm confused about this too, is there a resolution?

jpeg729

unread,
Feb 6, 2017, 6:33:52 PM2/6/17
to Keras-users, madhum...@gmail.com
It seemed fairly obvious to me.

The decoder is an RNN, and we need it to produce 'n' outputs. 'n' being the length of the input sequence that it is supposed to be reproducing. Now, because it is an RNN, in order to produce 'n' outputs you have to give it 'n' inputs. Which is why we repeat the vector 'n' times.

Granted you don't actually have to give the decoder the same vector at each timestep, but it is easy to code thusly and probably simplifies the decoder's work. You could feed the decoder the encoded vector at the first timestep followed by blank inputs thereafter, but then it would need to work hard at memorising the encoded vector before it could decode it.

jpeg729

unread,
Feb 6, 2017, 6:48:52 PM2/6/17
to Keras-users, madhum...@gmail.com
To reply to the second question. The beauty of an RNN model is that it can deal with arbitrary length sequences. You feed it the elements of the input sequence one after another until you are done. For each element it produces an output, and you discard all but the last output. That is your encoded vector representation. You could consider concatenating all of the outputs to make your encoded vector, but that would be a vector whose length depends on the input sequence length. This would defeat the purpose of an autoencoder which is supposed to make a fixed length encoded representation of the input.

Step two is reproducing the sequence from the vector representation. To do so you feed this vector into the decoder as many times as necessary for the decoder to reproduce the complete sequence element by element.

If you used padded sequences then the encoder and decoder would have to learn to ignore the padding as well as learning the encoding. This adds a source of potential confusion. It would be inefficient and conceptually messy.


On Thursday, 26 May 2016 14:22:58 UTC+2, madhum...@gmail.com wrote:
Reply all
Reply to author
Forward
0 new messages