Final question #6

Michele Zuccardi Merli

unread,

Jun 17, 2021, 5:03:53 PM6/17/21

to NCCU DS4CS

Hi Professor,

How are we supposed to train the embedding model in question 6 of the final? Since there is no label the only way to obtain embeddings for words is to directly use the .predict method on the model without fitting it to the training data, which means we are using random unoptimized weights. Or should we use an already trained embedding model? In this case however we cannot use the keras embedding layer.

Thanks

Mike Hsiao

unread,

Jun 18, 2021, 2:45:36 AM6/18/21

to NCCU DS4CS

Hi, Michele,

It is an excellent question. Usually, when we do embedding, we would like to solve a supervised learning problem after converting text-based data to a vector (i.e., embedding), just like the example provided in Google's "Word Embeddings" introduction.

https://www.tensorflow.org/text/guide/word_embeddings

And the introduction page says

"When you create an Embedding layer, the weights for the embedding are randomly initialized (just like any other layer). During training, they are gradually adjusted via backpropagation. Once trained, the learned word embeddings will roughly encode similarities between words (as they were learned for the specific problem your model is trained on)."

You can also see an example here.

https://www.tensorflow.org/api_docs/python/tf/keras/layers/Embedding

Without learning, the embedding layer can still output something, but they are simply some random numbers.

**

However, as mentioned in the class, like auto-encoder, we can

1) make the embedding layer to learn exactly what we input into the embedding layer.

2) Or let the embedding layer learn the 'cloze' task.

3) or if possible, you can perform unsupervised clustering first on the dataset and label them. Then create another model with an embedding layer.

4) ...

Of course (3) is the easiest solution. (I believe.)

I believe there are certain different creative ways to deal with such kinds of issues. And I believe real-world problems/data might be more like this.

As long as your design is reasonable, TA and I will accept your answer. You are free to design your own NN and data pre-processing.

Don't worry about it.

Again, it is an excellent question to be discussed.

Thanks,

Hsiao

michele.zu...@gmail.com 在 2021年6月18日星期五上午5:03:53 [UTC+8] 的信中寫道：

Reply all

Reply to author

Forward

Message has been deleted