Hi,
I have seen similar posts to this one, but none really give me an idea of exactly what is happening.
I am using a call back to print the latest training loss at each epoch.
My corpus of text is a vocabulary of size 62000 across 102000 individual texts (tokenized in a list of lists, each list being an text/sentence)
training proceeds without issue and the model actually makes sense after training - semantically, i see associations that are correct.
I can also validate that if I instantiate an empty w2v model and then build vocabulary, the wv.most_similar for a word i know are nonsensical which is a good sanity check - so training works o the corpus. Below however are the increasing losses that i see as training progresses (12 epochs shown as example) -
Loss after epoch 0: 3143296.25
Loss after epoch 1: 5722781.0
Loss after epoch 2: 8161942.0
Loss after epoch 3: 10323672.0
Loss after epoch 4: 12479485.0
Loss after epoch 5: 14592014.0
Loss after epoch 6: 16683512.0
Loss after epoch 7: 18451656.0
Loss after epoch 8: 20224652.0
Loss after epoch 9: 21987936.0
Loss after epoch 10: 23745272.0
Loss after epoch 11: 25497322.0 Loss after epoch 12: 27236514.0
I am very confused by this pattern, and if the model is finding correct associations, its loss per example should be decreasing in order to be find correct word associations? or is it because it also gets many other word pairs incorrectly when negative sampling is employed?
thank you for any help in advance!