Doc2Vec loss function

175 views
Skip to first unread message

Tedo Vrbanec

unread,
Mar 3, 2023, 1:12:10 PM3/3/23
to Gensim
When trainin Doc2Vec model, we can put parameter compute_loss=True. How can we get this numbers, values of loss function after each epoch?

Tedo Vrbanec

unread,
Mar 3, 2023, 1:15:58 PM3/3/23
to Gensim
I would like to be printed and saved to file.
For Word2Vec is usable: we can put in .train directive: callbacks=[callback_w2v( 'tmp/w2v_')]), with the class:
class callback_w2v(CallbackAny2Vec):
    '''Callback to print loss after each epoch.'''

    def __init__(self, savedir):
        self.savedir = savedir
        self.epoch = 0
   
    def on_epoch_end(self, model):
        loss = model.get_latest_training_loss()
        if self.epoch == 0:
            print('Loss after epoch {}: {}'.format(self.epoch, loss))
            self.loss_previous_step = 0
        else:
            print('Loss and loss difference after epoch {}: loss: {} difference: {}'.format(self.epoch, loss, loss- self.loss_previous_step))
        with open(self.savedir+'loss.txt', 'a') as f:
            f.write(str(self.epoch) + '\t' + str(loss) + '\t'+ str(loss-self.loss_previous_step) + '\n')
        self.epoch += 1
        self.loss_previous_step = loss

The same approach to Doc2Vec is giving me only zero values of loss function, and of course, difference as well.
Message has been deleted

Tedo Vrbanec

unread,
Mar 4, 2023, 10:27:39 AM3/4/23
to Gensim
So, loos values in Word2Vec can be extracted, perplexity is a kind of parameter used in Doc2Vec but I did not succeed it to use it. Can someone help with it?
All over, I just need all these various parameters of training for all DL models supported by Gensim: Word2Vec (solved), Doc2Vec, FastText to solve. 
Why is that not included in the trained model automatically when using parameter compute_loss=True?

Gordon Mohr

unread,
Mar 6, 2023, 1:45:17 PM3/6/23
to Gensim
Loss-reporting for the *2Vec models other than `Word2Vec` has never been contributed/implemented for Gensim. (And, the loss-reporting in `Word2Vec` is somewhat buggy & nonintuitive.) Improvements & extensions are wanted, and an open issue summarizing needs (with links to other open issues) can be viewed at:


But, it's been waiting quite a while for someone with the right interest & skills to tackle it. 

- Gordon

Message has been deleted
Message has been deleted

Tedo Vrbanec

unread,
Mar 19, 2023, 3:20:00 PM3/19/23
to Gensim
The situation is worse than I imagined ... If we do not have control over what we do, how much gensim is appropriate for usage, at least in science? For example, one reviewer asks me for loss function results... which I cannot present.

Gordon Mohr

unread,
Mar 19, 2023, 11:18:31 PM3/19/23
to Gensim
While working `Doc2Vec` loss-reporting could be a useful addition, & would be a welcome contribution from a capable programmer, most applications of `Doc2Vec` have no need for a loss-readout.

It's not necessary to interpret the results, or evaluate the performance of the model's outputs for specific purposes. None of the original work introducing the algorithm reports the loss, or any uses of the value for evaluation. 

Some have the mistaken belief that training loss is some indicator of the general quality of a model – perhaps that's the case with your reviewer? 

But training-loss is only a narrow measure of how well the model has adapted to the training data. A model with lower loss on its training corpus might actually be worse at intended downstream tasks than one with a higher loss – notably in cases of overfitting. So for data science purposes, other fuller measures of success at the ultimate tasks are usually more informative than the internal loss of an unsupervised modeling step. 

Running-loss during training could plausibly be helpful in determining dynamically, rather than via a prechosen number of training-epochs, when further SGD-based optimization can't make a model any better at its internal prediction target. When running loss stops improving you might as well end training – from that point on, each epoch is only slightly jittering the model to be better on some cases but equally worse on others. Still, sometimes ML models don't even want to train to that loss-minimization point – instead using "early-stopping" – because of the risk that focusing on loss alone leads to overfitting/less-generalization.

However, it doesn't seem like dynamic stopping is what you're seeking. If your paper is about some sort of predictor, maybe what your reviewer really wants is the loss of your overall technique, *not* the internal loss of one contributing unsupervised Doc2Vec step?

If I'm overlooking some other tangible benefits of a loss readout, please let me know.

And if your need is acute, adding the kind of loss-reporting you need is likely a small project, for any Python data science coding talent you have on your project or could contract.

- Gordon

Benedict Holland

unread,
Mar 19, 2023, 11:31:33 PM3/19/23
to gen...@googlegroups.com
Loss is basically a value in your model that gets maximized or minimized. That reviewer almost certainly comes from a modeling land where things like minimization, f-scores, and p-values have a specific meaning. It helps explain how much variation a model captures. AFAIK, and it's been a very long time, that isn't the purpose of classifier models. The purpose isn't model variance but how well a model predicts out of sample classifications. 

It's like asking why the model found a specific attribute statistically significant and a vital indicator. Does it matter? Maybe. Probably not. At least, not nearly as i.portant as if a model gets a clarification wrong. 

Thanks,
Ben

--
You received this message because you are subscribed to the Google Groups "Gensim" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gensim+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gensim/ad683152-6fb6-4f1a-bc7b-8a1f6e038cc2n%40googlegroups.com.

Tedo Vrbanec

unread,
Mar 20, 2023, 10:37:16 AM3/20/23
to Gensim
For me, dynamic stopping is what I am looking for. As for the reviewer, I am not sure. :)
Reply all
Reply to author
Forward
0 new messages