What loss function is used in LSTM model?

221 views
Skip to first unread message

smole...@gmail.com

unread,
Jun 14, 2017, 7:21:16 AM6/14/17
to fairseq Users
While running train command, I can see that the network is minimizing two things: train loss and train ppl. 

What is ppl? Is that perplexity? Isn't train perplexity the same as train loss? 

Jonas Gehring

unread,
Jun 14, 2017, 7:59:56 AM6/14/17
to fairseq Users
Hi there,

Yes, ppl stands for perplexity. During training we minimize cross-entropy, so the loss report in the log is cross-entropy / log(2) and  perplexity is 2^loss. See also the logging code at https://github.com/facebookresearch/fairseq/blob/b08530a/fairseq/torchnet/hooks.lua#L296 for reference.

Cheers!
Jonas



On Wednesday, Jun 14, 2017 at 1:21 PM, <smole...@gmail.com> wrote:
While running train command, I can see that the network is minimizing two things: train loss and train ppl. 

What is ppl? Is that perplexity? Isn't train perplexity the same as train loss? 

--
You received this message because you are subscribed to the Google Groups "fairseq Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fairseq-user...@googlegroups.com.
To post to this group, send email to fairse...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/fairseq-users/68724b72-c966-416a-a905-4f9fff9cc40b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages