Comparing Scores Across Models

Andrew Nystrom

unread,

Jun 26, 2014, 10:58:51 AM6/26/14

to berkeleyl...@googlegroups.com

Is it sound to compare scores coming from different models? For example, let's say I have labelled data (this is a fictitious case - there are better ways to do the following problem, but it's for the sake of example), and my classes are C1 and C2. Let's say I train a model for each class - I take all the text of C1 and train a model on it, and do the same for C2. If LogProb(doc|class) is the output of getLogProb, would it then make sense to say that a new document is a mixture of both classes according to the distribution [LogProb(doc|C1), LogProb(doc|C2)]? I think a core part of the question is just whether the log probe are normalized.

Adam Pauls

unread,

Jun 26, 2014, 11:06:26 AM6/26/14

to berkeleyl...@googlegroups.com

Yes, the probabilities are normalized. So other than perhaps learning what the mixture weights P(C1) and P(C2) are, comparing probabilities from two models is a reasonable thing to do.

Andrew Nystrom

unread,

Jun 27, 2014, 11:12:37 AM6/27/14

to berkeleyl...@googlegroups.com

Thanks, Adam! I appreciate your quick responses!

Reply all

Reply to author

Forward