Calculating the scores

0 views
Skip to first unread message

peter....@nrc-cnrc.gc.ca

unread,
Apr 6, 2007, 10:06:33 AM4/6/07
to SemanticRelations
Hi,

Here is more information about how the scores were calculated:

Lewis, D.D. Evaluating text categorization.
Proceedings of the Speech and Natural Language Workshop,
Asilomar, 312-318, 1991.

http://acl.ldc.upenn.edu/H/H91/H91-1061.pdf

We use the Macroaveraged F score. The F score is calculated separately
for each of the seven relations, and then the seven F scores are
averaged. Macroaveraging gives equal weight to each of the seven
relations.

The formula for Precision can be problematic when a system classifies
everything as "false". This is discussed in the above paper under
"Arithmetic Anomalies". We use the convention of treating 0/0 as 1.0.
This convention can be justified by observing that, in most precision-
recall plots, precision tends to head approximately towards 1.0 as
recall drops.

See also:

http://en.wikipedia.org/wiki/Information_retrieval#Performance_measures

To translate the equations from information retrieval to Task #4:

{relevant documents} => {cases that are manually labeled "true"}
{retrieved documents} => {cases that the system labels "true"}

Best wishes,
Peter.

Reply all
Reply to author
Forward
0 new messages