evaluation script for kba 2012

34 views
Skip to first unread message

ash...@udel.edu

unread,
May 3, 2013, 12:42:09 PM5/3/13
to trec...@googlegroups.com
Hi,

I am using 2012 corpus for some work and noticed that the KBAScore.py script does not check for
duplicate docids.
If I have document ids for a topic occuring multiple times in my run then I can increase the TP count or the FP count depending upon the document.
Also did anyone notice same document occurring multiple times in corpus with same streamid ?

Thanks,
Ashwani

John R. Frank

unread,
May 3, 2013, 2:26:06 PM5/3/13
to trec...@googlegroups.com
> I am using 2012 corpus for some work and noticed that the KBAScore.py
> script does not check for duplicate docids.

IIRC, the validation script that runs when uploading official run
submissions rejects submissons containing duplicate stream_ids, so the
scoring script did not need to handle this. We might add this enhancement
in the next rev of the KBAscore tool.


See here for info on duplicate doc_ids and stream_ids:

https://groups.google.com/d/msg/streamcorpus/Bsd1XF-aLpY/UqXg1irNQMUJ



jrf

Reply all
Reply to author
Forward
0 new messages