Hello,
I have tried to evaluate the TREC TS2014 gold standard updates using both the tseval.py and the tseval2014_mod.py scripts. I cannot seem to understand why the output for all metrics as shown by these scripts equals 0, except for the E[Verbosity] and E[Confidence-biased Latency] columns.
I believe I am outputting the data in the correct format, and when evaluating the gold standard updates I was expecting to see results higher than 0. Could it be because of the column delimiter, is it space or tabs? So far I have used both ways, unsuccessfully though. I attach to this e-mail the file with all the updates found inside updates_sample.extended.tsv that have been matched with a nugget that I am trying to evaluate. Any insights into why the evaluation script is not working?
Thank you,
Cristina