Available input rankings for passage/entity ranking

20 views
Skip to first unread message

Laura Dietz

unread,
Jul 20, 2019, 10:16:15 AM7/20/19
to trec...@googlegroups.com

Dear all,

If you want to participate in TREC CAR, but you don't have a search index available, you can use rankings we produced with Lucene (and some add-on code). The rankings are in TREC RUN format, compatible with trec_eval and our validation/population code. Feel free to use them to build candidate sets, features etc.


For each benchmark (Y1train, Y1test, Y2test, Y3train, Y3test) you find both a page-level and section-level archive with rankings here:

http://trec-car.cs.unh.edu/inputruns/

The code and instructions for reproducing these runs is available here:

https://www.cs.unh.edu/~dietz/appendix/ent-rank/reproduce-input-runs.html



A brief explanation on the semantics of provided filenames

- bm25-none means just BM25, no expansion

- bm25-rm means BM25 + RM3 3 expansion  (you can combine bm25-rm and bm25-none to tune how much the expansion part should matter)

- ql-none, just Query Likelihood no expansion

etc


You find rankings for passages, entities (based on the Wiki page), entities (based on meta-data info).


We also vary across different ways to turn a page/heading into a search query. sectionpath concatenates the heading with all parent headings and the page title. leaf uses just the heading, interior just parent headings, title uses just the page title. For page-level rankings we include both a ranking of just the title, as well as a ranking using ALL headings on the page.

Each archive should contain a file eval.mkd, with trec_eval output of each method by itself.


You can cite this dataset as "Run files provided by TREC CAR organizers available at http://trec-car.cs.unh.edu/inputruns/"


Let me know if you find this resource helpful!


Laura

Reply all
Reply to author
Forward
0 new messages