Rankings for internal sections: Tree qrels and new trec-car-tools

60 views
Skip to first unread message

Laura Dietz

unread,
Jul 26, 2018, 8:59:08 PM7/26/18
to trec...@googlegroups.com
Dear TREC CAR participants,

Last year, only some sections from every page were assessed. This year,
we will aim for assessment of the complete outlines. This requires a few
changes in the setup, read below.

This year we ask you to also submit rankings for internal sections. This
is a departure from last year, where rankings were only produced for
leaf headings  (analogously to *.hierarchical.qrels).

Example:

Title

  H1

     H1.1

   H2

    H2.1

Previously ("Hierarchical") only rankings for Title/H1/H1.1 and
Title/H2/H2.1 were to be provided.

Now ("Tree") rankings need to be provided for:

Title/H1
Title/H1/H1.1
Title/H2
Title/H2/H2.1


To help with the change, I did three things (let me know if these were
not helping.)

1) You will note that the trec car validation script will complain if
you are not providing rankings internal sections.

2) I released a new set of qrels files for the new behavior (so-called 
"Tree-qrels" in files named *tree.qrels) for all released benchmarks:
test200, benchmarkY1train, benchmarkY1test, train.

3) I changed the behavior of trec-car-tools so that the following
functions return section path of the internal sections as well

java: Data#flatSectionPaths

python: Page#flat_headings_list

If you prefer to iterate the page skeleton yourself, you need to make
changes to accommodate this new behavior.


If you have any questions, or any of this is still too confusing, please
send an email.


Best,
Laura




Rodrigo Nogueira

unread,
Jul 28, 2018, 11:48:14 AM7/28/18
to TREC Car
Hello TREC CAR organizers,

Could you please clarify some questions regarding the dataset and the submission?

1. Inside "Y1-tree-qrels.v2.1.tar.xz", 
"Y1-tree-qrels/benchmarkY1-train/fold-0-base.train.cbor.tree.qrels" refers to the tree qrels of the first year (Y1) whereas
"Y1-tree-qrels/train/fold-0-base.train.cbor.tree.qrels" refers to the tree qrels of Y2? If yes, it is confusing to have "Y1" in the folder name.

2. Could you explain why someone would use "unprocessedAllButBenchmark.v2.1.tar.xz" if the correct paragraphs are in "paragraphCorpus.v2.0.tar.xz"? Maybe I'm missing preprocessing or extra information that I should include while retrieving the paragraphs?

3. If the tree qrels will be used in the evaluation, why there are hierarchical, top-level and article qrels? I found quite confusing which one to use to train my models by reading the website.

4. The information required to understand the dataset is spread in http://trec-car.cs.unh.edu/http://trec-car.cs.unh.edu/datareleases/, and in this forum. Having all on a single page would be nice.

5. For newcomers to the TREC competitions, it would be nice to have a section explaining the metrics used for evaluation. For example, last year I was confused if it would be better to provide a long list of ranked paragraphs or only the paragraphs my algorithm thinks to be correct with high confidence.

6. Since the evaluation is changing from hierarchical to tree qrels, could you postpone the deadline until the end of August? It takes quite some time (1-2 weeks) to train the models :)


Thanks a lot for your time,
Rodrigo Nogueira

Laura Dietz

unread,
Jul 28, 2018, 1:46:45 PM7/28/18
to trec...@googlegroups.com

On 07/28/2018 11:48 AM, Rodrigo Nogueira wrote:
> Hello TREC CAR organizers,
>
> Could you please clarify some questions regarding the dataset and the
> submission?

Sure!


We only provide training qrels for the Y1 benchmarks -- this includes
the train, the benchmarkY1train, the benchmarkY1test, and test200.

Note that v2.x versus v1.x does not refer to year 1 or year 2 -- it just
refers to the version of our conversion pipeline, so you know that these
are compatible with one another.

When I mean "Y1 benchmark", I refer to the set of
topics/queries/outlines in the respective benchmarks.

You could have derived tree.qrels yourself -- but I figured it may help
to have training data to match the task.

>
> 2. Could you explain why someone would use
> "unprocessedAllButBenchmark.v2.1.tar.xz" if the correct paragraphs are
> in "paragraphCorpus.v2.0.tar.xz"? Maybe I'm missing preprocessing or
> extra information that I should include while retrieving the paragraphs?

These two datasets are complementary:

You are supposed to retrieve passages from paragraphCorpus.
You are supposed to retrieve entities that are (1)  linked in
paragraphCorpus, (2) have an entries in allButBenchmark, or (3) are
linked in the allButBenchmark collection.

The allButbenchmark is offered to teams who would like to build a
knowledge graphs. Note tha allButBenchmark will be missing paragraphs
that are relevant under the automatic tree/hierarchical qrels.


>
> 3. If the tree qrels will be used in the evaluation, why there are
> hierarchical, top-level and article qrels? I found quite confusing
> which one to use to train my models by reading the website.

Hierarchical, toplevel and article are provided for backwards
compatibility. I will try to state this clearly on the website.


> 6. Since the evaluation is changing from hierarchical to tree qrels,
> could you postpone the deadline until the end of August? It takes
> quite some time (1-2 weeks) to train the models :)

The deadline is not in my choice, but set by NIST. The deadline is more
than 2 weeks away.



You had a number of complaints about the way information is
disseminated. The intention is that the website is all you need to know.
If this is not the case, it would help to have concrete feedback which
information is missing or unclear.


If you have any general questions about information retrieval, please ask.


Best,

Laura


Reply all
Reply to author
Forward
0 new messages