INTRODUCTION
Content-oriented XML retrieval has been receiving increasing interest
fuelled by the widespread use of the eXtensible Markup Language (XML),
as a standard document format. The continuous growth in XML data
sources is matched by increasing efforts in the development of XML
retrieval systems, which aim at exploiting the available structural
information in documents to implement a more focused retrieval
strategy and return document components, the so-called XML elements -
instead of complete documents - in response to a user query.
Implementing this, more focused, retrieval paradigm means that an XML
retrieval system needs not only to find relevant information in the
XML documents, but also determine the appropriate level of granularity
to be returned to the user. In addition, the relevance of a retrieved
component is dependent on meeting both content and structural
conditions.
Evaluating the effectiveness of XML retrieval systems, hence, requires
a test collection where the relevance assessments are provided
according to a relevance criterion, which takes into account the
imposed structural aspects. In 2002, the Initiative for the Evaluation
of XML Retrieval (INEX) started to address these issues. The aim of
the INEX initiative is to establish an infrastructure and provide
means, in the form of a large XML test collection and appropriate
scoring methods, for the evaluation of content-oriented XML retrieval
systems.
Evaluating retrieval effectiveness is typically done by using test
collections assembled specifically for evaluating particular retrieval
tasks. A test collection as such has been built as a result of many
rounds of INEX (annually since 2002).
In INEX 2007, participating organizations will be able to compare the
retrieval effectiveness of their XML retrieval systems and will
contribute to the construction of a new XML test collection based on
the Wikipedia. The test collection will also provide participants a
means for future comparative and quantitative experiments.
TASKS AND TRACKS
In addition to the main general ad hoc retrieval task, INEX 2007 will
have the following specific tasks:
1. Document mining track
2. Multimedia
3. Entity Ranking
It will continue with the following tracks that started in previous
years:
1. Heterogeneous collection track
2. Interactive track
Additional tracks are planned:
1. Book Searching
2. Document interlinking "Link the Wiki"
RELEVANCE ASSESSMENTS
Relevance assessments will be provided by the participating groups
using INEX's on-line assessment system. Each participating
organization will judge around 3 topics. Please note that assessments
take about one-person 2 days per topic! Participating groups will gain
access to the completed INEX test collection only after they have
completed their assessment task. Upon completion of the relevance
assessments, participants new to INEX can have access to the previous
years test collections.
WORKSHOP AND PROCEEDINGS
Participants will be able to present their approaches and final
results at the INEX 2006 workshop to be held in December in Dagstuhl.
Revised papers will be published in the INEX post-workshop final
proceedings. As for INEX 2004, 2005, and 2006, we expect the INEX
final proceedings to be published in the Springer's Lecture Notes in
Computer Science (LNCS) series.
ORGANIZERS
Project Leaders: Andrew Trotman, Mounia Lalmas, Norbert Fuhr
Contact person: Saadia Malik, Zoltan Szlavik
Wikipedia document collection: Ludovic Denoyer
Document exploration: Ralf Schenkel, Martin Theobald
Topic format specification: Birger Larsen, Andrew Trotman
Task description: Jaap Kamps, Charlie Clarkes
Online relevance assessment tool: Benjamin Piwowarski
Effectiveness measures: Gabriella Kazai, Benjamin Piwowarski, Jaap
Kamps,
Jovan Pehcevski, Stephen Robertson (Adviser), Paul Ogilvie
(Statistical analysis)
Document mining track: Ludovic Denoyer, Patrick Gallinari
Multimedia track: Thijs Westerveld
Entity search track: Arjen de Vries, Nick Craswell, Mounia Lalmas
Link the Wiki track: Shlomo Geva, Andrew Trotman
Book search track: Gabriella Kazai, Antoine Doucet