We are happy to announce PIR-FIRE, a shared task on Personalised Information Retrieval @ FIRE (Forum for Information Retrieval Evaluation) 2024. The goal of PIR-FIRE is to bring together researchers with a common interest in developing and evaluating novel methods for Personalised Information Retrieval (PIR). PIR-FIRE aims to facilitate the comparative evaluation of PIR by offering research teams the means to formally define and evaluate their original personalisation algorithm and user profiling approaches, by providing them with information sources about real user preferences.
Dataset
PIR-FIRE uses data from StackExchange, a popular community Question Answering (cQA) platform. The dataset is composed of questions and answers collected from fifty StackExchange communities, and it is curated to tackle cQA as a retrieval task.
A key feature of the provided dataset is that it contains user-related information that can be leveraged for personalising and adapting the search process to the current user, for instance by creating and exploiting personal user profiles. The user-related information includes the text and the number of views of the documents they have generated, and in many cases also the tags associated with these documents, the date since they are registered on the website, the badges they obtained, their reputation score, and sometimes their autobiography.
Tasks description
PIR-FIRE consists of two tasks.
Task 1: Personalized Information Retrieval (PIR)
This task aims to investigate personalisation in cQA based on user profiles, following the standard IR pipeline, where the questions are considered as the queries, and the collection of documents to be retrieved consists of all the answers available in the StackExchange dataset. Personalisation can be tackled using any standard or novel technique to create a user profile and inject it into the retrieval model.
Task 2: Personalized IR Leveraging Large Language Models (LLM-PIR)
This task investigates the degree to which LLMs can be used in Personalized IR. Specifically, given a question, a set of answers, and user-related information, participants can leverage LLMs for personalised relevance estimation. A basic prompt-based approach is provided as a baseline, but participants can freely investigate further approaches (e.g. how to modify and integrate user information in prompts, fine-tuning, among others).
Baselines and evaluation
Several baselines are provided for the two tasks, including BM25, neural approaches based on BERT-like models, re-ranking approaches using cross-encoders like Mono-T5, personalised baselines that employ a mix of tags and historical documents related to the users, and LLM-based baselines that personalise the results using models like Phi and GPT.
Submitted runs will be evaluated according to Precision, Recall, Mean Average Precision, Mean Reciprocal Rank, and normalised Discounted Cumulative Gain.
Important dates
- Release of the training set: Released
- Release of the test set: 10 July 2024
- Run submission deadline: 1 August 2024
- Announcement of results: 15 August 2024
- Working notes due: 15 September 2024
- Camera-ready copies of notes and overview paper: 15 October 2024
Website and social
For more information:
Visit the PIR-FIRE website: https://pirfire.github.io/pirfire/2024/.
Join the mailing group: https://groups.google.com/u/2/g/pir-fire-2024
Updates on X: @PirFireIKR3
FIRE 2024 Website: http://fire.irsi.res.in/fire/2024/home.
Registrations: Google Form
Task organisers
- Gabriella Pasi, IKR3 Lab, University of Milano-Bicocca
- Marco Viviani, IKR3 Lab, University of Milano-Bicocca
- Alessandro Raganato, IKR3 Lab, University of Milano-Bicocca
- Sandip Modha, IKR3 Lab, University of Milano-Bicocca
- Georgios Peikos, IKR3 Lab, University of Milano-Bicocca
- Gian Carlo Milanese, IKR3 Lab, University of Milano-Bicocca
- Pranav Kasela, IKR3 Lab, University of Milano-Bicocca
- Marco Braga, IKR3 Lab, University of Milano-Bicocca
- Effrosyni Sokli, IKR3 Lab, University of Milano-Bicocca