Profiling Fake News Spreaders - Deadline (flexible)

Skip to first unread message

Paolo Rosso

Jun 5, 2020, 3:23:28 PM6/5/20
to, pan
Dear participants of the Profiling Fake News Spreaders at PAN,

I'm receiving requests to extend the deadline at least until the weekend.

As some colleagues always remind us, science comes first (than deadlines):
if you have been working on the problem and you're still facing some technical
problems, if you need few days more to make your system work (also on TIRA)
it's not going to be a tragedy but please next week try to fix everything.

If you have any problem, just contact us: Kico or Anastasia will try to
help you.

The more participants, the better in order to learn from everybody about the
different models that have been employed.

Come on, you're nearly there: don't give up when you're close to make it ;-)

Paolo & the rest of the gang
Paolo Rosso
Universitat Politècnica de València, Spain

Paolo Rosso

Jun 22, 2020, 12:38:59 PM6/22/20
to, pan
[Apologies for multiple postings]

After addressing at the FIRE evaluation forum the problem of
plagiarism detection in SOurce COde (SOCO) in 2014, also from a
Cross-Language perspective (CL-SOCO) in 2015, and Personality
Recognition in SOCO (PR-SOCO) in 2016, this year we will address the
problem of Authorship Identification of SOurce COde (AI-SOCO).

To be organized at FIRE 2020 (
10 - 13 December
Virtual Conference

Task Description:

General authorship identification is essential to the detection of
undesirable deception of others' content misuse or exposing the owners
of some anonymous hurtful content. This is done by revealing the
author of that content. Authorship Identification of SOurce COde
(AI-SOCO) focuses on uncovering the author who wrote some piece of
code. This facilitates solving issues related to cheating in academic,
work and open source environments. Also, it can be helpful in
detecting the authors of malware softwares over the world.
The dataset is composed of source codes collected from the open
submissions in the Codeforces online judge. Codeforces is an online
judge for hosting competitive programming contests such that each
contest consists of multiple problems to be solved by the
participants. A Codeforces participant can solve a problem by writing
a solution for it using any of the available programming languages on
the website, and then submitting the solution through the website. The
solution's result can be correct (accepted) or incorrect (wrong
answer, time limit exceeded, etc.).
In our dataset, we selected 1000 users and collected 100 source codes
from each one. So, the total number of source codes is 100,000. All
collected source codes are correct and written using the C++
programming language. For each user, all collected source codes are
from unique problems.
Given the pre-defined set of source codes and their writers, the task
participants should build systems that are able to detect the writer
given any new, unseen before source codes from the previously defined
writers list.
Full task description can be found at:
8th June – Open track websites
8th June – Training and development data release
31st July – Test data release
7th September – Run submission deadline
20th September – Results declared
31st October – Working notes and overview papers due (tentative)
10th-13th December – FIRE 2020
Ali Fadel, Jordan University of Science and Technology, Jordan
Husam Musleh, Jordan University of Science and Technology, Jordan
Ibraheem Tuffaha, Jordan University of Science and Technology, Jordan
Mahmoud Al-Ayyoub, Jordan University of Science and Technology, Jordan
Yaser Jararweh, Duquesne University, USA
Elhadj Benkhelifa, Staffordshire University, UK
Paolo Rosso, Universitat Politècnica de València, Spain

For regular updates subscribe to our mailing list:

The organizers of the PAN AI-SOCO shared task @ FIRE 2020

Reply all
Reply to author
0 new messages