PAN 2015: Plagiarism Detection (Text Alignment Corpus Construction)

111 views
Skip to first unread message

Martin Potthast

unread,
Feb 3, 2015, 7:24:16 AM2/3/15
to pan-workshop-series
Dear participant,

we have something new planned for this year's text alignment task.
Instead of inviting implementations of new text alignment approaches,
we invite you to submit a text alignment corpus of your own design.

Since 2009, we have constructed and released a new text alignment
evaluation corpus almost every year. We have invented new ways to
generate text reuse and plagiarism, new ways to automatically
obfuscate it in various ways, and all of that at large scales.

Now, we believe, it is time to further diversify the corpus creation
efforts. Many of you have their own ideas of what instances of text
reuse and plagiarism an evaluation corpus should consist of. For
example, text reuse can be found in many different genres of writing;
and we still have explored only few languages, let alone
cross-language text reuse. This is the opportunity to execute on your
ideas and let the whole community around this task benefit from your
efforts.

Thanks to TIRA and our software submission initiative, we are now in a
position to create an exciting challenge around corpus construction:
every corpus that will be submitted this year will be fed into the
text alignment prototypes that have been submitted in previous years.

This way, we can, to some extent, assess the validity of a corpus as
well as its difficulty. To further assess corpus validity, each
submitted corpus will be made available to all other participants so
they can analyze the instances of text reuse and plagiarism in a
peer-review manner in order to answer the question: how realistic are
the problem instances?

To the best of our knowledge, this is the first time that corpus
creation has been done in this way in a shared task, and we hope you
will pick up the challenge and contribute in order to create a
community-driven text reuse and plagiarism corpus for PAN 2015 and
beyond.

Please take a quick look at the task web page to learn details:
http://www.uni-weimar.de/medien/webis/research/events/pan-15/pan15-web/plagiarism-detection.html

If you have any questions, please don't hesitate to ask.

Best,
Martin

--
Martin Potthast
Bauhaus-Universität Weimar
www.webis.de --- www.netspeak.org

vani k

unread,
Feb 3, 2015, 10:56:22 AM2/3/15
to pan-works...@googlegroups.com
Hi Sir,

So this year only corpus submissions are invited and no new text alignment approaches/ implementations.

Thanks & Regards,
 Vani K


--
--
You received this message because you are subscribed to the Google Group "PAN".
Visit this group at http://groups.google.com/group/pan-workshop-series
To unsubscribe send email to pan-workshop-se...@googlegroups.com.
---
You received this message because you are subscribed to the Google Groups "PAN Workshop Series. Uncovering Plagiarism, Authorship, and Social Software Misuse." group.
To unsubscribe from this group and stop receiving emails from it, send an email to pan-workshop-se...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Martin Potthast

unread,
Feb 4, 2015, 4:03:47 AM2/4/15
to pan-workshop-series
Dear Vani,

we would prefer if you work on corpus construction this year (we need
this input to sustain the task in the future), but there's of course
nothing to stop you from also developing a new text alignment
solution.

Best,
Martin
Reply all
Reply to author
Forward
0 new messages