You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to pan-workshop-series, p...@webis.de
Dear participants,
Wrt the Twitter dataset, we will provide in the web page with a tool which will ease the task of downloading contents. Detailed information is given in the readme file. Basically you should execute the downloader with the path to your Twitter dataset in order to populate its contents. After that, you will be able to work with it in the same way than the other datasets.
With respect to the testing phase you will not have to do anything like that because we will be using a pre-downloaded version of the Twitter corpus and will deal with confidential data thanks to software submissions. This would be in compliance with Twitter's policies, since participants will not gain access to the tweets: the software will run in a sandbox that will not allow for the testing corpora to leak.
For those of you who are new to the task, we will compute the prediction at author level. We will calculate the accuracy predicting age, gender and age + gender, independently for each language. The final accuracy will be obtained by averaging accuracies for age + gender perdiction for both languages.