Can we use external data?

parag agrawal

unread,

Nov 1, 2018, 6:41:14 AM11/1/18

to RumourEval

g.go...@sheffield.ac.uk

unread,

Nov 5, 2018, 4:58:58 AM11/5/18

to RumourEval

You can use this Wikipedia dump:

https://archive.org/details/enwiki-20160901/

This is important for the English Twitter sample, as that was selected to be time-critical, in order to test whether we can automatically predict what the veracity *will turn out to be* in the case of currently evolving stories.

The Reddit data is much less likely to be time critical, but still, please use that same resource.

Genevieve

Martin Fajčík

unread,

Dec 19, 2018, 4:53:59 AM12/19/18

to RumourEval

Hello Genevieve.
could you please be more specific on type of external data it is forbidden to use?

Citing from the task description, under B subtask, following is written:
"Critically, no external resources may be used that contain information from after the rumour's resolution. To control this, we will specify precise versions of external information that participants may use. This is important to make sure we introduce time sensitivity into the task of veracity prediction."

My questions are:
1. This does apply for the subtask A as well, right?
2. Where is the boundary between external data and non-external?
For example word embeddings are already trained on external data (and they are allowed). My point is, can we use pretrained language representation models (ULMFiT, OpenAI-GPT ...), can we augment the data, and can we augment them using some paraphrasing system (again trained on "external data"), can we use NER/POS/DEP extraction systems etc. or pretrain system on similar task?

Thank you for your answer.

Dňa pondelok, 5. novembra 2018 10:58:58 UTC+1 g.go...@sheffield.ac.uk napísal(-a):

g.go...@sheffield.ac.uk

unread,

Dec 20, 2018, 9:01:23 AM12/20/18

to RumourEval

Yes, you can use pre-trained models etc. that just provide information about language.

Just don't use information sources that help to resolve the rumour with the benefit of hindsight.

Genevieve

Reply all

Reply to author

Forward