Classification of the source tweet

149 views
Skip to first unread message

a.t.cig...@gmail.com

unread,
Nov 16, 2018, 7:21:03 AM11/16/18
to RumourEval
Hi all,

Regarding Task A:

It's my understanding our goal is to classify the replies (with SDQC) against the source tweet.
For example if we have.

Source: "the sky is blue"
-Reply1: "yes, it is blue"
-Reply2: "no, it is violet"
-Reply3:  "do you really think so?"

To my understanding:
1 will be "support" against the Source tweet
2 will be "deny" against the Source tweet
3 will be "query" against the Source tweet


-- I see from the training data we also have to label the source tweets, but against what?

-- Wrt the replies, stance should be labelled against the source tweet or against the fact itself?  


Alessandra T Cignarella*

g.go...@sheffield.ac.uk

unread,
Nov 19, 2018, 4:32:05 AM11/19/18
to RumourEval
Example:

Source: "I don't believe that Prince is really going to play in Toronto"
  Rumour: Prince is going to play in Toronto
  Stance: Deny

Replies:
  "He totally is"
  Stance: Support

So the stance of the reply is with respect to the rumour, not the stance of the source.

Makes sense?

ghan

unread,
Nov 19, 2018, 6:52:17 AM11/19/18
to RumourEval
Hi Gorrell,

you meant by "with respect to the rumour", the source_tweet in each thread, no?


Source: "I don't believe that Prince is really going to play in Toronto"
Rumour: Prince is going to play in Toronto
Stance: Deny

since the source is the not available, how can we predict the stance of the source_tweets,
in your example, the stance is Deny because we have the source, but in the provided dataset there is nothing provided?

Simpler, the stance of a reply is towards the source tweet as you mentioned, but what about the source tweet, the stance should be toward what?


g.go...@sheffield.ac.uk

unread,
Nov 21, 2018, 8:55:17 AM11/21/18
to RumourEval
Sorry for the delayed reply, I was trying to figure out what you meant.
How do you mean, the source is not available? You should have the source tweet in a directory called source-tweet for every thread. Also in the Reddit data I called it source-tweet in case that made it a bit easier for people with existing infrastructure, though obviously in the case of the Reddit data it's not a tweet but the original post.
Genevieve

孔庆超

unread,
Jan 7, 2019, 9:23:18 PM1/7/19
to RumourEval
However, the "Rumour" is not explicitly provided.

What the twitter dataset dose provide is only a topic name, such as "charliehebdo", "ebola-essien", etc.

For the reddit dataset, it dose not even have a topic name.

Is this correct?

在 2018年11月19日星期一 UTC+8下午5:32:05,Genevieve Gorrell写道:

Genevieve Gorrell

unread,
Jan 17, 2019, 4:43:58 AM1/17/19
to RumourEval
Indeed.
For the Twitter data, the threads are selected on the basis of a rumour that was identified post-hoc. So whilst tweets can be short and vague, the rumour really exists "out there" in the sense that it came to exist within the culture. The exact formulation of it can be quite a subtle philosophical question in some cases.
The Reddit data were selected on the basis of the original post itself, though most posts do refer to rumours that also exist "out there". But in the Reddit data it might be helpful for you to know that the annotators were instructed to focus on the first factual statement in the original post.
The test tweets will be grouped like the training data, though many of the rumours just have one or two threads this time. And as previously the Reddit data won't be grouped because the Reddit threads are almost always the only thread about that rumour.
Genevieve
Reply all
Reply to author
Forward
0 new messages