test data/task

5 views
Skip to first unread message

Ted Pedersen

unread,
Feb 13, 2011, 11:03:50 PM2/13/11
to DiSCo2011 shared task
Greetings all,

I've been reading through the shared task description with great
interest, and it sounds like it should be really interesting.

I was wondering about the test data - are the test phrases going to be
marked in a complete context (like a sentence?) or will they be shown
in isolation (as the training and validation data shows)?

The reason I ask is that it seems like the interpretation of what is
literal or not could really change depending on that context....

He owns blue chip stocks.
I found a blue chip of paint on the floor.

If some context is to be provided, what form will that take?

Sorry if this question is already answered, I might have missed
something...Please just point me back to that spot and I'll catch
on...

Cordially,
Ted

DiSCo2011@ACL

unread,
Feb 14, 2011, 2:08:36 PM2/14/11
to DiSCo2011 shared task
Hi Ted,

very good point! We were discussing this internally.

For this shared task, we will not release the sentences. Think of the
scenario: What compositionality rating would you like to put in a
corpus-induced dictionary? Ideally, you would average the
compositionality over all occurrences, and this is what we ask you to
do in this task. We want to start off simple here.

Admittedly, the sample of 5 sentences per phrase is small, so there
might be discrepancies due to sample error. We however hope that at
least the coarse scoring is correct in most cases, and manual
inspection revealed that the scores make sense to us.

After the task is over, we will release the complete data, including
averaged judgments per sentence context. It will be an interesting,
different competition to tell blue chips stocks (highly frequent) from
blue chips of paint (rare). We saw quite a few cases where average
ratings were very high for some sentences and very low for others.

best,

Chris

Ted Pedersen

unread,
Feb 25, 2011, 4:45:54 PM2/25/11
to disco2011...@googlegroups.com, DiSCo2011@ACL
Thanks for this clarification.

Another question arises, and that has to do with the way the word
pairs have been presented....it seems like certain function words may
have been omitted. Sometimes this causes no trouble...

reinvent wheel

is pretty clearly "reinvent the wheel"

whereas

name come

has me absolutely flummoxed, particularly since it is shown as being
"medium literal". I can't work out what the original phrase must be,
and I don't think it's just "name come". Other ones that I'm
particularly puzzled by include

run number
name imply
interest lie

All show as medium literal, and I can't really even construct a
sentence using them that doesn't end up sounding really weird...

My name come from a far off land where grammar isn't taught so much...
I run number seven off the field
His name imply ... ?
My interest lie at the bottom of the sea?

What sort of filtering has been done on the pairs we are seeing in the
data (as compared to what appeared in the original context?)

Thanks!
Ted

--
Ted Pedersen
http://www.d.umn.edu/~tpederse

Organizer DISCo Workshop 2011

unread,
Mar 4, 2011, 11:32:32 AM3/4/11
to tped...@d.umn.edu, disco2011...@googlegroups.com
Hi Ted, 

sorry for taking so long to answer your questions. 

The selection process was the following: we extracted pairs using surface patterns (e.g. V DET N ; V N), selected by hand what looked reasonable and checked the sample sentences whether the pair was really in the relation we wanted most of the time.

Pairs have been reduced tho their base form (as available from the WaCky format) and function words between have been omitted. So, the pairs you mentioned could come from a sentence like this: "Your name implies that your interests lie in running a number of experiments." - note the inflection and the function words. 

"name come" looks like a selection error of ours: It should be "name comes from", obviously, and we excluded these cases (except when we failed to do so). Thank you very much for finding this! We will double-check so that none of those are found in the test set!

Judges often give medium ratings (or some give high and some give low, which averages to medium) when they feel that it is not entirely literal (where would interests lie down?) but there is still nothing idiomatic about the phrase. 

I hope this answers your questions, 

best,

Chris

Ted Pedersen

unread,
Mar 6, 2011, 11:54:10 AM3/6/11
to disco2011...@googlegroups.com, Organizer DISCo Workshop 2011
Hi Chris,

Thanks for these clarifications, this is really helpful, and I think
all my questions are answered at this point!

Cordially,
Ted

Ted Pedersen

unread,
Mar 15, 2011, 12:00:20 PM3/15/11
to disco2011...@googlegroups.com, Organizer DISCo Workshop 2011
Greetings all,

Just one rather specific question - how are function word defined? If
there a list of function words you've used, or is it a set of
particular POS categories? Knowing that would be helpful in enabling
some sort of "look up" from the Wacky corpus (to find the contexts
where the training examples have originated...)

Thanks!
Ted

On Fri, Mar 4, 2011 at 10:32 AM, Organizer DISCo Workshop 2011
<disco201...@gmail.com> wrote:

Organizer DISCo Workshop 2011

unread,
Mar 16, 2011, 12:02:25 PM3/16/11
to disco2011...@googlegroups.com
Hi Ted, 

It will be not trivial to implement your corpus lookup for the general case since a lot of manual scanning and selection went into the data to ensure that a) the parts of the phrase are related through target relation in the particular sentence and b) the parts of the phrase are predominantly found in the target relation. 

However, these are the selection patterns for the English tasks:

Adjective modifiers: JJ* NN*
Verb-Subject: NN* VV*
Verb-Object: VV* NN* and VV* DT NN*

For German tasks:
Adjective modifiers: ADJ* NN
Verb-Subject and Verb-object: Extracted VV* - NN* pairs within a window of five words and scanned manually. 

Again, these patterns overproduce and there was substantial manual work involved to trim the set of phrases as well as the set of sample sentences. 

best, 

Chris

Ted Pedersen

unread,
Mar 16, 2011, 5:35:20 PM3/16/11
to disco2011...@googlegroups.com, Organizer DISCo Workshop 2011
Hi Chris,

Thanks once again!

This is just a bit of thinking aloud - I'm a little puzzled as to what
to do with the Wacky corpus, since it does seem like it will be hard
to locate occurrences of the phrases included in the test data. I've
been trying to do that with the training data and it's a bit of a
puzzle. What I have been attempting is to take that third column in
the part of speech tagged version and use that as my "text" (since
that's the base form that we are getting in the phrases). Then I was
hoping to strip out the intermediate function words and hopefully be
left with phrases like "reinvent wheel" that could be lifted fairly
directly from the corpus. Now, what I would do after I actually
accomplished this is another question, mostly this is a "get to know
your data" exercise. Hmmm.

So...it's good to know that I probably shouldn't expect the above to
work especially well, and that's actually quite helpful (and will save
quite a bit of time...)

Hmmm. :)

Thanks,
Ted

On Wed, Mar 16, 2011 at 11:02 AM, Organizer DISCo Workshop 2011

Connie Jess

unread,
Mar 17, 2011, 10:07:29 AM3/17/11
to disco2011...@googlegroups.com
Thanks for the answers to these questions!

The patterns you used bring to mind a couple questions that I have
recently encountered.

At least in the English WaCKy corpus, POS tags appear to follow the
usual Penn Treebank standard but there are some exceptions.
Specifically, verbs tags may begin with either VB or VV. The tags VB,
VBG, VBP, VBZ, and VBN all occur in the corpus along with VV, VVG,
VVP, VVZ and VVN (and others).

(1) Is there any documentation about the POS tags found in the WaCKy corpora?

Secondly, the relations in the dependency-parsed version of the
English WaCKy corpus appears to use relations with similar names to
those listed in the CoNLL 2008 challenge but there are some
discrepancies. I have been unable to find any source describing the
definitions of the relations used by Malt Parser (which used to
generate the dependencies).

(2) Do you know of any resource that defines these dependency relations?

Thanks!
Connie

On Wed, Mar 16, 2011 at 1:02 PM, Organizer DISCo Workshop 2011

Connie Jess

unread,
Mar 17, 2011, 10:34:59 AM3/17/11
to disco2011...@googlegroups.com
Oh - I'm sorry - I asked the first question too quickly. I have since
realized that verbs with tags that begin with "VB" are only for the
verb "be" and verbs with tags that begin with "VH" are only for the
verb "have".

Connie

Chris Biemann

unread,
Mar 17, 2011, 1:00:40 PM3/17/11
to disco2011...@googlegroups.com
Hi Connie,

regarding your second question, I do not know of any resource that
defines these relations.

thanks,

Chris

Connie Jess

unread,
Mar 17, 2011, 3:38:31 PM3/17/11
to disco2011...@googlegroups.com
Hi Chris,
Thanks anyways.
Connie

Siva Reddy

unread,
Apr 4, 2011, 12:06:17 PM4/4/11
to disco2011...@googlegroups.com, Organizer DISCo Workshop 2011
Hi Chris and Organizers,

> For German tasks:
> Adjective modifiers: ADJ* NN
> Verb-Subject and Verb-object: Extracted VV* - NN* pairs within a window of five words and scanned manually.

I am afraid I am too late to ask this question. I heard German is
relatively free-word order compared to English and I wonder how to
distinguish between verb-subj and verb-obj patterns automatically. Can
you give me some hints. I am using deWaC corpus. Sorry if my question
is wrong.

Regards,
Siva


--
Siva Reddy         http://www.sivareddy.in

Organizer DISCo Workshop 2011

unread,
Apr 4, 2011, 12:18:24 PM4/4/11
to Siva Reddy, disco2011...@googlegroups.com
Hi Siva, 

I hope it is no too late in the sense that you won't be able to participate in the German task!
Nothing wrong about this question: there is no good way to distinguish verb-subj and verb-obj in German automatically. We do not ask you to do it. 

For the selection of examples, we concentrated on pairs of verbs and nouns that typically occur in either subj or obj relation. This decision was made by native speaker intuition and verified on randomly selected samples. 

In case you need to identify verb-subj vs. verb-obj for the test data: it is relatively safe to say that all occurrences of the noun and the verb in a window of up to 5 words are either unrelated or in the relation as given in the test data. 

I hope this helps, 

best wishes

Chris

Siva Reddy

unread,
Apr 5, 2011, 7:12:12 AM4/5/11
to Organizer DISCo Workshop 2011, disco2011...@googlegroups.com
In case you need to identify verb-subj vs. verb-obj for the test data: it is relatively safe to say that all occurrences of the noun and the verb in a window of up to 5 words are either unrelated or in the relation as given in the test data. 

I hope this helps, 

This surely helps. I am proceeding with this suggestion. Thanks very much.

Regards,
Siva
Reply all
Reply to author
Forward
0 new messages