Training and test data for Task 4

Deniz Yuret

unread,

Feb 28, 2007, 8:17:19 AM2/28/07

to semantic...@googlegroups.com

The data for task 4 is now available on the semeval 2007 website. One
way to get there (there may be easier ways):

1. Go to the task 4 data page:
http://nlp.cs.swarthmore.edu/semeval/tasks/task04/data.shtml
2. Click on the link that says "available Feb 26, 2007"
3. That should take you to a login page, if you don't have an account
yet register for one, then login.
4. That takes you to the "Semeval Task List & Files" page for your
team. Click on "Add Task/System".
5. Select task 04 in the resulting page and add it to the tasks for your team.
6. When you go back to the "Semeval Task List & Files" page, the
download buttons for Task 04 should be there.

deniz

Sophia Katrenko

unread,

Feb 28, 2007, 12:02:16 PM2/28/07

to semantic...@googlegroups.com

Dear organizers,

We'd like to ask you about the following:

1. Which parser have you used to extract the noun phrases (if any)?
2. Which version of WordNet has been used to annotate nominals?

Thank you in advance,
Sophia Katrenko

--
Sophia Katrenko

Human Computer Studies Laboratory
Informatics Institute
Faculty of Science
Universiteit van Amsterdam

Kruislaan 419, Matrix I
1098 VA Amsterdam
The Netherlands

Tel.: +31 20 888 4686
http://staff.science.uva.nl/~katrenko/

peter....@nrc-cnrc.gc.ca

unread,

Feb 28, 2007, 12:39:20 PM2/28/07

to SemanticRelations

Hi,

> 1. Which parser have you used to extract the noun phrases (if any)?

1. No parser was used. The noun phrase boundaries were chosen
manually. For each relation, one person selected the sentences and two
other people chose the WordNet labels and the true/false labels. The
noun phrase boundaries where chosen by the person who selected the
sentences. In a few cases, the boundaries were adjusted by the people
who chose the labels. We spent a lot of time discussing the sentences
and the labels, but we didn't spend much time discussing the
boundaries for the noun phrases, so there is likely some inconsistency
in the boundaries. The WordNet labels often indicate a sub-phrase of
the full phrase indicated by the markup in the sentence (because the
full phrase was not in WordNet). The sub-phrases in the WordNet labels
may be more consistent than the boundaries that are indicated by the
markup.

> 2. Which version of WordNet has been used to annotate nominals?

We used WordNet 3.0.

Note that the early release of WordNet 3.0 had a bug in it. The sense
numbers in the early release are wrong and do not match the numbers in
the current release. Unfortunately, both releases are called WordNet
3.0. However, we use sense keys rather than sense numbers, and the
sense keys are the same in both the early release and the current
release.

Here is how to tell whether you have the early release or the current
release. Go to the directory where the sense index file is located and
execute the following command:

cut -f4 -d' ' index.sense | sort | uniq | wc -l

If the result is 235, you have the current release of WordNet 3.0. If
the result is 1, you have the early (buggy) release.

Best wishes,
Peter.

Reply all

Reply to author

Forward