Attribute Extraction evaluation script is now available

4 views
Skip to first unread message

Satoshi Sekine

unread,
Nov 19, 2008, 4:33:49 PM11/19/08
to Web People Search Task
We distribute the preliminary evaluation script of Attribute
Extraction task.

http://nlp.uned.es/weps/
and look at the left side under "11/19/2008".

It is written in Perl and it works like the following for each name:
Please use a html-browser to look at the output.


perl eval.pl golden-file.txt your-file.txt > out.html


We believe we need a bit of improvements for minor cases (such as non-
ASCII character handling etc), Please let me know you find bugs or
comments.

Thanks,
Satoshi

cdep...@inf.uc3m.es

unread,
Nov 20, 2008, 5:05:57 PM11/20/08
to Web People Search Task
Dear Satoshi,

I believe that precision and recall has been interchanged in the
output file, am I wrong?

I obtain the following results:

MATCH MISS1 MISS2 Precision Recall F-measure
2 1 66.667 100.000 80.000

when using the following input as gold:
0 Birth place Madrid
0 Affiliation UC3M
0 Other name Cesar

and this as result
0 Affiliation UC3M
0 Other name Cesar

A second bug, the name of the attribute "Birth place" is spelled
"Birthplace" in the training set and therefore this attribute is
ignored by the evaluation program when using the training data.

Best regards,
César.

Satoshi Sekine

unread,
Nov 20, 2008, 6:15:36 PM11/20/08
to web-people-...@googlegroups.com
Dear Cesar,


Thank you very much for the bug report. We are sorry abouot it.
I made a quick fix. Could you please try it with the attached script?
(I'm on a trip now, and I can't easily check the result... sorry).


Thanks,
Satoshi

> Dear Satoshi,
>
> I believe that precision and recall has been interchanged in the
> output file, am I wrong?
>
> I obtain the following results:
>
> MATCH MISS1 MISS2 Precision Recall F-measure
> 2 1 66.667 100.000 80.000
>
> when using the following input as gold:
> 0 Birth place Madrid
> 0 Affiliation UC3M
> 0 Other name Cesar
>
> and this as result
> 0 Affiliation UC3M
> 0 Other name Cesar
>
> A second bug, the name of the attribute "Birth place" is spelled
> "Birthplace" in the training set and therefore this attribute is
> ignored by the evaluation program when using the training data.
>
> Best regards,
> C駸ar.
>
>
>
> On 19 nov, 22:33, Satoshi Sekine <sek...@cs.nyu.edu> wrote:
> > We distribute the preliminary evaluation script of Attribute
> > Extraction task.
> >
> > http://nlp.uned.es/weps/
> > and look at the left side under "11/19/2008".
> >
> > It is written in Perl and it works like the following for each name:
> > Please use a html-browser to look at the output.
> >
> > perl eval.pl golden-file.txt your-file.txt > out.html
> >
> > We believe we need a bit of improvements for minor cases (such as non-
> > ASCII character handling etc), Please let me know you find bugs or
> > comments.
> >
> > Thanks,
> > Satoshi
> >

--
Satoshi Sekine
sek...@cs.nyu.edu
eval2.pl

cdep...@inf.uc3m.es

unread,
Nov 22, 2008, 5:01:20 PM11/22/08
to Web People Search Task
Thanks Satoshi,

The issue with precision and recall it is solved now. I had to
translate the end of lines in the script in order to work on Linux
anyway,

I have seen that you changed all the attributes names to be
uncapitalized and withouth spaces. Does it means that we should use
those for the submission?

Just to be informative, running the following line I found some other
inconsistencies with Fax, Other name and Education in the training
data.

> cat *.txt | cut -f 2 | sort -iu

Affiliation
Award
Birthplace
Date of birth
Degree
Education
Email
Fax
FAX
Location
Major
Mentor
Nationality
Occupation
Other name
Other Name
Phone
Relatives
School
Web site
Work

Best regards,
César.
> eval2.pl
> 9 KVerDescargar

Satoshi Sekine

unread,
Nov 22, 2008, 7:43:29 PM11/22/08
to web-people-...@googlegroups.com
Hi,


I'm sorry for the confusion.
You can use the original naming.

The new program changes the attribute name of both gold and test, but
don't worry so much.

I will check the attribute naming in the test data very carefully.


Thanks,
Satoshi
> Cィヲsar.
>
>
> On 21 nov, 00:15, Satoshi Sekine <sek...@cs.nyu.edu> wrote:
> > Dear Cesar,
> >
> > Thank you very much for the bug report. We are sorry abouot it.
> > I made a quick fix. Could you please try it with the attached script?
> > (I'm on a trip now, and I can't easily check the result... sorry).
> >
> > Thanks,
> > Satoshi
> >
> >
> >
> > > Dear Satoshi,
> >
> > > I believe that precision and recall has been interchanged in the
> > > output file, am I wrong?
> >
> > > I obtain the following results:
> >
> > > MATCH MISS1 MISS2 Precision Recall F-measure
> > > 2 1 66.667 100.000 80.000
> >
> > > when using the following input as gold:
> > > 0 Birth place Madrid
> > > 0 Affiliation UC3M
> > > 0 Other name Cesar
> >
> > > and this as result
> > > 0 Affiliation UC3M
> > > 0 Other name Cesar
> >
> > > A second bug, the name of the attribute "Birth place" is spelled
> > > "Birthplace" in the training set and therefore this attribute is
> > > ignored by the evaluation program when using the training data.
> >
> > > Best regards,
> > > C〓ar.
> >
> > > On 19 nov, 22:33, Satoshi Sekine <sek...@cs.nyu.edu> wrote:
> > > > We distribute the preliminary evaluation script of Attribute
> > > > Extraction task.
> >
> > > >http://nlp.uned.es/weps/
> > > > and look at the left side under "11/19/2008".
> >
> > > > It is written in Perl and it works like the following for each name:
> > > > Please use a html-browser to look at the output.
> >
> > > > perl eval.pl golden-file.txt your-file.txt > out.html
> >
> > > > We believe we need a bit of improvements for minor cases (such as non-
> > > > ASCII character handling etc), Please let me know you find bugs or
> > > > comments.
> >
> > > > Thanks,
> > > > Satoshi
> >
> > --
> > Satoshi Sekine
> > sek...@cs.nyu.edu
> >
> > eval2.pl
> > 9 KVerDescargar
> >

--
Satoshi Sekine
sek...@cs.nyu.edu

Reply all
Reply to author
Forward
0 new messages