The evaluation period has started

Javier Artiles

unread,

Dec 1, 2008, 12:13:53 AM12/1/08

to weps-or...@lsi.uned.es, web-people-...@googlegroups.com, web-people-search-tas...@googlegroups.com

Dear all,

The evaluation period has started ! The test data is available in the following address:
http://nlp.uned.es/weps/weps2/WePS2_test_data.zip

Please send your system's output in a zip file to the organizers address weps-or...@lsi.uned.es
Remember that the deadline for this submission is the 8th of December. The evaluation results for each
team will be sent back the 17th of December.

In the test data you will find metadata files describing the documents and the web pages for each name.

* Metadata

Each xml file contains the top 150 search results metadata for each name.
It includes the URL, title, rank number (starting at 1) and MIME type of each document.
Not all documents in the metadata files are part of the WePS-2 corpus. The attribute "inWepsCorpus"
on each "doc" element indicates whether the referred document is included or not. Documents not
included won't be evaluated neither in the attribute nor the clustering tasks.

* Web Pages

The web pages directory contains all the documents downloaded from the search results of each person name.
Documents are named according to their position in the ranking (001.html, 002.html). In many cases the list
of files skips numbers from the original ranking. This is because not all documents have been downloaded and
included in the corpus. Only html and plain text documents have been downloaded and documents not containing
the query string (the person name) where ignored too. In some cases the document couldn't be downloaded or the
server was unavailable.

* Clustering task

- The format of your system output must be the same found in the
training data. For instance:

<clustering>
    <entity id="0">
        <doc rank="0"/>
        <doc rank="1"/>
        <doc rank="3"/>
        <doc rank="4"/>
    </entity>
    <entity id="1">
        <doc rank="5"/>
    ....

- One XML file is expected for each clustering problem (person name).
Please check that your XML is correct for parsing. Name the files using
the person name in uppercase as in "AMANDA_LENTZ.xml".

* Attribute Extraction Task

- Create your system output as directed in Section 6 of the task
guideline (or the same format as the training data). Information about
each name should be in a separate file, which has a name like
"Alexander_Macomb.txt".

- All the files should be in a single directory. The name of the
directory should be the task name and your site name (e.g. AE_NYU;
change NYU to your site name). Please make a zip/tar/tgz file (e.g.
AE_NYU.tzg) which contain the directory. Then send it (by e-mail) to the
organizer to submit.

- Note that the training data include "Education" attribute, which was
later changed to "school" "major" and "degree". We apologize that we
did not have time to fix the changes in the training data.

- Information about "pages to ignore" (Section 3 in the guideline) will
be distributed after the evaluation. Please create your data for all
pages. In other words, you don't have to detect which pages to ignore by
yourself. We will not evaluate the pages to ignore at the evaluation
even you make some outputs.

- After submission, we will check our gold data with the system outputs.
However, we expect that this task is too much time consuming if we check
it against all the outputs. We are likely to check against the output
which was created by at least two systems.

Javier Artiles (on behalf of the WePS organizers)

juie.jiang

unread,

Dec 4, 2008, 5:22:09 AM12/4/08

to Web People Search Task

What does "Each team can submit up to five runs", whether we can
submit five runs results at one time? or something else? Thanks.

On Dec 1, 1:13 pm, "Javier Artiles" <jav...@gmail.com> wrote:
> Dear all,
>
> The evaluation period has started ! The test data is available in the
> following address:http://nlp.uned.es/weps/weps2/WePS2_test_data.zip
>

> Please send your system's output in a zip file to the organizers address *
> weps-organiz...@lsi.uned.es*

Paul Kalmar

unread,

Dec 4, 2008, 8:04:24 PM12/4/08

to web-people-...@googlegroups.com

What time on the 8th is the deadline, 23:59 PST? GMT?

Thanks,

Paul

Javier Artiles

unread,

Dec 5, 2008, 4:02:07 PM12/5/08

to web-people-...@googlegroups.com

The deadline for the submission is midnight EST (Eastern Standard Time).
The same time zone was used to publish the test data.
You can keep track of the remaining time in this page:

http://www.timeanddate.com/worldclock/city.html?n=179

Javier.

--
------------------------------------------------------------------------------
Javier Artiles Picón
Departamento de Lenguajes y Sistemas Informáticos
ETSI Informática, UNED

Phone: +34 91 398 8106
Fax: +34 91 398 65 35
Home page: nlp.uned.es/~javier
LinkedIn page: www.linkedin.com/in/javierartiles
------------------------------------------------------------------------------

Javier Artiles

unread,

Dec 5, 2008, 4:08:37 PM12/5/08

to web-people-...@googlegroups.com

Dear Juie Jiang,

It means that for each of the tasks (clustering and attribute extraction) you can submit up
to 5 different system outputs. This should help you in comparing different approaches or
variations of your system.

Javier.

Message has been deleted

Thierry Poibeau

unread,

Dec 6, 2008, 7:56:25 PM12/6/08

to web-people-...@googlegroups.com

Dear Javier,

The website (nlp.uned.es) seems to be unreachable since yesterday (at
least from France).

Thanks,
Thierry

韩先培

unread,

Dec 7, 2008, 12:57:05 AM12/7/08

to web-people-search-task

Dear Javier,

It also cannot be reached in China.

And whether there will be some confirmation message be sent to us after the subimission of result?

Thanks

Xianpei Han

2008-12-07

Chinese Information Processing Group(CIP)
National Laboratory of Pattern Recognition(NLPR)
Institute of Automation, Chinese Academy of Sciences
ADDR: Automation Building Room 1010,Zhongguancun East
Road 95, HaiDian District, Beijing, China. 100190.
TEL: +86-010-82614468

发件人： Thierry Poibeau

发送时间： 2008-12-07 08:56:39

收件人： web-people-search-task

抄送：

主题： [WePS-Semeval07] Re: The evaluation period has started

__________ Information from ESET NOD32 Antivirus, version of virus signature database 3668 (20081206) __________

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com

priya venkateshan

unread,

Dec 7, 2008, 2:30:21 AM12/7/08

to web-people-...@googlegroups.com

It's not reachable from India either.

Julio Gonzalo

unread,

Dec 7, 2008, 5:39:50 AM12/7/08

to web-people-...@googlegroups.com

We're sorry, the server is (unexpectedly) down and it will be difficult to solve the problem before Tuesday. We'll do our best to fix it before. In any case you can still submit your results to weps-or...@lsi.uned.es (and yes, Javier will send confirmation messages)

If necessary we will postpone the deadline a couple of days, we'll get back to you soon.

Thanks for your understanding,

Julio

priya venkateshan escribió:

Javier Artiles

unread,

Dec 7, 2008, 2:16:47 PM12/7/08

to web-people-...@googlegroups.com

I've copied the most important information in another server until the WePS site is online again.
Please let me know if you need anything else.

Test data
http://www.lsi.uned.es/weps/weps-2.zip

Training data
http://www.lsi.uned.es/weps/weps-1.zip
http://www.lsi.uned.es/weps/weps-2_AE_training.zip

Task definitions
http://www.lsi.uned.es/weps/WePS2_Attribute_Extraction.pdf
http://www.lsi.uned.es/weps/WePS2_Clustering.pdf

Call for participation and evaluation schedule
http://www.lsi.uned.es/weps/WePS2_Call_for_participation.pdf

You can access the WePS task bibliography from CiteULike
http://www.citeulike.org/user/weps-task

Javier

--
------------------------------------------------------------------------------
Javier Artiles Picón
Departamento de Lenguajes y Sistemas Informáticos
ETSI Informática, UNED

Phone: +34 91 398 8106
Fax: +34 91 398 65 35
Home page: nlp.uned.es/~javier
LinkedIn page: www.linkedin.com/in/javierartiles
------------------------------------------------------------------------------

Satoshi Sekine

unread,

Dec 16, 2008, 1:29:17 PM12/16/08

to web-people-...@googlegroups.com

Dear WePS participants,

Thank you for participating the WePS 2.

We have to apologies that the notification of the WePS 2 Attribute
Extraction task will be delayed. This is because it takes more time than
expected to adjudicate the human made answer with the system output
results. We are hoping to notify the result by the Christmas. We will
inform you once we figure out when we can make it.

This is only for the AE task. The clustering results should be ready
tomorrow as scheduled.

Sorry for the inconvenience and thank you for your patients.

--
Satoshi Sekine
sek...@cs.nyu.edu

Satoshi Sekine

unread,

Dec 24, 2008, 11:35:52 AM12/24/08

to web-people-...@googlegroups.com

Dear WePS participants,

We have to applogize again that the data for WePS2 AE task is not ready,
yet. We've finished 20 names so far, so I'm expecting to distribute the
result soon after the new year. Again, I'm sorry for the delay and thank
you for your patients.

Best Regards,
Satoshi Sekine

P.S. Merry Christmas and a happy new year!

--
Satoshi Sekine
sek...@cs.nyu.edu

Reply all

Reply to author

Forward