Submission instructions

9 views
Skip to first unread message

Diego Molla

unread,
Oct 31, 2010, 2:45:33 AM10/31/10
to LT Programming Competition
I've just send this message to all team contacts. If your team has not
received it please tell me asap:

This is a gentle reminder that tomorrow Monday 1 November the test
data will be distributed to all participants. The submission system
will work as follows.

Tomorrow Monday 1 the contact person of each team will receive the
test dataset and three number tokens. Each token will allow one
submission. Any resubmissions will overwrite existing submissions made
with the same token. So feel free to submit as early as you wish to
check if there are any errors during the submission process. This also
means that you are allowed to submit up to three distinct results. The
best result of all three will be selected to qualify for a prize.

The submission script will return any possible warnings generated by
the evaluation script (basically it's the same evaluation script that
was available with the training and development data set, but it will
use the final test set instead). The evaluation results will be made
public just after the closing date.

Remember that to qualify for any prize you will need to submit a
poster describing your work. Make the poster size A0 and send it to me
by email as a PDF file. The poster will be displayed at the ALTA
workshop in Melbourne, 9-10 December
(http://www.alta.asn.au/events/alta2010/index.html).

It is in your interest to submit the poster as early as you can. After
we examine the poster and we are satisfied that the methods you use
are genuine (e.g. you didn't simply use a third-party system) we will
notify you if you qualify for a prize. Feel free to send an early
draft of the poster at any time if you want to know the final result
early, and send us the final version of the poster by the due date
listed below.

A reminder of the key dates:

Release of test data (without annotations): 1 Nov 2010
Deadline for submission of results over test data: 3 Nov 2010
Release of preliminary results: 4 Nov 2010
Deadline for submission of system description poster: 26 Nov 2010

Best regards,

Diego

Giang Binh Tran

unread,
Oct 31, 2010, 7:58:44 AM10/31/10
to lt-programmin...@googlegroups.com
Hi all,

I just have a small comment about the evaluation script: langid-evaluate.prl we got that is:
this part:

while (<GOLD>)
{
    chomp;
    if (my($docid,$docclass) = &process_line($_))
    {
$gold{$docid}{$docclass} = 1;
$docclass_list{$docclass} = 1;
$goldtotal++;
    }
}

close GOLD;

I think chomp does not perform well and instead I use  $line =~ s/\s+$//;
I already got errors: F-score, Precision and Recall are equal to 0.000 just because the special character  that chomp does not catch when I ran on Mac OS system.


best,

Timothy Baldwin

unread,
Nov 1, 2010, 8:16:03 AM11/1/10
to giangt...@gmail.com, lt-programmin...@googlegroups.com
Hi,


> I just have a small comment about the evaluation script: langid-evaluate.prl we got that is:
> this part:
>
> while (<GOLD>)
> {
> chomp;
> if (my($docid,$docclass) = &process_line($_))
> {
> $gold{$docid}{$docclass} = 1;
> $docclass_list{$docclass} = 1;
> $goldtotal++;
> }
> }
>
> close GOLD;
>
> I think chomp does not perform well and instead I use $line =~ s/\s+$//;
> I already got errors: F-score, Precision and Recall are equal to 0.000 just because the special character that chomp does not catch when I ran on Mac OS system.

You are right that it's brittle and doesn't handle DOS-style line breaks, but
it should be possible to pre-convert your files to UNIX line breaks before
evaluating with the script, e.g. using dos2unix (or fromdos). Alternatively,
create a simple Perl filter to convert your files to UNIX, e.g. with something
like:

perl -i -pne 's/\r\n/\n/g' FILE

over your output file. This is perhaps a better solution than modifying the
evaluation script at this late stage.


Tim

Reply all
Reply to author
Forward
0 new messages