Trial set xml-id's 48-50 misaligned with both substitutions and gold-rankings

6 views

Skip to first unread message

Sigrid Klerke

unread,

Mar 7, 2012, 1:12:52 PM3/7/12

to semeval2012_lexi...@googlegroups.com

Hi,

We had an error due to a misalignment of the "instance id" tag in contexts.xml and the line numbers in both the substitutions file and substitutions.gold-rankings file as illustrated in the example below. Ignoring the id-tag solved our problem.

Best regards,

Sigrid

<context>Abstract : Analysis of contact between two chromosomal races of house mice in northern Italy show that natural selection will produce alleles that <head>bar</head> interracial matings if the resulting offspring are unfit hybrids .</context>

</instance>

</lexelt>

Sentence 48 rankings: {bar} {pub} {saloon}

Sujay Kumar Jauhar

unread,

Mar 7, 2012, 2:10:51 PM3/7/12

to semeval2012_lexi...@googlegroups.com

Hi Sigrid (and everyone on the mailing list),

The problem you mention is inherent in the original SemEval Lexical Substitution dataset which we have borrowed for our task. The 'contexts.xml' file actually is taken straight out of the trial dataset without any modifications whatsoever, and contained this mis-alignment. We have on our part corrected the 'substitutions' and 'substitutions.gold-rankings' files so that they are actually aligned with the 'contexts.xml' file in an absolute way. That is to say that right now instance with id 50 is actually aligned with line 48 in the other two files. So, in effect, if you were to ignore the id numbers in the xml file, the instances are aligned in a one-to-one relation with the lines in the 'substitutions' and 'substitutions.gold-rankings' files.

However, I am attaching herewith a corrected version of the xml file in which I have replaced the instance id tags with the correct numbers. This mistake also happens later in the file around id 135. It is, however, not present in the test set.

We apologize for the inconvenience.

Regards,

Sujay.

contexts.xml

Reply all

Reply to author

Forward

0 new messages