Encoding error in trial data

0 views
Skip to first unread message

Maarten van Gompel

unread,
Oct 14, 2009, 4:26:26 AM10/14/09
to cross-li...@googlegroups.com
Hi Els, Veronique, and participants,

I found an encoding error in the trial data. The documentation
explicitly says all data files will be delivered in UTF-8 format, but at
least bank.data is iso-8859-1(5). It might help if the XML headers of
the data files could explicitly mention their encoding, to prevent any
confusion.

Regards,

--

Maarten van Gompel (Proycon), ILK, Universiteit Tilburg

pro...@anaproy.nl
pro...@unilang.org

--------------------------------------------------------------------------
Personal Homepage: http://proycon.anaproy.nl
My Language Technology Site: http://proylt.anaproy.nl
UniLang Language Community: http://www.unilang.org
--------------------------------------------------------------------------
JABBER: maar...@luon.net, AIM: proycon, YAHOO: proycon
MSN: pro...@anaproy.nl
--------------------------------------------------------------------------

Els

unread,
Oct 14, 2009, 5:59:58 AM10/14/09
to SemEval2010_Cross-Lingual Word Sense Disambiguation
Hi Maarten,

the bank.data file was indeed encoded in ANSI format.
I've checked and made sure all files are now encoded in UTF-8 format.
You can download an updated version of the trial data from:
http://lt3.hogent.be/semeval/Trial/

Best regards
Els.

On Oct 14, 10:26 am, Maarten van Gompel <proy...@anaproy.nl> wrote:
> Hi Els, Veronique, and participants,
>
> I found an encoding error in the trial data. The documentation
> explicitly says all data files will be delivered in UTF-8 format, but at
> least bank.data is iso-8859-1(5). It might help if the XML headers of
> the data files could explicitly mention their encoding, to prevent any
> confusion.
>
> Regards,
>
> --
>
> Maarten van Gompel (Proycon), ILK, Universiteit Tilburg
>
> proy...@anaproy.nl
> proy...@unilang.org
>
> --------------------------------------------------------------------------
> Personal Homepage:              http://proycon.anaproy.nl
> My Language Technology Site:    http://proylt.anaproy.nl
> UniLang Language Community:    http://www.unilang.org
> --------------------------------------------------------------------------
> JABBER: maarte...@luon.net, AIM: proycon, YAHOO: proycon
> MSN: proy...@anaproy.nl
> --------------------------------------------------------------------------
Reply all
Reply to author
Forward
0 new messages