cdec update: new devset format with MERT, new website

16 views
Skip to first unread message

Chris Dyer

unread,
Nov 11, 2012, 9:55:22 PM11/11/12
to cdec-...@googlegroups.com
Hello cdec users!

This is just a quick announcement about a non-backward compatible
change that has been committed to cdec. Going forward, cdec parameter
optimization tools (MERT, PRO etc.) will no longer accept devsets
specified with parallel source/reference files. Instead, source
segments and their reference translations must be provided in a single
line format. This format has the source sentence and then any number
of references following it, separated with a triple pipe (|||). Such
files can be constructed from parallel files using the
corpus/paste-files.pl command included in cdec.

I prefer to avoid non-backward compatible changes, but I'm trying to
simplify code and processing pipelines by getting rid of parallel
files, which will make things simpler to maintain going forward. My
apologies for any inconvenience.

Finally, cdec has a new website at the same location:
http://www.cdec-decoder.org -- check it out!

Best,
Chris
Reply all
Reply to author
Forward
0 new messages