cdec update: MERT & PRO documentation and interface change

21 views
Skip to first unread message

Chris Dyer

unread,
Nov 11, 2011, 2:27:47 PM11/11/11
to cdec-...@googlegroups.com
If you use MERT and/or PRO with cdec, or would like to, this message
pertains to you.

Over the past couple of weeks, I've made a few small changes to cdec
that should make it easier to run MERT and/or PRO training in more
places and with better outcomes. First, there is finally some "getting
started" documentation available:
http://www.cdec-decoder.org/index.php?title=How_to_run_MERT_or_PRO
It does assume you know how to configure cdec with a translation and
language model (hopefully one day there will be documentation on that,
too). There is also better (and more correct) documentation in the
command line tools themselves.

Second, and this is of particular importance for existing users: you
MUST specify --qsub on the command line if you want dist-vest.pl or
dist-pro.pl to submit jobs using qsub. Otherwise, it will run them
locally. If cdec doesn't know how qsub is configured in your
environment, it will give you an error.

Third, if you've been using PRO, I've recently changed the semantics
of the regularization parameters from something like a "variance" to
something like a "penalty", i.e., the higher you make them, the more
you regularize. If you have tuned these values in the past, you will
need to retune them.

If you haven't updated recently, you are strongly encouraged to do so.
Cdec is faster than ever, uses less memory than ever, and has more
scoring functions (feature functions) than ever!

-Chris

Joern Wuebker

unread,
Dec 20, 2011, 8:43:09 AM12/20/11
to cdec users
Hi Chris and cdec-Fans!

I would like to try running MERT with cdec, but the above link seems
to be broken. Is the documentation still available somewhere else?
My thanks to Hieu for the recomiled LM to make the example work.
Further, I am wondering whether cdec provides wrapper scripts for
distributed decoding in an SGE?

Thanks,
Joern

Chris Dyer

unread,
Dec 20, 2011, 5:40:43 PM12/20/11
to cdec-...@googlegroups.com
Hello Joern!
MERT will run with cdec. I'll get the domain re-registered. Hang on. :)
Chris

Chris Dyer

unread,
Dec 20, 2011, 11:37:21 PM12/20/11
to cdec-...@googlegroups.com
Joern & others,
The cdec documentation has been restored.

Please see the instructions for running MERT here:
http://www.cdec-decoder.org/index.php?title=How_to_run_MERT_or_PRO

If you have qsub set up, or large, shared-memory machines, it should
be relatively easy to get your environment to run the decoder and
optimizer in parallel. Let me know if you have any questions.

Chris

Joern Wuebker

unread,
Dec 21, 2011, 10:07:48 AM12/21/11
to cdec users
Great, thanks!
One last question for this year: Does cdec support a binary grammar
format? Or do you use other tricks to deal with large scfg tables?

Merry Xmas !

Chris Dyer

unread,
Dec 21, 2011, 11:32:28 AM12/21/11
to cdec-...@googlegroups.com
I've found that per-sentence grammars are the best way to deal with
large models. Extract all the rules necessary to translate just one
sentence and put them in a file. You can mark up the input as follows:

<seg grammar="/path/to/grammar.sent0.gz" id="0"> sentence 0 input </seg>

This is a little different than how other decoders do it, and it means
you can't do online translation of arbitrary inputs, but it is much
faster and less memory intensive than other options.

-Chris

Reply all
Reply to author
Forward
0 new messages