I am proud to announce that the new version Jane 2.0 has finally been
released.
In addition to the hierarchical phrase-based paradigm it now fully
supports standard phrase-based extraction, decoding and forced
alignment phrase training.
Further additions include, but are not limited to:
* extraction
- introduced phrase-based extraction
- observation histogram pruning for rule tables
* tools
- PhraseFeatureAdder: adds additional phrase-level feature scores to
existing phrase tables
* features
- flexible count features
- unaligned word count features
- insertion and deletion models
- binary feature for reordering hierarchical rules
- different scoring methods for word lexicon models
(length-normalized, noisy-or, Moses-style)
- regularized IBM Model 1
- precomputation of source-to-target and target-to-source phrase-
level triplet
lexicon and discriminative word lexicon scores
* translation
- faster hierarchical decoding (~ 10%-20%)
- lexicalized discriminative reordering model for hierarchical
translation
(tools for training of the model are not included in Jane)
- now capable of phrase-based decoding (scss and fastScss decoders)
- forced alignment phrase training (phrase-based) with leave-one-out
&
cross-validation
* dependency
- dependency language model is now integrated into the decoder
Download and enjoy!
Best wishes,
Joern