lattices with EPSILON

22 views
Skip to first unread message

Hieu Hoang

unread,
Oct 4, 2013, 8:52:30 AM10/4/13
to cdec-...@googlegroups.com, joshua_t...@googlegroups.com, moses-support
I'm just looking at the lattices decoding, as implemented in moses.

for confusion networks, it's fair to have EPSILON words (that represent blank words). However, I don't see the point of them in lattices.

Anyone have an opinion? How is it implemented in cdec & joshua?

--
Hieu Hoang
Research Associate
University of Edinburgh
http://www.hoang.co.uk/hieu

Chris Dyer

unread,
Oct 4, 2013, 12:49:31 PM10/4/13
to Nicola Bertoldi, Hieu Hoang, moses-support, <cdec-users@googlegroups.com>, <joshua_technical@googlegroups.com>
It's useful to have epsilons since it simplifies the creation of
lattices in some cases. Yes, you can convert them to a deterministic
equivalent, but that involves implementing FSA determinatization (or
using a tool like https://pypi.python.org/pypi/pyfst), which may not
be convenient.

Btw, I've also noticed that memory usage with lattices/CNs explodes
with non-binarized phrase tables (maybe also with binarized PTs?).
This is independent of the size of the phrase table and only seems to
be a function of the lattice structure. I'm not sure what's going on
(the code has changed substantially since I last looked at it). But,
you should always match paths in the lattice with paths in the phrase
table trie- maybe moses is now trying to extract all possible paths in
the lattice up to max-phrase-size or something?

On Fri, Oct 4, 2013 at 11:22 AM, Nicola Bertoldi <bert...@fbk.eu> wrote:
> I don't see any reason why a lattice should contain an EPSILON edge.
>
> In a confusion network, EPSILON are needed to allow the translation of input of different lengths.
> The sausage structure of the CN imposes the same amount of source words,
> and the EPSILONs overcome this constraint.
>
> This is not the case for lattice, because you can have any number of edges/words in a complete source path.
>
>
> cheers,
> Nicola
> _______________________________________________
> Moses-support mailing list
> Moses-...@mit.edu<mailto:Moses-...@mit.edu>
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-...@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support

Hieu Hoang

unread,
Oct 4, 2013, 4:02:42 PM10/4/13
to cdec-...@googlegroups.com, Nicola Bertoldi, moses-support
@nicola - i didn't see a reason either but some lattices from a speech recognizer contains them so was just curious. I think chris has a point - they may be easier to create.

I think they may also more efficient to decode. In a non-deterministic lattice, you might have the 2 edges with the same symbol coming out of 1 node. Each would have to be decoded separately.

However, its a pain to decode epsilons and there might be weird edge cases, eg. consecutive, beginning and end epsilons, entirely epsiloms.

@chris - cheers for the explanation. i might use victor's code and see how it goes.

Do you have an example (large) lattice that blows up memory that you can share?

Yes - i've changed the code to extract all possible paths. In fact, i extract all paths from beginning to end of sentence, without limit. 2 reasons for this
   1. I also divorced extracting the path creation from the phrase-table lookup. In the general case there's multiple phrase-tables so it's difficult to keep track of the tries. Also, the intertwinning of the binary pt loookup with lattices made it difficult to read.
   2. I want to give each feature function the opprtunity to score with full knowledge of the path.

This may have to be altered if the memory explosion is too drastic




--
You received this message because you are subscribed to the Google Groups "cdec users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cdec-users+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hieu Hoang

unread,
Oct 7, 2013, 6:54:51 AM10/7/13
to Ondrej Bojar, Liane Kirsten GUILLOU, luli...@gmail.com, Alexandra Birch, cdec-...@googlegroups.com, moses-support
@ondrej - Yes, Yulia's lattices look like confusion networks in disguised so there will be a large number of paths through the lattice.

the memory explosion is due to my code creating an object for every path. It was mainly for the reason mention previously above, ie:

   I want to give each feature function the opprtunity to score with full knowledge of the path.

However, the old binary phrase-table doesn't require these objects to do the lookups. Therefore, to enable Yulia and anyone else to decode large lattices, my code will not run when
   1. decoding lattice/confusion networks, AND
   2. using the old binary phrase table.

@Liang - thanks for the suggestions. I'm not sure how our lattice were created. Lexi knows

thanks for all who responded, was very useful.



On 4 October 2013 22:20, Ondrej Bojar <bo...@ufal.mff.cuni.cz> wrote:
Hi,

while you can always run rmepsilon from openfst or other toolkit, epsilon edges will be probably particularly useful if one would use different semirings for different components of the score vector. With generic toolkits, all the components of the score vector are processed in a single manner. Depending on whether Moses features do the "plus" of their respective scores on their own, each feature can use its own semiring.

The probably (in some sense) maximal explosion in the number of paths is achieved when the lattice has the form of a confusion network (no epsilons). You get the full cartesian product of choices of the first token, the second token etc.

Cheers, Ondrej.
--
Ondrej Bojar
http://www.cuni.cz/~obo

_______________________________________________
Moses-support mailing list
Moses-...@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support



--
Reply all
Reply to author
Forward
0 new messages