cube-growing hypothesis recombination

wenduan

unread,

Jun 11, 2011, 1:01:57 PM6/11/11

to jane Users

Hi!

I have some questions arise while trying to understand cube-growing in
jane.

Dose cube-growing in jane do hypothesis recombination for sub-spans in
the top-down +LM pass? Does it do any threshold pruning for each sub-
span as in cube-pruning?

Does it ever sort the hypotheses generated for a sub-span in the top-
down +LM pass? If it does, will it affect the (k-1)th best that has
already been produced for a span?

I'm assuming that in cube-growing that once a hypothesis is enumerated
for a node, e.g., it is the first-best for a node according to the
heuristic bound, then even if a better node comes by afterwards (after
we enumerate more hypotheses), we will still used the first generated
first-best. Am I understanding it correctly?

thanks in advance.
wenduan

wenduan

unread,

Jun 14, 2011, 5:33:25 AM6/14/11

to jane Users

anyone can comment on this?

wenduan

David Vilar

unread,

Jun 14, 2011, 6:14:08 AM6/14/11

to jane-...@googlegroups.com

Hi!

On 06/14/2011 11:33 AM, wenduan wrote:
> anyone can comment on this?

Sorry for the delay, but there were holidays in Germany :)

> On Jun 11, 6:01 pm, wenduan<xuwend...@gmail.com> wrote:
>> Dose cube-growing in jane do hypothesis recombination for sub-spans in
>> the top-down +LM pass? Does it do any threshold pruning for each sub-
>> span as in cube-pruning?

Yes for both questions. The corresponding parameters are
CubeGrow.recombination and CubeGrow.CGBufferThreshold. Note that
threshold pruning is deactivated by default.

>> Does it ever sort the hypotheses generated for a sub-span in the top-
>> down +LM pass? If it does, will it affect the (k-1)th best that has
>> already been produced for a span?

It doesn't sort the hypotheses once they have been selected as "final".
That means that the n-best lists generated at each node are not
necessarily sorted by scores. Re-sorting the hypotheses would also
alter the hypotheses in the nodes above the current one, which already
addressed the generated hypotheses.

>> I'm assuming that in cube-growing that once a hypothesis is enumerated
>> for a node, e.g., it is the first-best for a node according to the
>> heuristic bound, then even if a better node comes by afterwards (after
>> we enumerate more hypotheses), we will still used the first generated
>> first-best. Am I understanding it correctly?

Yes, this is correct. You can use the parameter
CubeGrow.minCGBufferSize to enforce a minimum size of the intermediate
buffer before popping hypotheses out in order to countereffect this, but
in preliminary experiments it didn't seem to bring any improvements.

Best regards,

David

--
David Vilar Torres
DFKI GmbH, Alt-Moabit 91c, 10559 Berlin
Tel. (+49) 30 238 95 1845

--------------- Legal Note ---------------
Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
Firmensitz: Trippstadter Strasse 122, D-67663 Kaiserslautern
Geschaeftsfuehrung: Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster
(Vorsitzender), Dr. Walter Olthoff
Vorsitzender des Aufsichtsrats: Prof. Dr. h.c. Hans A. Aukes
Amtsgericht Kaiserslautern, HRB 2313

Reply all

Reply to author

Forward