Different Constituent Trees

Bob Houston

未讀,

2011年6月16日下午6:09:202011/6/16

收件者：link-g...@googlegroups.com、Bob Houston

We have a system that uses an old version of the parser (4.1b) and I am interested in migrating to the current version. However, I am seeing a difference in the constituent trees being generated by the old and new versions, which is having a negative impact on the system I wish to upgrade. The following example illustrates the difference.

Example sentence: The captain greeted the passengers and the flight attendants.
Version 4.7.4 constituent tree: (S (NP The captain) (VP greeted (NP the passengers and the flight attendants)) .)
Version 4.1b constituent tree: (S (NP The captain) (VP greeted (NP (NP the passengers) and (NP the flight attendants))) .)

Our system requires the additional noun phrases that are included in the 4.1b constituent tree. Is there an option, or other config parameter, that will cause the new version to generate a constituent tree like version 4.1b?

TIA,
Bob

Stuti Ajmani

未讀,

2011年6月17日凌晨12:25:322011/6/17

收件者：link-g...@googlegroups.com

Any idea on finding the probability of a constituent tree?

I am planning to use the CFG approach for the same, i.e probability of a parse = product of probabilities of all its rules.

Any idea about it's implementation?

--
You received this message because you are subscribed to the Google Groups "link-grammar" group.
To post to this group, send email to link-g...@googlegroups.com.
To unsubscribe from this group, send email to link-grammar...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/link-grammar?hl=en.

Linas Vepstas

未讀,

2011年6月17日上午11:57:312011/6/17

收件者：link-g...@googlegroups.com、Bob Houston

Sigh. This is a bug, introduced at the time that "fat links" were deprecated.
In other words, I completely forgot about constituent trees when removing
the fat links, and clearly, this will require more fixing.

The last version in the 4.6.x series should work fine for you; the move to
avoid fat links doesn't happen till 4.7.0 (let me know if its still
broken in 4.6.x)

While I'd like to promise a quick fix, I'm afraid I can't; most likely it will
be several months before I can get a chance to do something about this.
Sorry.

--linas.

p.s. what is your application? It's nice to know who cares about this stuff ...

Linas Vepstas

未讀,

2011年6月17日中午12:05:022011/6/17

收件者：link-g...@googlegroups.com

On 16 June 2011 23:25, Stuti Ajmani <stuti...@iiitd.ac.in> wrote:
> Any idea on finding the probability of a constituent tree?
> I am planning to use the CFG approach for the same, i.e probability of a
> parse = product of probabilities of all its rules.

For link-grammar, I think you'd want to say: "the probability of a
parse = product of the probabilities of the links". The ranking database
just stores log-2 of the probabilities, and log of product = sum of logs.

BTW, are you using the database on a big-endian machine (i.e.
a non-intel box, e.g. powerpc or sparc? that might explain why
it won't work for you ... I've run out of ideas as to the source of
the problem.)

> Any idea about it's implementation?

The constituent tree is obtained by applying some fairly simple rules
after the link-parse is done. Source code in constituents.c

--linas

回覆所有人

回覆作者

轉寄