Different Constituent Trees

瀏覽次數:12 次
跳到第一則未讀訊息

Bob Houston

未讀,
2011年6月16日 下午6:09:202011/6/16
收件者:link-g...@googlegroups.com、Bob Houston
We have a system that uses an old version of the parser (4.1b) and I am interested in migrating to the current version. However, I am seeing a difference in the constituent trees being generated by the old and new versions, which is having a negative impact on the system I wish to upgrade. The following example illustrates the difference.

Example sentence: The captain greeted the passengers and the flight attendants.
Version 4.7.4 constituent tree: (S (NP The captain) (VP greeted (NP the passengers and the flight attendants)) .)
Version 4.1b constituent tree: (S (NP The captain) (VP greeted (NP (NP the passengers) and (NP the flight attendants))) .)

Our system requires the additional noun phrases that are included in the 4.1b constituent tree. Is there an option, or other config parameter, that will cause the new version to generate a constituent tree like version 4.1b?

TIA,
Bob

Stuti Ajmani

未讀,
2011年6月17日 凌晨12:25:322011/6/17
收件者:link-g...@googlegroups.com
Any idea on finding the probability of a constituent tree?
I am planning to use the CFG approach for the same, i.e probability of a parse = product of probabilities of all its rules.
Any idea about it's implementation?


--
You received this message because you are subscribed to the Google Groups "link-grammar" group.
To post to this group, send email to link-g...@googlegroups.com.
To unsubscribe from this group, send email to link-grammar...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/link-grammar?hl=en.


Linas Vepstas

未讀,
2011年6月17日 上午11:57:312011/6/17
收件者:link-g...@googlegroups.com、Bob Houston

Sigh. This is a bug, introduced at the time that "fat links" were deprecated.
In other words, I completely forgot about constituent trees when removing
the fat links, and clearly, this will require more fixing.

The last version in the 4.6.x series should work fine for you; the move to
avoid fat links doesn't happen till 4.7.0 (let me know if its still
broken in 4.6.x)

While I'd like to promise a quick fix, I'm afraid I can't; most likely it will
be several months before I can get a chance to do something about this.
Sorry.

--linas.

p.s. what is your application? It's nice to know who cares about this stuff ...

Linas Vepstas

未讀,
2011年6月17日 中午12:05:022011/6/17
收件者:link-g...@googlegroups.com
On 16 June 2011 23:25, Stuti Ajmani <stuti...@iiitd.ac.in> wrote:
> Any idea on finding the probability of a constituent tree?
> I am planning to use the CFG approach for the same, i.e probability of a
> parse = product of probabilities of all its rules.

For link-grammar, I think you'd want to say: "the probability of a
parse = product of the probabilities of the links". The ranking database
just stores log-2 of the probabilities, and log of product = sum of logs.

BTW, are you using the database on a big-endian machine (i.e.
a non-intel box, e.g. powerpc or sparc? that might explain why
it won't work for you ... I've run out of ideas as to the source of
the problem.)

> Any idea about it's implementation?

The constituent tree is obtained by applying some fairly simple rules
after the link-parse is done. Source code in constituents.c

--linas

回覆所有人
回覆作者
轉寄
0 則新訊息