can not find translation

15 views
Skip to first unread message

xiaodong

unread,
Jun 9, 2011, 10:37:34 PM6/9/11
to jane Users
hi,
when I run the decoder , I find that somtimes the translation hpy
can't be found , e,g, the second sentence:
[9/6/2011 8:54:04 Jane] Read sentence 1: 这 两 个 赛跑者 几乎 是 同时 到达 终点线 。
[9/6/2011 8:54:04 Jane.CubeGrow] Best translation found with costs
[s2t 24.4871 t2s 29.1941 ibm1s2t 19.082 ibm1t2s 23.5744 phrasePenalty
13 wordPenalty 10 s2tRatio 8 t2sRatio 8 isHierarchical 5 isPaste 5
glueRule 6 LM 27.2763 Total -5.65659]
[9/6/2011 8:54:04 Jane] Translation of segment 1: it the a UNKNOWN_赛跑者
it s is plant UNKNOWN_终点线 the
[9/6/2011 8:54:04 Jane] Read sentence 2: 那 两 个 老人 把 这个 年轻人 当做 他们 的 亲生 儿
子 。
[warning 9/6/2011 8:54:04 Jane.CubeGrow] No parse found
[9/6/2011 8:54:04 Jane] Read sentence 3: 双方 宣布 停战 以 避免 再 有 伤亡 。
[9/6/2011 8:54:04 Jane.CubeGrow] Best translation found with costs
[s2t 14.6322 t2s 17.6176 ibm1s2t 11.0718 ibm1t2s 13.9163 phrasePenalty
11 wordPenalty 9 s2tRatio 5 t2sRatio 5 isHierarchical 3 isPaste 3
glueRule 9 LM 17.8082 Total -3.65283

how to avoid the problem even if the whole sentence can not be parsed ?

David Vilar

unread,
Jun 10, 2011, 4:31:54 AM6/10/11
to jane-...@googlegroups.com
Hi!

This is due to the phrase table not having coverage for the whole
corpus. This can happen for example if you have some rule

X # a b c # v w x

and you don't have a rule for say b alone. Then, if at translation time you
encounter b in another context, e.g.

... d e b d ...

jane is not able to parse the source sentence. Note that it is different in
other decoders, they would probably handle b as an unknown word in this
case.
Jane does not consider b to be unknown, as it was encountered before.

In jane we took the decision of dealing with it in the phrase table,
controlled
mainly by the standard.nonAlignHeuristic and standard.swHeuristic.
Please check
if you have set them in your extraction options. A description of what
they do
can be found in Section 2.1 of

Daniel Stein, Stephan Peitz, David Vilar and Hermann Ney. "A Cocktail of
Deep
Syntactic Features for Hierarchical Machine Translation". In Conference
of the
Association for Machine Translation in the Americas 2010. Denver,
Colorado, USA,
October 2010

This is also one reason why the phrase table of jane can be bigger than one
extracted with another tool.

Hope this helps,

David

On 06/10/2011 04:37 AM, xiaodong wrote:
> hi,
> when I run the decoder , I find that somtimes the translation hpy
> can't be found , e,g, the second sentence:

> [9/6/2011 8:54:04 Jane] Read sentence 1: �� �� �� ������ ���� �� ͬʱ ���� �յ��� ��


> [9/6/2011 8:54:04 Jane.CubeGrow] Best translation found with costs
> [s2t 24.4871 t2s 29.1941 ibm1s2t 19.082 ibm1t2s 23.5744 phrasePenalty
> 13 wordPenalty 10 s2tRatio 8 t2sRatio 8 isHierarchical 5 isPaste 5
> glueRule 6 LM 27.2763 Total -5.65659]

> [9/6/2011 8:54:04 Jane] Translation of segment 1: it the a UNKNOWN_������
> it s is plant UNKNOWN_�յ��� the
> [9/6/2011 8:54:04 Jane] Read sentence 2: �� �� �� ���� �� ��� ������ ���� ���� �� ���� ��
> �� ��


> [warning 9/6/2011 8:54:04 Jane.CubeGrow] No parse found

> [9/6/2011 8:54:04 Jane] Read sentence 3: ˫�� �� ͣս �� ���� �� �� ���� ��


> [9/6/2011 8:54:04 Jane.CubeGrow] Best translation found with costs
> [s2t 14.6322 t2s 17.6176 ibm1s2t 11.0718 ibm1t2s 13.9163 phrasePenalty
> 11 wordPenalty 9 s2tRatio 5 t2sRatio 5 isHierarchical 3 isPaste 3
> glueRule 9 LM 17.8082 Total -3.65283
>
> how to avoid the problem even if the whole sentence can not be parsed ?


--
David Vilar Torres
DFKI GmbH, Alt-Moabit 91c, 10559 Berlin
Tel. (+49) 30 238 95 1845

--------------- Legal Note ---------------
Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
Firmensitz: Trippstadter Strasse 122, D-67663 Kaiserslautern
Geschaeftsfuehrung: Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster
(Vorsitzender), Dr. Walter Olthoff
Vorsitzender des Aufsichtsrats: Prof. Dr. h.c. Hans A. Aukes
Amtsgericht Kaiserslautern, HRB 2313

Reply all
Reply to author
Forward
0 new messages