Using placeholders instead of HTML tags

246 views
Skip to first unread message

Aniket Bezalwar

unread,
May 15, 2018, 1:13:23 PM5/15/18
to Google Cloud Translation API
Hi ,

I am trying to translate HTML text using Translation API. Now, the html tags have lots of attributes that need not be translated. So I was planning to replace this tags by placeholders , maintain placeholder - tag mapping on my end and replace placeholder with actual tags after receiving the translation output. The problem is though it works in most of the cases, in some case the translated text is not correct.

For instance,

Actual HTML text - 

<nobr o="7aM"><span o="7af" class="ft44">portfelami wierzytelności o łącznej wartości nominalnej <span id="~7bX_" l="4" v="17.1" bbox="1;54;328;541;318">17,1</span> mld zł, na dzień <span id="~7bt_" l="2" v="30" bbox="1;54;328;541;318">30</span> czerwca <span id="~7c2_" l="4" v="2017" bbox="1;54;328;541;318">2017</span> roku wartość nominalna</span></nobr>

Text with placeholders  - 

<1><2>portfelami wierzytelności o łącznej wartości nominalnej <3>17,1<3> mld zł, na dzień <4>30<4> czerwca <5>2017<5> roku wartość nominalna<2><1>

Translated Text  (Source Language - polish , Target - English)- 
<1> <2> receivables portfolios with a total nominal value of <3> 17.1 <3> billion PLN, as at <<4> 30 <4> June <5> 2017 <5> nominal value <2> <1>

If you see the yellow highlighted part, its an additional angle bracket received in output. It works well in many cases, for instance if I remove the <1>.. <1> placeholders at both the end of string , it gives me correct string. In such problematic case, I am facing issues in reconstructing the actual tags from placeholders.

Can you please guide here? Is there any other placeholder that should be used? 

Note - I have tried many placeholders for various html text from different languages and I found <1> .. <1> works the best (but not perfect). The placeholder thought came in picture when I found that using the actual html tags,  the translation output didnt have these tags intact in many cases, specially when the tags are nested.

Thanks!

Kenworth (Google Cloud Platform)

unread,
May 15, 2018, 5:44:44 PM5/15/18
to Google Cloud Translation API

Hi bizz, a thread like this is off-topic for Google Groups. As it may potentially be a defect on the platform, I highly encourage you to submit a defect report as described in this article so that proper attention and weight will be given to it. We monitor that issue tracker closely.


We look forward to this issue report.


Reply all
Reply to author
Forward
0 new messages