Added and dropped characters around notranslate segments

92 views
Skip to first unread message

Lonnie Warpup

unread,
Apr 21, 2016, 11:03:33 AM4/21/16
to Google Translate API Developer Forum
When I pass some phrases with notranslate segments to the translate API it appears that the results are containing extra spacing and missing trailing parenthesis. Take the following example.

Hello (<span class="notranslate">%{my_var}</span>), welcome!

API call:

https://www.googleapis.com/language/translate/v2?key=MYKEY&q=Hello%20(%3Cspan%20class=%22notranslate%22%3E%25%7Bmy_var%7D%3C/span%3E),%20welcome!&source=en&target=ja 

Response:

{ "data": { "translations": [ { "translatedText": "こんにちは( \u003cspan class=\"notranslate\"\u003e%{my_var}\u003c/span\u003e大歓迎!" } ] } }

Resulting phrase:

こんにちは( <span class=\"notranslate\">%{my_var}</span>大歓迎!

Notice the missing trailing parenthesis after the notranslate SPAN.  Also notice the added space (\u0020) after the first parenthesis which I would do not think is part of the actual translation because I think a space in the Japanese character set is the \u3000 character instead.

When I translate without notranslate markup using https://translate.google.com I get much better results:

Hello (%{my_var}), welcome!

I get:

こんにちは(%{MY_VAR})、大歓迎!

Am I doing something incorrect or can someone explain the reasoning for the differences?

Lonnie Warpup

unread,
Apr 21, 2016, 12:01:54 PM4/21/16
to Google Translate API Developer Forum
I just noticed that the API is also removing the subsequent comma after the trailing parenthesis. My production phrases didn't have a comma after the parenthesis so I hadn't seen the omission until I created this simple phrase for testing.

Zeehad (Cloud Platform Support)

unread,
Apr 22, 2016, 4:55:52 PM4/22/16
to Google Translate API Developer Forum
The following format seems to work better for your example:

Hello<span class="notranslate"> (%{my_var}),</span>welcome!

The translated text:

"translatedText": "こんにちは<span class=\"notranslate\">(%{my_var}),</span>大歓迎!"

If you see spaces stripped/inserted after using this HTML tag, the behavior should be consistent at least with the language. You can process the text post-translation to make it the way your application requires.

Also, If you have any issues with the platform not working according to what you understand should be the behavior after reading documentation, please post to the Public Issue Tracker, with steps that can reliably reproduce the behavior, and as much details as possible.

I hope that helps. Cheers!

Lonnie Warpup

unread,
Jun 8, 2016, 5:57:20 PM6/8/16
to Google Translate API Developer Forum
Hello again.

I followed your instructions and posted to the Public Issue Tracker (Issue 55) back in April, and it appeared as if the bug was acknowledged however everything has stalled. No replies to my inquiries there, nor any updates on what I would think is a pretty severe bug (a lost of characters as well as incorrect characters being added).  Is there somewhere else I need to be reporting this, or can you help get this escalated?

Thank you.
Reply all
Reply to author
Forward
0 new messages