zero-width space Unicode characters in translated response

329 views
Skip to first unread message

George Byrkit

unread,
Apr 7, 2017, 11:27:29 AM4/7/17
to Google Cloud Translation API
When I translate some certain English text to Italian, using the Translate API, the translated result contains two zero-width spaces in a row before the Italian word 'rilascio' (0x200B Unicode charactets).  They are preceded by a normal space character.  This occurs on the old and current V2 translate API URL (supporting or not supporting NLM.

Is there any particular reason that these characters are inserted?  What other non-displayable characters that could convert to a '?' when converted to Windows-1252 might be present?

I take it that it is up to me to strip out such characters for my purpose, or is this a flaw with the data stream of the translated text?

George Byrkit

unread,
Apr 7, 2017, 2:51:08 PM4/7/17
to Google Cloud Translation API
I meant NMT, not NLM.  Sorry.

George (Cloud Platform Support)

unread,
Apr 7, 2017, 3:00:21 PM4/7/17
to Google Cloud Translation API
Hello George, 

Using the API explorer, it is not possible to reproduce the issue here. All translations from English into Italian resulting in a text containing “rilascio” do not show any extra characters. 

It might prove helpful if you provide us with the exact English text you translated, or similar edited text, if confidential. 

Did you use the API explorer as well? How does your app access the translation API service? 

All related information, that you deem relevant, may help us in reproducing the issue. 

George Byrkit

unread,
Apr 7, 2017, 3:25:33 PM4/7/17
to Google Cloud Translation API
Thanks for your reply, George, and suggesting using the Translate API explorer.

So I used the API explorer and submitted the following (inside the quotes) to be translated to 'it' (Italian):
"Customers purchasing new licenses in the 90 days preceding the release of a new version will receive the updated version at no charge."
This generates the embedded character \u200b, twice in a row.

Request:

Response:
{
"data": {
"translations": [
{
"translatedText": "I clienti che acquistano nuove licenze nei 90 giorni precedenti il \u200b\u200brilascio di una nuova versione riceveranno la versione aggiornata senza alcun costo."
}
]
}
}

George (Cloud Platform Support)

unread,
Apr 10, 2017, 1:07:13 PM4/10/17
to Google Cloud Translation API
This time the error has been reproduced. You are encouraged to create an issue tracker entry at http://issuetracker.google.com/, as this issue deserves the direct attention of the Translate API developers. Posting there may speed up the research for an eventual resolution. 
Reply all
Reply to author
Forward
0 new messages