Does Google Charge for HTML Tags in Input for Translation?

1,333 views
Skip to first unread message

Aniket Bezalwar

unread,
Apr 26, 2018, 3:49:24 PM4/26/18
to Google Cloud Translation API
Hi,

I am currently writing a html parser to generate a translation input of length 'x' and send it to translation API. I was wondering if I can keep the inline elements like <span>
in input string. 
For example, can I send an input string as  "Company X earned a profit of <span> $100000 </span> in year <span>2017</span>"?
The reason I want to keep this span elements is because the attributes of this span elements contain offsets for source linking (which should not be changed) and we want them to be around the same words in the output. 
In this case, does google do some pre processing to ignore the tags and consider only the text "Company X earned a profit of $100000 in year 2017" and then put those tags back in the output at correct positions looking at the context? If not, how does does google translation handles tags ?

Also would google charge for this inline tags in the input? 

Thanks for your help !

Aniket Bezalwar

unread,
Apr 26, 2018, 4:52:42 PM4/26/18
to Google Cloud Translation API
Or as another approach I can strip the tags before sending the string for translation, but is there any mapping metadata I can get along with the traslated output indicating words mapping, so that I can add those tags back in the output ?

Jordan (Cloud Platform Support)

unread,
Apr 27, 2018, 3:12:33 PM4/27/18
to Google Cloud Translation API
You can find details about pricing in the documentation. You are indeed charged per character sent to the API for processing, including whitespace characters. Empty queries are charged for one character. 

Concerning HTML, the Translation API model attempts to understand HTML so as to not alter it and only translate the content text contained within. Of course this is best-effort and there is no guarantee that it will always return the same HTML format. I would recommend using the free 'Try the API' link in the documentation to test out some HTML translations to see their results. You should also report any translation errors you find to the engineering team via their Public Issue Tracker. 

As for the mapping of text and translation, as you can see via the same 'Try the API' link the response returned from the Translation API only contains the translation. The actual text that was requested to be translated can be seen in the actual request sent. Therefore it is up to the client that is using the Translation API to remember the text that was requested to be translated, in order to perform its own manual mapping. 

Aniket Bezalwar

unread,
Apr 30, 2018, 11:59:26 AM4/30/18
to Google Cloud Translation API
Thanks Jordan for your quick reply.
By mapping metadata, I meant is there any information received with the output which kind of indicates which word in the input was moved to which part of the output. 
I saw reference for such mapping on the response of this stack overflow thread. 
For instance If I strip out the span tag around word 'X' in input , and 'X' is converted to 'Y' in output, is there a way to know that 'Y'  is the word which I need to enclosed in the same span tag. 

Thanks!

Jordan (Cloud Platform Support)

unread,
Apr 30, 2018, 3:44:10 PM4/30/18
to Google Cloud Translation API
As that StackOverflow post was made in 2012 it does not reflect the current behavior. You can find all of the information returned by the Translation API in the documentation. Currently the actual translated text, the module used for translation, and the detected source language are returned. 

Therefore if you do choose to remove the HTML before translating, you should keep an array of each string to be translated, and translate each one individually. This way you are able to map each translation to each specific string, and position the translated text into your HTML accordingly. 

- Note that Google Groups is reserved for general product discussions. If you require technical support for implementing the above it is recommended to post your detailed questions to Stack Exchange using the supported Cloud tags.  

Aniket Bezalwar

unread,
Apr 30, 2018, 4:37:11 PM4/30/18
to Google Cloud Translation API
Thanks Jordan for detailed response. At this point, I have all the information to work on my parser. If I have any technical questions, I will post them on the stack exchange. 
Reply all
Reply to author
Forward
0 new messages