Google Cloud Translation API outputs HTML with syntax errors

103 views
Skip to first unread message

Pedro Cristovão

unread,
Jun 9, 2022, 10:35:42 AM6/9/22
to Google Cloud Translation API

We're using Google's Cloud Translation API on Node to translate database stored HTML blog posts to around 20 different languages. The translated HTML has a variety of seemingly random HTML syntax errors.

Any idea why this is so and what (if anything) can be done to solve this issue?

As a reference, below are the links to a blog post in English (master source) and the same post translated to Dutch:

HTML master in English

HTML translated to Dutch

Thank you in advance.

Jose Gutierrez Paliza

unread,
Jun 10, 2022, 11:01:39 AM6/10/22
to Google Cloud Translation API

I see that your HTML is bringing a different structure when you on your source code that’s why you are seeing something like < /span>, It brings span as a text and the html brackets it brings them in ASCII code so it's bringing something like:

       -  &lt; /span&gt;


Instead of: 

      -   </span>.


Google have an optional parameter named ‘format’

This optional parameter allows you to indicate that the text to be translated is either plain-text or HTML. A value of "html" indicates HTML and a value of "text" indicates plain-text.

Use this as: format=html


https://cloud.google.com/translate/v2/using_rest

Reply all
Reply to author
Forward
0 new messages