I want to build a solution that should process batch documents which are in multiple languages. As per the Google documentation, it accepts list of documents but the source language code is mandatory and it accepts only one language code.
I want the application to be global and translate the batch documents auto detecting the language. Please suggest if there is any alternative to perform this action using cloud Translation API.
Below is the code from Google Documentation
from google.cloud import translate_v3beta1 as translate
def batch_translate_document( input_uri: str, output_uri: str, project_id: str, timeout=180):
client = translate.TranslationServiceClient()
# The ``global`` location is not supported for batch translation
location = "us-central1"
# Google Cloud Storage location for the source input. This can be a single file
# (for example, ``gs://translation-test/input.docx``) or a wildcard
# (for example, ``gs://translation-test/*``).
gcs_source = {"input_uri": input_uri}
batch_document_input_configs = {
"gcs_source": gcs_source,
}
gcs_destination = {"output_uri_prefix": output_uri}
batch_document_output_config = {"gcs_destination": gcs_destination}
parent = f"projects/{project_id}/locations/{location}"
request={
"parent": parent,
"source_language_code": "en-US",
"target_language_codes": ["fr-FR"],
"input_configs": [batch_document_input_configs],
"output_config": batch_document_output_config,
}
)
print("Waiting for operation to complete...")
response = operation.result(timeout)
print("Total Pages: {}".format(response.total_pages))