Convert bilingual JSON to bilingual XLIFF

58 views
Skip to first unread message

Manuel Souto Pico

unread,
May 31, 2022, 8:02:49 AM5/31/22
to okapi-users
Dear all,

I'm trying to convert a bilingual JSON file to a bilingual XLIFF.

My input is something like this:

[
{
"id": 1,
"source_text": "Hello world",
"target_text": "Hallo Welt!"
},
{
"id": 2,
"source_text": "Foo",
"target_text": "Bar"
}
]

and I trying to obtain something like this:

<trans-unit id="sg2_sf1_tu1" resname="source_text_1">
<source xml:lang="en">Hello world</source>
<target xml:lang="de">Hallo Welt!</target>
</trans-unit>
<trans-unit id="sg3_sf2_tu1" resname="source_text_1">
<source xml:lang="en">Foo</source>
<target xml:lang="de">Bar</target>
</trans-unit>

I have tried creating a Rainbow project with the ID-based Copy step and two custom filters:

- On Input List 1, I add the JSON file and extract the 'source_text' key, and match the 'id' key for the resname.
- On Input List 1, I add the JSON file again and extract the 'target_text' key, and match the 'id' key for the resname.

I would assume the 'id' key is what would allow Rainbow to copy the target text of the second input and use it as the target of the first input based on matching id. However, I get this

=== Start process
Input: /home/pico/Sync/IEA/JSON/file.json
WARNING: Duplicate id detected: target_text_1
WARNING: Id 'target_text_1' is in the second file, but not in the main input.

Error count: 0, Warning count: 2
Process duration: 0h 0m 0s 982ms
=== End process

and the XLIFF file produced does not include the target text of the second input as the translation.

How can I manage to achieve this?

I'm attaching the filters, project and source files.
Thanks in advance for any tips.

Cheers, Manuel
okf_json@tgtLang.fprm
file.json
okp_json_prep.rnb
okf_json@srcLang.fprm

Manuel Souto Pico

unread,
Jun 3, 2022, 8:32:39 AM6/3/22
to okapi-users
Hi there,

Could someone let me know if at least I'm barking at the right tree or not?

Cheers, Manuel

Álvaro Mira del Amo

unread,
Jun 5, 2022, 10:29:08 AM6/5/22
to okapi-users
Hi Manuel,

Would you mind updating your json content so your 'id' keys have strings in their values?
If you do so, then the id values you use in your file filters will be correctly utilised and your resulting xlf file will be what you expect. Attaching the copy I obtained, where the resname is actually the ID you specified in the filter definitions.

Best,
A.
manuel.json.xlf

Manuel Souto Pico

unread,
Jun 7, 2022, 6:37:20 AM6/7/22
to Álvaro Mira del Amo, okapi-users
Thank you so much for your tip, Álvaro.

If I understand correctly, I should change "id": 1001 to "id": "1001". That's a feasible change on my end.

However, I still don't get your results. My target is still empty: <target xml:lang="de"></target>

I suppose you didn't use the rnb file that I sent. Could you please share yours, to see what is different? Same for the filters.

In case it matters I'm using Okapi Rainbow 1.42.0 with Java 17.0.3 on Linux 5.16.

Thank you so much!

Cheers, Manuel

--
You received this message because you are subscribed to the Google Groups "okapi-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to okapi-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/okapi-users/cafaa109-b20d-4805-aeec-9bdab4d9b9e2n%40googlegroups.com.

Manuel Souto Pico

unread,
Jun 7, 2022, 6:50:16 AM6/7/22
to Álvaro Mira del Amo, okapi-users
Update:

My apologies, I was using the off-the-shelf translation kit creation rather than my custom pipeline. I can confirm it works!

Cheers, Manuel
Reply all
Reply to author
Forward
0 new messages