latest XLIFF filter for OmegaT: no backwards compatibility for tags?

44 views
Skip to first unread message

Manuel Souto Pico

unread,
Sep 19, 2023, 6:24:03 AM9/19/23
to okapi-users
Dear all,

I have a problem with the latest XLIFF filter for OmegaT (I'm referring to version 1.13-1.45.0).

A bit of brief background: we were using version 1.11-1.43, which was compatible with Java 8, in OmegaT 5.7.1, which runs only on Java 8 on Windows. We have translated a number of projects using those versions, which now are moving to the revision step.

In the meantime we have upgraded OmegaT (5.8) and Okapi versions (1.13-1.45.0) so as to be able to use the latest versions of both which include some bug fixes and enhancements we need. However, now many translations are broken in the projects which were translated using the previous version of the filter.

Here comes one example of many.

The source file has:

<?xml version="1.0" encoding="UTF-8"?>
<xliff version="1.2" xmlns="urn:oasis:names:tc:xliff:document:1.2">
  <file source-language="en-US" datatype="plaintext" original="ng2.template">
    <body>
      <trans-unit id="clients.clientFeatures.enableText" datatype="html">
        <source> Please confirm you would like to turn on <x id="INTERPOLATION" equiv-text="{{ togglingFeatureShortName }}" /> for this client. </source>
      </trans-unit>
    </body>
  </file>
</xliff>

Using plugin okapiFiltersForOmegaT-1.11-1.43.0.jar in OmegaT 5.7.1 (run on Java 8), that looked like this:
image.png
Now when we open the same project in OmegaT 6.0.0 (run on Java 11) with plugin okapiFiltersForOmegaT-1.13-1.45.0.jar, it looks like this:
image.png
As you can see, the translation has become an "orphan" translation, which means that it's still in the working TM of the problem but does not populate the segment because there's no exact match.

Please also notice the strange tag that produces in OmegaT: <x619392636/>, rather than the expected <x1/>. I have no clue where that long digit comes from.

I have tested the two plugins/filters in the same version of OmegaT (5.7.1) but running on Java 8 and Java 11. The problem is not in OmegaT, it's in the filter.

Would it be possible to revert the change that made the filter change XLIFF tags? Or if that's not possible, would it be possible to create some tool to reliably convert the TMs to transition from the old format to the new format?

Is there any ticket already covering this? I can't see any, shall I write one?

Thanks a lot.
Cheers, Manuel

Manuel Souto Pico

unread,
Sep 19, 2023, 6:26:18 AM9/19/23
to okapi-users
I forgot to attach a test project / file for testing purposes, attached now.
Thanks, Manuel
okapi_v11-v13_conflict_test_OMT.omt

Manuel Souto Pico

unread,
Sep 22, 2023, 12:50:18 PM9/22/23
to okapi-users
Dear all,

Some reply about this issue would be greatly appreciated.

At least, it would be good to understand the reason for these long tags e.g. <x619392636/>. Some developer must have made a change in the code of the latest version that makes this happen...

Thanks a lot.
Cheers, Manuel

Jimbo

unread,
Sep 22, 2023, 3:19:56 PM9/22/23
to Manuel Souto Pico, okapi-users

Hey Manuel!

I tested your xliff file on the latest dev code. Maybe there was a fix since 1.45.0 was released, but this is the output I get (attached). Note that the x code id is preserved as-is (<x id="INTERPOLATION").

Maybe there is some new OmegaT interaction with non-numeric id's? We did refactor the xliff filter to be more consistent with id's so this code has changed. We would just need to understand all the code that is touching your files. Again, I'm not familiar with OmegaT.

cheers,

Jim

--
You received this message because you are subscribed to the Google Groups "okapi-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to okapi-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/okapi-users/CABm46bZ4zLYnWFFzVQ%3DGZ%3DH%2BuiwgkhM1KspV-BqBwT2g_jbAFg%40mail.gmail.com.
--
Jim Hargrave
Software Engineer

W: www.strakertranslations.com
E: jim.ha...@strakergroup.com

This e-mail and any attachments are confidential and intended solely for the intended addressee. If you are not the intended addressee or have received this e-mail in error, please notify Straker immediately, delete it from your system and do not copy, disclose, distribute or otherwise act in reliance upon any part of this e-mail or its attachments. Straker will not be held liable for any damage caused by the message.
Is it necessary to print this email? If you care about the environment like we do, please refrain from printing emails. It helps to keep the environment forested and litter-free.
file.xlf.xliff_extracted
file.xlf.tkitMerged
file.xlf

Manuel Souto Pico

unread,
Sep 30, 2023, 6:27:31 AM9/30/23
to Jimbo, okapi-users
Thank you for your reply, Jim.

Your test proves that the inline code is preserved in the roundtrip. That's fine, that's not where the problem is. The problem is in how the tag (i..e. <x id="INTERPOLATION" ctype="x-x" equiv-text="{{ togglingFeatureShortName }}"/> in your extracted file) is represented in OmegaT.

I have tried to build the plugin on your latest dev code but I still get the same result:
image.png

I'm not sure I did that correctly, could you or someone review my steps and confirm?

git clone https://bitbucket.org/okapiframework/omegat-plugin.git
cd omegat-plugin
git checkout dev
git pull origin dev
cd filters
mvn clean package

Thanks a lot.

In the meantime I'll share what you mentioned about non-numeric ids in the omegat dev list, that's probably where the problem is since you made changes there recently. I think Hiroshi is in this list but I don't know if he follows closely.

Cheers, Manuel





image.png
image.png

Jimbo

unread,
Oct 6, 2023, 4:47:33 AM10/6/23
to Manuel Souto Pico, okapi-users

Some more detail: 

Okapi uses two id's for Code. Code.id is numeric and meant *only* to be an index into the TextFragment. But Code.originalId is a string and is preferred in *all* cases if it is non-null. In this case it looks like the string id ("INTERPOLATION") is being converted to an integer.

Manuel Souto Pico

unread,
Jul 15, 2024, 11:12:08 AM7/15/24
to Jimbo, okapi-users
Reply all
Reply to author
Forward
0 new messages