Issue #1374: Unescaped closing angle bracket in ITS-excluded target content (okapiframework/okapi)

1 view
Skip to first unread message

msoutopico

unread,
Oct 3, 2024, 7:54:58 AM10/3/24
to okapi-...@googlegroups.com
New issue 1374: Unescaped closing angle bracket in ITS-excluded target content
https://bitbucket.org/okapiframework/okapi/issues/1374/unescaped-closing-angle-bracket-in-its

Manuel Souto Pico:

### Background

The bug happens while using the XML filter together with ITS filter properties.

My content has some phrases and terms in angle brackets, e.g. `<foo>`. These expressions are encoded as `&lt;foo&gt;` in the source XML file.

The closing character becomes unescaped in nodes excluded by ITS's locale filter, which breaks the application consuming the translation.

### Preconditions

Source file has:

```xml
<label key="ST801KAZ_ST801Q0104KAZ_54c6c210509bc38e7b8bea748938937e_6" its:localeFilterList="en-PH" its:localeFilterType="exclude">
<text>&lt;I don’t know&gt;</text>
</label>
<label key="ST801KAZ_ST801Q0104KAZ_54c6c210509bc38e7b8bea748938937e_6" its:localeFilterList="en-PH" its:localeFilterType="include">
<text>&lt;I don’t know&gt;</text>
</label>
```

### Expected result

Translation into en-PH produces this target file:

```xml
<label its:localeFilterList="en-PH" its:localeFilterType="exclude" key="ST801KAZ_ST801Q0104KAZ_54c6c210509bc38e7b8bea748938937e_6">
<text>&lt;I don’t know&gt;</text>
</label>
<label its:localeFilterList="en-PH" its:localeFilterType="include" key="ST801KAZ_ST801Q0104KAZ_54c6c210509bc38e7b8bea748938937e_6">
<text>&lt;Je ne sais pas&gt;</text>
</label>
```

### Actual results

Translation into en-PH produces this target file:

```xml
<label its:localeFilterList="en-PH" its:localeFilterType="exclude" key="ST801KAZ_ST801Q0104KAZ_54c6c210509bc38e7b8bea748938937e_6">
<text>&lt;I don’t know></text>
</label>
<label its:localeFilterList="en-PH" its:localeFilterType="include" key="ST801KAZ_ST801Q0104KAZ_54c6c210509bc38e7b8bea748938937e_6">
<text>&lt;Je ne sais pas&gt;</text>
</label>
```

### Comments

I know that there is [a flag in the FPRM file for escaping or not escaping the > char](https://okapiframework.org/wiki/index.php/XML_Filter#escapeGT). The problem seems to be that this option only works for included nodes, that's the problem: notice the difference depending on whether the label is excluded or included for the target language of the project \(which is en-PH\).

I got this issue while using the XML filter in OmegaT via the filter plugin, but I could reproduce it if I create an XLIFF file with Rainbow \(also attached\), which is why I’m reporting the issue in the okapi general tracker and not in the omegat-plugin tracker.

For your convenience, I'm attaching:

* the source file \(STQ.xml\)
* the target file
* the filter parameters file
* the OmegaT package including all of the above
* the target XLIFF file

### Other bugs that hamper testing

I could not create an OmegaT project in Rainbow 1.45 when doing Utilities > Translation Kit Creation. I selected the OmegaT Project, but the outcome is Generic XLIFF \(but this is a different bug\).

Reply all
Reply to author
Forward
0 new messages