There is a change in 1.44 XLIFFFilter that affects the way we get <mrk> elements.
For example in 1.43 for this:
Texte avec <mrk mtype="x-sdl-location" mid="someID5"/>location.
We were getting an inline code with type=mrk and no annotation.
Now we get an inline code type=”_annotation_ and a inline annotation.
The change comes from those new lines:
Is there a reason for the compatibility breaking change?
The default is to add a custom annotation that basically store the mtype of the marker. Something that does not really add any useful information.
It seems there is also a side effect with the change.
Some mrk elements processed like this end up as isolated tags in the segment, rather than normal expected opening/closing (like there are with 1.43).
I still have to dig into that part to see why.
But I was wondering if there was a reason for the root cause first.
I'll look into this asap. I made these changes a while back to address some bugs we we finding, but I don't remember the details now. I'll try to find the specific files I was testing against so if we make changes we can make sure that those files still pass.
You received this message because you are subscribed to the Google Groups "okapi-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to okapi-devel...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/okapi-devel/007b01d8c4ce%2419d9a270%244d8ce750%24%40gmail.com.
Ok, this change was made to add support for Custom mrk elements in xliff 2 (with the ability to extract these to xliff 1.2 and preserve them).
Here's the commit comment: "various fixes to merger code
(TextUnitMerger) additional support for custom mrk elements in
xliff2 and xliff 1.2"
One reason I wanted to consistently add the annotate type to mrk
codes is that during "TextFragmentUtil.synchronizeCodeIds" we
treat them differently than normal inline codes. We don't expect
to align these between the source and target and can have any
number of added or missing annotation codes in the target. The
idea is that these can be added in the translation.
Ugly stuff for sure - maybe the bug you noted is cuased by
treating these mrk codes specially now.
Removing the code in question I get a failure on this file. So at least we have something to debug with:
[ERROR] RoundTripXliffIT.xliffFiles:81->BaseRoundTripIT.realTestFiles:98->EventRoundTripIT.runTest:91 » OkapiTest lqiTest.xlf
[ERROR] RoundTripXliffIT.xliffSerialized:102->BaseRoundTripIT.realTestFiles:98->EventRoundTripIT.runTest:91 » OkapiTest lqiTest.xlf
I’m fine with keeping the change and the custom annotation, especially since it helps for 2.x.
The change to handle that on the filter caller side is minor.
But it’d be great to have the markers in the extracted coded-text set to OPENING/CLOSING rather than ISOLATED.
The attach file should trigger such behavior in the first target segment.
I’ll try to understand why it does that too.
Oh, good I'm glad it was something small. Was worried there for a
I hate that getCode has side effects. I've looked over the balanceCodes several times to see if I could clean this up - but have never come up with a better solution.
I approved the the PR with one comments.
To view this discussion on the web visit https://groups.google.com/d/msgid/okapi-devel/c68a2179-84f2-4b58-89b4-cb399db65968n%40googlegroups.com.