Key-based alignment of two XML files

8 views
Skip to first unread message

Manuel Souto Pico

unread,
Feb 24, 2023, 5:35:42 AM2/24/23
to okapi-users
Dear all,

I have two XML files, source and target. The source XML file looks a bit like this:

<label key="585afd88cce860ed194a4eced247f124">
   <text>foo</text>
</label>
<label key="64b8a8ea560bc7e850a5db4918366c70">
   <text>bar</text>
</label>

The target file looks like this:

<label key="585afd88cce860ed194a4eced247f124">
   <text>FÜÜ</text>
</label>
<label key="64b8a8ea560bc7e850a5db4918366c70">
   <text>BÂR</text>
</label>

In other words, same keys, different text.

Here comes the question: Is it possible to align these two files in Okapi Rainbow based on the ID/keys?

In other words, the input would be those two XML files, and the output would be a TMX file that looks like this:

    <tu>
      <prop type="id">585afd88cce860ed194a4eced247f124</prop>
      <tuv lang="en">
        <seg>foo</seg>
      </tuv>
      <tuv lang="fr">
        <seg>FÜÜ</seg>
      </tuv>
    </tu>
    <tu>
      <prop type="id">64b8a8ea560bc7e850a5db4918366c70</prop>
      <tuv lang="en">
        <seg>bar</seg>
      </tuv>
      <tuv lang="fr">
        <seg>BÂR</seg>
      </tuv>
    </tu>

Thanks in advance.
Cheers, Manuel

jimbo

unread,
Feb 24, 2023, 10:29:42 AM2/24/23
to Manuel Souto Pico, okapi-users

Have you tried the ID Aligner Step? In your filter you would need to have the filter config set the id to the key attribute.

--
You received this message because you are subscribed to the Google Groups "okapi-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to okapi-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/okapi-users/CABm46bZ-015bTt-JJf%3D9wn5gknB4ohPt53_7suH8zo977MMNHQ%40mail.gmail.com.

Manuel Souto Pico

unread,
Feb 28, 2023, 9:01:28 AM2/28/23
to jimbo, okapi-users
Thanks, Jim.


Cheers, Manuel

jimbo

unread,
Feb 28, 2023, 10:04:03 AM2/28/23
to Manuel Souto Pico, okapi-users

that's correct. As long as the filter can populate the "resname" that step should align all the text units.

Jim

Reply all
Reply to author
Forward
0 new messages