FXP file format

155 views
Skip to first unread message

Mikhail Kudinov

unread,
Jan 15, 2013, 5:07:46 AM1/15/13
to opentm2...@googlegroups.com
Hi, 

We want to add support of editing FXP files to our tool.
Is such support isolated to some library/module/code file set to use it?

I tried to read source code files, but haven't found the part related to file format yet.

Thanks in advance!

Best regards, Mikhail Kudinov.

GerhardF

unread,
Jan 15, 2013, 8:04:07 AM1/15/13
to opentm2...@googlegroups.com
Dear Mikhail,

many thanks for being interested in the OpenTM2 project, and good to see that you already studied the current source code of OpenTM2.

The OpenTM2 folder, which simply keeps all data of a translation project (e.g. the files to be translated, translation memories, translation dictionaries etc.), is a binary file which can't be processed without using OpenTM2 functions. We have a strong set of OpenTM2 APIs integrated into OpenTM2, but they only work if OpenTM2 is installed on a PC.

We have plans (in the future development cycle) to replace the proprietary and binary OpenTM2 folder by an open and flexible alternative - it may be a concept based on a ZIP-container similar to what we see on OpenOffice files or even Microsoft Open XML files. But this is the future :-)

Maybe you can let me know more about your tool, what you are developing etc. It may be worth to have some cooperation in this area.

Best regards .... Gerhard

==================================================================

Mikhail Kudinov

unread,
Jan 15, 2013, 11:01:08 PM1/15/13
to opentm2...@googlegroups.com
Dear Gerhard, 

Thank you for the reply!

Regarding the API, it could be useful for our case, especially if APIs provide way to modify translation memory (and bilingual file content) via this API.
ZIP container of XML sounds great from interoperability point of view.

Our tool is translation QA tool to find and correct translation errors in bilingual files. It is Verifika. It is based on .NET technology stack.
Currently we support a lot of XLIFF/XLZ variations and some proprietary bilingual text formats.
In fact, we just need to get set of pairs (source + target) texts (may be with tags) and put updated target text back.

One of our customers asked us to support FXP file format and I started investigation.
Regarding the cooperation, I think it may make sense. What kind of cooperation do you offer?

Best regards, Mikhail.

GerhardF

unread,
Jan 16, 2013, 5:30:25 AM1/16/13
to opentm2...@googlegroups.com
Hi Mikhail,

one feature of OpenTM2 may be of interest for you (unfortunately there is a bug in it not yet fixed) - you can export an OpenTM2 folder as an XLIFF-FOLDER. The XLIFF-output would then contain all documents of the folder (source and target) as well as a bunch of meta-information.

It looks like this (short excerpt):

<?xml version="1.0" encoding="UTF-8" ?>

<xliff version="1.1" xmlns="urn:oasis:names:tc:xliff:document:1.1" xmlns:tmgr="http://www.ibm.com">
  <file datatype="html" original="count01.htm" source-language="en-US" target-language="de-DE" tool="OpenTM2">
    <header>
      <tmgr:properties>
        <tmgr:folder>XLIFF_TEST</tmgr:folder>
        <tmgr:markup>IBMHTM32</tmgr:markup>
        <tmgr:sourcelang>English(U.S.)</tmgr:sourcelang>
        <tmgr:targetlang>German(reform)</tmgr:targetlang>
        <tmgr:shortname>COUNT01H.000</tmgr:shortname>
        <tmgr:shipment>1</tmgr:shipment>
      </tmgr:properties>
    </header>
    <body>
      <trans-unit id="6" translate="yes" tmgr:segstatus="XLATED">
        <source>Test translation</source>
        <target state="translated">Test-Übersetzung</target>
        <count-group name="word count">
          <count count-type="word count" unit="word">2</count>
        </count-group>
      </trans-unit>

      <trans-unit id="66" translate="yes" tmgr:segstatus="XLATED">
        <source>This is a small test 1 sentence.</source>
        <target state="translated">Dies ist ein kleiner Test-1 Satz.</target>
        <count-group name="word count">
          <count count-type="word count" unit="word">6</count>
        </count-group>
      </trans-unit>

      </body>
  </file>
</xliff>

Would this something be you are looking for? Actually our dev. resources are tied to most important developments, but during 2013 I could imagine that the FOLDER XLIFF bug gets fixed.

AND: when I'm talking about "cooperation", then I mainly aim for getting contributors to the OpenTM2 project. It may be any help to improve the usage of OpenTM2, because our main goal is to make OpenTM2 a representative OPEN SOURCE application supporting all most important open standards such as TMX, TBX, SRX, XLIFF etc.  And our goal is to become platform independent, so that OpenTM2 not only runs on WIN, but also on e.g. LINUX or OSX ect.  It is also of interest, that OpenTM2 can cooperate with many applications and platforms as possible (e.g. cooperating with MT-environments).


Best regards .... Gerhard

Mikhail Kudinov

unread,
Jan 16, 2013, 11:31:02 AM1/16/13
to opentm2...@googlegroups.com
Hi Gerhard, 

Thank you for pointing export to XLIFF feature. 
It seems it is exactly what we're looking for, so I'm looking forward to the correction of this bug.
Is that bug critical?

Regarding the contribution, I'm afraid I'm not so familiar with plain C code to make useful contribution here.
We could just help with better support of your XLIFF and so increase number of tool interoperable with OpenTM2.

Best regards, Mikhail.


среда, 16 января 2013 г., 17:30:25 UTC+7 пользователь GerhardF написал:

GerhardF

unread,
Jan 17, 2013, 4:26:29 AM1/17/13
to opentm2...@googlegroups.com
Hi Mikhail,

REF the XLIFF-bug: actually it has not a high priority yet, because there are known translation-processes which are depending on this feature. It works fine in TM (the IBM TranslationManager/2, the "mother" of OpenTM2), but during the migration this FOLDER-XLIFF export was somehow corrupted. Unfortunately our development resources are fully occupied by other OpenTM2 related work, but I can double check the effort behind fixing the bug.

REF the contribution: it would help, if you could run more intensive tests on your side, as soon as we have fixed the XLIFF-bug. Just this alone is a great help ;-)

Best regards ... Gerhard

GerhardF

unread,
Jan 17, 2013, 7:34:16 AM1/17/13
to opentm2...@googlegroups.com
Ooops .... I was made aware of a little (but important) typo ....

I wrote " ... because there are known translation-processes which are..... ", but it should really mean "... because there are NO known translation-processes which are ..."

Regards .... Gerhard
Message has been deleted

Mikhail Kudinov

unread,
Jan 22, 2013, 5:10:03 AM1/22/13
to opentm2...@googlegroups.com
Hi Gerhard,

We tried to parse XLIFF files from original IBM Translation Manager.
It have exported XLIFF successfully, but that XLIFF does not follow XLIFF 1.1 standard (declared in header). 
Particularly, it has no required "id" attribute for <ph> element.
So, you might be interested in correcting it in your version.

Hope it helps,
Mikhail.


четверг, 17 января 2013 г., 19:34:16 UTC+7 пользователь GerhardF написал:

GerhardF

unread,
Jan 22, 2013, 7:59:24 AM1/22/13
to opentm2...@googlegroups.com
Hi Mikhail,

I know that we have not reached the point where we can call OpenTM2 being "standard compliant", but it is one of our declared goals to turn OpenTM2 into platform which supports the most important standards (such as TMX, TBX, XLIFF etc). The ultimate goal would be to declare OpenTM2 as THE reference implementation in the Open Source scene.

Would you help getting closer to this goal? You could add e.g. an example XLIFF-file which is NOT following the standards, and you could "fake" this very same XLIFF file and turn it into the shape where you would say that it would be following the XLIFF standars perfectly.

This would be a great help ....

Regards ... Gerhard

Mikhail Kudinov

unread,
Jan 22, 2013, 10:02:29 AM1/22/13
to opentm2...@googlegroups.com
That is not a problem.

In fact, correcting of the whole XLIFF file is unneccessary because I found the only incompliant thing - missing required "id" attribute in <ph> elements (in <source>, <target>).
For example, 
<target state="translated">Some text <ph>&lt;keyword&gt;</ph></target>
Valid XLIFF will be 
<target state="translated">Some text <ph id="1">&lt;keyword&gt;</ph></target>

Also, if you're thinking about interopability and standard compliance, I highly recommend to look at Interoperability Now!
You might know that a lot of custom XLIFF extensions create different representation of the same concepts.
And this XLIFF variations could not be converted easily to provide real interoperability between different tools.
That project's goal is to provide really interoperable XLIFF documents and project packages.
Also, you might be interested in Linport project (it will use Interoperability Now as XLIFF file format), it is great initiative to provide "open vendor-independent format for packaging translation materials".
They plan to base on existing open formats (XLIFF, TMX, TBX) and provide more project information in standard way.

Best regards, Mikhail.

GerhardF

unread,
Jan 23, 2013, 8:41:00 AM1/23/13
to opentm2...@googlegroups.com
Hi Mikhail .... thanks for your comments.

1. XLIFF-STANDARDS:  Missing attributes in exported XLIFF files: I talk to our developers to see the impact if we change the export-code in the sense you mentioned.
2. XLIFF-export: I opened a new Ticket #179 in TRAC. You can follow the progress directly in TRAC.
2. LINPORT: I take this with me into the OpenTM2 steering committee.

Thanks again for your good suggestions, and for driving the improvement of OpenTM2 forward.

Best regards ... Gerhard
Message has been deleted

Mariel Varjão Azoubel

unread,
Sep 9, 2013, 9:58:06 PM9/9/13
to opentm2...@googlegroups.com
Hi Mikhail/GerardF,

I'm a Project Manager currently working with both your tools (IBM TM 6.5.1 and Verifika) and I found the workaround of exporting the FXP folder into XLIFF pretty useful so far.

But it does lack one thing - I can't figure out which of the Protection options to use with my folder in order for the QA tool to exclude Pretranslated segments (Exact matches pretranslated into the file previously to my receiving it with the Analyse function in TM) from the consistency checks.

I can do it with XBench pretty easily (just checking off 'exclude pretranslated') but as Verifika only allows support for the memoQ, Lionbridge and Idiom kinds - that's the only match I get. Is there any way to verify the memory match ID of a XLIFF exported folder from FXP format? Or is this match I'm looking for simply untranslatable to other tools?

GerhardF

unread,
Sep 10, 2013, 2:35:55 AM9/10/13
to opentm2...@googlegroups.com
Hi Mariel,

The current implementation in both, TM and OpenTM2 does not indicate the "class" of proposal used to do the translation. In other words, the XLIFF-folder doesn't show e.g. whether the translation came from an EXACT or FUZZY match.
Below is an excerpt of a segment in the exported XLIFF-folder. The element <target state="translated"> only shows that the segment was translated, but doesn't show from "where" the proposal comes from (e.g. EXACT of FUZZY match etc.). I could imagine that this service is improved in 2014, when the main rework of OpenTM2 is done, and when a new OpenTM2 V.1.0. hits the ground. But before this it's unfortunately not possible to spend development resources on this.

...
...
<trans-unit id="1" translate="yes" tmgr:segstatus="XLATED">

  <source>This is a small test 1 sentence.</source>
  <target state="translated">XL_This is a small test 1 sentence.</target>

  <count-group name="word count">
    <count count-type="word count" unit="word">6</count>
  </count-group>
</trans-unit>
...
...


Regards .... Gerhard
Reply all
Reply to author
Forward
0 new messages