Validation of XLIFF 1.2 files using Okapi

777 views
Skip to first unread message

Martin Wunderlich

unread,
Feb 22, 2017, 9:08:10 AM2/22/17
to okapi-devel
Hi all,

I guess some people here are familiar with Rodolfo Raya's great tool XLIFFChecker (http://www.maxprograms.com/products/xliffchecker.html). I was wondering, if Okapi has a similar functionality somewhere to validate XLIFF 1.2 files?
Thanks.

Cheers,

Martin
 

Yves Savourel

unread,
Feb 22, 2017, 9:52:32 AM2/22/17
to okapi...@googlegroups.com

Hi Martin,

 

No there is not. You can run the XML Validation using the XLIFF 1.2 schema of course, but that won’t check anything more than wellformness and the schema.

 

-ys

--
You received this message because you are subscribed to the Google Groups "okapi-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to okapi-devel...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Martin Wunderlich

unread,
Feb 22, 2017, 10:05:42 AM2/22/17
to okapi-devel
OK, thanks, Yves.

The XLIFFChecker tool is published under Eclipse Public license v1.0. Does anyone know, if this is compatible with the Okapi license? Because if yes, then we could integrate the code into Okapi. Could be a handy addition. The advantage would be that there are more potential maintainers in the Okapi project. The disadvantage would be the that code on the Okapi side would have to be kept in synch with source. But since 1.2 is stable and the tool works very nice, I don't see an issue here.
(I have already done some work locally to create a headless version of the XLIFFChecker, because originally the actual checking functionality was rather tied up with the GUI, but this never made it into the project)

Cheers,

Martin

Jim Hargrave

unread,
Feb 22, 2017, 11:58:24 AM2/22/17
to okapi...@googlegroups.com
I use xliffchecker quite often - I would welcome a library in Okapi and eventual integration into tikal. I would also welcome a tmx, and tbx checker - with the full RelaxNG schema + schematron.

Jim

Mihai Nita

unread,
Feb 22, 2017, 3:47:21 PM2/22/17
to Group: okapi-devel
I looked at xliffchecker, and it looks pretty easy to separate the UI from the checker proper.

Some bottlenecks:
* Uses Xerxes, and I think okapi does not
* To be really useful it would need to be published in maven repository (not mandatory, but nice)

I can reach out and contact the owner, if you want.


Mihai

To unsubscribe from this group and stop receiving emails from it, send an email to okapi-devel+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "okapi-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to okapi-devel+unsubscribe@googlegroups.com.

Martin Wunderlich

unread,
Feb 23, 2017, 5:14:40 AM2/23/17
to okapi-devel
Thanks, Jim and MIhai. Glad you like the idea. I know Rodolfo, so I can handle the contact to him. However, I think first we need to clarify, if the licenses are compatible at all. If the code can't be included directly, then maybe an alternative solution would be to build a headless jar from the XLIFFChecker project which can then be included as a Maven dependency. Again, I am not sure, what the license restrictions in this case are - if any.
@Jim: Rodolfo has also created a TMX validator.

I'll try and make the call today so that we can discuss.

Cheers,

Martin

Mihai Nita

unread,
Feb 23, 2017, 9:56:06 AM2/23/17
to Group: okapi-devel
The code is under Eclipse Public License v1.0

My "investigation" yesterday was into changing XLIFFChecker itself to separate a headless library from the UI part, not to use the code directly. This would help to keep things in sync.

I have the changes already, if Rodolfo wants I can contribute then.

Mihai

To unsubscribe from this group and stop receiving emails from it, send an email to okapi-devel+unsubscribe@googlegroups.com.

Martin Wunderlich

unread,
Feb 28, 2017, 8:36:29 AM2/28/17
to okapi-devel
Hi all,

I checked with Rodolfo and currently there are no plans to create an "official" headless version or making it available via Maven. But we're free to use the XLIFFChecker code and include it in Okapi.
How about proceeding in the following way:
- include current state of XLIFFChecker as a separate project in the Okapi project
- turn it into a Maven project
- apply Mihai's or my changes to make it into headless version
- include a headless jar as a depencency, so that the XLIFFChecker can be used in Okapi

That way the code should be sufficiently separated to avoid some kind of maintenance nightmare. The project seems to be stable enough anyway, so I wouldn't expect too many changes that need to be carried over from the original project.
What do you think? Any suggestions or better ideas?

Cheers,

Martin

Jim Hargrave

unread,
Feb 28, 2017, 11:17:41 AM2/28/17
to okapi...@googlegroups.com

A standalone project under the okapi umbrella makes sense to me. We could even make the project more general so that we can add other formats in the future. I'm sure there would be a lot of code sharing for schema validation.

Jim

Mihai Nita

unread,
Feb 28, 2017, 12:54:41 PM2/28/17
to Group: okapi-devel
Yes, sounds reasonable.

I don't think we have anything like this though, an external project that we build from sources.
This is a first.
So I wonder what the location would be...
I was thinking something like <okapi_root>/third_party/xliffchecker ?

Mihai


To unsubscribe from this group and stop receiving emails from it, send an email to okapi-devel+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "okapi-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to okapi-devel+unsubscribe@googlegroups.com.

Jim Hargrave

unread,
Feb 28, 2017, 1:08:18 PM2/28/17
to okapi...@googlegroups.com

Maybe I misunderstood. Are we thinking of forking the code?

In that case it should be an independent bitbucket project under the okapiframework team. Our trend is to move more code out of the main okapi into separate projects - we have already done this for a number of sub-projects like longhorn. Hopefully the UI code will be next. At some point I would like to see "core" okapi with only filters, writers, segmentation and a few commonly used steps and libs.

Jim

Mihai Nita

unread,
Feb 28, 2017, 1:11:15 PM2/28/17
to Group: okapi-devel
We might also need to decide what to do about overlaps...

We already have schema validation in lib-verification (net.sf.okapi.lib.verification.ValidateXliffSchema)
But XLIFFChecker uses the Xerces XMLCatalogResolver, and it seems to validate against multiple schemas.
So it looks like it is more powerful.

XLIFFChecker also validates the locale ID against language-subtag-registry.txt
Not sure if that is better or worse than what ICU does.

We can probably keep things "as is" for now, and validate "both ways"

Mihai

Jim Hargrave

unread,
Feb 28, 2017, 1:25:18 PM2/28/17
to okapi...@googlegroups.com

>>We can probably keep things "as is" for now, and validate "both ways"

+1 with the plan to deprecate lib-verification ValidateXliffSchema eventually and add more modern libraries and standards (e.g., RelaxNG/schematron) in the new project.

We should name the new project something generic like "Localization XML Validators" - then use the forked code to slowly add new validators - for example there is also a tmxchecker that probably uses similar code.

Jim

Mihai Nita

unread,
Feb 28, 2017, 1:26:22 PM2/28/17
to Group: okapi-devel
I think it is a fork, weather we like it or not:
* we refactor the code to separate the UI from the "worker" part, so it is not just a mirror
* the owner has no interest to take our changes back in


I am also fine with a separate bitbucket project, sounds cleaner.

You think we should have a xliffvalidator bitbucket project, or a
third-party bitbucket project, where we can keep adding projects like this?
(hopefully not needed, but...)
Each bitbucket project adds maintenance overhead... keeping library versions in sync with other okapi projects, cloudbees projects, etc.


Mihai


Jim Hargrave

unread,
Feb 28, 2017, 1:32:35 PM2/28/17
to okapi...@googlegroups.com

Personally I think third party project is too big - but common validators for xliff, tbx, tmx,  srx, etc.. sounds like a good sweet spot.

Agreed the maintenance overhead is greater, but I think testing and deployment is simplified and more people can work independently across different projects. Less risk of breaking okapi core etc.

Jim

Mihai Nita

unread,
Feb 28, 2017, 2:00:29 PM2/28/17
to Group: okapi-devel
After I spent some time on the xliffvalidator, I kind of doubt it can be easily adapted to deal with formats other than xliff.

I wonder it is not cleaner / easier to start something from scratch, with a focus on reuse.
The current code hard-codes a lot of things, creates temp files, and all kind of other stuff that would need to change for a clean reuse.

Mihai

Jim Hargrave

unread,
Feb 28, 2017, 2:22:23 PM2/28/17
to okapi...@googlegroups.com

>> I wonder it is not cleaner / easier to start something from scratch, with a focus on reuse.

+1

My hope is that any new project would rely more on RelaxNG/Schematron (xml schema/DTD is so old school :-)) - plus whatever custom checking we think is good.

In the near future I might have time to work on a new TBX validator - which could seed the new project if not started by then.

Another project to learn from is: https://sourceforge.net/p/tbxutil/git/ci/master/tree/

But this one is old, last commit 2012.

Jim

Martin Wunderlich

unread,
Feb 28, 2017, 2:43:44 PM2/28/17
to okapi-devel
I think keeping in mind other validation mechnisms, e.g. for TMX, makes a lot of sense. I don't know how much effort it is to maintain the new project as a separate Bitbucket project, but this would also make sense to allow others to use the validation tool without having to include all of okapi. Should we call it something "Okapi validation tools"? Should we aim to make it available through Maven central, too?

Cheers,
Martin

Jim Hargrave

unread,
Feb 28, 2017, 3:31:11 PM2/28/17
to okapi...@googlegroups.com

>>Should we call it something "Okapi validation tools"? Should we aim to make it available through Maven central, too?

That's a good name. Ideally we would have full cloudbees, integration test and maven central support, but that is optional in the short term, IMHO.

J

Jim Hargrave

unread,
Mar 1, 2017, 9:28:43 PM3/1/17
to okapi...@googlegroups.com

For new okapi sub-projects the bitbucket pipelines plugin may be much easier than cloudbees. This could reduce our admin overhead and still give us automated testing and building. It's free for now - but may be $$$ in the future. Mihai has played with this for the okapi main project.

https://confluence.atlassian.com/bitbucket/get-started-with-bitbucket-pipelines-792298921.html

Jim


On 02/28/2017 12:43 PM, Martin Wunderlich wrote:

Chase Tingley

unread,
Mar 1, 2017, 9:38:33 PM3/1/17
to okapi...@googlegroups.com
The bitbucket pipeline seems to work very well for the main project. What about artifact hosting, though?

To unsubscribe from this group and stop receiving emails from it, send an email to okapi-devel+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "okapi-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to okapi-devel+unsubscribe@googlegroups.com.

Jim Hargrave

unread,
Mar 2, 2017, 1:59:31 PM3/2/17
to okapi...@googlegroups.com

Looks like there are some workarounds for artifact hosting, but I'm not as worried about that initially for the sub-projects. I mostly would like to see things like unit, integration tests, code analysis etc..

See: https://bitbucket.org/simpligility/ossrh-pipeline-demo

You are right the build and unit tests are very easy to setup in pipleline! I didn't realize this had been working for okapi.

Jim

Martin Wunderlich

unread,
Mar 3, 2017, 1:07:25 AM3/3/17
to okapi-devel
I've logged this on bitbucket now as an enhancement:
https://bitbucket.org/okapiframework/okapi/issues/590/include-xliffchecker-code-headless-in-new

@Mihai, do you perhaps want to send me your code changes and I'll, if I can get started setting up the new sub-project?

Cheers,

Martin

Mihai Nita

unread,
Mar 3, 2017, 1:15:00 PM3/3/17
to Group: okapi-devel
Idea: can you start with the original code, untouched?
That way git can keep track of all changes.
I think this is what you proposed anyway...


- include current state of XLIFFChecker as a separate project in the Okapi project
- turn it into a Maven project
- apply Mihai's or my changes to make it into headless version
- include a headless jar as a depencency, so that the XLIFFChecker can be used in Okapi

I also have the "turn it into a Maven project" part

Mihai


To unsubscribe from this group and stop receiving emails from it, send an email to okapi-devel+unsubscribe@googlegroups.com.

Mihai Nita

unread,
Mar 3, 2017, 1:21:57 PM3/3/17
to Group: okapi-devel
Oh, wait...
Are we going ahead with XLIFFChecker, or we start from scratch?

>> I wonder it is not cleaner / easier to start something from scratch, with a focus on reuse.

> +1
>
> My hope is that any new project would rely more on RelaxNG/Schematron
> (xml > schema/DTD is so old school :-)) - plus whatever custom checking we think is good.
>
> In the near future I might have time to work on a new TBX validator - which could
> seed the new project if not started by then.
>
> Another project to learn from is: https://sourceforge.net/p/tbxutil/git/ci/master/tree/
>
> But this one is old, last commit 2012.


On Thu, Mar 2, 2017 at 10:07 PM, Martin Wunderlich <ma...@censhare.com> wrote:
To unsubscribe from this group and stop receiving emails from it, send an email to okapi-devel+unsubscribe@googlegroups.com.

Martin Wunderlich

unread,
Mar 4, 2017, 2:55:05 AM3/4/17
to okapi-devel
My suggestion would be to start with XLIFFChecker in its current state as the basis, because it already does a really good job of validating all sorts of stuff. Starting scratch would mean having to plough through the XLIFF specs on a very detailed level and figuring what needs to be validated.

Cheers,
Martin

Mihai Nita

unread,
Mar 6, 2017, 12:17:28 PM3/6/17
to Group: okapi-devel
The "Okapi Validation Tools" repository it's up:
    https://bitbucket.org/okapiframework/okapi-validation-tools

It is structured in a manner similar to okapi: a superpom and sub-projects.
xliffchecker is one of them, but the idea is to add more, as needed.

Builds with maven and the Bitbucket pipelines.

==

The library-proper is (almost) separated from the GUI.
It still builds into one single .jar, and there is a single .properties files with all the messages, GUI or LIB.
But the lib code uses an interface, and can be invoked from any other code.
The launch scripts are not updated, but running the resulting application "by hand" works like before.

I've contemplated splitting the GUI from the lib in different maven projects, and maybe merge them in one shadow jar.

But I still kind of hope that Rodolfo might want to "take it back"
So I tried to not diverge and only do what was really needed.

I kind of hate forking projects :-)
It feels like I'm not happy with the direction where the owner takes it and "I know better". And good projects with good owners don't deserve that...
Maybe we can try pinging Rodolfo and show him what it looks like now...

Cheers,
Mihai



To unsubscribe from this group and stop receiving emails from it, send an email to okapi-devel+unsubscribe@googlegroups.com.

Martin Wunderlich

unread,
Mar 7, 2017, 5:43:45 AM3/7/17
to okapi-devel
Nice!! Thanks a lot, Mihai, for getting this project set up so quickly.

I understand your sentiments regarding the forking of this project. Then again, if it's not being maintained actively, I think it is also in the interest of the original develop to see the project coming back alive and gaining more maintainers and users.
I'll send Rodolfo a quick note to let him know that his XLIFFChecker has been included in Okapi.

Cheers,

Martin

Jim Hargrave

unread,
Mar 7, 2017, 12:47:06 PM3/7/17
to okapi...@googlegroups.com

Good timing. I really need a good xliff 2 validator. We are starting to get xliff 2 that is non conformant.

Does anyone know of a web based xliff 2 validator?

Jim

--

Yves Savourel

unread,
Mar 7, 2017, 12:55:18 PM3/7/17
to okapi...@googlegroups.com

For XLIFF2: http://okapi-lynx.appspot.com/validation

With some file size limitation.

 

For large files you can use the Lynx tool (shipped with the Okapi XLIFF Toolkit), or just load the XLIFF2 file with XLIFFReader.

 

-ys

Jim Hargrave

unread,
Mar 7, 2017, 1:04:21 PM3/7/17
to okapi...@googlegroups.com

Nice, I knew about this lynx tool but didn't know you had deployed to the web.

Is this something we want to port over to the validator project? I would like to have all the best/latest XML tools in one place with a consistent framework for all formats.

Do you know of  RelaxNG/Schematron files for xliff 2?

Jim

Yves Savourel

unread,
Mar 7, 2017, 1:26:18 PM3/7/17
to okapi...@googlegroups.com

Ø  Is this something we want to port over to the validator project? I would like to have all the best/latest XML tools in one place with a consistent framework for all formats.

It would make sense. And it should be relatively easy.

Ø  Do you know of  RelaxNG/Schematron files for xliff 2?

There is a set of schematron files planned for 2.1.
Which, hopefully may be able to validate 2.0 files (but I’m not sure).

-ys

Reply all
Reply to author
Forward
0 new messages