> Thank you for posting this example! I had looked at the ITS specification, > but not understood what it was really about and how it was meant to be > used.
Note that this Glade example is an actual example used in the 'best practices' document, not something I came up with :)
So if I understand it right, tools for extracting translatable strings and
> for merging back translated strings into XML documents could use this > W3C ITS specification?
Yes, exactly. That is, for merging back you probably don't need it... But imagine this combined with xgettext, e.g. for extracting stuff from odf through xhtml and glade,ts... the absolute path for a translation unit could be stored in the #: reference elem, for example "/html/body/p[34]/table[3]/p" and be used as a locator when merging... something like "xgettext --its=myconfig.its mydoc.xml".
> There is no free implementation of it right now?
> An implementation of it would have to rely on XPath. For example, use > libxml2. > Right?
Yeah, the spec relies heavily on xpath expressions, libxml2 is excellent for this.. It should be able to do a 'streaming' implementation, and just rely on xpath for evaluating if the given node is translatable/inline/comment etc, and not rely on loading the whole document into memory.
> > Thank you for posting this example! I had looked at the ITS specification, > > but not understood what it was really about and how it was meant to be used.
> Note that this Glade example is an actual example used in the 'best practices' document, not something I came up with :)
> > So if I understand it right, tools for extracting translatable strings and > > for merging back translated strings into XML documents could use this > > W3C ITS specification?
> Yes, exactly. That is, for merging back you probably don't need it... But imagine this combined with xgettext, e.g. for extracting stuff from odf through xhtml and glade,ts... the absolute path for a translation unit could be stored in the #: reference elem, for example "/html/body/p[34]/table[3]/p" and be used as a locator when merging... something like "xgettext --its=myconfig.its mydoc.xml".
> > There is no free implementation of it right now?
> > An implementation of it would have to rely on XPath. For example, use libxml2. > > Right?
> Yeah, the spec relies heavily on xpath expressions, libxml2 is excellent for this.. It should be able to do a 'streaming' implementation, and just rely on xpath for evaluating if the given node is translatable/inline/comment etc, and not rely on loading the whole document into memory.
One limitation with a PO-based implementation is of course the handling of inline elements.
Say you have the xml fragment: <para>Please email us at <email>i...@example.com</email>, or visit our website at <uri>http://www.example.com</uri>.</para>
Here, everything within para would become a msgid, however, we have no way of blocking translators from modifying the non-translatable email or uri elements... This could however be put in automatic comments by the extraction tool, and even be checked by msgfmt if we have the its configuration available...
A possible PO representation:
#: //section/para[34] #. do not translate content within the <email> element #. do not translate content within the <uri> element #, xml-format msgid "Please email us at <email>i...@example.com</email>, or visit our website at <uri>http://www.example.com</uri>." msgstr ""
> [: Asgeir Frimannsson :] > [...] the absolute path for a translation unit could be stored in the #: > reference elem, for example "/html/body/p[34]/table[3]/p" and be used as a > locator when merging...
Notwithstanding the main line of the discussion, which I know little of to add anything, this particular bit I do not like. The source reference should be a source reference; a link to a particular file and line should the translator wish to venture there for more context.
Instead, I'd put the document-tree path as another automatic comment (#.), with a certain prefix to indicate it as such.
On Fri, Mar 14, 2008 at 11:50 PM, Chusslove Illich <caslav.i...@gmx.net> wrote: > > [: Asgeir Frimannsson :] > > [...] the absolute path for a translation unit could be stored in the #:
> > reference elem, for example "/html/body/p[34]/table[3]/p" and be used as a > > locator when merging...
> Notwithstanding the main line of the discussion, which I know little of to > add anything, this particular bit I do not like. The source reference should > be a source reference; a link to a particular file and line should the > translator wish to venture there for more context.
> Instead, I'd put the document-tree path as another automatic comment (#.), > with a certain prefix to indicate it as such.
Well, yes, the link to the source *file* should be there somewhere. But:, with XML, the absolute path to an element is much more precise than a line-number, and transferable. Imagine e.g. an XML file with all content on one long line.
Both is of course ideal. I've been doing XML processing before where we needed the line number and byte offset/length for the element, and it's a very tricky business to combine with the standard xml processing tools. But I'd be very happy to be proven wrong here :)