Gmail Calendar Documents Reader Web more »
Recently Visited Groups | Help | Sign in
Google Groups Home
Gettext and ITS (was Re: Feature proposal: string extracting by RegExp for xgettext)
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  4 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Asgeir Frimannsson  
View profile  
 More options Mar 14 2008, 8:13 am
Newsgroups: gnu.utils.bug
From: "Asgeir Frimannsson" <asge...@gmail.com>
Date: Fri, 14 Mar 2008 22:13:29 +1000
Local: Fri, Mar 14 2008 8:13 am
Subject: Gettext and ITS (was Re: Feature proposal: string extracting by RegExp for xgettext)
Hi Bruno,

Note that this Glade example is an actual example used in the 'best
practices' document, not something I came up with :)

So if I understand it right, tools for extracting translatable strings and

> for merging back translated strings into XML documents could use this
> W3C ITS specification?

Yes, exactly. That is, for merging back you probably don't need it... But
imagine this combined with xgettext, e.g. for extracting stuff from odf
through xhtml and glade,ts... the absolute path for a translation unit could
be stored in the #: reference elem, for example
"/html/body/p[34]/table[3]/p" and be used as a locator when merging...
something like "xgettext --its=myconfig.its mydoc.xml".

> There is no free implementation of it right now?

There are a couple: http://www.w3.org/International/its/links.html

Rainbow (mono/.net) is LGPL, so is Spritser.

> An implementation of it would have to rely on XPath. For example, use
> libxml2.
> Right?

Yeah, the spec relies heavily on xpath expressions, libxml2 is excellent for
this.. It should be able to do a 'streaming' implementation, and just rely
on xpath for evaluating if the given node is translatable/inline/comment
etc, and not rely on loading the whole document into memory.

cheers,
asgeir


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Asgeir Frimannsson  
View profile  
 More options Mar 14 2008, 8:42 am
Newsgroups: gnu.utils.bug
From: "Asgeir Frimannsson" <asge...@gmail.com>
Date: Fri, 14 Mar 2008 22:42:19 +1000
Local: Fri, Mar 14 2008 8:42 am
Subject: Re: Gettext and ITS (was Re: Feature proposal: string extracting by RegExp for xgettext)

One limitation with a PO-based implementation is of course the
handling of inline elements.

For example:

Specify non-translatable elements: <its:translateRule translate="no"
selector="//d:email|//d:uri"/>
Specify inline elements: <its:withinTextRule withinText="yes"
selector="//d:email|//d:uri"

Say you have the xml fragment:
<para>Please email us at <email>i...@example.com</email>, or visit our
website at <uri>http://www.example.com</uri>.</para>

Here, everything within para would become a msgid, however, we have no
way of blocking translators from modifying the non-translatable email
or uri elements... This could however be put in automatic comments by
the extraction tool, and even be checked by msgfmt if we have the its
configuration available...

A possible PO representation:

#: //section/para[34]
#. do not translate content within the <email> element
#. do not translate content within the <uri> element
#, xml-format
msgid "Please email us at <email>i...@example.com</email>, or visit
our website at <uri>http://www.example.com</uri>."
msgstr ""

cheers,
asgeir


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Chusslove Illich  
View profile  
 More options Mar 14 2008, 9:50 am
Newsgroups: gnu.utils.bug
From: Chusslove Illich <caslav.i...@gmx.net>
Date: Fri, 14 Mar 2008 14:50:43 +0100
Local: Fri, Mar 14 2008 9:50 am
Subject: Re: Gettext and ITS (was Re: Feature proposal: string extracting by RegExp for xgettext)

> [: Asgeir Frimannsson :]
> [...] the absolute path for a translation unit could be stored in the #:
> reference elem, for example "/html/body/p[34]/table[3]/p" and be used as a
> locator when merging...

Notwithstanding the main line of the discussion, which I know little of to
add anything, this particular bit I do not like. The source reference should
be a source reference; a link to a particular file and line should the
translator wish to venture there for more context.

Instead, I'd put the document-tree path as another automatic comment (#.),
with a certain prefix to indicate it as such.

--
Chusslove Illich (Часлав Илић)

  application_pgp-signature_part
< 1K Download

    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Asgeir Frimannsson  
View profile  
 More options Mar 14 2008, 4:56 pm
Newsgroups: gnu.utils.bug
From: "Asgeir Frimannsson" <asge...@gmail.com>
Date: Sat, 15 Mar 2008 06:56:42 +1000
Local: Fri, Mar 14 2008 4:56 pm
Subject: Re: Gettext and ITS (was Re: Feature proposal: string extracting by RegExp for xgettext)

On Fri, Mar 14, 2008 at 11:50 PM, Chusslove Illich <caslav.i...@gmx.net> wrote:
> > [: Asgeir Frimannsson :]
>  > [...] the absolute path for a translation unit could be stored in the #:

> > reference elem, for example "/html/body/p[34]/table[3]/p" and be used as a
>  > locator when merging...

>  Notwithstanding the main line of the discussion, which I know little of to
>  add anything, this particular bit I do not like. The source reference should
>  be a source reference; a link to a particular file and line should the
>  translator wish to venture there for more context.

>  Instead, I'd put the document-tree path as another automatic comment (#.),
>  with a certain prefix to indicate it as such.

Well, yes, the link to the source *file* should be there somewhere.
But:, with XML, the absolute path to an element is much more precise
than a line-number, and transferable. Imagine e.g. an XML file with
all content on one long line.

Both is of course ideal. I've been doing XML processing before where
we needed the line number and byte offset/length for the element, and
it's a very tricky business to combine with the standard xml
processing tools. But I'd be very happy to be proven wrong here :)

cheers,
asgeir


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »

Create a group - Google Groups - Google Home - Terms of Service - Privacy Policy
©2009 Google