Sahana translation questions and comments :)

2 views
Skip to first unread message

Gavin Treadgold

unread,
Jul 21, 2008, 4:56:52 AM7/21/08
to Sahana list developers', sahana-lo...@googlegroups.com
Hi all,

Gees, I first dug into translation over a month ago, and I'm only just
getting around to posting my comments and questions out. Sorry about
that :(

Anyway, I spent an evening translating about 15-20% of the English
(UK) translation to get a better feel for the process, and came away
with quite a few questions. I've cc'd this post to both the
localisation list and maindev as there are a mixture of comments I've
got (or would like advice on).

These are in no particular order...

* quite a few strings with leading/trailing space?
* how do we know which strings should end in a full stop?

I noted that there were quite a few strings that appeared to have 1 or
more leading/trailing spaces - is this appropriate? If these are
required, I'm guessing that they could be easily missed during
translation. Should all leading/trailing spaces from strings be
removed to make translation easier and the spacing goes into the code?


* should <br> \n \t ' ' be hard coded into string? will this affect
rendering on say phones?
* should ':' or '"' be included in the strings, or should programmer
add following string e.g. to avoid 'Test' & 'Test:' duplication of
strings

I saw a number of 'special' characters encoded in the strings that
didn't seem appropriate for translation, and I thought these would
have had a better home in the code?


* some duplicate strings e.g. 'Latitude/Longitude is'
* why do we have 6 strings for 'Please enter the name for Level[123]'
and Level[123]

I saw multiple copies of strings such as Latitude and Longitude.
Surely we only need to have these once?


* strings such as 'mod_admin_ims_level2'

Do strings such as this need to be in the language db?


* we should avoid shorthand such as L10N - we know what it is, can't
expect everyone too
* capitalisation - what is policy on what is/isn't capitalised.
* Is it language or locale? We can't use both
* consistent error messages e.g. www/temp path not writeable vs
Modules directory is not writeable.

We appear to referring to both language and locale - can we pick one
term and stick with it? :)
We need a consistent approach to writing error messages.


* need consistent verbs e.g. click or press?

Inconsistent use of verbs for actions. We should pick some terms and
stick with them to keep it simple.


* policy for / separator e.g. ' / ' or '/'

E.g. should the words be separated by a space and the forward-slash,
or should they run together. Whatever - we should be consistent.


* Strings not consistent with menu structure

Some of the words used in strings seemed inconsistent with the menu
structure e.g. when referring to commands/instructions.


* can Pootle identify duplicate strings?

Does Pootle have the ability to identify duplicate strings to help
developers reuse the same string?


* duplicate 'cannot be negative' or 'is not a valid number'.

Sorry - I can't remember this one now, that will teach me for not
sending this out sooner ;)
There might have been multiple strings that were the same.


* should all-caps be stored in string, or done programmatically

I recall finding some strings that were allcaps - do they need to be
stored that way, or should allcaps be limited to abbreviations only.


* shortening of words e.g. admin, config

Should we be using contracted words, if so, which ones are allowable?


Anyway, I want to raise this for further discussion, but I'm thinking
that perhaps between the next stable 0.6.2 release and 0.7.0 we may
need to have a dedicated effort to:

1. Develop a policy to ensure we have better consistency in our
approach to language in Sahana strings.

2. Once policy is approved, review strings and update code to ensure
that the strings going into 7.0 are more consistent (and that we are
not making more work for the translators than we need).

Oh, and I found a few strings that perhaps we should look at updating
grammar wise. What is the best way to go about this given that I'm
suggesting changing the root string at some point (after 0.6.2
probably).

Also, how do we deal with strings for a particular module? Do module
have their own files, or is it one monolithic string file? Where I'm
heading with this comment is I've been thinking about Haniffa's Needs
Assessment module and the fact that we may have multiple surveys to
record. I guess what I'm asking is if we create update a module during
deployment, can we update just the strings for that module?

I think that's enough questions from me for the time being... I should
let the dust settle a bit :)

Cheers Gav

Dominic

unread,
Jul 21, 2008, 7:04:37 AM7/21/08
to Sahana Localization
Hello Gavin,

most of the questions should be directed to the Sahana developers,
since they assemble the message catalogue while the translators just
translate it.

Nevertheless, I'll try some answers here:

1. Trailing/Leading/additional whitespaces as well as Escape-
Sequences, special characters and HTML-codes have be included into the
translated message as they are, since they affect the formatting/
appearance of the UI.

2. Having a consistent terminology and using 'user-
friendly' (comprehensible) markings, messages and phrases is a task
for the developers. Translators _might_ use more comprehensible
alternatives in their target language, but I think, the originals from
the template file should be suitable as well.

3. There are in fact two additional Pootle features to ensure
consistent re-use of already translated strings or parts of phrases,
and I already considered to add them to this service:

a) Terminology hinting - means that you have suggestions for the
translation of certain words displayed while you work on the Pootle
web interface (from a word list). Terminology hinting would recognize
even similar/sound-like words.

b) Translation memory - this would repeat translations of words and
phrases which have already been translated in this project.
Translation memory is
dynamic and will be compiled from the actual work.

4. Similar or even double messages in the PO template are produced by
the programmers (!), and they mean additional workload for the
translators - these should be avoided if feasible. Also, there are a
few "unused" messages in the .pot file, that have remained from prior
versions, so it is important to "clean up" the PO template from time
to time. But, as said, this is a task for the developers.

5. Up to now, Sahana comes with one large message catalogue. This
might be an advantage if different modules use the same messages, but
it makes the maintenance of this large file quite complex and
inefficient. I would suggest to partition the message catalogues into
module-related parts with each 100-200 strings (this makes it much
easier to maintain the files and keep them consistent and clean). But
this is a decision the Sahana developers should make, not the
translators.

Regards,
Dominic

Jorge Maturana

unread,
Jul 21, 2008, 5:32:55 PM7/21/08
to sahana-lo...@googlegroups.com
Hello all,

Just an idea: define a role of "style writer" between programmer and translators.

Actually, programmers are more focused in code, so they probably aren't much concerned about writing messages in a strict coherent way, or correcting typos. On the other side, translators receive the .po file already generated, so they have no choice to correct messages in the source (and the errors are propagated to all translated languages)

The idea is someone (maybe the same programmer, but no in 'programming mode') take a look to the code BEFORE generating the .pot file and:

- correct typos

- detect duplicated strings (such as "login", "login " and "login: ")

- avoid personal directions (such as "we are collecting this information because..." instead of "The information is collected because..."

- simplify them (eg. things like "Please insert the name of the victim and the address of the victim in the fields above for completing information in the Sahana missing person module. If any porblem occurs, please contact the administrator of your Sahana system" instead of "Please insert the name and address of the victim. In case of problem, contact Sahana adminstrator")

- detect possibly difficult messages (eg. "Report informations if displaced person to be found")

- eliminate things like "ebdcnt".

- other things that Gavin pointed out before

This would save a lot of time to translators and will help to build a more coherent vocabulary.

Regards,

Jorge

Chamindra de Silva

unread,
Jul 22, 2008, 3:07:33 AM7/22/08
to sahana-lo...@googlegroups.com, Sahana developers' list
Our PHP coding convention has some guidelines for gettext strings.

http://wiki.sahana.lk/doku.php?id=dev:php_coding_convention#localization

As part of this effort and to encourage the developers it would be valuable if you can recommend modifications and additions to this convention.
--
Chamindra de Silva
http://chamindra.googlepages.com
Reply all
Reply to author
Forward
0 new messages