Re: i18n and Pootle progress

4 views
Skip to first unread message

Bryan Berry

unread,
May 3, 2010, 12:49:20 AM5/3/10
to Peter Gijsels, Kenny Meyer, David Farning, Bernie Innocenti, karmajs, Ashok Basnet
cc'ing karma.js google group

On Mon, May 3, 2010 at 12:41 AM, Peter Gijsels <peter....@gmail.com> wrote:
Hi Kenny, Bryan

See my comments inline below.

On Sun, May 2, 2010 at 9:27 AM, Kenny Meyer <knny...@gmail.com> wrote:
> Karma team,
>
> I have been continuing my work on Internationalisation.
>
> I have done that what Peter suggested: Starting to translate the lessons.
>
> This was the point where I really investigated a lot about Pootle and gettext,
> and the documentation is extensive. [1]
>
> I noticed that in *all* lessons the tags of all translatable elements weren't
> annotated with the class attribute `class="translate"`. So the html2po script
> is useless, as it parses for this attribute.

Yes. This hasn't happened yet. The goal was first to get the lessons
working in Nepali. Bryan and I developed the scripts on a sample
lesson, but we didn't prepare the lessons for translation.

> With "the script" or "html2po" I will refer myself to the script in the i18n/
> folder of the repository.
>
> So my goal was doing that with all 2 Grade's English lessons for the start. See
> my attached diff-file.

At first sight, that looks good.

>
> This is what I'm going to have to prioritize next:
>
> html2po.py enhancements:
> ========================
>
> 1) Handle keywords:
>
>        <meta name="keywords" content="javascript,html5,sugar,sugarlabs,gsoc,ole,nepal,Vocabulary, Birds, English" />
>
>   I suppose that the keywords should be translated, too?

I don't think this is high priority. As far as I understand, these
keywords were added for search-engine optimisation. But as far as I
know no major search engine takes them into account any longer. I
would be in favor of removing them all-together. So you can leave that
untranslated for now.

+1, keywords really aren't critical. that's for search engine optimization which we don't actually need
 
> 2) Make translatables out of tags with `title` attributes:
>
>        <div title="Back" id="linkBackLesson" class="linkBack">
>
>   This is a tricky part, but I think this is high-priority. So this will
>   probably be one of the *very next* things to do.
>
>   Any ideas of how those should look like in the PO file?

Good catch, we didn't think of that.

The problem I see is deciding whether or not to add the value of the
title attribute (in this case "Back") to the keys in the PO file. We
could add it when the div has class 'translate', but then you can not
deal with the case where the title needs translation and the contents
of the div not or vice-versa. Maybe add a class translateTitle?

Is there any other problem I'm missing?

I may be missing something but I really think each message stored in a po file should have a msgcontext. This would vastly reduce conflicts between strings. IMHO, the msgcontext should be the file that the string came from, perhaps w/ a suffix of the element ID to make it even more unique

for example,

msgcontext "index.html"
"Start"
"Empiezo"

msgcontext "ui.gamebuttons.js"
"Start"
"Inicia"
 

> 3) Handle <title> tags
>
>        <title class="translate">An English Title</title>
>
>   Mysteriously this doesn't seem to get translated by the html2po script, at
>   the moment.

This is strange. It seems to be working here on a small example. Do
you have an example where it doesn't work?

> 4) [LOW PRIORITY] Show file path in the PO file reference comment and position
>   of the translatable.
>
>   Showing the file path is not the tricky part, but showing the position (line
>   number) is currently not supported by the BeautifulSoup Parser.
>
>   See here:
>
>   - http://groups.google.com/group/beautifulsoup/browse_thread/thread/58fc89c6d5ae6b84/218edc54598f8609?lnk=gst&q=line+number#218edc54598f8609
>   - http://groups.google.com/group/beautifulsoup/browse_thread/thread/2b85fcede4814982/e815185482bd65fa?lnk=gst&q=line+number#e815185482bd65fa
>
>        Well, there are some "quick and dirty" patches, but 1) I don't want to
>        waste my time on that at the moment and 2) This doesn't seem all too
>        important to me.

I agree with you that it isn't that important and you should focus on
other things first.

>
> WORKAROUNDS:
> ============
>
> I have encountered a lot of sections looking like this:
>
>        <div id="lesson_title">
>            <img src="../../assets/image/title_block_lt.png" width="33" height="75" align="absmiddle" />
>                Vocabulary Body Parts
>            <img src="../../assets/image/title_block_rt.png" width="33" height="75" align="absmiddle" />
>        </div>
>
> Now, when applying the html2po.py script you can imagine that the `msgid`
> will look very ugly if I leave the script as it is.
>
> My workaround using <span>s:
>
>        <div id="lesson_title">
>            <img src="../../assets/image/title_block_lt.png" width="33" height="75" align="absmiddle" />
>                <span class="translate">Vocabulary Body Parts</span>
>            <img src="../../assets/image/title_block_rt.png" width="33" height="75" align="absmiddle" />
>        </div>
>
> You will see this in the diff.
>
> Is this acceptable? If not, please provide a suggestion.

For me this is a good solution.

> NOTES:
> ======
>
> Reminder:
> My goal was doing that with all 2 Grade's English lessons for the start. The
> rest will eventually follow.
>
> Lessons where the tags of the translatable items are attributed with 'class="translate"'
> (currently only the index.html files):
> - 2_English_alphabetPuzzle_1_K
> - 2_English_animalIdentification_8_K
> - 2_English_colorIdentification_19_K
> - 2_English_directionOfArrows_1_K
> - 2_English_hangmanNumbers_9_K
> - 2_English_matchingWordsAndObjects_23_K
> - 2_English_numberWords
> - 2_English_vocabularyBirds_11_K
> - 2_English_vocabularyBodyParts_14_K
> - 2_English_vocabularyClothes_15_K
> - 2_English_vocabularyDomesticAnimals_4_K
> - 2_English_vocabularyFood_16_K
> - 2_English_vocabularyFruits_10_K
> - 2_English_vocabularyObjects_5_K
> - 2_English_vocabularyPlants_17_K
> - 2_English_vocabularyProfessions_6_K
> - 2_English_vocabularyStructures_11_K
> - 2_English_vocabularyTransportation_9_K
> - 2_English_vocabularyWildAnimals_6_K
> - 2_English_whatSomeoneIsDoing
> - 2_English_whatSomeoneIsDoing_15_K
>
> These lessons contain at least one of the following (partially Non-English)
> HTML documents:
>                * kdoc.html
>                * lessonPlan.html
>                * start.html
> This means, these files should be translated primarily, before touching them.
> [?]
>
> - 2_English_animalIdentification_8_K
> - 2_English_colorIdentification_19_K
> - 2_English_directionOfArrows_1_K
> - 2_English_matchingWordsAndObjects_23_K
> - 2_English_numberWords_14_K
> - 2_English_vocabularyBirds_11_K
> - 2_English_vocabularyBodyParts_14_K
> - 2_English_vocabularyClothes_15_K
> - 2_English_vocabularyDomesticAnimals_4_K
> - 2_English_vocabularyFood_16_K
> - 2_English_vocabularyObjects_5_K
> - 2_English_vocabularyFruits_10_K
> - 2_English_vocabularyPlants_17_K
> - 2_English_vocabularyProfessions_6_K
> - 2_English_vocabularyStructures_11_K
> - 2_English_vocabularyTransportation_9_K
> - 2_English_vocabularyWildAnimals_6_K
> - 2_English_whatSomeoneIsDoing
> - 2_English_whatSomeoneIsDoing_15_K
>
> Comments and Questions
> ======================
>
> * Most lessons are pretty straight-forward, only a few need some manual edits.
> * Still all index.html files from the lessons are not 100% English.
> * These HTML documents in the lessons are mostly still in a foreign language:
>        * kdoc.html
>        * lessonPlan.html
>        * start.html
>
>  What to do with them?

Bryan: do we have somebody who can translate these files into English
so that Kenny can forge ahead?

Let me look into it. We should have someone who can help. I hope Ujjwol is still around, he is the most likely candidate

Kenny, this is a very long e-mail by this point :). Can u e-mail the karma.js group w/ a much shorter description of what you need translated?
 
> * Where and how to store the PO files/translations?
>
>  This may be discussed at a later point, I think, because first I have to make
>  all the lessons translatable and I will consult Sayamindu about his opinion,
>  too, when he has time.
>
>  @Bryan: Ideas, Suggestions, Warnings?
>
> * Should I get write-access to the Karma repository, send the patches to
>  you, create a personal clone on git.olenepal.org?

Bryan: how do you think this is best handled? Maybe it would be a good
idea to let the developers of the individual lessons review the
changes to their lessons. That way, 1. they can verify that nothing
was inadvertedly broken, 2. see what they have to do in the future to
make their lessons easily translatable.

I think it best that Kenny work on his personal clone. Ashok Basnet is doing most of the development work at this point as Vaibhaw has delegated development duties to him.

I think it would work best if Ashok and Kenny can coordinate merges.  Kenny can send merge requests to Ashok and Ashok can review them. This is how github works and gitorious supports the same functionality.

 
> Please correct, overwrite, comment anything which may be useful to me.
>
> GOALSETS
> ========
>
> For today (Sunday):
>
>        * Write all the important enhancements for the script.
>        * Make the missing lessons translatable.
>        * Extract the strings from HTML *and* JavaScript.
>        * Get the shit rolling, in order to do the fun stuff. (Pootle
>          integration) :-)
>
> This week:
>
>        * (Hopefully) start to integrate with Pootle.
>        * Fix the other uprising problems.
>
> (Hopefully) this week or next week:
>
>        * Get most of it rolling.
>        * Get it done.
>
> This is a very quick draft of what I will do in the future.
>
> ---
>
> I'll probably pop-up with other questions during this week, and keep you more
> up-to-date about my work from now on.

Ok, great.

Great to see your progress Kenny! sorry I can't be of more help right now.

 
> Oh yes, I'm sorry for being quite silent during the week. I couldn't do a lot
> because of personal reasons. I apologize and hope you understand.

No problem. You did some good work. It seems to be heading in the
right direction.

> I will triple my efforts from now on, as I want to finish most of it before my
> school exams start (End of May), where my time will be limited to quite a
> minimum.

Once again: good work. And good luck with the exams!

Regards,
Peter

Bryan Berry

unread,
May 3, 2010, 1:02:36 AM5/3/10
to Kenny Meyer, Peter Gijsels, David Farning, Bernie Innocenti, karmajs


On Mon, May 3, 2010 at 8:31 AM, Kenny Meyer <knny...@gmail.com> wrote:
Hey,


> >
> > This is what I'm going to have to prioritize next:
> >
> > html2po.py enhancements:
> > ========================
> >
> > 1) Handle keywords:
> >
> >        <meta name="keywords" content="javascript,html5,sugar,sugarlabs,gsoc,ole,nepal,Vocabulary, Birds, English" />
> >
> >   I suppose that the keywords should be translated, too?
>
> I don't think this is high priority. As far as I understand, these
> keywords were added for search-engine optimisation. But as far as I
> know no major search engine takes them into account any longer. I
> would be in favor of removing them all-together. So you can leave that
> untranslated for now.
>

Ok.


> > 2) Make translatables out of tags with `title` attributes:
> >
> >        <div title="Back" id="linkBackLesson" class="linkBack">
> >
> >   This is a tricky part, but I think this is high-priority. So this will
> >   probably be one of the *very next* things to do.
> >
> >   Any ideas of how those should look like in the PO file?
>
> Good catch, we didn't think of that.
>
> The problem I see is deciding whether or not to add the value of the
> title attribute (in this case "Back") to the keys in the PO file. We
> could add it when the div has class 'translate', but then you can not
> deal with the case where the title needs translation and the contents
> of the div not or vice-versa. Maybe add a class translateTitle?
>
> Is there any other problem I'm missing?

So the HTML markup


   <div title="Back" id="linkBackLesson" class="linkBack">

which would be converted to

   <div title="Back" id="linkBackLesson" class="linkBack translateTitle">

at the end should look like this:

       msgid "Back"
       msgstr ""

Hmm.. the next thing would be checking if the `translate-html.py`, can handle
this. At first glance, I would have to enhance this, too.

I'll primarily make the translate stuff, before tackling this.


> > 3) Handle <title> tags
> >
> >        <title class="translate">An English Title</title>
> >
> >   Mysteriously this doesn't seem to get translated by the html2po script, at
> >   the moment.
>
> This is strange. It seems to be working here on a small example. Do
> you have an example where it doesn't work?
>

Sorry, false alarm. I have forgotten that the output gets automatically sorted
by the script.



> > * Where and how to store the PO files/translations?
> >
> >  This may be discussed at a later point, I think, because first I have to make
> >  all the lessons translatable and I will consult Sayamindu about his opinion,
> >  too, when he has time.
> >
> >  @Bryan: Ideas, Suggestions, Warnings?
> >
> > * Should I get write-access to the Karma repository, send the patches to
> >  you, create a personal clone on git.olenepal.org?
>
> Bryan: how do you think this is best handled? Maybe it would be a good
> idea to let the developers of the individual lessons review the
> changes to their lessons. That way, 1. they can verify that nothing
> was inadvertedly broken, 2. see what they have to do in the future to
> make their lessons easily translatable.
>

I'm still thinking of how to handle that best.

I think it is good to have a po/ folder in each lesson. Later the PO files in
the po/ folders will be sourced to create a single compendium, which will be
used for the Pootle translators.

To get an idea of what I mean:

karma/
       lessons/
               lesson1/
                       ...
                       po/
                               messages.po
                               ...
                       ...
               lesson2/
                       ...
                       po/
                               messages.po
                               ...
                       ...
       ...
       translations/
               karma.pot
               mo/
                       ...
               po/
                       ...
                       en_US.po
                       es_ES.po
                       ...
               ...
       ...

I'm not sure if this is realistic, but it may be a start. [?]

Sayamindu's pootle infrastructure requires you to have one po/ directory at the root the repository. Kenny, I think your solution is the most logical but it will require Sayamindu's help (and consent ;)) to get the pootle scripts to accommodate it.

It would be great if Kenny and Sayamindu could meet w/ Sayamindu this week to work out how to handle the po/ directories.

The best way is to try it out. I'll put that on my TODO list.

This is what I am trying to discuss with experienced people who are more
experienced than me, on this topic.
I'll work on this actively if all the first steps are done.


One thing I have forgotten to ask. After reading the wiki this was attracting
my attention:
> For the immediate future we will dynamically change out strings per locale on
> page load.

Please, tell me how this will look like? JavaScript w/ gettext?

for strings in  .js files, each string would have to be wrapped in a gettext function, just like in Python apps

for strings .html, you would change out the strings using a jQuery CSS selector

// very, very rough quasi-code
//get references to items w/ translate class
$listTranslateItems = $(".translate");

// loop through items and change out strings from .po file
for $i in $listTranslateItems {
   var translatedText = mypo.gettext($i.text());
   $i.text(translatedText);
}
 
// you would want to use a for loop here rather than purely functional style
// because javascript functions have a lot of overhead
// and remember to cache your jquery object references


Sounds like fun.
---

Thank you very much for your time, Peter!

Let's get back to work.

--
 Regards,
 Kenny Meyer | http://kenny.alwaysdata.net

Peter Gijsels

unread,
May 3, 2010, 5:24:18 AM5/3/10
to Bryan Berry, Kenny Meyer, David Farning, Bernie Innocenti, karmajs
>> One thing I have forgotten to ask. After reading the wiki this was
>> attracting
>> my attention:
>> > For the immediate future we will dynamically change out strings per
>> > locale on
>> > page load.
>>
>> Please, tell me how this will look like? JavaScript w/ gettext?
>
> for strings in  .js files, each string would have to be wrapped in a gettext
> function, just like in Python apps
> for strings .html, you would change out the strings using a jQuery CSS
> selector
> // very, very rough quasi-code
> //get references to items w/ translate class
> $listTranslateItems = $(".translate");
> // loop through items and change out strings from .po file
> for $i in $listTranslateItems {
>    var translatedText = mypo.gettext($i.text());
>    $i.text(translatedText);
> }
>
> // you would want to use a for loop here rather than purely functional style
> // because javascript functions have a lot of overhead
> // and remember to cache your jquery object references

Just to make sure: for strings in html, we abandoned this approach in
favor of the translate-html.py script. (The strings in javascript
still use gettext.)

Peter

Bryan Berry

unread,
May 3, 2010, 5:31:58 AM5/3/10
to Peter Gijsels, Kenny Meyer, David Farning, Bernie Innocenti, karmajs
Sorry :p I am all confused after not looking at the code for so long. Forget what I said ;) and listen to Peter 

Reply all
Reply to author
Forward
0 new messages