Multilingual Property

196 views
Skip to first unread message

hetsch

unread,
Jan 8, 2012, 11:30:36 AM1/8/12
to appengine-...@googlegroups.com
Hi ndb group,

i'm using the new ndb for some time now and i really like it so far. But i can't get my head around implementing a multilanguage property.
The value for the field should look something like
{
    'en' : 'value_en',
    'es' : 'value_es',
    'de' : 'value_de'
}
My first thought was to subclass the JsonProperty or PickleProperty but I don't know how to manipulate _get_value or _set_value in a way
that the function does return value of the type string and not the whole object. I'm not sure if this is the way to go...

My second idea was to use a ComputedProperty for the task. Something like:

name = JsonProperty()
name_translated = ComputedProperty(lambda self: self.name.get(language))

I hope that you are understanding what i try to achieve and maybe you can give me a small hint in the right direction.

Many thanks in advance!

Regards,
Gernot

Jakob Holmelund

unread,
Jan 8, 2012, 2:39:55 PM1/8/12
to appengine-...@googlegroups.com

First of all, have look at i18n. This is the standard way of dealing with locales. If you want to pursue the aproach you are taking now, which could be ok, if you dont have alot of language properties you could use the structured property, something likethis

Class languageprop(model):
  en_val = stringproperty
  es_val = stringproperty

Class someclass(model):
  title = StructuredProperty(languageprop)

Sorry for typos and syntax.. Writing from tablet

Den 08/01/2012 17.30 skrev "hetsch" <gerno...@gmail.com>:

Michael Robellard

unread,
Jan 8, 2012, 3:00:45 PM1/8/12
to appengine-...@googlegroups.com
I would use a structured property so that you can query the data
inside. Language would be one of the elements in the structure and the
data would be the other. You could then query by Lang and or data

Sent from my iPhone

Niklas Rosencrantz

unread,
Jan 8, 2012, 3:16:44 PM1/8/12
to appengine-...@googlegroups.com
But how to make the language parameter en-us or en_US default to en
when en-us is not available? Should a framework do this and how?
Django does this and also has get_language_from_request and solves
what is locale pt_BR and what is language pt-br since you usually have
one with the other and vice versa.

I make apps in English and then localize to Portuguese or Swedish or
whatever the end usage is. It's more work but a better solution for
the programmers to be able to read what i/o is without having to
understand Portuguese or Swedish. A huge advantage over other i18n
systems for app engine is that with webapp2 you can localize
currencies and timezones using babel so I could write a localized
timesince filter, everything works and it is bug-free. Maybe webapp2
or some framework could include a useful getter for a user's language
ie user.get_locale something like django has that parses the request
so that pt-br falls back to pt if not available.

I think gettext is a very good project and I'm getting user's language
from request by http get parameter as described in the documentation,
http headers, by session value, by cookie and one could also get it
from datastore value. When making a locale file I find using poedit is
more convenient and that babel.cfg is not needed if you use poedit - I
can just compile the .po file directly in poedit. I change the
language in the basehandler dispatch function depending on the values
found. This is a not buggy but I don't understand why I cannot set the
session variable in the dispatch method:

from webapp2_extras import i18n
from webapp2_extras.i18n import lazy_gettext as _
def dispatch(self):
"""
............Save the sessions for preservation across requests
........"""

# self.session_store = sessions.get_store(request=self.request)
# if self.request.host.find('localhost') > 0: # for a Swedish
domain that uses Swedish
# or lang = os.environ.get("HTTP_ACCEPT_LANGUAGE")

i18n.get_i18n().set_locale('sv')
lang_code_get = self.request.get('hl', None)
if lang_code_get:
#self.session['i18n_language'] = lang_code_get # why won't this work?
i18n.get_i18n().set_locale(lang_code_get)
try:
response = super(NewBaseHandler, self).dispatch()
self.response.write(response)
finally:
self.session_store.save_sessions(self.response)

So I'd like a way to solve this once for all my apps like the user
model does a good job at a user model we want localization library
ready to use with builtin session management when starting a new app.
My configuration is

config = {'webapp2_extras.jinja2': {'template_path': 'template_files',
'filters': {'makeid': filters.makeid,
'timesince': filters.timesince,
'datetimeformat': filters.datetimeformat},
'environment_args': {'extensions': ['jinja2.ext.i18n']}}
'webapp2_extras.sessions': {'secret_key':
'RR-234234432432-secret-key'}
}
Thank you
Nick Rosencrantz

Robellard, Michael

unread,
Jan 8, 2012, 4:54:17 PM1/8/12
to appengine-...@googlegroups.com
First of all, You need to think about how you are using the
localization. If you are localizing text in your app such as the
prompt's on a form or some other part of your app that is fixed, then
you should not store it in the datastore, You should use a solution
that use the built-in localization support of Python, Django, and some
other APIs. I have used babel myself and it works great. In the end
you end up storing all your internationalized text in .mo and .po
files. This will be as fast as dictionary lookups when you go to load
the page. If you stored that data in the datastore you would have to
retrieve it each time and that would be slow or you would end up
adding a lot of caching complexity that's not necessary.

Now, if it dynamic data, then I could see why you need it stored with
it language attributes in the datastore.

--
Michael Robellard
(216)288-2745
Play Railroad Empire: http://www.therailroadempire.com

hetsch

unread,
Jan 9, 2012, 3:52:16 AM1/9/12
to appengine-...@googlegroups.com
Thank you so much for your detailed help!

It's clear for me that using i18n module is build for handling more statically text like info messages aso.
In my case, using a StructuredProperty seems to be the way to go. I will now have a look at the documentation and
try to play around with the information you gave me.

I'll post the solution to my problem if i make some progress!

Anyway, thanks a lot, and have a nice day!

Regards,
Gernot

hetsch

unread,
Jan 10, 2012, 6:57:09 AM1/10/12
to appengine-...@googlegroups.com
I have a proof of concept now for a starting point. What are you thinking about this idea:

from google.appengine.api import datastore_errors
from google.appengine.ext.ndb import model
from tests import base

class MultilingualProperty(model.PickleProperty):
   
    def __init__(self, property_instance, *args, **kwargs):
        self.property_instance = property_instance
        super(MultilingualProperty, self).__init__(*args, **kwargs)
   
    """   
    def _db_get_value(self, v, p):
        return super(MultilingualProperty, self)._db_get_value(v, p)
       
    def _db_set_value(self, v, p, value):
        super(MultilingualProperty, self)._db_set_value(v, p, value)
    """
   
    def _set_value(self, entity, value):
        curr_val = self._get_user_value(entity)
        if curr_val is None:
            curr_val = {}
        curr_val[entity.language] = value
        super(MultilingualProperty, self)._set_value(entity, curr_val)
       
    def _get_value(self, entity):
        curr_val = super(MultilingualProperty, self)._get_value(entity)
        return curr_val.get(entity.language, None)
       
    def _validate(self, value):
        if not isinstance(value, dict):
            raise datastore_errors.BadValueError('Expected dict, got %r' %
                                                       (value,))
        # validate all dict values
        for key, value in value.iteritems():
            self.property_instance._validate(value)
           
        return super(MultilingualProperty, self)._validate(self._to_base_type(value))   
   
class MultilingualModel(model.Model):
   
    title = MultilingualProperty(model.StringProperty())
   
    def __init__(self, language='en', *args, **kwargs):
        self._language = language
        super(MultilingualModel, self).__init__(*args, **kwargs)
   
    def _get_language(self):
        return self._language
    def _set_language(self, value):
        self._language = value
    language = property(_get_language, _set_language)
   
class MutlilingualTest(base.BaseTest):
   
    def setUp(self):
        super(MutlilingualTest, self).setUp()
        self.register_model('MultilingualModel', MultilingualModel)

    def testInsertEntity(self):
        entity = MultilingualModel()
        entity.put()
        self.assertEqual(1, len(MultilingualModel.query().fetch(2)))
       
        entity.language = 'en'
        entity.title = u'title_en'
        entity.put()
       
        db_saved = MultilingualModel.query().get()
        db_saved.language = 'en'
        self.assertEqual('title_en', db_saved.title)

        entity.language = 'de'
        entity.title = 'title_de'
        entity.put()
       
        db_saved = MultilingualModel.query().get()
        db_saved.language = 'de'
        self.assertEqual('title_de', db_saved.title)


I'm sure there's a better way to do it and i'm really interested how i can make it a little bit better.

As always, thank you a lot!

Guido van Rossum

unread,
Jan 10, 2012, 1:55:05 PM1/10/12
to appengine-...@googlegroups.com
This looks about right, at least as a starting point; overriding _set_value() and _get_value() seems to be the right way to address your issue. As a challenge you might look into how you can make it do the right thing for MultilingualProperty(<class>, repeated=True).

Also, perhaps surprisingly, this code finally clarifies to me what you wanted (probably I didn't take enough time reading your earlier posts :-( ). Summarizing, you want a property that stores a bunch of values, one for each language, and setting/getting the value should set/get the value corresponding to the currently selected language. That's a cool idea.

--Guido
--
--Guido van Rossum (python.org/~guido)
Message has been deleted

Gernot Cseh

unread,
Jan 10, 2012, 5:14:14 PM1/10/12
to appengine-ndb-discuss
Guido, your second paragraph describes exactly what I need! :-) After
re-reading my posts it's really hard to tell what i need.
Code is the best language... :-)

I'm trying hard to get that baby done (also the repeated
functionality) and
your comment shows me that I'm on the right way what makes me happy.
I
have played a little bit more and if it's allowed, i can post that
code later on.
Maybe it's useful for others too.

Gernot

Robellard, Michael

unread,
Jan 10, 2012, 9:32:45 PM1/10/12
to appengine-...@googlegroups.com
Guido is the right thing to use a Pickle for him? I thought pickle was
the slowest of the encoding methods that we had available to us?
Aren't JSON and Protocol Buffer faster?

--

Guido van Rossum

unread,
Jan 10, 2012, 10:31:26 PM1/10/12
to appengine-...@googlegroups.com
On Tue, Jan 10, 2012 at 18:32, Robellard, Michael <mi...@robellard.com> wrote:
> Guido is the right thing to use a Pickle for him? I thought pickle was
> the slowest of the encoding methods that we had available to us?
> Aren't JSON and Protocol Buffer faster?

TBH I don't know which would be faster; it might be worth measuring carefully.

It may also depend on whether you have C code to accelerate it, which
you will when using the Python 2.7 runtime for all three, but which
only protobuf has for Python 2.5.

Downsides of JSON and PB are that you must conform to the limitations
of those respective serialization methods; pickle has limitations too
but is generally more lenient. When using PB, in particular, you would
have to use LocalStructuredProperty, which would require that you
define the structure using a Model class. I *think* the design here
requires being able to serialize a dict, so that would rule out PB --
it has no direct dict support. (In fact, PickleProperty and
JSONProperty were added in part to overcome this. :-)

Apologies to hetsch / Gernot / the OP, I don't have more time to
review your code. Good luck!

Niklas Rosencrantz

unread,
Jan 11, 2012, 1:34:04 AM1/11/12
to appengine-...@googlegroups.com
For example, I'd like the property user.country to display depending
on user.language so that "Sweden" displays Sverige is the user is
Swedish and in Sweden and "UK" in English if the user is British and
there. Could I just save everything in English and then

<dt>{% trans %}Country{% endtrans %}:</dt>
<dd>{% trans %}{{user.country}}{% country %}</dd>

? I normally just write my translation in a .po file and compile it
with poedit to .mo and then Jinja2 can pick up the compiled
translations without use of the datastore. I think gettext is a good
project since it enables us to use the same compiled .mo translation
for many different projects like I can take the .po file from django
which has rather a lot of translations and use those translations in
my project under the BSD licence.

Thank you for the aweseome ndb library,
Nick Rosencrantz

Gernot Cseh

unread,
Jan 11, 2012, 2:19:04 AM1/11/12
to appengine-...@googlegroups.com
Guido, no problem - thanks for your new insights - you have been an immense help so far and i really enjoy working with the nice code of ndb!

Niklas, yes you got it right. The whole point of the story that you are able to translate some text/values - which you don't know before - and save it
to the database. Think about a Page model. In a really basic model, you would normally  have a title property and a text property. The User is able
to enter the text in some kind of cms backend. In this case you don't know what title or text the user enters. Gettext is really a useful piece of code, but
in this case it's not that useful anymore. With our approach you only have to change the language attribute of the MultilingualModel and save/resave
the translated values to the database.

On the view side you would be able to simply render the value like

<dt>{% trans %}Country{% endtrans %}:</dt>
<dd>{{user.country}}</dd> // the actually set model language
<dd>{{user.country_de}}</dd> // an explicit called language attribute
<dd>{{user.country_en}}</dd> // an explicit called language attribute

I hope this clarifies the whole story a little bit.

If i have some kind of stable code I put up a gist on github (might be better than posting tons of code here) and I would be happy if you all would join me
developing a stable base for this problem.


Gernot Cseh

unread,
Jan 11, 2012, 5:06:57 AM1/11/12
to appengine-...@googlegroups.com
Michael - that's a valid point. I use the PickleProperty in favor of a StructuredProperty because I don't wanted to create another model, but
i had no idea of the performance hit. Can you think of any better solution? I mainly use pyhton2.7 but if there's a better way of achieving the same I'm
 would be happy to hear that

I've uploaded the latest changes to the following gist:

https://gist.github.com/1593960

Maybe this kind of Property is also useful any kind of attributes not limited to language parameters.
Consider the following:

class MrNiceGuy(MultilingualModel):
     answer = MultilingualModel(model.StringProperty())

entity = MrNiceGuy()
entity.language = 'girlfriend'
entity.answer = 'Hmmm, it tasted yummie!'
entity.language = 'reality'
entity.answer = 'Maybe it could have taken a little bit more salt and pepper....'

<h1>How was your meal today?</h1>
{% if visitor.name == 'girlfriend' %}
    {{ entity.answer_girlfriend }}
{% else %}
    {{ entity.answer_reality }}
{% endif %}

Just joking... like the cooking of my girlfriend...

Yohai Rosen

unread,
Aug 27, 2012, 10:42:39 AM8/27/12
to appengine-...@googlegroups.com
Hi Gernot,
I'm using your Multilingual model and it is great. The only thing is that my data is being uploaded to the datastore from csv files using the BulkLoader. The problem is that in order to use your Multilingual model you first need to instanciate it and only then to populate it with the data for each language. Im not an expert using BulkLoader but for my understanding, you cannot do that (and I'll be happy if you could tell me otherwise), you have to give it the init data. How would you suggest to solve this issue? maybe implement an option to pass the languages as a dict in the constructor? (aka {en: english_val, he: hebrew_val})

Thanks,
Yohai

Gregory Nicholas

unread,
Aug 27, 2012, 2:32:34 PM8/27/12
to appengine-...@googlegroups.com
here's a link to some code i've posted on github that i use to solve this issue.. https://gist.github.com/3491119

it does have the trade off of being based on a JsonProperty, so the property content is not queryable. maybe it could help you out..?

Yohai Rosen

unread,
Aug 28, 2012, 7:41:28 AM8/28/12
to appengine-...@googlegroups.com
Hi Gregory, 
I don't understand how using your JSON property will solve the issue of uploading data using BulkLoader into Multilingual models.
Can you give more details about your solution?

Regards,
Yohai
Reply all
Reply to author
Forward
0 new messages