Internationalization of table columns

104 views
Skip to first unread message

Alexander Jacob

unread,
Jan 5, 2013, 7:42:25 AM1/5/13
to sqlal...@googlegroups.com

Hi,

 

I want to implement my own internationalization type for tables like this one where the different languages are stores in columns rather than an extra i.e. country_i18n table.

I know this is not so flexible when adding extra languages but our goal is to eliminate as much joins as possible.

 

CREATE TABLE `country` (

  `id` int(11) auto_increment NOT NULL AUTO_INCREMENT,

  `iso2` char(2) NOT NULL,

  `name_en` varchar(64),

  `name_de` varchar(64),

  PRIMARY KEY (id)

) ENGINE=InnoDB;

 

insert into `country`(`id`,`iso2`,`name_de`,`name_en`) values (1,'DE','Germany','Deutschland');

insert into `country`(`id`,`iso2`,`name_de`,`name_en`) values (2,'IT', 'Italy','Italien');

 

The goal is to have a mapped class Country with a property “name” that is populated depending on a current language setting i.e. in the user session.

With the help of the documentation I did it this way:

 

 

# global language setting (for this example in a global var)

GLOBAL_LANGUAGE = 'en'

 

# custom column clause: modifies the column name depending on GLOBAL_LANGUAGE

class i18NColumnClause(ColumnClause):

    pass

 

@compiles(i18NColumnClause)

def compile_i18n_column_clause(element, compiler, **kw):

    return '%s_%s' % (element.name, GLOBAL_LANGUAGE)

 

# custom type

class i18NType(sqlalchemy.types.TypeDecorator):

    impl = sqlalchemy.types.Unicode

    def column_expression(self, colexpr):

        return i18NColumnClause(colexpr)

 

# test class for custom type

class Country(Base):

 

    __tablename__ = 'country'

    id = Column(Integer, primary_key=True)

    iso2 = Column(String(2))

    name = Column(i18NType(64))

 

    def __repr__(self):

        return "<Country(%d/%s/%s)>" % (self.id, self.iso2, self.name)

 

 

This works for read operations because the new type uses a custom column_expression so I can do something like this:

 

 

germany = session.query(Country).get(1)

print germany      #  <Country(1/DE/Germany)>

 

Unfortunately it does not work when saving an object:

 

australia = Country()

australia.iso2 = 'AU'

australia.name = 'Australia'

session.add(australia)

session.commit()

 

The following query is executed and fails because the column “name” is not in table, it should be name_en:

 

INSERT INTO country (iso2, name) VALUES ('AU', 'Australia')

 

 

What am I missing? Or is this a practicable approach at all? Or are there better ways to do that?

 

Can anyone help?

 

Cheers, Alex

 

(I am using Sqlalchemy 0.8.0b2, mysql 5.5.16 and python 2.7.3)

Michael Bayer

unread,
Jan 5, 2013, 12:50:45 PM1/5/13
to sqlal...@googlegroups.com
On Jan 5, 2013, at 7:42 AM, Alexander Jacob wrote:

@compiles(i18NColumnClause)
def compile_i18n_column_clause(element, compiler, **kw):
    return '%s_%s' % (element.name, GLOBAL_LANGUAGE)
 
# custom type
class i18NType(sqlalchemy.types.TypeDecorator):
    impl = sqlalchemy.types.Unicode
    def column_expression(self, colexpr):
        return i18NColumnClause(colexpr)
 
# test class for custom type
class Country(Base):
 
    __tablename__ = 'country'
    id = Column(Integer, primary_key=True)
    iso2 = Column(String(2))
    name = Column(i18NType(64))
 
    def __repr__(self):
        return "<Country(%d/%s/%s)>" % (self.id, self.iso2, self.name)
 


Interesting approach with the column_expression(), though that seems like it would circumvent the labeling behavior of the compiler and cause more complex expressions to fail.   The INSERT case might work if you also implemented bind_expression() but I'm not sure if that approach overall doesn't break too much of the existing compilation functionality.

A more direct route with @compiles would be to compile Column, but still tricky, in particular that the compilation process for a Column is very involved.   At the very least, for that to work it would have to be like this:

class i18NColumn(Column):   # note it has to be Column, not ColumnClause
   pass

@compiles(i18NColumn)
def compile(element, compiler, **kw):
    text = compile.visit_column(element, **kw)
    text = text.replace(element.name, '%s_%s' %(element.name, GLOBAL_LANGUAGE))
    return text

so that existing labeling rules are not circumvented.  but even that I'm not 100% sure might not have some gotchas.    

Another variant on this would be to use the before_cursor_execute event to just do a search and replace on the SQL statement before it's passed to the DBAPI, I sometimes recommend that as well.  Giving the "fake" column a name that clearly indicates internationalization, like Column("name_<i18N>"), is a good way to give a statement filter a clear path to where those names would go. If you need this translation behavior everywhere, that might be the best approach.

Though here, there's a less intrusive way to get this behavior if ORM level is all you need which is by using hybrids:

class Country(Base):
    # ...
 
   name_de = Column(String)
   name_en = Column(String)

   @hybrid_property
   def name(self):
        return getattr(self, "name_%s" % GLOBAL_LANGUAGE)

   @name.setter
   def name(self, value):
       setattr(self, "name_%s" % GLOBAL_LANGUAGE, value)

the above can be generalized:

def make_hybrid(name):
   @hybrid_property
   def name_attr(self):
        return getattr(self, "%s_%s" % (name, GLOBAL_LANGUAGE))

   @name_attr.setter
   def name_attr(self, value):
       setattr(self, "%s_%s" % (name, GLOBAL_LANGUAGE), value)

   name_attr.__name__ = name
   return name_attr
  

Alexander Jacob

unread,
Jan 5, 2013, 1:53:09 PM1/5/13
to sqlal...@googlegroups.com


Am Samstag, 5. Januar 2013 18:50:45 UTC+1 schrieb Michael Bayer:

On Jan 5, 2013, at 7:42 AM, Alexander Jacob wrote:

@compiles(i18NColumnClause)
def compile_i18n_column_clause(element, compiler, **kw):
    return '%s_%s' % (element.name, GLOBAL_LANGUAGE)
 
# custom type
class i18NType(sqlalchemy.types.TypeDecorator):
    impl = sqlalchemy.types.Unicode
    def column_expression(self, colexpr):
        return i18NColumnClause(colexpr)
 
# test class for custom type
class Country(Base):
 
    __tablename__ = 'country'
    id = Column(Integer, primary_key=True)
    iso2 = Column(String(2))
    name = Column(i18NType(64))
 
    def __repr__(self):
        return "<Country(%d/%s/%s)>" % (self.id, self.iso2, self.name)
 

Thank you for the answer.
 

Interesting approach with the column_expression(), though that seems like it would circumvent the labeling behavior of the compiler and cause more complex expressions to fail.   The INSERT case might work if you also implemented bind_expression() but I'm not sure if that approach overall doesn't break too much of the existing compilation functionality.

You are right, it causes problems when doing complex queries like joinedloads

session.query(IM).options(joinedload(IM.country)).get(4)

the generated sql gets broken like this:

 %(57544816 country)s.name_en AS country_1_name

and then fails here

  File "C:\Python27\lib\site-packages\sqlalchemy\engine\default.py", line 320, in do_execute
    cursor.execute(statement, parameters)
  File "C:\Python27\lib\site-packages\MySQLdb\cursors.py", line 159, in execute
    query = query % db.literal(args)
TypeError: format requires a mapping
 

A more direct route with @compiles would be to compile Column, but still tricky, in particular that the compilation process for a Column is very involved.   At the very least, for that to work it would have to be like this:

class i18NColumn(Column):   # note it has to be Column, not ColumnClause
   pass

@compiles(i18NColumn)
def compile(element, compiler, **kw):
    text = compile.visit_column(element, **kw)
    text = text.replace(element.name, '%s_%s' %(element.name, GLOBAL_LANGUAGE))
    return text

so that existing labeling rules are not circumvented.  but even that I'm not 100% sure might not have some gotchas.    

I will try that
 
Another variant on this would be to use the before_cursor_execute event to just do a search and replace on the SQL statement before it's passed to the DBAPI, I sometimes recommend that as well.  Giving the "fake" column a name that clearly indicates internationalization, like Column("name_<i18N>"), is a good way to give a statement filter a clear path to where those names would go. If you need this translation behavior everywhere, that might be the best approach.

I preferred the event solution but


class Country(Base):
    __tablename__ = 'country'
    name = Column('name_<I18N>', String(64))

'SELECT country.`name_<I18N>` AS `country_name_<I18N>`, country.id AS country_id, country.iso2 AS country_iso2
FROM country
WHERE country.id = %s'

gets properly replace in before_cursor_execute to

'SELECT country.`name_en` AS `country_name_en`, country.id AS country_id, country.iso2 AS country_iso2
FROM country
WHERE country.id = %s'

and finally fails here:

Traceback (most recent call last):
  File "C:\Program Files (x86)\JetBrains\PyCharm 2.5.2\helpers\pydev\pydevd.py", line 1401, in <module>
    debugger.run(setup['file'], None, None)
  File "C:\Program Files (x86)\JetBrains\PyCharm 2.5.2\helpers\pydev\pydevd.py", line 1055, in run
    pydev_imports.execfile(file, globals, locals) #execute the script
  File "C:/Projekte/interimexcellence/prototype/sqlalchemy/test.py", line 68, in <module>
    print germany
  File "C:/Projekte/interimexcellence/prototype/sqlalchemy/test.py", line 44, in __repr__

    return "<Country(%d/%s/%s)>" % (self.id, self.iso2, self.name)
  File "C:\Python27\lib\site-packages\sqlalchemy\orm\attributes.py", line 251, in __get__
    return self.impl.get(instance_state(instance), dict_)
  File "C:\Python27\lib\site-packages\sqlalchemy\orm\attributes.py", line 559, in get
    "correctly" % key)
KeyError: "Deferred loader for attribute 'name' failed to populate correctly"

The dict_ has no key 'name' :-(

Do you have any suggestions how to fix this?
 
Though here, there's a less intrusive way to get this behavior if ORM level is all you need which is by using hybrids:

class Country(Base):
    # ...
 
   name_de = Column(String)
   name_en = Column(String)

   @hybrid_property
   def name(self):
        return getattr(self, "name_%s" % GLOBAL_LANGUAGE)

   @name.setter
   def name(self, value):
       setattr(self, "name_%s" % GLOBAL_LANGUAGE, value)

the above can be generalized:

def make_hybrid(name):
   @hybrid_property
   def name_attr(self):
        return getattr(self, "%s_%s" % (name, GLOBAL_LANGUAGE))

   @name_attr.setter
   def name_attr(self, value):
       setattr(self, "%s_%s" % (name, GLOBAL_LANGUAGE), value)

   name_attr.__name__ = name
   return name_attr
  

This will work but this will load all columns from the db...
 

Michael Bayer

unread,
Jan 5, 2013, 2:12:49 PM1/5/13
to sqlal...@googlegroups.com

On Jan 5, 2013, at 1:53 PM, Alexander Jacob wrote:

>
> I preferred the event solution but
>
> class Country(Base):
> __tablename__ = 'country'
> name = Column('name_<I18N>', String(64))
>
> 'SELECT country.`name_<I18N>` AS `country_name_<I18N>`, country.id AS country_id, country.iso2 AS country_iso2
> FROM country
> WHERE country.id = %s'
>
> gets properly replace in before_cursor_execute to
>
> 'SELECT country.`name_en` AS `country_name_en`, country.id AS country_id, country.iso2 AS country_iso2
> FROM country
> WHERE country.id = %s'

ah yeah. there's a way to manipulate the column lookup as well here which is inside of the "context" passed to the event, I seem to recall working this out for someone but I'd have to re-think into a recipe here. It starts using things that aren't 100% public, though. or you could try a more elaborate regexp that skips the label names, or one that does a second replace for "tablename_colname_en" back to the original symbol since that's the labeling scheme.

but before I get into that....

>
> This will work but this will load all columns from the db...

because your country is a fixed global, you can actually just map the column here as needed:

class Country(Base):
# ...

name = Column('name_%s' % GLOBAL_COUNTRY, String)

then you don't even declare the other columns on this Table, how about that ? there's also ways to have the columns on the Table, but not mapped, using exclude_columns.

otherwise, if GLOBAL_COUNTRY changes at runtime, deferred() might be an approach:

class Country(Base):
# ...

name_de = deferred(Column(String))
name_en = deferred(Column(String))

name_de and name_en won't be part of queries, and only load when they are accessed. but this emits a second SELECT. You'd probably want some way to undefer the one you need at Query time, that looks like this:

query(Country).options(undefer(Country.name_en))

but you'd probably want to automate that somehow.


Reply all
Reply to author
Forward
0 new messages