Newbie problem with unicode in model (pgsql)

Mariano Mara

unread,

Nov 10, 2009, 2:43:54 PM11/10/09

to pylons-discuss

Hi, I'm facing the current situation and I don't know how to fix it.
What I want to do, of course, is to store unicode data, have unicode data
in the database and retrieve unicode data. But I don't know what I'm
missing.

I defined this simple table in model/__init__.py

language = sa.Table('language', meta.metadata,
sa.Column('id', sa.types.Unicode(length=10),
primary_key=True),
sa.Column('name', sa.types.Unicode(length=255),
nullable=False))

class Language(object):

def __repr__(self):
return "Language<%s, %s>" % (self.id, self.name)

As backend I'm using pgsql 8.3 with a database encoded in UTF-8 and in
my development.ini I have the following line:
sqlalchemy.convert_unicode = true

If I try to store unicode data in the database, when I try to retrieve
it all I get is garbage.
A full example with paster shell (as you can see my database store
garbage too):

In [1]: x = model.Language()

In [2]: x.id = u'es'

In [3]: x.name = u'Español'

In [4]: model.meta.Session.add(x)

In [5]: model.meta.Session.commit()

In [6]: for y in model.meta.Session.query(model.Language).all(): n = y

In [7]: n.id
Out[7]: u'es'

In [8]: n.name
Out[8]: u'Espa\xc3\xb1ol'

In [9]: !psql -d kalendar2
Welcome to psql 8.3.8, the PostgreSQL interactive terminal.

kalendar2=# select * from language;
id | name
----+----------
es | EspaÃ±ol
(1 row)

In [10]: n
Out[10]:
---------------------------------------------------------------------------
UnicodeEncodeError Traceback (most recent call
last)

...traceback...

/usr/lib/python2.6/pprint.pyc in _safe_repr(object, context, maxlevels,
level)
318 return format % _commajoin(components), readable,
recursive
319
--> 320 rep = repr(object)
321 return rep, (rep and not rep.startswith('<')), False
322

UnicodeEncodeError: 'ascii' codec can't encode characters in position
17-18: ordinal not in range(128)

Jonathan Vanasco

unread,

Nov 10, 2009, 6:04:58 PM11/10/09

to pylons-discuss

i've run into the same thing.

in config/environment.py try ading this to your load_environment()

tmpl_options = config['buffet.template_options']
tmpl_options['mako.input_encoding'] = 'utf-8'
tmpl_options['mako.output_encoding'] = 'utf-8'
tmpl_options['mako.default_filters'] = ['decode.utf8']

i don't know why this happens, i don't like my fix, but it has worked
for me.

Mariano Mara

unread,

Nov 10, 2009, 10:32:39 PM11/10/09

to pylons-discuss

Excerpts from Jonathan Vanasco's message of Tue Nov 10 20:04:58 -0300 2009:

Thank you very much, Jonathan: so far so good.

Mariano

Marius Gedminas

unread,

Nov 11, 2009, 2:48:23 PM11/11/09

to pylons-...@googlegroups.com

On Tue, Nov 10, 2009 at 04:43:54PM -0300, Mariano Mara wrote:
> Hi, I'm facing the current situation and I don't know how to fix it.
> What I want to do, of course, is to store unicode data, have unicode data
> in the database and retrieve unicode data. But I don't know what I'm
> missing.

Your code looks correct to me.

> If I try to store unicode data in the database, when I try to retrieve
> it all I get is garbage.
> A full example with paster shell (as you can see my database store
> garbage too):
>
> In [1]: x = model.Language()
>
> In [2]: x.id = u'es'
>
> In [3]: x.name = u'Español'

I see from the shape of your prompt that you're using IPython.

IPython has a bug where you cannot really input Unicode literals.

Try it:

$ ipython
In [1]: len(u'ñ')
Out[1]: 2 <--- WROOONG

$ python
>>> len(u'ñ')
1 <--- correct

The bug is reported upstream and might even be fixed in ipython's trunk.

Marius Gedminas
--
At most companies, programmers aren't trusted with words that a user might
actually see (and for good reason, much of the time).
-- Joel Spolski

signature.asc

Mariano Mara

unread,

Nov 12, 2009, 12:43:29 PM11/12/09

to pylons-discuss

Excerpts from Marius Gedminas's message of Wed Nov 11 16:48:23 -0300 2009:

> On Tue, Nov 10, 2009 at 04:43:54PM -0300, Mariano Mara wrote:
> > Hi, I'm facing the current situation and I don't know how to fix it.
> > What I want to do, of course, is to store unicode data, have unicode data
> > in the database and retrieve unicode data. But I don't know what I'm
> > missing.
>
> Your code looks correct to me.
>
> > If I try to store unicode data in the database, when I try to retrieve
> > it all I get is garbage.
> > A full example with paster shell (as you can see my database store
> > garbage too):
> >
> > In [1]: x = model.Language()
> >
> > In [2]: x.id = u'es'
> >
> > In [3]: x.name = u'Español'
>
> I see from the shape of your prompt that you're using IPython.
>
> IPython has a bug where you cannot really input Unicode literals.
>
> Try it:
>
> $ ipython
> In [1]: len(u'ñ')
> Out[1]: 2 <--- WROOONG
>
> $ python
> >>> len(u'ñ')
> 1 <--- correct
>
> The bug is reported upstream and might even be fixed in ipython's trunk.
>
> Marius Gedminas