unicode question

1 view
Skip to first unread message

Jose Galvez

unread,
Mar 16, 2009, 10:07:01 PM3/16/09
to pylons
I've had issues in the past with unicode being "injected" into my apps
usually from database entries. Most of these problems could mostly
likely be avoided if I had added # -*- coding: utf-8 to the top of my
controllers. But why isn't this the default behavior? If I make a
controller with paster controller <controllername> shouldn't this be
added to the top by default? especially since everything in pylons is
supposed to be in unicode it would make things easer wouldn't it? Also
will any of this be needed once python switches to unicode as the default?

Martin Brabham

unread,
Mar 16, 2009, 10:15:12 PM3/16/09
to pylons-...@googlegroups.com
A lot of times in our apps, we use the ignore option for utf8.  We get unicode errors in our mako mostly, from titles of auctions, and other things of that nature.  I've even experienced them in URL's.  We catch it like this:

offendingString = offendingString.decode('utf8', 'ignore')

This might not be exactly what you are asking, but I hope it helps.

Jose Galvez

unread,
Mar 16, 2009, 11:37:31 PM3/16/09
to pylons-...@googlegroups.com
I've done similar stuff to try and catch unicode errors too.  I guess my point is since pylons pushes unicode and mako pushes unicode python will all be in unicode with version 3, why write code to catch an error, when it doesn't have to be an error?  doesn't adding the magic encoding comment remove the need for much of the code we write to "catch them"?

Christopher Barker

unread,
Mar 17, 2009, 12:34:00 PM3/17/09
to pylons-...@googlegroups.com
Jose Galvez wrote:
> I've done similar stuff to try and catch unicode errors too. I guess my
> point is since pylons pushes unicode and mako pushes unicode python will
> all be in unicode with version 3, why write code to catch an error, when
> it doesn't have to be an error? doesn't adding the magic encoding
> comment remove the need for much of the code we write to "catch them"?

nope -- all that does is specify what encoding the source file is in -
so it effects literals -- that's it. If you are getting unicode objects
and strings intermixed from DB queries, etc, that's another problem.

I like to think of it like this:

Use unicode entirely inside your app.

encode/decode (or make sure you know the encoding of the source) EVERY
TIME you do any IO -- reading writing files, getting to to/from a
database, etc.

Where this gets ugly is legacy data that may be in mixed encodings - arrgg!

-Chris

--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris....@noaa.gov

jj.galvez

unread,
Mar 18, 2009, 2:17:23 PM3/18/09
to pylons-discuss
hmm ok, I guess I need to spend more time looking at the unicode
tutorials. Are any pylons tutorials that show an example of how to
properly use unicode? I would love to see an example, as I am really
new to using unicode
Jose
> Chris.Bar...@noaa.gov
Reply all
Reply to author
Forward
0 new messages