Unicode Query Result Problems

58 views
Skip to first unread message

Shawn Brown

unread,
Nov 6, 2009, 6:45:49 PM11/6/09
to pyodbc
I'm having Unicode-related problems with pyodbc.

I can connect to my database and run queries without issue. I can
SELECT numeric values and use the results without problems but when I
select varchar fields, the results are scrambled.

For example, given the following query:

SELECT countyname FROM county_db WHERE countyname='ADAMS' LIMIT 1

I should get a single item containing the text "ADAMS". But instead, I
get the following:

u'\u4441\u4d41S\u3139\ud580'

The hex values for "ADAMS" are 41 44 41 4d 53. The above Unicode hex
contains these ASCII hex values plus some additional junk values. For
clarity, see below:

\u4441 => 44 (D), 41 (A)
\u4d41 => 4d (M), 41 (A)
S => S
\u3139 <- junk data from adjacent memory address
\ud580 <- junk data from adjacent memory address

I saw Daniel Holth's post <http://groups.google.com/group/pyodbc/msg/
7ba109cd82eab4ac> and added '-fshort-wchar' to setup.py's
extra_compile_args and reinstalled pyodbc, but this didn't change the
behavior noted above.

I've been at this for a while and I'm out of ideas. Can anyone help?
Suggestions are welcome.

P.S. I'm using Python 2.6.4 compiled with the default UCS2 option.

Daniel Holth

unread,
Nov 6, 2009, 10:43:02 PM11/6/09
to pyodbc
Shawn,

You have provided almost no information about your environment. OS?
Database? ODBC driver vendor? ODBC driver manager? 32 or 64 bit?

Please try SQLSetConnectAttr()as mentioned in my original post on the
off chance your ODBC drivers were provided by DataDirect. If not, you
need to read the programming manuals until you find the part that
mentions Unicode. Mine return utf-8 by default or UCS2 and it's less
work to get pyodbc to work with UCS2.

One imperfect solution I tried while I was working on this is to patch
pyodbc so it treats Unicode results the same way as ASCII. You could
probably do the UTF-8 -> unicode conversion at a higher layer.

Daniel

Shawn Brown

unread,
Nov 7, 2009, 9:13:58 AM11/7/09
to pyodbc
Hi, Daniel. Thanks for the reply.

The environment I'm working with:
OS - Linux (SuSE)
Database - Netezza
ODBC driver vendor - possibly Netezza (but I'll check)
Driver Manager - unixODBC
32/64 bit - not sure (but I'll check)

> Please try SQLSetConnectAttr()as mentioned in my
> original post on the off chance your ODBC drivers
> were provided by DataDirect.

I'll look into this -- will have to be on Monday, though. The
documentation says that it's returning UTF-8.
Reply all
Reply to author
Forward
0 new messages