Stuck again with charset from freetds.

72 views
Skip to first unread message

qwig...@gmail.com

unread,
Dec 10, 2012, 7:37:27 PM12/10/12
to pyo...@googlegroups.com
Hi.
I expirience Issue#222 again 
http://code.google.com/p/pyodbc/issues/detail?id=222

tsql and isql return proper results (i.e. unicode data as utf-8 according to locale) thus i believe that freetds works.
pyodbc replaces non-ascii characters with '?'

python: 2.6.5 (r265:79063, Oct 1 2012, 22:04:36) [GCC 4.4.3]
pyodbc: 3.0.7-beta10 /usr/src/pyodbc/build/lib.linux-x86_64-2.6/pyodbc.so
odbc: 03.52 
driver: libtdsodbc.so 0.82 
supports ODBC version 03.00 
os: Linux 
unicode: Py_Unicode=4 SQLWCHAR=2
server: MSSQL-2000, Cyrillic_General_BIN 

also tried pyodbc 3.0.5, 3.0.6 and various 2.1.x with no luck

Is there a way to debug the issue?
I have no clue how to gdb-inspect either PyObject, or stuff from libodbc.

qwig...@gmail.com

unread,
Dec 11, 2012, 6:55:10 AM12/11/12
to pyo...@googlegroups.com
Seems like the error appears when getdata.c uses SQL_C_WCHAR in GetDataString 
caused by unicode_results=True,  python version >= 3, or explicitely patched

ret = SQLGetData(cur->hstmt, (SQLUSMALLINT)(iCol+1), nTargetType, buffer.GetBuffer(), buffer.GetRemaining(), &cbData);

I'm not sure the error appears exaclty when retrieving the unicode field. Still cannot build pyodbc with debug info.
But the only statement executed is: "SELECT TOP 1 LastName FROM Interviewer"

And here is how isql (unixodbc) retrieves unicode field:

SQLCHAR                 szColumnValue[MAX_DATA_WIDTH+1];
/* include/sqltypes.h: typedef unsigned char   SQLCHAR; */

nReturn = SQLGetData( hStmt, nCol, SQL_C_CHAR, (SQLPOINTER)szColumnValue, sizeof(szColumnValue), &nIndicator );
fputs((char*) szColumnValue, stdout );

The latter statement displays string on UTF-8 console.
That does mean that szColumnValue already contains UTF-8 encoded string right after SQLGetData

Максим Васильев

unread,
Dec 12, 2012, 5:44:14 AM12/12/12
to pyo...@googlegroups.com
Solution for those who will meet the same problem and will find the message.

1. avoid using MSSQL-2000 (8.00) 
It only supports tds_version 4.2 (or may be misconfigured to behave so):
it can store unicode in nvarchar but cannot transfer them as unicode
it can store unicode in ntext but cannot transfer it at all

2. avoid using freetds-0.82 (it does not support WCHAR at all)
freetds-0.91 supports WCHAR and also iconv recoding from server.
paramater 'client charset' is required 
(to initialize iconv before login when server sets up encoding)

Combination of MSSQL-2000 and freetds-0.91/unixodbc-2.2.14 works with restricted unicode support:
strings with ending character with code 0xFF (in native server charset) produce failure from inside libodbc
(the code 0xff was used to denote end-of-string in ancient windows traditions, 
meanwhile it is occupied with national alphabet letter in some charsets)
Reply all
Reply to author
Forward
0 new messages