SA 0.4 with SQLServer and PyODBC Unicode data into nvarchar column

385 views
Skip to first unread message

cristiroma

unread,
Jun 18, 2009, 8:30:25 AM6/18/09
to sqlalchemy
Hello, I have a question regarding SQLAlchemy(0.4) with PyODBC,
SQLServer 2008.
I want to insert Unicode data and query values should be prefixed with
'N' like INSERT INTO test VALUES(N'Сърцето и черният й дроб'). Is this
possible with SA 0.4?

The main problem with SQL Server is that needs explicit 'N' prefix to
string values containing Unicode and I cannot find a way to insert
Unicode data other than modifying SA queries. Before digging into this
I want to make sure there is no other way.

Best regards,
Cristian.

Michael Bayer

unread,
Jun 18, 2009, 9:46:07 AM6/18/09
to sqlal...@googlegroups.com
PyODBC accepts unicode strings (and by that I mean Python unicode,
u'some string') directly on most platforms. No "N" character is
needed (unless PyODBC or the odbc driver does this behind the
scenes). Our MSSQL dialect does detect certain builds where this is
not possible and instead encodes to utf-8 or whatever encoding is
configured on the engine before passing in unicode strings.

phrr...@googlemail.com

unread,
Jun 18, 2009, 11:43:15 AM6/18/09
to sqlalchemy
You should put this into your .freetds.conf file to ensure that
FreeTDS will tell iconv to do the right thing (my understanding is
that all unicode data is encoded to UCS-2 by FreeTDS)

tds version = 8.0
client charset = UTF-8

SQL Alchemy create_engine has an encoding kwarg:
encoding=’utf-8’ – the encoding to use for all Unicode translations,
both by engine-wide unicode conversion as well as the Unicode type
object

This should be set to the same value as you have for 'client charset'
in the .freetds.conf file.

You can run into problems when bogus data has been stuff into the
nvarchar field at the dataserver as it will cause the python codec to
blow up when retrieving the data so "Don't Do That" (I discovered this
the hard way by having a server-side job that was populating the data
and not ensuring that the encoding was well-formed)


pjjH


Cristian Romanescu

unread,
Jun 18, 2009, 1:46:19 PM6/18/09
to sqlal...@googlegroups.com
Well, out of the box it didn't work, I pasted a sample code here: http://pastebin.com/m104b32e0 (note that strings are bulgarian characters not &#1040 as put by pastebin. My FreeTDS/ODBC are:

----------- freetds.conf
[Observations_TDS]
        host = 10.0.0.50
        port = 1433

        tds version = 8.0
        client charset = UTF-8

odbc.ini
[Observations]
Driver          = MSSQLunixODBC
Description     = SQL Server
Servername      = Observations_TDS
User            = sa
Password        = vmuser
Language        =
Database        = 20090610OBS
Logging         = 1
LogFile         = /tmp/log_observations
QuotedId        = Yes
AnsiNPW         = Yes
Mars_Connection = No


Sample code is:
#!/var/local/eoe/python245/bin/python
# -*- coding: UTF-8 -*-
# vim: set fileencoding=UTF-8 :

from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from sqlalchemy import Table, Column, String, MetaData, Unicode, Integer
from sqlalchemy.orm import mapper

# Table definition in SQL
#CREATE TABLE [dbo].[test2] (
#  [ID] int IDENTITY(1, 1) NOT NULL,
#  [comment] nvarchar(100) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
#  CONSTRAINT [PK_test2] PRIMARY KEY CLUSTERED ([ID])
#)
#ON [PRIMARY]
#GO
# sqlalchemy 0.4.8
# python 2.4
# freetds-0.8.2 officialy patched
# pyodbc 2.1.6

metadata = MetaData()
test2_table = Table('test2', metadata,
    Column('comment', Unicode, primary_key=True)
)

class Test2(object):
    def __init__(self, comment):
        self.comment = comment

mapper(Test2, test2_table)

engine = create_engine('mssql://sa:vmuser@/?dsn=Observations', echo=True)
Session = sessionmaker(bind=engine)
session = Session()

ob = Test2(u'товарни автомобили')
session.save(ob)
session.commit()

ob_list = session.query(Test2).all()
print ob_list

Output is:
[cornel@localhost bin]$ ./python test2.py
2009-06-18 16:55:22,754 INFO sqlalchemy.engine.base.Engine.0x..94 BEGIN
2009-06-18 16:55:22,755 INFO sqlalchemy.engine.base.Engine.0x..94 INSERT INTO test2 (comment) VALUES (?)
2009-06-18 16:55:22,755 INFO sqlalchemy.engine.base.Engine.0x..94 ['\xd1\x82\xd0\xbe\xd0\xb2\xd0\xb0\xd1\x80\xd0\xbd\xd0\xb8 \xd0\xb0\xd0\xb2\xd1\x82\xd0\xbe\xd0\xbc\xd0\xbe\xd0\xb1\xd0\xb8\xd0\xbb\xd0\xb8']
2009-06-18 16:55:22,787 INFO sqlalchemy.engine.base.Engine.0x..94 COMMIT
2009-06-18 16:55:22,791 INFO sqlalchemy.engine.base.Engine.0x..94 BEGIN
2009-06-18 16:55:22,791 INFO sqlalchemy.engine.base.Engine.0x..94 SELECT test2.comment AS test2_comment
FROM test2 ORDER BY test2.comment
2009-06-18 16:55:22,791 INFO sqlalchemy.engine.base.Engine.0x..94 []
[<__main__.Test2 object at 0xb7ab5d0c>]







2009/6/18 Michael Bayer <mik...@zzzcomputing.com>



--
The brain is a wonderful organ. It starts working
the moment you get up in the morning, and
does not stop until you get into the office.
Robert Frost (1874-1963)

Cristian Romanescu

unread,
Jun 18, 2009, 1:54:41 PM6/18/09
to sqlal...@googlegroups.com
The wirdest thing is that if I write:

ob = Test2('just a test')
session.save(ob)
session.commit()

2009-06-18 17:08:15,251 INFO sqlalchemy.engine.base.Engine.0x..74 INSERT INTO test2 (comment) VALUES (?)
2009-06-18 17:08:15,251 INFO sqlalchemy.engine.base.Engine.0x..74 ['just a test']
2009-06-18 17:08:15,262 INFO sqlalchemy.engine.base.Engine.0x..74 COMMIT


Everything works fine, but if I write:

ob = Test2(u'товарни автомобили')
session.save(ob)
session.commit()

Output is:
[cornel@localhost bin]$ ./python test2.py
2009-06-18 16:55:22,754 INFO sqlalchemy.engine.base.Engine.0x..94 BEGIN
2009-06-18 16:55:22,755 INFO sqlalchemy.engine.base.Engine.0x..94 INSERT INTO test2 (comment) VALUES (?)
2009-06-18 16:55:22,755 INFO sqlalchemy.engine.base.Engine.0x..94 ['\xd1\x82\xd0\xbe\xd0\xb2\xd0\xb0\xd1\x80\xd0\xbd\xd0\xb8 \xd0\xb0\xd0\xb2\xd1\x82\xd0\xbe\xd0\xbc\xd0\xbe\xd0\xb1\xd0\xb8\xd0\xbb\xd0\xb8']
2009-06-18 16:55:22,787 INFO sqlalchemy.engine.base.Engine.0x..94 COMMIT

And in the database, column is empty! no data, just blank.
Reply all
Reply to author
Forward
0 new messages