unicode issues with Zope, SQLAlchemy and ibm_db_sa/ ibm_db

266 views
Skip to first unread message

Frank Hauptmann

unread,
Apr 13, 2010, 4:50:48 AM4/13/10
to ibm_db
Hello Group,

I'm currently working on a project including Zope application Server
2.12.4, Products.SQLAlchemyDA-0.4.1-py2.4, SQLAlchemy-0.4.8-py2.6,
ibm_db_sa-0.1.6-py2.6, ibm_db-1.0.1-py2.6-linux-i686 and IBM DB2 9 on
an OpenSuSE Linux box. Basically database access from Zope to the DB2
ist working. But whenever an insert or update statement contains
german "Umlaut"-Symbols (äöüÄÖÜ) or other special symbols the whole
zope server crashes and restarts whithout changing any data in the
database.
So I tried it down in the python shell:

>>> import ibm_db
>>> import ibm_db_sa
>>> conn = ibm_db.connect('HSI_NEU2', 'db2inst1', '*******')
>>> stmt = ibm_db.exec_immediate(conn, "UPDATE T_ESSENZ set s_essenz = 'Agrimonia ö' where i_pk_essenz = 1")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
Exception: [IBM][CLI Driver] CLI0124E Invalid argument value.
SQLSTATE=HY009 SQLCODE=-99999

I get this Invalid argument Exception.
but:
stmt = ibm_db.exec_immediate(conn, unicode("UPDATE T_ESSENZ set
s_essenz = 'Agrimonia ö' where i_pk_essenz = 1", "iso-8859-1"))

works without any problem.
Executing
db2inst1@hsits100:~> db2 "UPDATE T_ESSENZ set s_essenz = 'Agrimonia ä'
where i_pk_essenz = 1"
from the command line is working without any problems, too.

Obviously somewhere in the chain from Zope through SQLAlchemy to
ibm_db the encoding is messed up.
In SQLAlchemyDA I tried setting/unsetting the "encoding" and the
"convert_unicode" property but it had no effect.
Finally I digged through the sourcecode of SQLAlchemyDA and added one
line in the "query" method of da.py:

qs = unicode(qs, 'ISO-8859-1')

This had the effect that at least Zope doesn't crash any more. But now
the Database contains: "Agrimonia ö" as the value for "S_ESSENZ". So
somewhere down the chain there is still something going wrong with the
encoding.
After digging deeper through code I found this line in sqlalchemy/
engine/base.py:

def execute_string(self, stmt, params=None):
"""execute a string statement, using the raw cursor,
and return a scalar result."""
conn = self.context._connection
if isinstance(stmt, unicode) and not
self.dialect.supports_unicode_statements:
stmt = stmt.encode(self.dialect.encoding)
conn._cursor_execute(self.cursor, stmt, params)
return self.cursor.fetchone()[0]


It is in the class DefaultRunner. IBM_DBDefaultRunner seems to
inhertit completely from this class.

I observed that supports_unicode_statements property in IBM_DBDialect
is set to False (it is inherited from DefaultDialect). Is this right?
DB2 obviously supports unicode statements. But adding the line:

supports_unicode_statements = True

to ibm_db_sa.py (IBM_DBDialect) doesn't seem to change anything.

I don't know if this all is an issue of ibm_db_sa, SQLAlchemy,
SQLAlchemyDA or even the configuration of the linux box. Locale for
the linux box and Zope are set to "de_DE@euro". Changing the locale
didn't help either.

Any advice, where to look at or which parameter to change would be
welcome. Thanks in advance


Frank

Rahul

unread,
Apr 13, 2010, 7:37:47 AM4/13/10
to ibm_db
<<<<<
>>> stmt = ibm_db.exec_immediate(conn, "UPDATE T_ESSENZ set s_essenz = 'Agrimonia ö' where i_pk_essenz = 1")

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
Exception: [IBM][CLI Driver] CLI0124E Invalid argument value.
SQLSTATE=HY009 SQLCODE=-99999
<<<<<

you got this error because your update sql contains some non ASCII
characters, and internally we convert every statement to unicode
object through PyUnicode_FromObject().
PyUnicode FromObject() function returns NULL if string object contains
some non ASCII characters.

<<<<<
I observed that supports_unicode_statements property in IBM_DBDialect
is set to False (it is inherited from DefaultDialect). Is this right?
DB2 obviously supports unicode statements.
<<<<<

Yes, you are correct “supports_unicode_statements” should be true.
Actually ibm_db_sa-0.1.6 had written at the time when ibm_db driver
didn't support unicode.
Now ibm_db driver support unicode so “supports_unicode_statements ”
must to true

To reproduce your problem i need to setup environment for this. It
will take some time for this given my current schedule
Once i am able to reproduce your problem or, facing some problem to
reproduce this i will contact you.
__
Thanks,
Rahul Priyadarshi

Frank Hauptmann

unread,
Apr 13, 2010, 8:06:26 AM4/13/10
to ibm_db
Hello Rahul,

thanks for the quick answer.

On 13 Apr., 13:37, Rahul <rahul.priyadar...@in.ibm.com> wrote:


> you got this error because your update sql contains some non ASCII
> characters, and internally we convert every statement to unicode
> object through PyUnicode_FromObject().
> PyUnicode FromObject() function returns NULL if string object contains
> some non ASCII characters.

But it only does this for String-Objects? As long as I send an unicode
object to ibm_db it works for me.
Where is the PyUnicode_FromObject()-Method called?

> Yes, you are correct “supports_unicode_statements” should be true.
> Actually ibm_db_sa-0.1.6 had written at the time when ibm_db driver
> didn't support unicode.
> Now ibm_db driver support unicode so “supports_unicode_statements ”
> must to true

But there is still a conversion happening somewhere, even
supports_unicode_statements is set to true?

> To reproduce your problem i need to setup environment for this. It
> will take some time for this given my current schedule
> Once i am able to reproduce your problem or, facing some problem to
> reproduce this i will contact you.

Feel free to contact me, if you need further details.

Frank

Rahul

unread,
Apr 20, 2010, 1:53:06 AM4/20/10
to ibm_db
<<<<<<
>But it only does this for String-Objects? As long as I send an unicode
>object to ibm_db it works for me.
>Where is the PyUnicode_FromObject()-Method called?
<<<<<<

For both(String and Unicode Object) we use PyUnicode_FromObject() to
get unicode object.
This method is called from ibm_db driver.

If a string object contained some non ascii characters then
PyUnicode_FromObject() is not able to convert it to unicode object
and it returns NULL. so, make sure to make unicode object if your
statement contains some non ascii characters.

<<<<<<
But there is still a conversion happening somewhere, even
supports_unicode_statements is set to true?
<<<<<<

i can't tell you this without checking.

Thanks,
Rahul Priyadarshi
--
You received this message because you are subscribed to the Google Groups "ibm_db" group.
To post to this group, send email to ibm...@googlegroups.com.
To unsubscribe from this group, send email to ibm_db+un...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/ibm_db?hl=en.

Reply all
Reply to author
Forward
0 new messages