Fwd: [elixir] sqlite3.OperationalError: Could not decode to UTF-8 column

Skip to first unread message

Gaetan de Menten

Dec 5, 2007, 3:07:23 AM12/5/07
to sqlal...@googlegroups.com
Anybody knows about this?

---------- Forwarded message ----------
From: Mitch <mchapm...@gmail.com>
Date: Dec 5, 2007 1:06 AM
Subject: [elixir] sqlite3.OperationalError: Could not decode to UTF-8 column
To: SQLElixir <sqle...@googlegroups.com>

Apologies in advance if this should be sent to the SQLAlchemy list

I have an SQLite3 database created with SQLElixir. One of the table
columns, of type String, is being populated from a file which contains
ISO-8859 data. When I try to query all records from the table I get a
traceback which ends with:

File "/Library/Frameworks/Python.framework/Versions/2.5/lib/
base.py", line 1497, in fetchall
l = [self._process_row(self, row) for row in
File "/Library/Frameworks/Python.framework/Versions/2.5/lib/
base.py", line 1492, in _fetchall_impl
return self.cursor.fetchall()
sqlite3.OperationalError: Could not decode to UTF-8 column '[...]'
with text [...]

If I were using the sqlite3 module directly, an acceptable workaround
would be to override the default text factory for the database
connection, e.g. "conn.text_factory = str". Is there any way to do
this via the elixir or sqlalchemy APIs? Is there a better solution?

Thanks for the help.

Gaëtan de Menten

Michael Bayer

Dec 5, 2007, 10:00:57 AM12/5/07
to sqlal...@googlegroups.com

he needs to supply data to the DB as python unicode objects..the
strings from the file should be decoded first from ISO-8859.

if he wants to sqlite's "text_factory" feature, he can use a custom
connection function described in http://www.sqlalchemy.org/docs/04/dbengine.html#dbengine_establishing_custom


Dec 5, 2007, 1:32:24 PM12/5/07
to sqlalchemy
On Dec 5, 8:00 am, Michael Bayer <mike...@zzzcomputing.com> wrote:
> he needs to supply data to the DB as python unicode objects..the
> strings from the file should be decoded first from ISO-8859.

Thanks for the help. After some experimentation I found that encoding
all string data to UTF-8, ignoring errors, did the trick:
s = unicode(s, "utf-8", "ignore")

In case anybody is curious, here are tests to demonstrate my problem
and some solutions. N.B. UTF-8/replace worked in these tests, but not
in my SQLElixir application.

import unittest, sqlite3

class TestCase(unittest.TestCase):
def setUp(self):
self.conn = sqlite3.connect(":memory:")
self.conn.cursor().execute('CREATE TABLE demo (value TEXT)')

def insertAndDump(self, encoding, errors, expectFailure=False):
caseName = repr([encoding, errors])
data = 'K\xf6 1366'
if encoding is not None:
data = unicode(data, encoding=encoding, errors=errors)
cursor = self.conn.cursor()
cursor.execute("INSERT INTO demo (value) VALUES (?)",
cursor.execute('SELECT * FROM demo')
self.failUnless(len(cursor.fetchall()) == 1)
self.failIf(expectFailure, "Unexpected success for %s" %
except Exception, info:
self.failUnless(expectFailure, "Failed %s: %s" %
(caseName, info))

def testCannotRetrieveUnencoded(self):
self.insertAndDump(None, None, True)

def testCannotEncodeStrict(self):
self.insertAndDump("utf-8", "strict", True)

def testCanRetrieve8859Strict(self):
self.insertAndDump("8859", "strict")

def testCanRetrieveUTF8Ignored(self):
self.insertAndDump("utf-8", "ignore")

def testCanRetrieveUTF8Replaced(self):
self.insertAndDump("utf-8", "replace")


Reply all
Reply to author
0 new messages