I can see that this has been discussed before, but I am totally
baffled as to what is happening.
I.
ubuntu
mysql version 5.0.38
with :
MySQL_python-1.2.2
SQLAlchemy-03.10
python-2.5
Using SA, I can insert and select unicode data with no problem. All
the mysql stuff looks like it is set to "latin1", the database wasn't
created with any special options etc., I added nothing special to the
connection string, but if I insert Russian characters (encoded in
utf-8) into a column with type Unicode, It *works* and if I select, I
get back the correct data. It even looks fine with the "mysql"
program.
II.
Now, I try the same thing on a red-hat system
with:
mysql version 4.1.9
MySQL_python-1.2.2
SQLAlchemy-03.10
python-2.4.2
I realize I am changing two things (besides the os): mysql version,
and python version.
BUT, I can't get *anything* with sa and unicode and mysql to work!
I ended up creating the database specifically with utf-8 charset to
get even things with MySQLdb to work (which I finally did). But for
anything to work, I had to do this beforehand:
cursor.execute("set collation_connection=utf8_general_ci")
cursor.execute("set collation_server=utf8_general_ci")
cursor.execute("set character_set_results=utf8")
With SA, when I try to do, say, an insert with something like:
ins = utest_t.insert({'lastname':
"hello"})
conn.execute(ins)
no matter what I do I get errors like:
sqlalchemy.exceptions.DBAPIError: (LookupError) unknown encoding:
latin1_swedish_ci
Does anyone know what I am doing wrong here? Or, how I can make things
right?
BTW, I have tried various versions of the connection string:
conn_string = "mysql://xx:xxx@localhost/utest?
use_unicode=1&charset=utf8"
conn_string = "mysql://xx:xxx@localhost/utest?
use_unicode=0&charset=utf8"
conn_string = "mysql://xx:xxx@localhost/utest"
but with the same results.
All I can say is, unicode on mysql is, well...... I won't say it.
Thanks!
David
can't reproduce this on 0.3.10 (or 0.3.11 or 0.4). the attached test
passes fine for me with zero special setup on nearly identical setup.
(i'm testing with python 2.4.4.)
all that's needed for unicode in mysql is to ensure that the connection
and table encodings aren't configured with a latin1 legacy default. the
mysql unicode data support is actually the best of all the open source
databases, and it's tied with postgres for nearly functional unicode
schemas. (sqlite wins that one.)
to guess wildly, this:
> sqlalchemy.exceptions.DBAPIError: (LookupError) unknown encoding:
> latin1_swedish_ci
could be due to specifying a collation where a charset should be in the
database configuration somewhere.
My output from a 5.0.41 is attached. The 4.1.9 output is the same
except for some wording changes in the encoding warnings. Both
instances have stock configurations right from the MySQL binary tarball,
so when 'table_options={}' runs here, the varchar columns are stored in
the default 'latin1'.
When you're seeing all question marks, is that in the script output or
in the mysql client? I don't think you'd see the cyrillic characters in
the client unless you're using a capable terminal and possibly doing a
'set charset utf8'.