How to deal with non-ASCII(such as Chinese)table name?

226 views
Skip to first unread message

Chen Houwu

unread,
Mar 25, 2007, 1:04:02 AM3/25/07
to sqlalchemy
from the sqlalchemy documentation,
I found when table definition
(e.g
>>> users_table = Table('users', metadata,
... Column('user_id', Integer, primary_key=True),
... Column('user_name', String(40)),
... Column('password', String(10))
... )
)

and a class
(e.g.
>>> class User(object):
... pass
)

are mapped together,
the column name in the column definition(e.g. 'password') are mapped
into an attribute of the class,
but python only accept ASCII attribute name.
so,
What if the above "password" is a word in Chinese GB2312 encoding?

Does it mean I can't deal with column name which is not ASCII?

sdo...@sistechnology.com

unread,
Mar 25, 2007, 3:05:53 AM3/25/07
to sqlal...@googlegroups.com
try Column()'s parameters quote=.. and key=..,
doc/docstrings.html#docstrings_sqlalchemy.schema_Column

and/or mapper's explicit column-naming:
doc/adv_datamapping.html#advdatamapping_properties_colname

> -~----------~----~----~----~------~----~------~--~---


Chen Houwu

unread,
Mar 25, 2007, 5:57:44 AM3/25/07
to sqlalchemy
thanks you very much!

Michael Bayer

unread,
Mar 25, 2007, 10:55:27 AM3/25/07
to sqlal...@googlegroups.com
non-ascii attribute names in your Python code ? or non-ascii column
names defined in your database table ? I didnt know the latter was
possible with most databases.

Jorge Godoy

unread,
Mar 25, 2007, 11:25:33 AM3/25/07
to sqlal...@googlegroups.com
Michael Bayer <mik...@zzzcomputing.com> writes:

> non-ascii attribute names in your Python code ? or non-ascii column
> names defined in your database table ? I didnt know the latter was
> possible with most databases.

I know that (ARGH!) MySQL would allow the latter. I've seen column names with
diacriticals... I dunno if it would allow table names with those, though...

Also, with quotes you can get a lot of weird things (such as spaces in column
names).

This is for PostgreSQL:

# create table weird_test ("column test" serial, "é um teste" varchar);
NOTA: CREATE TABLE criará sequência implícita "weird_test_column test_seq" para coluna serial "weird_test.column test"
CREATE TABLE
# \d weird_test
Tabela "public.weird_test"
Coluna | Tipo | Modificadores
-------------+-------------------+--------------------------------------------------------------------
column test | integer | not null default nextval('"weird_test_column test_seq"'::regclass)
é um teste | character varying |

# select * from weird_test ;
column test | é um teste
-------------+------------
(0 registros)

# insert into weird_test ("é um teste") values ('testing 123');
INSERT 0 1
# select * from weird_test ;
column test | é um teste
-------------+-------------
1 | testing 123
(1 registro)

#


So, if you don't use quotes you're a lot more restricted than with quotes. I
dunno, though, who would be crazy to create such names.

--
Jorge Godoy <jgo...@gmail.com>

Paul Johnston

unread,
Mar 25, 2007, 4:12:10 PM3/25/07
to sqlal...@googlegroups.com
Hi,

Unicode characters in table/column names is fully supported by MSSQL.

Paul

Michael Bayer

unread,
Mar 25, 2007, 10:27:13 PM3/25/07
to sqlal...@googlegroups.com
yah but thats just MS-SQL who cares :)

Chen Houwu

unread,
Mar 27, 2007, 7:27:25 AM3/27/07
to sqlalchemy
I mean non-ascii column names defined in my database table.
SqlServer, Oracle, MySql...all of the mainstream DBMS support this
feature,
In east asia(China, Japan, Korea), non-ascii column names and table
names,together with non-ascii strings in record
are widely used.

Michael Bayer

unread,
Mar 27, 2007, 10:24:22 AM3/27/07
to sqlal...@googlegroups.com
then there is currently no fix for your situation until this feature
is added to SA.

Chen Houwu

unread,
Mar 27, 2007, 10:55:08 AM3/27/07
to sqlalchemy
But following the second post (see below), I have read the
documentation,
though not very carefully, and have not validated it by test code,
but the documentation really shows the solution to my problem.

Have I misunderstood the documentation?
----------------------
second post
----------------------


"
try Column()'s parameters quote=.. and key=..,
doc/docstrings.html#docstrings_sqlalchemy.schema_Column

and/or mapper's explicit column-naming:
doc/adv_datamapping.html#advdatamapping_properties_colname
"

-----------------------

Chen Houwu

unread,
Mar 27, 2007, 11:12:37 AM3/27/07
to sqlalchemy
from the docstring in
----
class Column:
def __init__(....)
----
name
The name of this column. This should be the identical name as it
appears, or will appear, in the database.

key
Defaults to None: an optional alias name for this column. The
column will then be identified everywhere in an application, including
the column list on its Table, by this key, and not the given name.
Generated SQL, however, will still reference the column by its actual
name.

If I use a ascii key, and a non-ascii name according to the column in
the database, is there anything wrong?

I *won't* solve the problem by "Overriding Column name in mapper
function", which is showed as following.
---------------------
Overriding Column Names

When mappers are constructed, by default the column names in the Table
metadata are used as the names of attributes on the mapped class. This
can be customzed within the properties by stating the key/column
combinations explicitly:

user_mapper = mapper(User, users_table, properties={
'id' : users_table.c.user_id,
'name' : users_table.c.user_name,
})
-----------------------

sdo...@sistechnology.com

unread,
Mar 27, 2007, 12:54:17 PM3/27/07
to sqlal...@googlegroups.com
On Tuesday 27 March 2007 18:12:37 Chen Houwu wrote:
> from the docstring in
> ----
> class Column:
> def __init__(....)
> ----
> name
> The name of this column. This should be the identical name as
> it appears, or will appear, in the database.
>
> key
> Defaults to None: an optional alias name for this column. The
> column will then be identified everywhere in an application,
> including the column list on its Table, by this key, and not the
> given name. Generated SQL, however, will still reference the column
> by its actual name.
>
> If I use a ascii key, and a non-ascii name according to the column
> in the database, is there anything wrong?
well, try it... and tell me, i also have cyrrilic to handle...

Michael Bayer

unread,
Mar 27, 2007, 12:09:14 PM3/27/07
to sqlal...@googlegroups.com
I am doing some experimenting with this, and it would appear that
Psycopg2 (not sure about postgres itself) can not handle unicode
column names. sqlite and mysql adapters do handle unicode table and
column names.

so in changeset 2447, I have made some adjustments so that unicode
table names and column names are supported. there is also a unit
test that does some INSERTs and tests out a basic mapping with one
relationship both lazy- and eager-loaded, and it passes on sqlite and
mysql so far. however the names I am using have just a single non-
ascii character present; with a full set of chinese characters, that
might raise more issues.

In particular the following regular expression has to work with
chinese characters (which it *should*...)

match = re.compile(r'(?<!:):([\w_]+)', re.UNICODE)

the main thing being that the "\w" matcher will allow all characters
in your column name to be matched as "alphanumeric" with the UNICODE
flag enabled.

So you should check out the latest trunk and begin working with it.
I would label this support as "preliminary" since more testing will
be needed.

Additionally, check out the unit tests test/sql/unicode.py, which
illustrates two ways to do mappings with unicode names. one defines
a "key" inside each Column with a plain string name, the other
defines a plain string name on the mapper itself.

On Mar 27, 2007, at 7:27 AM, Chen Houwu wrote:

Michael Bayer

unread,
Mar 27, 2007, 12:15:10 PM3/27/07
to sqlal...@googlegroups.com

On Mar 27, 2007, at 10:55 AM, Chen Houwu wrote:

>
> But following the second post (see below), I have read the
> documentation,
> though not very carefully, and have not validated it by test code,
> but the documentation really shows the solution to my problem.
>
> Have I misunderstood the documentation?


until the fix I just made, you could not say:

t = Table(u'tablename', meta, Column(u'columname'))

i.e. the u'' strings were immediately converted to strings. so i
cant see how non-ascii column names could possibly work.

Reply all
Reply to author
Forward
0 new messages