DAL returns string fields not as unicode strings

151 views
Skip to first unread message

Dandelion Mine

unread,
Nov 4, 2015, 9:19:41 AM11/4/15
to web2py-users
Hello!
According to Web2py book, 'by default web2py uses utf8 character encoding for databases'. I get the contrary results: there are fields with type 'string', mysql shows that they have collation utf8_general_ci, but when I select them with DAL, the type of returned fields are 'str', not 'unicode'.

db.define_table('customers',
                    Field('name', 'string'))

+-----------+--------------+-----------------+------
| Field     | Type         | Collation       | Null | Key |
+-----------+--------------+-----------------+------+
| name      | varchar(512) | utf8_general_ci |

print type(db(db.tradera_customers).select().first().name)
<type 'str'>

I tried to remove *.table in databases and to use db_codec parameter for DAL, but nothing changed.

Web2py version 2.9.11-stable, Python 2.7.9, MySQL ver 14.14 Distrib 5.5.40

Is it a known bug or I'm doing something wrong?
Thanks in advance.

Massimo Di Pierro

unread,
Nov 11, 2015, 9:52:47 AM11/11/15
to web2py-users
The strings you get are probably UTF8. Can you confirm?

Dandelion Mine

unread,
Nov 12, 2015, 2:03:25 PM11/12/15
to web2py-users
I tried it on 2.12.3-stable:
>>> db = DAL('sqlite://storage.sqlite')
>>> db._db_codec
'UTF-8'
>>> db.define_table('unicode_test', Field('test', 'string'))
<Table unicode_test (id,test)>
>>> test_val = unicode('på Facebook', 'utf-8')
>>> db.unicode_test.insert(test=test_val)
1L
>>> for r in db(db.unicode_test).select(): print r.test, type(r.test)
...
på Facebook <type 'str'>
>>> db.unicode_test[1].test
'p\xc3\xa5 Facebook'
>>> db.unicode_test[1].test.decode('utf-8')
u'p\xe5 Facebook'

Paolo Valleri

unread,
Nov 14, 2015, 8:48:30 AM11/14/15
to web2py-users
I tried to run the following (test.py) with the latest stable web2py (i.e.: python web2py.py -S welcome -R test.py)
db = DAL('sqlite:///tmp/storage.sqlite')
db.define_table('unicode_test', Field('test', 'string'))
test_val = unicode('på Facebook', 'utf-8')
db.unicode_test.insert(test=test_val)
db.commit()

for r in db(db.unicode_test).select(): 
    print r.test, type(r.test)

print db.unicode_test[1].test

I got

på Facebook <type 'str'>
på Facebook

Paolo

Alexandr Presniakov

unread,
Nov 16, 2015, 9:43:50 AM11/16/15
to web...@googlegroups.com
Yes, the only difference in my example was that strings were printed by interpreter, not by print.
The question is: why these strings are returned as byte strings? If they are encoded as utf-8 by default, why not return unicode typed strings?
If I'm not mistaken, it was default behaviour before. Or maybe I confused it with strings returned by json field type.

--
Resources:
- http://web2py.com
- http://web2py.com/book (Documentation)
- http://github.com/web2py/web2py (Source code)
- https://code.google.com/p/web2py/issues/list (Report Issues)
---
You received this message because you are subscribed to a topic in the Google Groups "web2py-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/web2py/1bAU_OUbHQQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to web2py+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages