cyrillic text garbled if used as question in polls example

214 views
Skip to first unread message

Anton Daneika

unread,
Dec 9, 2006, 1:49:25 PM12/9/06
to django...@googlegroups.com
Hello, django users.

I tried entering a Cyrillic text for the poll question attribute, while playing with example polls application. This resulted in a bunch of question marks on the view page, instead of the expected "Как дела?" poll question, which was in Cyrillic.
My firefox character encoding is set to UTF-8;
the following command:
$ env | grep -i utf
yields to
LANG=en_AU.UTF-8

Has anyone faced that problem before?
Am I supposed to switch something else in django or MySQL to get the cyrillic text or am I doing something stupid?

Baurzhan Ismagulov

unread,
Dec 9, 2006, 4:25:08 PM12/9/06
to django...@googlegroups.com
Hello Anton,

On Sat, Dec 09, 2006 at 08:49:25PM +0200, Anton Daneika wrote:
> I tried entering a Cyrillic text for the poll question attribute, while
> playing with example polls application. This resulted in a bunch of question
> marks on the view page, instead of the expected "Как дела?" poll question,
> which was in Cyrillic.

In my experience, text can get broken on each and every step.

First, I would try to see whether the browser sends it correctly using a
dumping HTTP proxy (netcat or a simple perl script). Then, check whether
it is written correctly to the database (e.g., issue a SELECT in mysql).
Then, see if it is displayed correctly.

With kind regards,
--
Baurzhan Ismagulov
http://www.kz-easy.com/

mezhaka

unread,
Dec 11, 2006, 5:32:20 AM12/11/06
to Django users
On Dec 9, 11:25 pm, Baurzhan Ismagulov <i...@radix50.net> wrote:
> Hello Anton,
>
> On Sat, Dec 09, 2006 at 08:49:25PM +0200, Anton Daneika wrote:
> > I tried entering a Cyrillic text for the poll question attribute, while
> > playing with example polls application. This resulted in a bunch of question
> > marks on the view page, instead of the expected "Как дела?" poll question,
> > which was in Cyrillic.In my experience, text can get broken on each and every step.

>
> First, I would try to see whether the browser sends it correctly using a
> dumping HTTP proxy (netcat or a simple perl script). Then, check whether
> it is written correctly to the database (e.g., issue a SELECT in mysql).
> Then, see if it is displayed correctly.
>
> With kind regards,
> --
> Baurzhan Ismagulovhttp://www.kz-easy.com/

*The netcat part*:
Here's what I did:
1. opened an admin interface, went to
http://localhost:8000/admin/polls/poll/add/ and filled it with some
cyrillics.
2. ran shell command:
$ nc -l -vv -p 8000 localhost > cyrillic_trouble
3. pushed "Save and continue editing" on the admin page.
Thus I got browser's POST request in cyrillic_trouble file. The POST's
question parameter taken from that file looks this way:

question=%D0%9A%D0%B0%D0%B3+%D0%B4%D0%B8%D0%BB%D0%B0%2C+%D0%BA%D1%80%D0%BE%D1%81%D0%B0%D1%84%D1%87%D0%B5%D0%93%3F

which is URL encoded text I did entered on the admin page.

So is it supposed to be like this?

I netcated the response to
GET /admin/polls/poll/3/ HTTP/1.1
Host: localhost:8000

The troubled line (the one with question marks) looks like this:
<input type="text" id="id_question" class="vTextField required"
name="question" size="30" value="??? ?????" maxlength="200" />


*The MySQL part*:
I did:
<code>
mysql> select * from polls_poll;
+----+--------------------+---------------------+
| id | question | pub_date |
+----+--------------------+---------------------+
| 1 | what's up? | 2006-12-05 15:05:00 |
| 2 | how are you today? | 2006-12-03 15:22:00 |
| 3 | ??? ????? | 2006-12-09 12:24:13 |
+----+--------------------+---------------------+
</code>

the third row was inserted via admin interface and as you can see it's
just question marks.

then I tried to ran the following SQL:
mysql> insert into polls_poll (`question`, `pub_date`) values ('Ты
используешь Джанго?', NOW());

now it's displayed ok in the mysql client, but the admin interface
shows a different style garbled text instead:
Ты Ð¸Ñ Ð¿Ð¾Ð»ÑŒÐ·ÑƒÐµÑˆÑŒ
Джанго?
which is
Ты используешь Джанго?


I'm desperately stuck for now. Hope someone will push me towards.

Mikhail Gusarov

unread,
Dec 11, 2006, 6:18:19 AM12/11/06
to django...@googlegroups.com

You (mez...@gmail.com) wrote:

m> then I tried to ran the following SQL:
mysql>> insert into polls_poll (`question`, `pub_date`) values ('Ты
m> используешь Джанго?', NOW());

m> now it's displayed ok in the mysql client, but the admin interface
m> shows a different style garbled text instead:
m> Ты Ð¸Ñ Ð¿Ð¾Ð»ÑŒÐ·ÑƒÐµÑˆÑŒ
m> Джанго?
m> which is
m> Ты используешь Джанго?

Here UTF-8 text has been treated as Latin1 and encoded again as UTF-8.

--

Georgi Stanojevski

unread,
Dec 11, 2006, 6:29:59 AM12/11/06
to django...@googlegroups.com
Anton Daneika напиша:

> playing with example polls application. This resulted in a bunch of question
> marks on the view page, instead of the expected "Как дела?" poll question,
> which was in Cyrillic.
> My firefox character encoding is set to UTF-8;
> the following command:
> $ env | grep -i utf
> yields to
> LANG=en_AU.UTF-8
>
> Has anyone faced that problem before?
> Am I supposed to switch something else in django or MySQL to get the
> cyrillic text or am I doing something stupid?

Which mysql version are you using?

I'm using mysql 5.0.24a and django from svn - utf-8 cyrillic text works
fine out of the box in a browser with a default codepage of utf-8. The
system locale is mk_MK.utf8.

To get mysql client in the console working with utf-8 (doesn't have
anything to do with Django) I have this in /etc/my.cnf:

[client]
.
.
default-character-set = utf8

[mysqld]
.
.
character-set-server=utf8
collation-server=utf8_unicode_ci
init_connect='set collation_connection = utf8_unicode_ci;'

--
Glisha
The perfect OS, MS-DOS!
No patches, no root exploits for 21 years.

Anton Daneika

unread,
Dec 11, 2006, 7:32:59 AM12/11/06
to django...@googlegroups.com
On 12/11/06, Georgi Stanojevski <gli...@gmail.com> wrote:

Anton Daneika напиша:

> playing with example polls application. This resulted in a bunch of question
> marks on the view page, instead of the expected "Как дела?" poll question,
> which was in Cyrillic.
> My firefox character encoding is set to UTF-8;
> the following command:
> $ env | grep -i utf
> yields to
> LANG=en_AU.UTF-8
>
> Has anyone faced that problem before?
> Am I supposed to switch something else in django or MySQL to get the
> cyrillic text or am I doing something stupid?

Which mysql version are you using?

$ mysql --version
mysql  Ver 14.12 Distrib 5.0.22, for pc-linux-gnu (i486) using readline 5.1
 

I'm using mysql 5.0.24a and django from svn - utf-8 cyrillic text works
fine out of the box in a browser with a default codepage of utf-8. The
system locale is mk_MK.utf8.

I use Django 0.95 official version

To get mysql client in the console working with utf-8 (doesn't have
anything to do with Django) I have this in /etc/my.cnf:

[client]
.
.
default-character-set = utf8

[mysqld]
.
.
character-set-server=utf8
collation-server=utf8_unicode_ci
init_connect='set collation_connection = utf8_unicode_ci;'

Well, I tried this conf modification.
Before it I could do
mysql> insert into polls_poll  (`question`, `pub_date`) values ('Часто ли у вас возникают проблемы с кирилицей?', NOW());

Then the select statement would produce a correct cyrillic output, but after the proposed conf modification I get garbled text in mysql client...

Gábor Farkas

unread,
Dec 11, 2006, 7:38:09 AM12/11/06
to django...@googlegroups.com
Anton Daneika wrote:
> Well, I tried this conf modification.
> Before it I could do
> mysql> insert into polls_poll (`question`, `pub_date`) values ('Часто
> ли у вас возникают проблемы с кирилицей?', NOW());
>
> Then the select statement would produce a correct cyrillic output, but
> after the proposed conf modification I get garbled text in mysql client...

:)

welcome to the wonderful world of charsets :)

(sorry, i have no idea how this works in mysql(i'm a postgresql user),
but if you see it garbled in mysql-client, that does not mean things got
wrong... unfortunately there are many-many factors that may affect these
things)

so, good luck, and don't give up :)

gabor

Anton Daneika

unread,
Dec 11, 2006, 7:45:18 AM12/11/06
to django...@googlegroups.com
Thank you for support :)
It helps to keep myself up.

Baurzhan Ismagulov

unread,
Dec 12, 2006, 5:28:08 PM12/12/06
to django...@googlegroups.com
On Mon, Dec 11, 2006 at 10:32:20AM -0000, mezhaka wrote:
> question=%D0%9A%D0%B0%D0%B3+%D0%B4%D0%B8%D0%BB%D0%B0%2C+%D0%BA%D1%80%D0%BE%D1%81%D0%B0%D1%84%D1%87%D0%B5%D0%93%3F

This looks good, at least I have no problems with Cyrillic characters
passed to the server in that way.


> The troubled line (the one with question marks) looks like this:
> <input type="text" id="id_question" class="vTextField required"
> name="question" size="30" value="??? ?????" maxlength="200" />

Could you please od the value part, to eliminate the possibility of the
terminal - shell - catting tool chain scrambling the output?


> mysql> select * from polls_poll;

...


> | 3 | ??? ????? | 2006-12-09 12:24:13 |

...


> the third row was inserted via admin interface and as you can see it's
> just question marks.

Could you please od this, too?


> then I tried to ran the following SQL:
> mysql> insert into polls_poll (`question`, `pub_date`) values ('Ты
> используешь Джанго?', NOW());
>
> now it's displayed ok in the mysql client, but the admin interface
> shows a different style garbled text instead:
> Ты Ð¸Ñ Ð¿Ð¾Ð»ÑŒÐ·ÑƒÐµÑˆÑŒ
> Джанго?

Aha. So, what is the output of "locale" in the shell you started mysql
in? Does mysql inherit it? What is the encoding of the database you are
writing to?

Anton Daneika

unread,
Dec 21, 2006, 7:26:14 PM12/21/06
to django...@googlegroups.com
well at last the problem is solved.

i was up to dig into the od'ing everything on my way, but decided to try to mess with MySQL and it helped.

1.
at first i did what Georgi Stanojevski suggested:

To get mysql client in the console working with utf-8 (doesn't have
anything to do with Django) I have this in /etc/my.cnf:

[client]
.
.
default-character-set = utf8

[mysqld]
.
.
character-set-server=utf8
collation-server=utf8_unicode_ci
init_connect='set collation_connection = utf8_unicode_ci;'

2.
then i restarted the mysql server with (i use ubuntu):
$ sudo /etc/init.d/mysql restart

3.
from mysql client
mysql> alter table polls_choice convert to character set utf8;
mysql> alter table polls_poll convert to character set utf8;

and finally the input of cyrillic in the admin area started to work!

the interesting fact is that in the same time running the SELECT statement from mysql client yields:
mysql> select * from polls_choice;
+----+---------+-------------+-------+
| id | poll_id | choice      | votes |
+----+---------+-------------+-------+
|  1 |       1 | ??? ?????   |     0 |
|  2 |       1 | ????? ????? |     0 |
+----+---------+-------------+-------+

to output of question marks, which i suppose displays, that the client doesn't handle utf8.

I want to thank all the people, who tried to help me. I will try my best to be helpful as well
:)
Reply all
Reply to author
Forward
0 new messages