Migrating from Wordpress with cyrillic

18 views
Skip to first unread message

Pavel

unread,
Apr 9, 2007, 5:21:28 AM4/9/07
to habari-users
Has any one tried importing non-latin posts from wordpress?

Ш run import plugin and get the posts from WP, the encoding is all
broken. Wordpress uses UTF8, all data is stored in utf8 (MySQL 4.1),
but when I import posts with plugin, it cranks the whole thing up with
question marks.

http://sandbox.makesense.ru/habari/ -- result.

Slaff

unread,
Apr 9, 2007, 9:47:51 AM4/9/07
to habari-users
*** Try to not use plugins. Try direct import from phpMyAdmin or
terminal if you can.

** Check your tables, may be tables in your database has wrong
collation.
This article can be helpful: http://www.slaff.net/2006/06/29/kak-vyilechit-utf.html

** Try to put correct <meta> tag (you don't have defined charset).
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

天佑

unread,
Apr 9, 2007, 9:57:31 AM4/9/07
to habari...@googlegroups.com
Hi Pavel,

Habari is not yet fully support UTF-8 encoding. You can use the following hack to make the import of UTF-8 WordPress blog become successful.

Modify system/classes/databaseconnection.php.  Add the following line after line 127.

$this->pdo->exec('SET CHARACTER SET UTF8');

i.e. before change

         $this->pdo->setAttribute( PDO::ATTR_ERRMODE, PDO::ERRMODE_WARNING );
         $this->load_tables();

After change

         $this->pdo->setAttribute( PDO::ATTR_ERRMODE, PDO::ERRMODE_WARNING );
         $this->pdo->exec('SET CHARACTER SET UTF8');
         $this->load_tables();
--
Cheers,
tinyau
Blog: http://blog.tinyau.net

Owen Winkler

unread,
Apr 9, 2007, 11:54:02 AM4/9/07
to habari...@googlegroups.com
On 4/9/07, 天佑 <tinyau....@gmail.com> wrote:
> Hi Pavel,
>
> Habari is not yet fully support UTF-8 encoding. You can use the following
> hack to make the import of UTF-8 WordPress blog become successful.
>
> Modify system/classes/databaseconnection.php. Add the
> following line after line 127.
>
> $this->pdo->exec('SET CHARACTER SET UTF8');

Is this something that needs to be part of the install, or is it
something that users should do when creating their databases?
Meaning, if you create your database in a non-UTF-8 character set,
then should you expect not to have UTF-8 compatibility? Or do you
need to set the character set regardless of the character set you've
used to create your database?

Owen

天佑

unread,
Apr 9, 2007, 10:31:43 PM4/9/07
to habari...@googlegroups.com
I'm using MySQL 4.1+ and both my WordPress and Habari databases were created with UTF-8 encoding and collation.   However, the default collation of MySQL server of my web hosting is 'latin1_swedish_ci'.

Even all the tables are under 'utf8_general_ci' collation, if the collation is not explicit set to 'utf8_general_ci', the collation of data stored will be the same as server default, i.e. 'latin1_swedish_ci'.  I use Chinese for blogging.  Although the Chinese characters can be displayed properly in Habari page, it is stored in the wrong collation and I can view it properly in phpMyAdmin using UTF-8 character set.

This is the same problem in WordPress 2.1.x now.  As a non-latin bloggers, we have to apply the similar hack in WordPress in order to store the data with the correct collation.  WordPress 2.2 trunk has addressed this issue and I hope Habari can handle it as well.


天佑

unread,
Apr 9, 2007, 10:36:23 PM4/9/07
to habari...@googlegroups.com
Typo. 

Although the Chinese characters can be displayed properly in Habari page, it is stored in the wrong collation and I can't view it properly in phpMyAdmin using UTF-8 character set.

Cheers,
tinyau
Blog: http://blog.tinyau.net

Pavel

unread,
Apr 10, 2007, 1:15:06 AM4/10/07
to habari-users
I'll try all of the stuff you mentioned here, but the point was
different.

I belive direct mysqldump and further ingest might help, but, I
specifically deployed MySQL with default utf8 encoding and collation
for this same purpose :-) So, I was expecting export/import from/into
to be run rather smoothly :-))

I'll try looking through Habari code -- its seems as if it's setting
NAMES directly somewhere in the code. Is it not?

Reply all
Reply to author
Forward
0 new messages