charset on c4lj db

4 views
Skip to first unread message

Ed Summers

unread,
Oct 28, 2011, 12:50:11 AM10/28/11
to c4lj-d...@googlegroups.com
Hi all,

I have some Japanese characters in an article I'm editing and noticed
that the UTF-8 I paste into the form gets converted to question marks
after I post it. I did a little bit of poking around and believe this
is the result of the c4lj database having a default character set of
latin1.

mysql> show create database c4lj;
+----------+-----------------------------------------------------------------+
| Database | Create Database |
+----------+-----------------------------------------------------------------+
| c4lj | CREATE DATABASE `c4lj` /*!40100 DEFAULT CHARACTER SET latin1 */ |
+----------+-----------------------------------------------------------------+
1 row in set (0.00 sec)

I just thought I'd mention it in case it hasn't come up before.

//Ed

Tom Keays

unread,
Oct 28, 2011, 11:17:56 AM10/28/11
to c4lj-d...@googlegroups.com
Hi Ed,

Not knowing much about character sets, is your recommendation that we
change the setting in MySQL to UTF-8? Will that have any consequences
for our existing articles that were created with the latin1 setting?

I would want to back up the Wordpress database before making any
change -- so probably not until Saturday if we decide to do this.

Tom

> --
> You received this message because you are subscribed to the Google Groups "Code4Lib Journal-discuss" group.
> To post to this group, send email to c4lj-d...@googlegroups.com.
> To unsubscribe from this group, send email to c4lj-discuss...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/c4lj-discuss?hl=en.
>
>

Ed Summers

unread,
Oct 28, 2011, 11:19:19 AM10/28/11
to c4lj-d...@googlegroups.com
On Fri, Oct 28, 2011 at 11:17 AM, Tom Keays <tomk...@gmail.com> wrote:
> I would want to back up the Wordpress database before making any
> change -- so probably not until Saturday if we decide to do this.

I'm not actually recommending anything. charset issues can be thorny.
Just pointing out something you all may have known already.

//Ed

Tod Olson

unread,
Oct 29, 2011, 10:50:26 AM10/29/11
to c4lj-d...@googlegroups.com, Tod Olson
In an ideal world I'd like the database to be in unicode. It would certainly make it easier to get non-Roman characters in. But I'm not certain how much effort it is worth, especially if the conversion in non-trivial.

-Tod

Tom Keays

unread,
Oct 29, 2011, 12:02:47 PM10/29/11
to c4lj-d...@googlegroups.com, Tod Olson
OK. I don't want to make any changes while we're in article editing
mode (the site is always active, since we allow comments, but they are
of less concern than articles, at least regarding formatting). A few
weeks after we publish, I'll see if we can upgrade to unicode. Say,
mid-November...

Tod Olson

unread,
Oct 29, 2011, 10:30:45 PM10/29/11
to c4lj-d...@googlegroups.com, Tod Olson
Oh, certainly. And I was quite serious wondering how much effort a conversion is actually worth.

-Tod

Ed Summers

unread,
Oct 30, 2011, 6:15:57 AM10/30/11
to c4lj-d...@googlegroups.com
If you want to have content that isn't restricted to LATIN-1 it's worth it.
Reply all
Reply to author
Forward
0 new messages