I have some Japanese characters in an article I'm editing and noticed that the UTF-8 I paste into the form gets converted to question marks after I post it. I did a little bit of poking around and believe this is the result of the c4lj database having a default character set of latin1.
mysql> show create database c4lj; +----------+--------------------------------------------------------------- --+ | Database | Create Database | +----------+--------------------------------------------------------------- --+ | c4lj | CREATE DATABASE `c4lj` /*!40100 DEFAULT CHARACTER SET latin1 */ | +----------+--------------------------------------------------------------- --+ 1 row in set (0.00 sec)
I just thought I'd mention it in case it hasn't come up before.
Not knowing much about character sets, is your recommendation that we change the setting in MySQL to UTF-8? Will that have any consequences for our existing articles that were created with the latin1 setting?
I would want to back up the Wordpress database before making any change -- so probably not until Saturday if we decide to do this.
On Fri, Oct 28, 2011 at 12:50 AM, Ed Summers <e...@pobox.com> wrote: > Hi all,
> I have some Japanese characters in an article I'm editing and noticed > that the UTF-8 I paste into the form gets converted to question marks > after I post it. I did a little bit of poking around and believe this > is the result of the c4lj database having a default character set of > latin1.
> mysql> show create database c4lj; > +----------+--------------------------------------------------------------- --+ > | Database | Create Database | > +----------+--------------------------------------------------------------- --+ > | c4lj | CREATE DATABASE `c4lj` /*!40100 DEFAULT CHARACTER SET latin1 */ | > +----------+--------------------------------------------------------------- --+ > 1 row in set (0.00 sec)
> I just thought I'd mention it in case it hasn't come up before.
> //Ed
> -- > You received this message because you are subscribed to the Google Groups "Code4Lib Journal-discuss" group. > To post to this group, send email to c4lj-discuss@googlegroups.com. > To unsubscribe from this group, send email to c4lj-discuss+unsubscribe@googlegroups.com. > For more options, visit this group at http://groups.google.com/group/c4lj-discuss?hl=en.
On Fri, Oct 28, 2011 at 11:17 AM, Tom Keays <tomke...@gmail.com> wrote: > I would want to back up the Wordpress database before making any > change -- so probably not until Saturday if we decide to do this.
I'm not actually recommending anything. charset issues can be thorny. Just pointing out something you all may have known already.
In an ideal world I'd like the database to be in unicode. It would certainly make it easier to get non-Roman characters in. But I'm not certain how much effort it is worth, especially if the conversion in non-trivial.
> On Fri, Oct 28, 2011 at 11:17 AM, Tom Keays <tomke...@gmail.com> wrote: >> I would want to back up the Wordpress database before making any >> change -- so probably not until Saturday if we decide to do this.
> I'm not actually recommending anything. charset issues can be thorny. > Just pointing out something you all may have known already.
> //Ed
> -- > You received this message because you are subscribed to the Google Groups "Code4Lib Journal-discuss" group. > To post to this group, send email to c4lj-discuss@googlegroups.com. > To unsubscribe from this group, send email to c4lj-discuss+unsubscribe@googlegroups.com. > For more options, visit this group at http://groups.google.com/group/c4lj-discuss?hl=en.
OK. I don't want to make any changes while we're in article editing mode (the site is always active, since we allow comments, but they are of less concern than articles, at least regarding formatting). A few weeks after we publish, I'll see if we can upgrade to unicode. Say, mid-November...
On Sat, Oct 29, 2011 at 10:50 AM, Tod Olson <t...@uchicago.edu> wrote: > In an ideal world I'd like the database to be in unicode. It would certainly make it easier to get non-Roman characters in. But I'm not certain how much effort it is worth, especially if the conversion in non-trivial.
> -Tod
> On Oct 28, 2011, at 10:19 AM, Ed Summers wrote:
>> On Fri, Oct 28, 2011 at 11:17 AM, Tom Keays <tomke...@gmail.com> wrote: >>> I would want to back up the Wordpress database before making any >>> change -- so probably not until Saturday if we decide to do this.
>> I'm not actually recommending anything. charset issues can be thorny. >> Just pointing out something you all may have known already.
>> //Ed
>> -- >> You received this message because you are subscribed to the Google Groups "Code4Lib Journal-discuss" group. >> To post to this group, send email to c4lj-discuss@googlegroups.com. >> To unsubscribe from this group, send email to c4lj-discuss+unsubscribe@googlegroups.com. >> For more options, visit this group at http://groups.google.com/group/c4lj-discuss?hl=en.
> -- > You received this message because you are subscribed to the Google Groups "Code4Lib Journal-discuss" group. > To post to this group, send email to c4lj-discuss@googlegroups.com. > To unsubscribe from this group, send email to c4lj-discuss+unsubscribe@googlegroups.com. > For more options, visit this group at http://groups.google.com/group/c4lj-discuss?hl=en.
> OK. I don't want to make any changes while we're in article editing > mode (the site is always active, since we allow comments, but they are > of less concern than articles, at least regarding formatting). A few > weeks after we publish, I'll see if we can upgrade to unicode. Say, > mid-November...
> On Sat, Oct 29, 2011 at 10:50 AM, Tod Olson <t...@uchicago.edu> wrote: >> In an ideal world I'd like the database to be in unicode. It would certainly make it easier to get non-Roman characters in. But I'm not certain how much effort it is worth, especially if the conversion in non-trivial.
>> -Tod
>> On Oct 28, 2011, at 10:19 AM, Ed Summers wrote:
>>> On Fri, Oct 28, 2011 at 11:17 AM, Tom Keays <tomke...@gmail.com> wrote: >>>> I would want to back up the Wordpress database before making any >>>> change -- so probably not until Saturday if we decide to do this.
>>> I'm not actually recommending anything. charset issues can be thorny. >>> Just pointing out something you all may have known already.
>>> //Ed
>>> -- >>> You received this message because you are subscribed to the Google Groups "Code4Lib Journal-discuss" group. >>> To post to this group, send email to c4lj-discuss@googlegroups.com. >>> To unsubscribe from this group, send email to c4lj-discuss+unsubscribe@googlegroups.com. >>> For more options, visit this group at http://groups.google.com/group/c4lj-discuss?hl=en.
>> -- >> You received this message because you are subscribed to the Google Groups "Code4Lib Journal-discuss" group. >> To post to this group, send email to c4lj-discuss@googlegroups.com. >> To unsubscribe from this group, send email to c4lj-discuss+unsubscribe@googlegroups.com. >> For more options, visit this group at http://groups.google.com/group/c4lj-discuss?hl=en.
> -- > You received this message because you are subscribed to the Google Groups "Code4Lib Journal-discuss" group. > To post to this group, send email to c4lj-discuss@googlegroups.com. > To unsubscribe from this group, send email to c4lj-discuss+unsubscribe@googlegroups.com. > For more options, visit this group at http://groups.google.com/group/c4lj-discuss?hl=en.
On Sat, Oct 29, 2011 at 10:30 PM, Tod Olson <t...@uchicago.edu> wrote: > Oh, certainly. And I was quite serious wondering how much effort a conversion is actually worth.
> -Tod
> On Oct 29, 2011, at 11:02 AM, Tom Keays wrote:
>> OK. I don't want to make any changes while we're in article editing >> mode (the site is always active, since we allow comments, but they are >> of less concern than articles, at least regarding formatting). A few >> weeks after we publish, I'll see if we can upgrade to unicode. Say, >> mid-November...
>> On Sat, Oct 29, 2011 at 10:50 AM, Tod Olson <t...@uchicago.edu> wrote: >>> In an ideal world I'd like the database to be in unicode. It would certainly make it easier to get non-Roman characters in. But I'm not certain how much effort it is worth, especially if the conversion in non-trivial.
>>> -Tod
>>> On Oct 28, 2011, at 10:19 AM, Ed Summers wrote:
>>>> On Fri, Oct 28, 2011 at 11:17 AM, Tom Keays <tomke...@gmail.com> wrote: >>>>> I would want to back up the Wordpress database before making any >>>>> change -- so probably not until Saturday if we decide to do this.
>>>> I'm not actually recommending anything. charset issues can be thorny. >>>> Just pointing out something you all may have known already.
>>>> //Ed
>>>> -- >>>> You received this message because you are subscribed to the Google Groups "Code4Lib Journal-discuss" group. >>>> To post to this group, send email to c4lj-discuss@googlegroups.com. >>>> To unsubscribe from this group, send email to c4lj-discuss+unsubscribe@googlegroups.com. >>>> For more options, visit this group at http://groups.google.com/group/c4lj-discuss?hl=en.
>>> -- >>> You received this message because you are subscribed to the Google Groups "Code4Lib Journal-discuss" group. >>> To post to this group, send email to c4lj-discuss@googlegroups.com. >>> To unsubscribe from this group, send email to c4lj-discuss+unsubscribe@googlegroups.com. >>> For more options, visit this group at http://groups.google.com/group/c4lj-discuss?hl=en.
>> -- >> You received this message because you are subscribed to the Google Groups "Code4Lib Journal-discuss" group. >> To post to this group, send email to c4lj-discuss@googlegroups.com. >> To unsubscribe from this group, send email to c4lj-discuss+unsubscribe@googlegroups.com. >> For more options, visit this group at http://groups.google.com/group/c4lj-discuss?hl=en.
> -- > You received this message because you are subscribed to the Google Groups "Code4Lib Journal-discuss" group. > To post to this group, send email to c4lj-discuss@googlegroups.com. > To unsubscribe from this group, send email to c4lj-discuss+unsubscribe@googlegroups.com. > For more options, visit this group at http://groups.google.com/group/c4lj-discuss?hl=en.