in 1.9.2, with force_encoding, we still need iconv?

33 views
Skip to first unread message

Zhenning Guan

unread,
Jan 31, 2011, 9:55:28 PM1/31/11
to rails...@googlegroups.com
I got code in ruby 1.8.
Iconv.iconv('gbk', 'utf-8', string)

now, ruby 1.9.2 has force_encoding('utf-8'), so can I just
forceing_encoding('utf-8') ?

--
Posted via http://www.ruby-forum.com/.

"Martin J. Dürst"

unread,
Jan 31, 2011, 10:39:50 PM1/31/11
to rails...@googlegroups.com, Zhenning Guan
On 2011/02/01 11:55, Zhenning Guan wrote:
> I got code in ruby 1.8.
> Iconv.iconv('gbk', 'utf-8', string)
>
> now, ruby 1.9.2 has force_encoding('utf-8'), so can I just
> forceing_encoding('utf-8') ?

No. force_encoding just changes the encoding label, but leaves the bytes
in the string as they are. That would result in garbage (unless
everything is ASCII anyway). The main use of force_encoding is to set
encoding labels for raw byte strings (e.g. coming from outside) when
knowing already what the encoding is.

The equivalent of your Iconv call, in Ruby 1.9, is:

string.encode('gbk', 'utf-8')

But I'm a bit vary about the order of the arguments. Both


Iconv.iconv('gbk', 'utf-8', string)

string.encode('gbk', 'utf-8')

encode from UTF-8 to GBK, but the result of force_encoding('utf-8') is
UTF-8, so if you want the result to be UTF-8, you have to turn the order
of the parameters around. I was never happy with the TO-FROM order in
iconv, and I'm also not happy with the TO-FROM order in String#encode,
but String#encode can also be used just with the TO parameter, e.g. just
string.encode('gbk')
if the string has the correct encoding at this point. So when we (Matz
and me, mainly) designed String#encode, unfortunately TO-FROM was the
only order that made sense.

Please also note that there might be slight differences between Iconv
and String#encode for some characters, but these should be very small in
number.

Regards, Martin.

--
#-# Martin J. D�rst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp mailto:due...@it.aoyama.ac.jp

قصص عربية

unread,
Dec 29, 2013, 12:52:28 PM12/29/13
to rails...@googlegroups.com
Reply all
Reply to author
Forward
0 new messages