Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Saving UTF-8 characters to YAML problem

13 views
Skip to first unread message

Paweł Radecki

unread,
Apr 7, 2008, 6:24:45 PM4/7/08
to
Hi there,

I'm trying to save multinational data to YAML with following simple
program:

require 'yaml'

# show "James Bond 007: Nightfire in Chinese"
# text in YAML form
puts '詹姆斯邦德007:暗夜之火'.to_yaml

and what I get is:
--- "\xE8\xA9\xB9\xE5\xA7\x86\xE6\x96\xAF\xE9\x82\xA6\xE5\xBE
\xB7007\xEF\xBC\x9A\xE6\x9A\x97\xE5\xA4\x9C\xE4\xB9\x8B\xE7\x81\xAB"

How can I make this to be human readable text?

Any help much appreciated.
Thanks in advance!

--
Paweł Radecki
e: pawel.j...@gmail.com
w: http://radeckimarch.blogspot.com/

Phillip Gawlowski

unread,
Apr 7, 2008, 6:56:09 PM4/7/08
to
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Paweł Radecki wrote:
| Hi there,
|
| I'm trying to save multinational data to YAML with following simple
| program:
|
| require 'yaml'
|
| # show "James Bond 007: Nightfire in Chinese"
| # text in YAML form
| puts '詹姆斯邦德007:暗夜之火'.to_yaml
|
| and what I get is:
| --- "\xE8\xA9\xB9\xE5\xA7\x86\xE6\x96\xAF\xE9\x82\xA6\xE5\xBE
| \xB7007\xEF\xBC\x9A\xE6\x9A\x97\xE5\xA4\x9C\xE4\xB9\x8B\xE7\x81\xAB"
|
| How can I make this to be human readable text?

With an editor that understand UTF-8, I guess (considering that you use
characters I know from Polish, you probably already do). The OS has to
support UTF-8, too (I *think* that Windows does, and so should Mac OS X.
I'm not sure about other *NIX flavors).

If you mean Ruby, Iconv and Kconv are the way to go, AFAIK, to convert
strings between character sets.


- --
Phillip Gawlowski
Twitter: twitter.com/cynicalryan

Rule of Open-Source Programming #8:

Open-Source is not a panacea.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkf6pooACgkQbtAgaoJTgL/CCQCgo5vT8Jl0mynvcKRJn/36jNuc
XkkAn0RLEwKcyMv/A7BN6i0krgx++td7
=Umd0
-----END PGP SIGNATURE-----

Michael Fellinger

unread,
Apr 8, 2008, 8:29:15 AM4/8/08
to
On Tue, Apr 8, 2008 at 12:25 AM, Paweł Radecki
<pawel.j...@gmail.com> wrote:
> Hi there,
>
> I'm trying to save multinational data to YAML with following simple
> program:
>
> require 'yaml'
>
> # show "James Bond 007: Nightfire in Chinese"
> # text in YAML form
> puts '詹姆斯邦德007:暗夜之火'.to_yaml
>
> and what I get is:
> --- "\xE8\xA9\xB9\xE5\xA7\x86\xE6\x96\xAF\xE9\x82\xA6\xE5\xBE
> \xB7007\xEF\xBC\x9A\xE6\x9A\x97\xE5\xA4\x9C\xE4\xB9\x8B\xE7\x81\xAB"
>
> How can I make this to be human readable text?
>

install the ya2yaml gem and use "詹姆斯邦德007:暗夜之火".ya2yaml instead of .to_yaml
The problem with the yaml that comes with ruby is that it doesn't have
good unicode support, so any "strange" text will be saved as binary,
that behaviour exists since around 1.8.5.
You don't have to change the way your YAML is read, but you have to
use ya2yaml for serializing.

^ manveru

Paweł Radecki

unread,
Apr 8, 2008, 5:30:27 PM4/8/08
to
> install the ya2yaml gem and use "詹姆斯邦德007:暗夜之火".ya2yaml instead of .to_yaml

I really hoped this worked but it didn't for 100%.

Here is what I did:
I installed ya2yaml gem using "gem install ya2yaml" command (Windows
box) and "sudo gem install ya2yaml" Linux box. Got: ya2yaml-0.26.

Then I modified my simple program to be:
#!/usr/bin/env ruby

require 'yaml'
require 'ya2yaml'
require 'jcode'
$KCODE = 'u'

# show "James Bond 007: Nightfire" text
# in Chinese in YAML form
puts '詹姆斯邦德007:暗夜之火'.ya2yaml

and ran it both on my Windows and Linux boxes.

Windows: When I redirect results to a file (running "yaml_test.rb >
test.yaml") it works perfectly but when I try to display results on a
screen (running "yaml_test.rb") with 65001 active code page (set
through "chcp 65001" in command line) I see:
(squares)007(squares) and an error message: "in 'write' Bad file
descriptor (Errno::EBADF)".

Linux: While running above program I get:
"./yaml_test.rb:4:in `require': no such file to load -- ya2yaml
(LoadError) from ./yaml_test.rb:4"

Any clue how to work around this?
Any help much appreciated!

Justin Collins

unread,
Apr 8, 2008, 6:25:04 PM4/8/08
to

Use

require 'rubygems'

before

require 'ya2yaml'


-Justin

Paweł Radecki

unread,
Apr 8, 2008, 8:57:32 PM4/8/08
to
> Use
>
> require 'rubygems'
>
> before
>
> require 'ya2yaml'
>
> -Justin

It worked. Thanks, guys! You saved me hours!

Joao Silva

unread,
Sep 9, 2008, 3:45:41 PM9/9/08
to
Here's a patch for "rake extract_fixtures" that also uses ya2yaml:
http://fukamachi.org/wp/2007/05/18/rails-dump-database-to-fixtures-preserving-utf8/
Saving YAML as UTF-8 should really be default behavior. UnicodeDammit!
--
Posted via http://www.ruby-forum.com/.

0 new messages