problem with UTF-8 characters?

19 views
Skip to first unread message

Chris

unread,
Jul 18, 2006, 12:01:25 PM7/18/06
to soap4r
I've attempted to send the string

è or à

in a SOAP message with soap4r and I've gotten a return from the server

XSD::ValueSpaceError: {http://www.w3.org/2001/XMLSchema}string: cannot
accept 'è or à'.

However, I have a SOAP client written in Java that handles the UTF-8
string properly. Is this a soap4r problem? Any pointers to why the
Ruby SOAP client mishandles this string would be appreciated.

zinsser

unread,
Jul 19, 2006, 6:43:37 AM7/19/06
to soap4r
Have you tried setting
XSD::Charset.encoding = 'UTF8'
in your client code?

Timo

Chris

unread,
Jul 19, 2006, 4:40:15 PM7/19/06
to soap4r

I'm having trouble figuring out how to set this correctly. Could you
show a code snippet demonstrating how to set the encoding?
Thanks...

Tiago Macedo

unread,
Jul 19, 2006, 4:50:27 PM7/19/06
to soa...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

This might be unrelated to your issues, but I also had some troubles
with UTF8 in soap4r and it was due to the fact that my ruby installation
(a normal compile) didn't have iconv (an issue with freebsd).

Make sure typing "require 'iconv'" in irb returns true.

Tiago Macedo

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.4 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEvpsTxFuRTtCTMvIRAkrFAJkBQ3RL3rOq4wfnSwKsO/ZxrVr8SgCgg+Br
Naz8v7D7QvH6esfZXj6UenA=
=VScW
-----END PGP SIGNATURE-----

Chris

unread,
Jul 19, 2006, 5:23:08 PM7/19/06
to soap4r

Tiago Macedo wrote:

> This might be unrelated to your issues, but I also had some troubles
> with UTF8 in soap4r and it was due to the fact that my ruby installation
> (a normal compile) didn't have iconv (an issue with freebsd).
>
> Make sure typing "require 'iconv'" in irb returns true.

I'm on WinXP and I have iconv, but thanks for the suggestion, it was
worth checking.

zinsser

unread,
Jul 20, 2006, 6:08:29 AM7/20/06
to soap4r
> I'm having trouble figuring out how to set this correctly. Could you
> show a code snippet demonstrating how to set the encoding?
> Thanks...

here you are:

class Api
def initialize(amazon_key)
@amazon_key = amazon_key

@driver = AWSECommerceServicePortType.new
@driver.wiredump_dev = STDOUT if $DEBUG

XSD::Charset.encoding = 'UTF8' ### set encoding
end

def method_missing(m, *args)
request = args[0]
request[:AWSAccessKeyId] = @amazon_key
response = @driver.send(m, request)

# [...]

return response
end
end

Timo

Chris

unread,
Jul 20, 2006, 5:52:09 PM7/20/06
to soap4r

> @driver = AWSECommerceServicePortType.new
> @driver.wiredump_dev = STDOUT if $DEBUG
>
> XSD::Charset.encoding = 'UTF8' ### set encoding

Hmmm. Seems to have had no effect on my client script. Perhaps
something else is going on.

NAKAMURA, Hiroshi

unread,
Jul 22, 2006, 10:55:11 PM7/22/06
to soa...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

You said that 'Server returns the error' IIRC. Your server is written
in Ruby + soap4r, right? If really the server is returning the error,
you should add above line to the server.

I can look into more deeply when you post/send a working sample.

Regards,
// NaHi
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.1 (Cygwin)

iQEVAwUBRMLlDh9L2jg5EEGlAQIZiQf/Z5c4rLpOwvo+fNMYu2s1+CVzATkKqe4L
NLz2O3I8XK750yW5KOGD54G7COfd4LkHtosd5QkrNMzaVCTsDrvfB1QikTUoZd6X
51KhZ9QzU/o5JI+xfOHiHFWHgHIv/ugPn06UZBfvMXPs+ICJZyRa3o1N1eeupni3
AYU4HAlvXs8Jn2KWOybu/IyUeMRy/1NFyrR+vaM6B6VtrQm4OGZTjbZnesR3RjVg
Z3uYnQl5wFxwSKtM/Yw0MfgOb4EInJW39LQaU0Q+Nr2qW0HuCDE+J/0fng3PiBj1
iWVuJN6cVz4rxROPjDBD/QirnYpiWkBt8a73uNt1OfKzczDd/BE9AA==
=rEuF
-----END PGP SIGNATURE-----

Chris

unread,
Jul 24, 2006, 1:36:18 PM7/24/06
to soap4r

> Chris wrote:
> >> @driver = AWSECommerceServicePortType.new
> >> @driver.wiredump_dev = STDOUT if $DEBUG
> >>
> >> XSD::Charset.encoding = 'UTF8' ### set encoding
> >
> > Hmmm. Seems to have had no effect on my client script. Perhaps
> > something else is going on.
>
> You said that 'Server returns the error' IIRC. Your server is written
> in Ruby + soap4r, right? If really the server is returning the error,
> you should add above line to the server.

Sorry, I misspoke. The server is actually in C++. I don't know the
significance of the

XSD::ValueSpaceError: {http://www.w3.org/2001/XMLSchema}string: cannot
accept 'è or à'.

message. But it seems like like it should work, since the same string
passed to the same server by a Java SOAP client is handled correctly by
the server.

NAKAMURA, Hiroshi

unread,
Aug 7, 2006, 2:08:24 AM8/7/06
to soa...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

Chris wrote:
> Sorry, I misspoke. The server is actually in C++. I don't know the
> significance of the
>
> XSD::ValueSpaceError: {http://www.w3.org/2001/XMLSchema}string: cannot
> accept 'è or à'.
>
> message. But it seems like like it should work, since the same string
> passed to the same server by a Java SOAP client is handled correctly by
> the server.

Yes, it should work. I think soap4r client side library detects
XSD::ValueSpaceError and stop before sending the request to the server.

1) What is dumped when you execute;

% ruby -riconv -e 0

2) What is dumped when you execute following ruby code;

require 'soap/marshal'
str = 'è or à'
puts SOAP::Marshal.dump(str)

3) What is dumped when you execute following ruby code;

require 'soap/marshal'
str = 'è or à'
XSD::Charset.encoding = 'UTF8' # add this line
puts SOAP::Marshal.dump(str)

Regards,
// NaHi

>
>
> >

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.1 (Cygwin)

iQEVAwUBRNbY1x9L2jg5EEGlAQK4Ggf+PlwE3rLGMhT2lhCGaQwontrjkXqz8mc8
Qukd1yL6+FhGQpGJsidhaOSDzwJ9iU4sCYy66YVT0hDpqhipdAFHHpkqNBbB2ZG+
jBMdfKPzDHKD0eIG5/WzYyKT18s4Fcg9Ct+Kb+tzETD5nhTe+7khE5+eYQYnsJgs
pGNNQPjVWHNoXZ2q8nR2cb3OQe/Fgj9Z08CDQ+ixKVuhh/R+qMyJhZIDy2JAzv2h
KBWMAQcyr4f/k8sUxjTKHL0aiTl6JdDl28bRQY2WVH+pbSpOwVU9mRGQswfzDH5K
BfpBdeSrewe+G3L1VYtAGzVr49Lxay4vRCIZarEZsHSnGFItkZVHrA==
=z+B5
-----END PGP SIGNATURE-----

Chris

unread,
Aug 7, 2006, 5:17:29 PM8/7/06
to soap4r
seems to be exactly the same:

##############################3


require 'soap/marshal'
str = 'è or à'

#XSD::Charset.encoding = 'UTF8' # add this line
puts SOAP::Marshal.dump(str)

>ruby blah.rb
<?xml version="1.0" encoding="utf-8" ?>
<env:Envelope xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:env="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<env:Body>
<String xmlns:n1="http://schemas.xmlsoap.org/soap/encoding/"
xsi:type="n1:base64"

env:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">6CBvciDg</String>
</env:Body>
</env:Envelope>
>Exit code: 0
############################
>ruby blah.rb
<?xml version="1.0" encoding="utf-8" ?>
<env:Envelope xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:env="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<env:Body>
<String xmlns:n1="http://schemas.xmlsoap.org/soap/encoding/"
xsi:type="n1:base64"

env:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">6CBvciDg</String>
</env:Body>
</env:Envelope>
>Exit code: 0
################################3

I was going to do this IRB, but pasting the string into IRB yields an
odd result:

normal CMD prompt:


>str = 'è or à'

IRB prompt:
irb(main):002:0* >str = 'ach or acc'
###################################

and adding the line

XSD::Charset.encoding = 'UTF8'

to the script has no effect on the XSD::ValueSpaceError message.

NAKAMURA, Hiroshi

unread,
Aug 7, 2006, 11:29:49 PM8/7/06
to soa...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

Chris wrote:
> seems to be exactly the same:

> ##############################3
> require 'soap/marshal'
> str = 'è or à'
> #XSD::Charset.encoding = 'UTF8' # add this line
> puts SOAP::Marshal.dump(str)
>
>> ruby blah.rb
> <?xml version="1.0" encoding="utf-8" ?>
> <env:Envelope xmlns:xsd="http://www.w3.org/2001/XMLSchema"
> xmlns:env="http://schemas.xmlsoap.org/soap/envelope/"
> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
> <env:Body>
> <String xmlns:n1="http://schemas.xmlsoap.org/soap/encoding/"
> xsi:type="n1:base64"
>
> env:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">6CBvciDg</String>
> </env:Body>
> </env:Envelope>
>> Exit code: 0

Hmm. encoding="utf-8" and base64 encoding. Please show me the result of;

> ruby -Ku blah.rb

By the way, what version of soap4r are you using?

Regards,
// NaHi
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.1 (Cygwin)

iQEVAwUBRNgFLB9L2jg5EEGlAQIVAAf/ULAOISdF79ApRO0JnsRF2PNiB+at4WMU
W8prnVhah9iGE6qIewFF0sbUJsGuu4IAvQ0eGIqkxe0oyjWWQB9m1j4XokJlvfWG
sgE8MBFtFU12KGmJrd5yHqPlMMkE0a2A1MOiuv4tOzK9NE6pQHpcNHPxC52NWi9W
wLOYb1Z/jdzQ3No6NIzzTOQNc9Qr7RoqBW9nkARq9DSDJmls6gSr1dVqGmO+RIr0
n5kV1KFOeh9XDPgg0AsaidRJBqu+yrljViWblq97iBMIvQW7dzypSwo9RzZ4rINS
I1D3E5EMe+0C7xKGce4nXRYji/5uyYCvC2GtB1lXl+jOjOOO2Qhvjw==
=AtD/
-----END PGP SIGNATURE-----

Chris

unread,
Aug 8, 2006, 11:37:57 AM8/8/06
to soap4r

> Hmm. encoding="utf-8" and base64 encoding. Please show me the result of;
>
> > ruby -Ku blah.rb

Interesting:

C:\Documents and Settings\user\Desktop>ruby -Ku blah.rb
blah.rb:4: parse error, unexpected tCONSTANT, expecting $


XSD::Charset.encoding = 'UTF8' # add this line

^

here is blah.rb:
###########################


require 'soap/marshal'
str = 'è or à'

XSD::Charset.encoding = 'UTF8' # add this line
puts SOAP::Marshal.dump(str)

##############################

> By the way, what version of soap4r are you using?

20051204

NAKAMURA, Hiroshi

unread,
Aug 9, 2006, 9:18:18 AM8/9/06
to soa...@googlegroups.com
Hi,

Chris wrote:
>>> ruby -Ku blah.rb
>
> Interesting:
>
> C:\Documents and Settings\user\Desktop>ruby -Ku blah.rb
> blah.rb:4: parse error, unexpected tCONSTANT, expecting $
> XSD::Charset.encoding = 'UTF8' # add this line
> ^
>
> here is blah.rb:
> ###########################
> require 'soap/marshal'
> str = 'è or à'
> XSD::Charset.encoding = 'UTF8' # add this line
> puts SOAP::Marshal.dump(str)
> ##############################

Interesting. Error message says it's in line #4 but it seems to be at
line #3.

1) What version of ruby are you using?
ruby -v

2) Please show hexdump of blah.rb with attached hexdump.rb
ruby -rhexdump.rb -e 'puts PGP::HexDump.encode(File.read("blah.rb"))'

>> By the way, what version of soap4r are you using?
>
> 20051204

Thanks.

Regards,
// NaHi

signature.asc
hexdump.rb

Chris

unread,
Aug 9, 2006, 12:27:39 PM8/9/06
to soap4r

> Interesting. Error message says it's in line #4 but it seems to be at
> line #3.

line #4 is correct (line #1 is blank)


> 1) What version of ruby are you using?

ruby 1.8.4 (2005-12-24) [i386-mswin32]

> 2) Please show hexdump of blah.rb with attached hexdump.rb
> ruby -rhexdump.rb -e 'puts PGP::HexDump.encode(File.read("blah.rb"))'


C:\Documents and Settings\user\Desktop>ruby -rhexdump.rb -e 'puts
PGP::HexDump.encode(File.read("
blah.rb"))'
00000000 0a726571 75697265 2027736f 61702f6d .require 'soap/m
00000010 61727368 616c270a 73747220 3d2027e8 arshal'.str = '.
00000020 206f7220 e0270a58 53443a3a 43686172 or .'.XSD::Char
00000030 7365742e 656e636f 64696e67 203d2027 set.encoding = '
00000040 55544638 27202320 61646420 74686973 UTF8' # add this
00000050 206c696e 650a7075 74732053 4f41503a line.puts SOAP:
00000060 3a4d6172 7368616c 2e64756d 70287374 :Marshal.dump(st
00000070 722920 r)

NAKAMURA, Hiroshi

unread,
Aug 9, 2006, 8:05:45 PM8/9/06
to soa...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

Chris wrote:
>> 1) What version of ruby are you using?
>
> ruby 1.8.4 (2005-12-24) [i386-mswin32]

It's new enough.

>> 2) Please show hexdump of blah.rb with attached hexdump.rb
>> ruby -rhexdump.rb -e 'puts PGP::HexDump.encode(File.read("blah.rb"))'
>
> C:\Documents and Settings\user\Desktop>ruby -rhexdump.rb -e 'puts
> PGP::HexDump.encode(File.read("
> blah.rb"))'
> 00000000 0a726571 75697265 2027736f 61702f6d .require 'soap/m
> 00000010 61727368 616c270a 73747220 3d2027e8 arshal'.str = '.
> 00000020 206f7220 e0270a58 53443a3a 43686172 or .'.XSD::Char

Thanks. blah.rb is saved as iso-8859-1 encoding.

1) save blah.rb as UTF-8 encoding or
2) XSD::Charset.encoding = 'X_ISO_8859_1'

2) may not work. I don't remember when I added it.

Regards,
// NaHi
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.1 (Cygwin)

iQEVAwUBRNp4Vx9L2jg5EEGlAQLBBwf/WY6ORi39VOCkE7BWEKAk3p2Zp9wQqA+t
xj+OOy6tElFgQ4IT6fMmUnM2Pae5UUc7DhJJtsilARWYLZBPwiZqRp368QGN43Nw
wXOkyKpLGkmge+rFmIu9SuLUdPYzpmAEWqziZLqe5bIIbuizaZit0LoyeQ8FgXu3
wPxrrlccIWUE+nMbL28ZhjCyHnvxS0BA30oWoDASYaWqlSPiCH+1CcbqKNU37q9B
V7A8sSY2wMfpfEd4QInSKf7QZaR+NDB7EkAJW3vakILBR//D6Gp8i4qiUAYQoJ0C
55oTvOX9G9qcwyVganW43LdxfmKyZSZrvyMSVbuEn6wxn0NW4MLCmw==
=dugW
-----END PGP SIGNATURE-----

Chris

unread,
Aug 10, 2006, 1:42:02 PM8/10/06
to soap4r

> > C:\Documents and Settings\user\Desktop>ruby -rhexdump.rb -e 'puts
> > PGP::HexDump.encode(File.read("
> > blah.rb"))'
> > 00000000 0a726571 75697265 2027736f 61702f6d .require 'soap/m
> > 00000010 61727368 616c270a 73747220 3d2027e8 arshal'.str = '.
> > 00000020 206f7220 e0270a58 53443a3a 43686172 or .'.XSD::Char
>
> Thanks. blah.rb is saved as iso-8859-1 encoding.
>
> 1) save blah.rb as UTF-8 encoding or
> 2) XSD::Charset.encoding = 'X_ISO_8859_1'
>
> 2) may not work. I don't remember when I added it.


Aha! Thank you very much! I understand what's going on now. And
you're correct, 2) does not work. :)

I really appreciate the time you spent on this.
-Chris

Reply all
Reply to author
Forward
0 new messages