Hi all,
I created a recordset with 1 row, containing a field with
the following test-values (without the bordering '|'-chars):
Fieldname: VORNAME
Value: |Töü'ßst"a&<>|
(i hope your news reader can display these characters)
When I export the recordset to XML with the Save method,
an XML file is generated which contains invalid characters.
See http://www.wana.at/xml which is the saved XML file.
The problem seems to be that the 'ß' character doesn't get
converted to UTF-8 correctly, as the following hexdump shows:
0000a280 44 56 52 5f 4e 52 3d 27 31 31 39 32 27 20 4e 41 |DVR_NR='1192' NA|
0000a290 43 48 4e 41 4d 45 3d 27 57 61 6e 61 27 20 56 4f |CHNAME='Wana' VO|
0000a2a0 52 4e 41 4d 45 3d 27 54 c3 b6 c3 bc 26 23 78 32 |RNAME='T....|
0000a2b0 37 3b c3 78 73 74 26 23 78 32 32 3b 61 26 23 78 |7;.xst"a&#x|
0000a2c0 32 36 3b 26 23 78 33 63 3b 26 23 78 33 65 3b 27 |26;<>'|
Look at the part between &x27; and "xs" - there should be
the escaped '&' (or the unicodeified characters - I am not
quite sure yet - I only know that 0xc3 0x78 is definitely wrong.
'&' in unicode would be 0xc3 0x9f)
http://www.hcrc.ed.ac.uk/~richard/xml-check.html also tells
me that the file contains invalid characters.
If you want to reproduce the error, try the above. I was using
the ISO-8859-1 charset. Can someone confirm this bug?
regards,
Tom
"Thomas Wana" <thomas_h...@wana.at> wrote in message
news:10576717...@newsmaster-03.atnet.at...
> I found that saving it in binary form works better in xml. the xml is
> unreadable anyways and you can base64 encode the binary form (ADTG) if you
> need to transfer it plaintext
>
Hi,
thanks for the tip, but we need to save it to XML because we are
transmitting the recordsets to SOAP services that have to understand
the XML again.
Tom
> it doesn't matter. if you save it in binary form you can encode it for
> transmission and decode it when you receive it in your soap service before
> you load it as a disconnected recordset
Yeah, that would work if it was a MS-SOAP-Service ... but it isn't.
It doesn't even run on Windows.
Tom
I resolved the bug on my own. It seems that UTF-8 encoding
an UTF-8 string irreversibly destroys the code. The solution
was to save the ADO-recordset in ISO-8859-1 and then transmitting
it via UTF-8-XML.
tom