This is in C#
When reading some UTF-8 from an Access notes field using ADO.NET I get the
UTF-8 characters held in a UTF-16 string. In this example I'm going to use
the hebrew het character 'ח' character. This is 5D7 unicode and D7 97 UTF-8.
Now I can convert the 5D7 character held in a C# string to its corresponding
a UTF-8 bytes easily.
UnicodeEncoding unicode = new UnicodeEncoding();
UTF8Encoding utf8 = new UTF8Encoding();
string het = "ח";
byte[] UnicodeHet = unicode.GetBytes(het);
byte[] UTF8Bytes = Encoding.Convert(unicode,utf8,UnicodeHet);
UTF8Bytes is then written to the database.
When I read this from the database I get two characters that represent the
UTF-8 string held in UTF-16 C# string. I can convert these back to the het
character using the following code
UnicodeEncoding unicode = new UnicodeEncoding();
UTF8Encoding utf8 = new UTF8Encoding();
Encoding local = Encoding.GetEncoding(1252);
string utf8het = "׳—"; //Normally read from the database but hardcoded here
byte[] utf8hetbytes = local.GetBytes(utf8het);
byte[] utf8result = Encoding.Convert(utf8,unicode,utf8hetbytes);
result = unicode.GetString(utf8result);
If the code page for the machine is set to 1252 this works correctly. e.g.
If the result from the database was a hebrew het character 'ח' character it
will return the utf-8 characters D7 97 in the byte sequence, which will be
correctly decoded to 5D7
Problem: If I subsequently change the code page of the machine to hebrew,
byte[] utf8hetbytes = local.GetBytes(utf8het);
will start returning 3F 97. 3F is ? which generally means a translation
error has occurred on the character.
Why?
If I switch to getting the default code page, it always works. Unfortunately
it appears the rest of the code (poor) requires 1252. Am I wrong in assuming
that if I get 1252 encoding it should not be effected by the code page of the
machine? It appears that I am faced with a bit of a major re-work due to
this.
Is there another way to get the two utf-8 bytes held in a C# string into a
byte array without going through a code page?
Thanks in advance
Alex
-Nagendra
"Sunray" <Sun...@discussions.microsoft.com> wrote in message
news:00C8F20F-42E5-4031...@microsoft.com...