On Sep 22, 11:14 am, Tim Mensch <
tim.men...@gmail.com> wrote:
> See:
http://en.wikipedia.org/wiki/UTF-8
See also:
http://download.oracle.com/javase/6/docs/technotes/guides/jni/spec/types.html#wp16542
This page has the definition of "Modified UTF-8" as recognized by the
VM. The main difference is that '\0' is represented as a multi-byte
value, so that you can use C-style strings. The other difference is
that the 4-byte format isn't recognized.
The key thing to remember is that JNI provides NewStringUTF, not
NewStringASCII or NewStringISOLatin1. If you pass in a non-UTF-8
string, the VM will convert it to something other than what you had in
mind. (Or, if you have CheckJNI enabled, complain bitterly and
abort.)