Is there any solution to support unicode on Android?

1,732 views
Skip to first unread message

schumacher

unread,
Feb 22, 2010, 12:14:02 AM2/22/10
to android-ndk
My project is based on Unicode for windows, now I want to migrate it
to Android in native C/C++. But on Android, the wchar_t is 4 bytes and
it miss many important function like string conversion between
different codepage.

If I really want to make it to work on Android, what should I do? Is
there any suggestion?

craig.mautner

unread,
Feb 22, 2010, 10:12:13 AM2/22/10
to android-ndk
I've had some success using the
-fshort-wchar
compiler flag.

Dianne Hackborn

unread,
Feb 22, 2010, 12:59:26 PM2/22/10
to andro...@googlegroups.com
Use UTF-8.  The One True Encoding (except for $#^%!! Java).


--
You received this message because you are subscribed to the Google Groups "android-ndk" group.
To post to this group, send email to andro...@googlegroups.com.
To unsubscribe from this group, send email to android-ndk...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/android-ndk?hl=en.




--
Dianne Hackborn
Android framework engineer
hac...@android.com

Note: please don't send private questions to me, as I don't have time to provide private support, and so won't reply to such e-mails.  All such questions should be posted on public forums, where I and others can see and answer them.

nagamatu

unread,
Feb 26, 2010, 6:31:18 PM2/26/10
to android-ndk
Dalvik VM supports modified UTF-8. Not standard UTF-8.
There is no one true encoding.

--
nagamatu

On 2月23日, 午前2:59, Dianne Hackborn <hack...@android.com> wrote:
> Use UTF-8.  The One True Encoding (except for $#^%!! Java).
>
>
>
>
>
> On Sun, Feb 21, 2010 at 9:14 PM, schumacher <zhengli...@gmail.com> wrote:
> > My project is based on Unicode for windows, now I want to migrate it
> > to Android in native C/C++. But on Android, the wchar_t is 4 bytes and
> > it miss many important function like string conversion between
> > different codepage.
>
> > If I really want to make it to work on Android, what should I do? Is
> > there any suggestion?
>
> > --
> > You received this message because you are subscribed to the Google Groups
> > "android-ndk" group.
> > To post to this group, send email to andro...@googlegroups.com.
> > To unsubscribe from this group, send email to

> > android-ndk...@googlegroups.com<android-ndk%2Bunsubscribe@googlegr oups.com>


> > .
> > For more options, visit this group at
> >http://groups.google.com/group/android-ndk?hl=en.
>
> --
> Dianne Hackborn
> Android framework engineer

> hack...@android.com

Dan Bornstein

unread,
Feb 28, 2010, 8:33:33 PM2/28/10
to andro...@googlegroups.com
On Fri, Feb 26, 2010 at 3:31 PM, nagamatu <naga...@gmail.com> wrote:
> Dalvik VM supports modified UTF-8. Not standard UTF-8.

I think you may have misunderstood something.

The modified UTF-8 format[*] is certainly used in Dalvik in a couple
of areas, in particular strings in .dex files are represented with it,
and there are some standard APIs that are specified to use it (such as
Data{Input,Output}Stream). And yes, when you deal with "UTF-8" through
the JNI calls defined specifically to use that encoding, those calls
actually expect or produce modified UTF-8.

However, the core library fully supports standard UTF-8 (as well as a
host of other encodings). You can pass in "UTF-8" as the encoding name
(aka a "charset name") to use in any of the APIs that take one, such
as, but not limited to, InputStreamReader and at least a couple of the
String constructors. You may use these APIs directly from native code
by making method calls via JNI.

-dan

[*] Technically, it's actually closer to CESU-8 than UTF-8.

Dawei Xu

unread,
Feb 28, 2010, 8:57:36 PM2/28/10
to andro...@googlegroups.com
Hi, Dan,

Thank you for your explanation on UTF-8.

Could you please tell me which header file to include and what libxxx.so to link
so as to use these character-encoding APIs directly from native code
by making method calls via JNI?

Best regards,
Dawei Xu

--
You received this message because you are subscribed to the Google Groups "android-ndk" group.
To post to this group, send email to andro...@googlegroups.com.
To unsubscribe from this group, send email to android-ndk...@googlegroups.com.

Dan Bornstein

unread,
Mar 1, 2010, 6:31:37 PM3/1/10
to andro...@googlegroups.com
On Sun, Feb 28, 2010 at 5:57 PM, Dawei Xu <davidj...@gmail.com> wrote:
> Could you please tell me which header file to include and what libxxx.so to
> link
> so as to use these character-encoding APIs directly from native code
> by making method calls via JNI?

Just include the standard JNI header file, and then use the standard
core classes (such as String, InputStreamReader, etc.) via the normal
lookup mechanisms.

I'm not an NDK expert, but I don't think you have to do any special linking.

-dan

nagamatu

unread,
Mar 1, 2010, 6:44:19 PM3/1/10
to android-ndk
How can we make a object of java.lang.String from standard UTF-8
characters in NDK?
NewStringUTF does not work for this.

--
nagamatu

On 3月2日, 午前8:31, Dan Bornstein <danf...@android.com> wrote:

Dan Bornstein

unread,
Mar 1, 2010, 7:20:51 PM3/1/10
to andro...@googlegroups.com
On Mon, Mar 1, 2010 at 3:44 PM, nagamatu <naga...@gmail.com> wrote:
> How can we make a object of java.lang.String from standard UTF-8
> characters in NDK?
> NewStringUTF does not work for this.

As I said, you can do so by using the standard core library APIs via
JNI. The equivalent Java code would be something like this:

byte[] utfEncodedData = [... get your bytes of UTF-8 from somewhere ...];
String decoded = new String(utfEncodedData, "UTF-8");

It is admittedly a minor pain to get references to the classes and
methods with JNI, but it's not rocket science. The JNI code would look
something like this (omitting error-checking for ease of exposition):

// top-level declarations
jclass String_class;
jmethodID String_constructor;
jstring utf8_name;

// in your setup function (assuming JNIEnv *env)
String_class = env->FindClass(env, "java/lang/String");
String_class = env->NewGlobalReference(env, String_class);
String_constructor = env->GetMethodID(env, String_class, "<init>",
"([BLjava/lang/String;)V");
String_constructor = env->NewGlobalReference(env, String_constructor);
utf8_name = env->NewStringUTF(env, "UTF-8");
utf8_name = env->NewGlobalReference(env, utf8_name);

// in your main code when you want to make a String (assuming JNIEnv *env)
jbyteArray utfEncodedData = [... get your bytes of UTF-8 from
somewhere ...];
jstring decoded = env->NewObject(env, String_class, String_constructor,
utfEncodedData, utf8_name);

You may find it easier to avoid most of this and just do the
conversion in Java code. That is, construct your byte[]s in native
code if you need to, but simply return those to a driver written in
Java, and then you can just use a single line of code to do the
conversion. Something like:

// where callNativeCode() is defined like: static native byte[]
callNativeCode(...);
byte[] utfEncodedData = callNativeCode(...);
String decoded = new String(utfEncodedData, "UTF-8");

It's hard to say what's the most preferable way to do this without a
lot more context about what you're actually trying to accomplish.

I hope this helps.

-dan

Dawei Xu

unread,
Mar 4, 2010, 4:07:27 AM3/4/10
to andro...@googlegroups.com
Hi, Dan,

Thank you very much about how to use UTF-8 code by JNI.

Best Regards,
Dawei Xu


--
Reply all
Reply to author
Forward
0 new messages