The way Dalvik VM native code uses Java objects

860 views
Skip to first unread message

Wang, Ligang

unread,
Dec 25, 2009, 4:47:04 AM12/25/09
to android-...@googlegroups.com

Hi all,

 

I have two questions about the design of object reference in Dalvik VM.

 

1.         I noticed that Dalvik VM internal native code uses Java objects just with their direct references, not in a JNI-like way. Perhaps the design purposes for this are simplicity and better performance. But in this way, Dalvik can not use a GC which may move objects. Since function local variables use objects with their direct references and these variables will not be enumerated as roots, these variables may have wrong references to those objects after GC. I saw many occurrences in vm/oo, vm/reflect directories and elsewhere.

2.         And many of such occurrences are references to ClassObject. So related to this, another question is why ClassObject combines the Java object part and the native part into one and stores it in the managed heap. If it is separated into a Java object part and a native description part, and stores the former in the managed heap, stores the latter in the native heap, many direct references to ClassObject won’t be a problem, because the native part is what it actually wants.

 

So I am wondering the initial intentions for the two designs above. Thanks in advance for any help.

 

-Ligang

Wang, Ligang

unread,
Dec 24, 2009, 9:02:09 PM12/24/09
to android-...@googlegroups.com

Carl Shapiro

unread,
Jan 4, 2010, 8:56:57 PM1/4/10
to android-platform
Hi Ligang,

You are correct that objects are passed as direct references. If the
USE_INDIRECT_REF macro is defined at compile time the JNI will pass
objects to native code using an indirect reference. However, there
are Android APIs that assume that references to the same object will
have the same value. Until such assumptions are eliminated objects
must be passed directly.

It is important to keep in mind that the JNI gives the implementor
substantial latitude in this particular corner of the design space.
Passing objects indirectly is not truly the "JNI-like way" as the JNI
explicitly permits passing object references directly or indirectly.
No one strategy is intrinsically better than the other and the VM is
allowed to make this decision on a case-by-case basis.

Regarding the GC implications of direct references, the use of direct
references does not strictly prohibit the use of a copying collector.
While it could limit the number of objects which may otherwise be
relocated during a collection, the vast majority of objects are never
passed to native code and therefore would not be subject to "pinning"
during any part of their lifetime.

On a related note, certain objects in the DVM, such as certain strings
and class objects, have indefinite extent. (This is subject to
change.) In other words, once they are allocated they are never
become garbage. A well-engineered relocating collector would allocate
such objects separately from volatile objects, thereby ensuring that
the efforts of the collector are spent on objects with a non-zero
probability of becoming garbage. Also, because these objects are
known to be permanent, there is no need to relocate the object because
its space is never threatened or, similarly, never needs compaction,
and no need to pass the object indirectly as exposing its address is
not a liability.

Regards,

Carl

Wang, Ligang

unread,
Jan 8, 2010, 5:19:43 AM1/8/10
to android-...@googlegroups.com

Hi Carl,

 

Many thanks to your helpful explanation. I added some comments below. And thanks again for your time. J

 

Regards,

Ligang

 

-----Original Message-----
From: android-...@googlegroups.com [mailto:android-...@googlegroups.com] On Behalf Of Carl Shapiro
Sent: 2010
15 9:57
To: android-platform
Subject: Re: The way Dalvik VM native code uses Java objects

 

Hi Ligang,

 

You are correct that objects are passed as direct references.  If the

USE_INDIRECT_REF macro is defined at compile time the JNI will pass

objects to native code using an indirect reference.  However, there

are Android APIs that assume that references to the same object will

have the same value.  Until such assumptions are eliminated objects

must be passed directly.

 

[Ligang] Yes, Eclair introduced the indirect reference for global/local JNI reference. But in my understanding, this is only used for JNI references, not for references in the VM native code. So the problem I mentioned is still not solved by this, unless it uses a similar way.

 

It is important to keep in mind that the JNI gives the implementor

substantial latitude in this particular corner of the design space.

Passing objects indirectly is not truly the "JNI-like way" as the JNI

explicitly permits passing object references directly or indirectly.

No one strategy is intrinsically better than the other and the VM is

allowed to make this decision on a case-by-case basis.

 

[Ligang] Yes, you are right. “JNI-like way” is not an accurate expression. What I meant was, if direct references in the VM native code are substituted by a structure like the JNI indirect reference, they will be able to be enumerated by GC and GC will handle the reference modification.

 

Regarding the GC implications of direct references, the use of direct

references does not strictly prohibit the use of a copying collector.

While it could limit the number of objects which may otherwise be

relocated during a collection, the vast majority of objects are never

passed to native code and therefore would not be subject to "pinning"

during any part of their lifetime.

 

On a related note, certain objects in the DVM, such as certain strings

and class objects, have indefinite extent.  (This is subject to

change.)  In other words, once they are allocated they are never

become garbage.  A well-engineered relocating collector would allocate

such objects separately from volatile objects, thereby ensuring that

the efforts of the collector are spent on objects with a non-zero

probability of becoming garbage.  Also, because these objects are

known to be permanent, there is no need to relocate the object because

its space is never threatened or, similarly, never needs compaction,

and no need to pass the object indirectly as exposing its address is

not a liability.

 

[Ligang] Yeah. If these objects are known to be permanent, they can be treated specially such as storing them in the zygote heap source. But these two kinds of objects may not cover all objects used in the VM native code. For example, an exception object is created and referenced by the VM native code while an application runs. This object can not be stored in the zygote heap source. In this case, this object has to be pinned in a relocating collector. But pinning objects makes a collector complicated, although pinned object are not many. So rewriting the access way of references in VM native code with indirect reference is another approach to deal with this problem. The advantages of this way might be GC algorithm simplicity and space efficiency. The disadvantages might be a little bit performance drop while accessing references in the VM native code.

 

[Ligang] Another question is, you must have thought of decoupling the native and Java part of ClassObject when you designed this. Well, why did you choose not to? J

 

Regards,

 

Carl

 

--

 

You received this message because you are subscribed to the Google Groups "android-platform" group.

To post to this group, send email to android-...@googlegroups.com.

To unsubscribe from this group, send email to android-platfo...@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/android-platform?hl=en.

 

 

fadden

unread,
Jan 12, 2010, 3:50:46 PM1/12/10
to android-platform
On Jan 8, 2:19 am, "Wang, Ligang" <ligang.w...@intel.com> wrote:
> From: android-...@googlegroups.com [mailto:android-...@googlegroups.com] On Behalf Of Carl Shapiro

> [Ligang] Yes, Eclair introduced the indirect reference for global/local JNI reference. But in my understanding, this is only used for JNI references, not for references in the VM native code. So the problem I mentioned is still not solved by this, unless it uses a similar way.

"Internal native" code is very different from JNI code. Internal
natives cannot be interrupted by the garbage collector (see the note
in dalvik/vm/native/README.txt), and don't hang on to objects across
calls, so they should not interfere with object relocation. Because
they're more "raw" than normal JNI calls, we try to use them as little
as possible.


> [Ligang] Yes, you are right. “JNI-like way” is not an accurate expression. What I meant was, if direct references in the VM native code are substituted by a structure like the JNI indirect reference, they will be able to be enumerated by GC and GC will handle the reference modification.

Not necessary, and indirection tables are expensive on embedded CPUs
with small caches.


> [Ligang] Yeah. If these objects are known to be permanent, they can be treated specially such as storing them in the zygote heap source. But these two kinds of objects may not cover all objects used in the VM native code. For example, an exception object is created and referenced by the VM native code while an application runs. This object can not be stored in the zygote heap source. In this case, this object has to be pinned in a relocating collector. But pinning objects makes a collector complicated, although pinned object are not many. So rewriting the access way of references in VM native code with indirect reference is another approach to deal with this problem. The advantages of this way might be GC algorithm simplicity and space efficiency. The disadvantages might be a little bit performance drop while accessing references in the VM native code.

There are other situations in which it's very useful to pin memory,
notably with regard to the JNI functions that access large primitive
arrays (e.g. bitmaps). The VM can choose to pin or copy the primitive
array, but with the poor memory bandwidth on embedded devices it's
much better to pin.

Given that, there's no need to jump through hoops that slow execution
speed.


> [Ligang] Another question is, you must have thought of decoupling the native and Java part of ClassObject when you designed this. Well, why did you choose not to? :)

One less allocation per class, better locality of reference, no
indirection to go from the Object pointer to class attributes.

We're re-examining this decision, chiefly because special cases are
annoying. As you mentioned, most things interested in a class object
can point directly at the attribute rather than the object, so the
only thing we'd pay is an extra allocation per class.

Wang, Ligang

unread,
Jan 13, 2010, 10:19:02 PM1/13/10
to android-...@googlegroups.com

Fadden, thanks for your informative answer. But I still doubt that the internal native code cannot be interrupted by GC.

 

As I observed so far, the only way of preventing a thread executing internal native code from being interrupted by GC is the status check in waitForThreadSuspend() called by dvmSuspendAllThreads(). If the thread’s status is THREAD_RUNNING and it doesn’t set its flag “isSuspended” to be true, any other thread will wait for it. Bearing this in mind, internal native code is executed with status being THREAD_RUNNING and isSuspended being false. This is suitable for the case that one thread is executing internal native code and another is requesting GC. But what if the first thread itself wants to trigger GC while allocating an object? For example, Dalvik_dalvik_system_DexFile_getClassNameList() creates several StringObjects and sets the references in the content of a String array. We can not prevent one of the creating processes from triggering GC. Am I right?

 

Thanks,

Ligang

 

-----Original Message-----
From: android-...@googlegroups.com [mailto:android-...@googlegroups.com] On Behalf Of fadden
Sent: 2010
113 4:51
To: android-platform
Subject: Re: The way Dalvik VM native code uses Java objects

 

On Jan 8, 2:19 am, "Wang, Ligang" <ligang.w...@intel.com> wrote:

fadden

unread,
Jan 14, 2010, 4:18:29 PM1/14/10
to android-platform
On Jan 13, 7:19 pm, "Wang, Ligang" <ligang.w...@intel.com> wrote:
> As I observed so far, the only way of preventing a thread executing internal native code from being interrupted by GC is the status check in waitForThreadSuspend() called by dvmSuspendAllThreads(). If the thread’s status is THREAD_RUNNING and it doesn’t set its flag “isSuspended” to be true, any other thread will wait for it. Bearing this in mind, internal native code is executed with status being THREAD_RUNNING and isSuspended being false. This is suitable for the case that one thread is executing internal native code and another is requesting GC. But what if the first thread itself wants to trigger GC while allocating an object? For example, Dalvik_dalvik_system_DexFile_getClassNameList() creates several StringObjects and sets the references in the content of a String array. We can not prevent one of the creating processes from triggering GC. Am I right?

It is possible for an "internal native" thread to cause a GC. I
wouldn't call that an interruption though, since you know exactly the
points at which it can occur. (More like a "side effect" than an
"interrupt".)

Looking at the specific function you mention, you can see that the
string array and its contents are allocated with ALLOC_DEFAULT, which
causes the reference to be held in the "tracked allocation" table.
This table is part of the root set, and the GC knows that anything
referenced by it cannot be moved. After the String object is placed
inside the String array, the tracked allocation table entry is
released, since (a) the GC can see it inside the array, and (b) the
internal-native function no longer holds a pointer.

At the end of the function, we release the tracked allocation on the
string array, and return the array reference. The result goes into
the secret "result register", which is moved into a "real" Dalvik
register with a move-result instruction (which must immediately follow
the method invocation instruction).

If it were possible to GC between the point where the method returned
and the move-result instruction completed, we could have a problem.
It can't, so we're fine. See dalvik/vm/mterp/NOTES.txt for
information about this.

At any rate, the tracked allocation table is always empty by the time
we return to the interpreter. It typically has few entries and
they're always short-lived. (To be more accurate, the table is always
at the same level it was when the interpreter started. If you
allocate some stuff, call native code through JNI, and the native code
calls back into the interpreter, then those tracked allocations are
going to sit in the list until you come all the way back out.)

Wang, Ligang

unread,
Jan 15, 2010, 8:35:30 AM1/15/10
to android-...@googlegroups.com

Fadden, your answer made me understand deeper in Dalvik design. Thanks.

 

For the “tracked allocation” table, I knew it is part of GC root set, but I didn’t know objects tracked by it are taken as non-movable. I didn’t see hints in the code. J In this case, these objects have to be pinned since being allocated and until we release them from that table.

 

Regarding pinned objects, in my personal opinion, if we cannot totally remove the pinning usage in a relocating collector (due to big spatial and temporal overheads of copying large arrays in JNI functions), we should keep them as few as possible to make GC and space more efficient. To summarize, currently there are two kinds of pinned objects: 1. those tracked by the “tracked allocation” table; 2. those pinned by pinPrimitiveArray(), which is a new feature introduced by Eclair. Kind 1 can be removed by using indirect reference in the internal native code, so that the collector can relocate them. This will benefit space efficiency but make performance drop a little bit due to dereferencing. But here performance drop won’t be a big problem since as you said those objects are few and short-lived. Kind 2 can use copying as a substitute if we want.

 

So, if we want to implement a relocating collector for Dalvik, object pinning might be an absolutely necessary feature. And the following two might be beneficial attempts: 1. using indirect reference instead of direct reference as local variables in the internal native code if there is a GC safe point in the life cycles of these local variables; 2. decoupling the native and Java part of the ClassObject. What do you think of this?

 

Btw, have you ever measured how many and how large are the arrays pinned by pinPrimitiveArray() in typical apps?

 

Thanks,

Ligang

 

-----Original Message-----
From: android-...@googlegroups.com [mailto:android-...@googlegroups.com] On Behalf Of fadden
Sent: 2010
115 5:18
To: android-platform
Subject: Re: The way Dalvik VM native code uses Java objects

 

On Jan 13, 7:19 pm, "Wang, Ligang" <ligang.w...@intel.com> wrote:

Reply all
Reply to author
Forward
0 new messages