That's correct. At some point, there will be an extension to the
bytecode format to allow for wider field and method references. When
we were developing the dex format originally, it looked like we were
far enough away from that being necessary that we decided not to
bother in the short term. It looks like the time is finally drawing
close: I've been told that the Scala core library has to be split in
two exactly because of this issue.
With extended versions of these opcodes, the remaining 16-bit limits
will be on (a) class references and (b) method prototypes. That is,
you will still be limited to no more than 65536 classes in a dex file,
and though you will be able to have 2^32 methods, you will only be
able to have 65536 different prototypes (list of arguments and return
type). You might be concerned about the latter, but the last time I
checked it seemed that even method-heavy code shared enough prototypes
that this won't become an issue for a while, still.
> On the other hand it seems that strings with large index (< 2^32) can
> be accessed.
Yep. The writing was already on the wall about that in the 1.0
timeframe. Hence, const-string/jumbo.
> 2. All references to data within the bytecode (jumps, packed-switch
> and family) are relative offsets to data within the same method's
> bytecode, correct?
Correct.
> Really what I'm wondering is if a method's
> bytecode is self contained and if changing it could have an effect on
> anything outside of the method.
The only thing that would matter is the overall size, since that would
end up affecting the file offsets of everything in the file after the
code in question.
> 3. In the dex file format static fields can be initialized with a
> list of encoded-values. None of the encoded values seem to allow for
> the static field to be initialized to an instance of an object,
> however this is possible in the java language. How is this kind
> initialization done?
It's done by a <clinit> method.
Cheers,
-dan
The nominal plan is to introduce the concept of "wide opcodes," where,
e.g. an 0xff in the opcode position of the first code unit of an
instruction would mean that the other 8 bits are taken to be part of
the opcode and not used as other arguments. Basically, if we will have
to burn 16 more bits for a member reference, we can afford to burn 8
more bits for an opcode too, but we only impose this penalty on code
that's already going to be huge enough that this won't make a
meaningful difference.
> I wonder if you started to get libraries that
> actually exceeded these criteria if they would be too large to be put
> on android phones at all.
Moore's Law does seem to apply to all aspects of portable devices
except battery capacity.
> I still have one small question about how native code is handled. In
> the encoded_method format description it says that code_off will be 0
> if the method is abstract or native. If the method is native how does
> the vm know what native code to call? The native code in question
> would be stored in a dynamic linked library, right? Is this library
> and the method name somehow in the dex file? I didn't see anything on
> this.
Native libraries get hooked in by calling System.loadLibrary(). Look
for the jni-tips.html document in the Dalvik docs directory for more
details.
-dan