Byte Code Injection for Dalvik through modified Class loader

2,440 views
Skip to first unread message

Tez

unread,
Feb 11, 2011, 10:40:59 AM2/11/11
to android-platform
I need to inject some "monitor" code into compiled android
applications at runtime. For this, I have thought of Byte Code
Injection. As such, it makes sense to modify the class loader for
android applications such that it accepts a "bytecode payload" to be
injected depending upon the current class being loaded.

1. Are there any other ways of accomplishing something similar?
2. Where is the corresponding code located? (the class loader etc)

Cheers,
Earlence

Tez

unread,
Feb 14, 2011, 2:14:17 AM2/14/11
to android-platform
any ideas?

Tez

unread,
Feb 15, 2011, 5:06:13 AM2/15/11
to android-platform
I have located "PathClassLoader"
From the comments, "Android uses this class for its system class
loader and for its application class loader(s)."

so this seems like a reasonable place where the bytecode can be
scanned and code can be injected?

Cheers,
Earlence

Tez

unread,
Feb 20, 2011, 8:01:21 AM2/20/11
to android-platform
from the dalvik docs (Dalvik Debugger support),
I noticed that they mention that bytecode insertion is not supported
currently by dalvik.

Can anyone one (hopefully someone from the Dalvik team) explain why
this is so? and what changes would it encompass to have something like
this?

Cheers,
Earlence

fadden

unread,
Feb 22, 2011, 1:38:42 PM2/22/11
to android-platform
On Feb 20, 5:01 am, Tez <earlencefe...@gmail.com> wrote:
> from the dalvik docs (Dalvik Debugger support),
> I noticed that they mention that bytecode insertion is not supported
> currently by dalvik.
>
> Can anyone one (hopefully someone from the Dalvik team) explain why
> this is so? and what changes would it encompass to have something like
> this?

There's no specific reason; it just hasn't been a priority.

Tez

unread,
Feb 22, 2011, 3:12:20 PM2/22/11
to android-platform
My scenario is this:
An APK is installed on the system. When a class is being loaded for
execution, I need to inject certain bytecode check statements at
strategic points in the instruction stream.
(These checks validate certain runtime conditions based on inputs from
a static code analysis tool)
Is it technically possible to do something like this with dalvik?

Cheers,
Earlence

fadden

unread,
Feb 22, 2011, 4:08:31 PM2/22/11
to android-platform
On Feb 22, 12:12 pm, Tez <earlencefe...@gmail.com> wrote:
> An APK is installed on the system. When a class is being loaded for
> execution, I need to inject certain bytecode check statements at
> strategic points in the instruction stream.
> (These checks validate certain runtime conditions based on inputs from
> a static code analysis tool)
> Is it technically possible to do something like this with dalvik?


The VM would need to provide hooks to get and set the bytecode for a
method. It's possible to do so but there is currently no support for
it.

Tez

unread,
Feb 22, 2011, 2:59:42 PM2/22/11
to android-platform
okay, can you explain what would be the steps (logically) to have
something like this?
or where can I start something like this?

On Feb 22, 7:38 pm, fadden <fad...@android.com> wrote:

Tez

unread,
Feb 23, 2011, 2:25:29 AM2/23/11
to android-platform
oops...double post. thanks for the answer.

by support, can you be a little more specific?

Tez

unread,
Feb 23, 2011, 2:44:33 AM2/23/11
to android-platform
coz, If it works out, I can submit it as a patch to the dalvik code.

Tez

unread,
Feb 24, 2011, 6:09:00 AM2/24/11
to android-platform
I have located "dvmDexFileOpenFromFd" inside dalvik/vm/DvmDex.c

It creates a "DvmDex" structure which contains a lot of information
(and hopefully the byetcode instructions for methods?) which is mapped
read only via "sysMapFileInShmemWritableReadOnly".

I think is the the location where I can push in hooks to read
methodIDs, locate their corresponding code and modify pDexFile (the
parsed optimized dex file) to insert additional bytecode.

Is this hypothesis correct? Or there are other things I should take
care of?

Cheers,
Earlence

On Feb 23, 8:44 am, Tez <earlencefe...@gmail.com> wrote:
> coz, If it works out, I can submit it as a patch to the dalvik code.
>
> On Feb 23, 8:25 am, Tez <earlencefe...@gmail.com> wrote:
>
>
>
>
>
> > oops...double post. thanks for the answer.
>
> > by support, can you be a little more specific?
>
> > On Feb 22, 8:59 pm, Tez <earlencefe...@gmail.com> wrote:
>
> > > okay, can you explain what would be the steps (logically) to have
> > > something like this?
> > > or where can I start something like this?
>
> > > On Feb 22, 7:38 pm, fadden <fad...@android.com> wrote:
>
> > > > On Feb 20, 5:01 am, Tez <earlencefe...@gmail.com> wrote:
>
> > > > > from the dalvik docs (Dalvik Debugger support),
> > > > > I noticed that they mention thatbytecodeinsertion is not supported

Tez

unread,
Feb 24, 2011, 8:37:10 AM2/24/11
to android-platform
some more updates!

inside Class.c

there is loadMethodFromDex(Clazz,pDexMethod,Method)

and commented code - dvmMakeCodeReadWrite which has code that sort of
"replaces" method bytecode.
I am thinking of using a special flag inside loadMethodFromDex, which
upon being set, will add additional bytecode using the logic from
dvmMakeCodeReadWrite

meth->insns = newCode->insns
This will contain the injected payload.

i'll post more results here. If anyone finds something wrong with
this, let me know!

Cheers,
Earlence

Tez

unread,
Feb 24, 2011, 11:54:51 AM2/24/11
to android-platform
I have changed the code in the above method such that meth->insns is
pointed to a new array containing the SAME bytes.
(just to see if I can actually play around with it later)

if(strcmp(meth->name, "call_hello") == 0)
{
DexCode* methodDexCode = (DexCode*) dvmGetMethodCode(meth);
//make code read-write
dvmLinearReadWrite(meth->clazz->classLoader, methodDexCode);

//inject additional bytecode
meth->registersSize = pDexCode->registersSize;
meth->insSize = pDexCode->insSize;
meth->outsSize = pDexCode->outsSize;

/* pointer to code area */

memcpy(newInst, pDexCode->insns, 8 * sizeof(unsigned short));

meth->insns = newInst;

//print out the instruction bytes
for(i = 0; i < 8; i++)
LOGI("--EARL-- modded, %4x ", meth->insns[i]);

//make code read-only
dvmLinearReadOnly(meth->clazz->classLoader, methodDexCode);
}

The target program in question is a simple hell world which has one
method "call_hello" which just sysouts a string.

This is the dexdump of the call_hello method

#1 : (in Lhello;)
name : 'call_hello'
type : '()V'
access : 0x0008 (STATIC)
code -
registers : 2
ins : 0
outs : 2
insns size : 8 16-bit code units
000154: |[000154]
hello.call_hello:()V
000164: 6200 0000 |0000: sget-object v0,
Ljava/lang/System;.out:Ljava/io/PrintStream; // field@0000
000168: 1a01 0a00 |0002: const-string v1,
"call_hello invoked" // string@000a
00016c: 6e20 0300 1000 |0004: invoke-virtual
{v0, v1}, Ljava/io/PrintStream;.println:(Ljava/lang/String;)V //
method@0003
000172: 0e00 |0007: return-void
catches : (none)
positions :
0x0000 line=10
0x0007 line=11
locals :



This is the tombstone when a seg fault occurs:
*** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
Build fingerprint: 'unknown'
pid: 275, tid: 275 >>> dalvikvm <<<
signal 11 (SIGSEGV), fault addr 4104c234
r0 00000000 r1 00000000 r2 000000a5 r3 ffffaca5
r4 beaf9a2c r5 4104bfa0 r6 beaf9c28 r7 0000aed9
r8 aca0fb80 r9 000000ae 10 4104bf8c fp 00000000
ip 000000d9 sp beaf9be8 lr aca62991 pc aca131cc cpsr 40000010
#00 pc 000131cc /system/lib/libdvm.so
#01 pc 0001a944 /system/lib/libdvm.so (dvmMterpStd)
#02 pc 00019810 /system/lib/libdvm.so (dvmInterpret)
#03 pc 00050014 /system/lib/libdvm.so (dvmCallMethodV)
#04 pc 0003d108 /system/lib/libdvm.so
#05 pc 00008918 /system/bin/dalvikvm
#06 pc 0000d066 /system/lib/libc.so (__libc_init)

code around pc:
aca131ac e1a00000 e1a00000 e1a00000 e1a00000
aca131bc e1a00000 e1d430f2 e1a09427 e20320ff
aca131cc e7950102 e1b01443 e1f470b4 e0600001
aca131dc e207c0ff e7850109 e088f30c e1a00000
aca131ec e1a00000 e1a00000 e1a00000 e1a00000

code around lr:
aca62970 42be2001 2000dc00 bdf0b005 2200b510
aca62980 ff6cf7ff 46c0bd10 2201b510 ff66f7ff
aca62990 46c0bd10 e000b510 78033001 d0fb2b5b
aca629a0 061b3b42 2a180e1a 4909d810 2201161b
aca629b0 1c13409a 420a3001 055ad109 213bd506

stack:
beaf9ba8 beaf9bc8 [stack]
beaf9bac beaf9c28 [stack]
beaf9bb0 0006e180 [heap]
beaf9bb4 aca5cc21 /system/lib/libdvm.so
beaf9bb8 4104bfb4
beaf9bbc aca83fd0 /system/lib/libdvm.so
beaf9bc0 00000002
beaf9bc4 00000001
beaf9bc8 00071328 [heap]
beaf9bcc 00000000
beaf9bd0 000164b0 [heap]
beaf9bd4 429bf1ac /data/dalvik-cache/
sdc...@hello.jar@classes.dex
beaf9bd8 4104bfbc
beaf9bdc beaf9c28 [stack]
beaf9be0 00000071
beaf9be4 aca117ec /system/lib/libdvm.so
#00 beaf9be8 00000000
beaf9bec beaf9c28 [stack]
beaf9bf0 0006e180 [heap]
beaf9bf4 beaf9c28 [stack]
beaf9bf8 beaf9cb8 [stack]
beaf9bfc 00000000
beaf9c00 00000000
beaf9c04 00000000
beaf9c08 00000000
beaf9c0c aca1a948 /system/lib/libdvm.so
#01 beaf9c10 0000c320 [heap]
beaf9c14 aca1a8f0 /system/lib/libdvm.so
beaf9c18 beaf9c28 [stack]
beaf9c1c aca19814 /system/lib/libdvm.so
#02 beaf9c20 0000c320 [heap]
beaf9c24 beaf9bc8 [stack]
beaf9c28 429bf1ac /data/dalvik-cache/
sdc...@hello.jar@classes.dex
beaf9c2c 4104bfbc
beaf9c30 aca8505e
beaf9c34 aca850a0
beaf9c38 4109356c /dev/ashmem/dalvik-LinearAlloc (deleted)
beaf9c3c 0006e180 [heap]
beaf9c40 0000c320 [heap]
beaf9c44 beaf9be8 [stack]
beaf9c48 41049300
beaf9c4c 0000c328 [heap]
beaf9c50 aca8505e
beaf9c54 aca850a0
beaf9c58 00000000
beaf9c5c 00000000
beaf9c60 40024401 /dev/ashmem/mspace/dalvik-heap/0 (deleted)
beaf9c64 aca4fd49 /system/lib/libdvm.so
beaf9c68 00000000
beaf9c6c 00000009
beaf9c70 410935a0 /dev/ashmem/dalvik-LinearAlloc (deleted)
beaf9c74 0000c320 [heap]
beaf9c78 429bf22c /data/dalvik-cache/
sdc...@hello.jar@classes.dex
beaf9c7c aca50017 /system/lib/libdvm.so
#03 beaf9c80 aca83fd0 /system/lib/libdvm.so
beaf9c84 429bf22c /data/dalvik-cache/
sdc...@hello.jar@classes.dex
beaf9c88 beaf9c90 [stack]
beaf9c8c 00000001
beaf9c90 00000001
beaf9c94 00000007
beaf9c98 40024428 /dev/ashmem/mspace/dalvik-heap/0 (deleted)
beaf9c9c beaf9cdc [stack]
beaf9ca0 0000c320 [heap]
beaf9ca4 410935a0 /dev/ashmem/dalvik-LinearAlloc (deleted)
beaf9ca8 40024428 /dev/ashmem/mspace/dalvik-heap/0 (deleted)
beaf9cac aca3d10d /system/lib/libdvm.so
#04 beaf9cb0 beaf9cb8 [stack]
beaf9cb4 beaf9cdc [stack]
beaf9cb8 aca3b645 /system/lib/libdvm.so
beaf9cbc 00008747 /system/bin/dalvikvm
beaf9cc0 0000c320 [heap]
beaf9cc4 beaf9cdc [stack]
beaf9cc8 aca3d0e5 /system/lib/libdvm.so
beaf9ccc 00009118 /system/bin/dalvikvm
beaf9cd0 0000a180 [heap]
beaf9cd4 0000891b /system/bin/dalvikvm
beaf9cd8 410935a0 /dev/ashmem/dalvik-LinearAlloc (deleted)
beaf9cdc 4001dcb8 /dev/ashmem/mspace/dalvik-heap/0 (deleted)
#05 beaf9ce0 beaf9d70 [stack]
beaf9ce4 410935a0 /dev/ashmem/dalvik-LinearAlloc (deleted)
beaf9ce8 00000003
beaf9cec 0000a120 [heap]
beaf9cf0 4001dcb8 /dev/ashmem/mspace/dalvik-heap/0 (deleted)
beaf9cf4 00008b59 /system/bin/dalvikvm
beaf9cf8 00010004 [heap]
beaf9cfc 00000002
beaf9d00 0000a120 [heap]
beaf9d04 00000000
beaf9d08 0000a190 [heap]
beaf9d0c 0000a168 [heap]
beaf9d10 00000000
beaf9d14 00000000
beaf9d18 00000000
beaf9d1c 00000000
beaf9d20 00000000
beaf9d24 afd0d069 /system/lib/libc.so
#06 beaf9d28 00000000
beaf9d2c b000293b /system/bin/linker
beaf9d30 00000004
beaf9d34 beaf9e1e [stack]
beaf9d38 beaf9e27 [stack]
beaf9d3c beaf9e2b [stack]
beaf9d40 beaf9e3d [stack]
beaf9d44 00000000
beaf9d48 beaf9e43 [stack]
beaf9d4c beaf9e58 [stack]
beaf9d50 beaf9e74 [stack]
beaf9d54 beaf9ea5 [stack]
beaf9d58 beaf9ebf [stack]
beaf9d5c beaf9f66 [stack]
beaf9d60 beaf9f79 [stack]
beaf9d64 beaf9f94 [stack]
beaf9d68 beaf9fb1 [stack]
beaf9d6c beaf9fc4 [stack]

and these are the bytes that are printed on the logger (I see that
they are not the same as those from dexdump)
I/dalvikvm( 275): --EARL--, 62
I/dalvikvm( 275): --EARL--, 0
I/dalvikvm( 275): --EARL--, 11a
I/dalvikvm( 275): --EARL--, a
I/dalvikvm( 275): --EARL--, 20f8
I/dalvikvm( 275): --EARL--, 2b
I/dalvikvm( 275): --EARL--, 10
I/dalvikvm( 275): --EARL--, e

fadden

unread,
Feb 24, 2011, 3:20:27 PM2/24/11
to android-platform
On Feb 24, 8:54 am, Tez <earlencefe...@gmail.com> wrote:
> I have changed the code in the above method such that meth->insns is
> pointed to a new array containing the SAME bytes.
> (just to see if I can actually play around with it later)

You can't just redirect meth->insns. Look at dvmGetMethodCode to see
why. This is why dvmMakeCodeReadWrite copies the entire DexCode
block.

What you'd want to do instead is modify dvmMakeCodeReadWrite to pass
dexCodeSize + (some value) to dvmLinearAlloc, so you have enough room
there to add additional bytecode.

The dvmMakeCodeReadWrite function was #if 0ed in Froyo and removed
entirely in Honeycomb, but since you're doing a custom VM that
shouldn't be a problem. (The newer implementation uses mprotect to
allow the original code to be modified in place, but that won't work
for you.)

Tez

unread,
Feb 24, 2011, 3:30:32 PM2/24/11
to android-platform
> What you'd want to do instead is modify dvmMakeCodeReadWrite to pass
> dexCodeSize + (some value) to dvmLinearAlloc, so you have enough room
> there to add additional bytecode.

Yes, I kinda figured that out the hard way :)

I see that newCode is allocated in dvmMakeCodeReadWrite.
I presume that I should place modified byte code here and then make
meth->insns point to newCode->insns?

Yes, I tried increasing the size of the newly allocated DexCode
structure.
I will copy out the original DexCode data and add new byte code (in
hex)

Is there any other structure that needs to be updated (besides the
insns and insSize array) ?

Cheers,
Earlence

Tez

unread,
Feb 25, 2011, 5:00:19 AM2/25/11
to android-platform
Hi,

This is what I am trying (but I get a seg fault on dexGetCodeSize and
can't figure out why)

if(strcmp(meth->name, "call_hello") == 0)
{
DexCode* methodDexCode = (DexCode*) dvmGetMethodCode(meth);

if (IS_METHOD_FLAG_SET(meth, METHOD_ISWRITABLE))
{
dvmLinearReadWrite(meth->clazz->classLoader,
methodDexCode);
}
else
{
assert(!dvmIsNativeMethod(meth) && !
dvmIsAbstractMethod(meth));

size_t dexCodeSize = dexGetDexCodeSize(methodDexCode);
LOGD("Making a copy of %s.%s code (%d bytes)\n", meth-
>clazz->descriptor, meth->name, dexCodeSize + 14);

DexCode* newCode = (DexCode*) dvmLinearAlloc(meth-
>clazz->classLoader, dexCodeSize + 14);
memcpy(newCode, methodDexCode, dexCodeSize);

meth->insns = newCode->insns;
//newCode->insnsSize = 15; //fifteen 16-bit code units
SET_METHOD_FLAG(meth, METHOD_ISWRITABLE);

newCode->insns[7] = 0x6200;
newCode->insns[8] = 0x0000;
newCode->insns[9] = 0x1a01;
newCode->insns[10] = 0x0a00;
newCode->insns[11] = 0x6e20;
newCode->insns[12] = 0x0300;
newCode->insns[13] = 0x1000;
newCode->insns[14] = 0x0e00;

//print out the instruction bytes
for(i = 0; i < 15; i++)
LOGI("--EARL-- modded, %4x ", meth->insns[i]);

//make code read-only
dvmLinearReadOnly(meth->clazz->classLoader,
methodDexCode);
}
}

Here, basically, I am allocating14 bytes more than dex code size (to
hold 7 more 16-bit code units). This is to hold basically the same
opcodes that appear in the previous statement (which is a sysout).
Why is a seg fault occuring at dexGetDexCodeSize (it does not print
the message that code is being copied)

Cheers,
Earlence

Tez

unread,
Feb 25, 2011, 5:28:34 AM2/25/11
to android-platform
I found that meth->insns was not set correctly to pDexCode->insns in
the first place.

anyway, I think that if additional bytecode is inserted, the try_item
addresses will no longer be valid and they will have to be adjusted as
well. Am I correct?

-E

Tez

unread,
Feb 25, 2011, 5:43:22 AM2/25/11
to android-platform
I did it!!!
I managed to inject another call to sysout.println!!!

1. but my earlier question holds. right now, its very simple. what if
the code has try catch blocks. will insertion of additional bytecode
foul up the try_item data structures?
2. If so, do you have any strategies on how I can effectively handle
situations like this?
3. Suppose I need to inject other "data" like strings, any pointers on
how that can be done (so that the injected bytecode can make used of
this data?)

Cheers,
Earlence

On Feb 25, 11:28 am, Tez <earlencefe...@gmail.com> wrote:
> I found that meth->insns was not set correctly to pDexCode->insns in
> the first place.
>
> anyway, I think that if additionalbytecodeis inserted, the try_item

Tez

unread,
Feb 28, 2011, 3:10:52 PM2/28/11
to android-platform
any suggestions?

-Earlence

On Feb 25, 11:43 am, Tez <earlencefe...@gmail.com> wrote:
> I did it!!!
> I managed to inject another call to sysout.println!!!
>
> 1. but my earlier question holds. right now, its very simple. what if
> the code has try catch blocks. will insertion of additionalbytecode
> foul up the try_item data structures?
> 2. If so, do you have any strategies on how I can effectively handle
> situations like this?
> 3. Suppose I need to inject other "data" like strings, any pointers on
> how that can be done (so that the injectedbytecodecan make used of

fadden

unread,
Mar 1, 2011, 4:34:55 PM3/1/11
to android-platform
On Feb 25, 2:43 am, Tez <earlencefe...@gmail.com> wrote:
> I managed to inject another call to sysout.println!!!

Congratulations! I think you may be the first.

> 1. but my earlier question holds. right now, its very simple. what if
> the code has try catch blocks. will insertion of additional bytecode
> foul up the try_item data structures?

Yes. You have to adjust them. If you put all of your new code at the
end of the method and branch to it and then back to start+1, your
adjustment will be a fixed offset, which might be easier to manage.

Same applies to the local variable info used by the debugger and
exception trace display.

> 2. If so, do you have any strategies on how I can effectively handle
> situations like this?

Write lots of code. :-)

> 3. Suppose I need to inject other "data" like strings, any pointers on
> how that can be done (so that the injected bytecode can make used of
> this data?)

That will be tricky. You're already running, which means all of the
DEX "constant pool" areas have been processed, which makes them hard
to update. Essentially you'd need to realloc() the tables in the
DvmDex struct and "pre-resolve" them so that the resolver doesn't go
chasing through non-existent DEX data.

Tez

unread,
Mar 2, 2011, 7:57:40 AM3/2/11
to android-platform
> Congratulations! I think you may be the first.
Thanks!

>"pre-resolve" them so that the resolver doesn't go
> chasing through non-existent DEX data.
Im not quite sure what "pre-resolve" means.

I think I will need to realloc DvmDex and update the constant table to
hold additional values that the injected code will use.
The docs mention that the method/field tables are sorted. But on what
field is the sorting done?

Cheers,
Earlence

Tez

unread,
Mar 3, 2011, 8:05:21 AM3/3/11
to android-platform
okay...from the dump, I see that methods etc are sorted by defining
type.
Why is it necessary to have these lists sorted?

eg: if I have to add a method reference such that its sorted status is
in the middle of the list,then every reference below that has changed
and
these changes will imply massive reconstruction of the entire dex file
in memory. This seems like quite a problem against bytecode injection.

Cheers,
Earlence
> > > how that can be done (so that the injectedbytecodecan make used of

Tez

unread,
Mar 3, 2011, 9:28:30 AM3/3/11
to android-platform
but, If I modify DvmDex structure such that it holds additional data,
then loadClassFromDex0 will use this modified data and handle the
changes itself while loading.
Finally, the hook in loadMethodFromDex will push in the injected code.

so 2 steps

1. intercept the load of DvmDex into memory. Have an additional step
which scans and modifies the struct.
2. intercept loadMethodFromDex and inject bytecode as shown earlier.

What do you think of this approach?

Cheers,
Earlence

fadden

unread,
Mar 4, 2011, 2:21:30 PM3/4/11
to android-platform
On Mar 2, 4:57 am, Tez <earlencefe...@gmail.com> wrote:
> >"pre-resolve" them so that the resolver doesn't go
> > chasing through non-existent DEX data.
>
> Im not quite sure what "pre-resolve" means.

The field/method/class/string resolver first checks to see if there's
a non-NULL entry in the DvmDex tables. If it's NULL, it goes chasing
through the DEX file to find the info. So long as you populate the
entry, you don't need to e.g. create a new string constant in the DEX
file to back up a string constant.

Since you're bypassing the DEX data entirely in this scenario, I don't
think the sort order of stuff there matters.

Tez

unread,
Mar 4, 2011, 2:51:33 PM3/4/11
to android-platform
okay. so if i got this right, I just add my method/field/string
references in memory (newly allocated) for the DvmDex table.
and refer to the addresses contained therein from the bytecode?

If its like that, then it saves quite an amount of work :)

Cheers,
Earlence

Tez

unread,
Mar 4, 2011, 2:56:54 PM3/4/11
to android-platform
inside DvmDex.h, I see the following:

/* interned strings; parallel to "stringIds" */
struct StringObject** pResStrings;

/* resolved methods; parallel to "methodIds" */
struct Method** pResMethods;

/* resolved instance fields; parallel to "fieldIds" */
/* (this holds both InstField and StaticField) */
struct Field** pResFields;

I think, updating these with new data should help me make some
progress. lets see how it goes.

Cheers,
Earlence

Tez

unread,
Mar 7, 2011, 8:40:42 AM3/7/11
to android-platform
typedef struct DvmDex {
/* pointer to the DexFile we're associated with */
DexFile* pDexFile;

/* clone of pDexFile->pHeader (it's used frequently enough) */
const DexHeader* pHeader;

/* interned strings; parallel to "stringIds" */
struct StringObject** pResStrings;

/* resolved classes; parallel to "typeIds" */
struct ClassObject** pResClasses;

/* resolved methods; parallel to "methodIds" */
struct Method** pResMethods;

/* resolved instance fields; parallel to "fieldIds" */
/* (this holds both InstField and StaticField) */
struct Field** pResFields;

/* interface method lookup cache */
struct AtomicCache* pInterfaceCache;

/* shared memory region with file contents */
MemMapping memMap;
} DvmDex;

As per this,
if I have a new string reference, and If I understand what you say
correctly,
I must just reallocate this structure as readwrite. and add an entry
with the new string at the end of pResStrings? (similarly for fields/
methods refs)?

-Earlence

On Mar 4, 8:51 pm, Tez <earlencefe...@gmail.com> wrote:

fadden

unread,
Mar 7, 2011, 5:13:52 PM3/7/11
to android-platform
On Mar 4, 11:51 am, Tez <earlencefe...@gmail.com> wrote:
> okay. so if i got this right, I just add my method/field/string
> references in memory (newly allocated) for the DvmDex table.
> and refer to the addresses contained therein from the bytecode?
>
> If its like that, then it saves quite an amount of work :)

That's the idea. I think that'll "just work", shaky as it sounds.
The trick is to avoid reallocating stuff while another thread is
trying to use it -- the tables and methods aren't normally expected to
move around. (If this is for "academic" purposes, you could just over-
alloc up front, and blow up if you exceed the new limit.)

Tez

unread,
Mar 7, 2011, 5:21:17 PM3/7/11
to android-platform
I was thinking that I will pre-calculate how much memory is needed
when a load is about to take place.
Each time, the injected data is constant. so I think I will be safe
from going over the limit.

Yes, this is for academic research. But I want to have a stable
"production" level modification. (maybe the code can be contributed
back to dalvik :) )

> The trick is to avoid reallocating stuff while another thread is
> trying to use it -
when DmDex is first created, I will hook there and modify it. does
this solve the problem?

-Earlence

Tez

unread,
Mar 8, 2011, 12:23:09 PM3/8/11
to android-platform
The reason I say this is because, I plan on injecting code once at
method load time only.
this is definitely before it is executed, and hence only one thread
will be accessing it.
extending this, I will know exactly how much more space is needed.
Hence, I will not run the risk of overflow.
Does this sound reasonable?

-Earlence

fadden

unread,
Mar 8, 2011, 7:40:30 PM3/8/11
to android-platform
On Mar 8, 9:23 am, Tez <earlencefe...@gmail.com> wrote:
> extending this, I will know exactly how much more space is needed.
> Hence, I will not run the risk of overflow.
> Does this sound reasonable?

That does make it easier to do stuff up front. :-)

I'm curious to know how this turns out.

Tez

unread,
Mar 9, 2011, 2:04:35 AM3/9/11
to android-platform
> I'm curious to know how this turns out.
Yes, I'll keep this thread updated.

Cheers,
Earlence

Deng Yao

unread,
Mar 12, 2012, 1:43:39 AM3/12/12
to android-...@googlegroups.com
Sorry for bumping this old thread.

Tez, is there any progress on the injection? I need to do something similar. But my problem is more difficult: I cannot build a customized dalvik vm, I can only inject the "monitor" java code by native code injection, which means firstly I have to inject native code into libdvm.so like a virus, then the injected libdvm.so would load extra "monitor" java code.
Currently I can inject instructions on the entry of any function whose address is known. Which function from libdvm.so should I inject to get this job done? Although with different versions the function name may be different, I can handle this problem by injecting different native code according to the version number.

Thanks
-Yao

Tez

unread,
Mar 12, 2012, 4:58:37 AM3/12/12
to android-platform
can you explain your mechanism in more detail?
my arch. is different.

-Earlence

邓尧

unread,
Mar 12, 2012, 9:49:20 PM3/12/12
to Tez, android-...@googlegroups.com
Since apk decompilers already exist, I believe the information in a pre-compiled library should be rich enough to identify any Java method.

On Mon, Mar 12, 2012 at 3:02 PM, Tez <earlen...@gmail.com> wrote:
Sounds like an interesting approach. However, how is it possible that
you identify a Java method from a pre-compiled library?
I made some additional progress on the injection, but I have only
recently restarted work on this.

My approach was different. I was thinking of creating a new bytecode
instruction. The interpreter upon seeing this intruction will load
monitor code maintained in binary form on the device. This is executed
by the interpreter.
The only "injection" that will happen is of the new instruction I have
created and its parameters

eg:

<instruction stream>
mon 1 <----- injected via Dalvik VM
<instruction stream>

the parameter specifies which function from a dynamic library the
interpreter needs to execute.

-Earlence

On Mar 12, 6:43 am, Deng Yao <tors...@gmail.com> wrote:

邓尧

unread,
Mar 12, 2012, 10:34:02 PM3/12/12
to android-...@googlegroups.com
My approach works on arm-linux.

1. Attach to the target process with ptrace() system call.
2. Create a break point in the target process like a debugger.
3. When the break point is hit, copy some bootstrap code (more on bootstrap code below) on the stack of the target process. The stack of the target process can be found by parsing /proc/<pid>/maps, the corresponding line entry ends with "[stack]"
4. Replace the break point with a "bx ip" instruction (similar to "jmp" in x86), modify the value of register "ip", then resume the target process. The target process will execute the injected bootstrap code.

About the bootstrap code:
a). The first instruction of the bootstrap code is a break point. When this break point is hit, do the following:
* Clear the break point created in step 2.
* Replace the break point created in step a) with a "nop" instruction.
* Replace the first instruction of the injected function with any illegal instruction, I picked "0xdead" for THUMB code, "0xdeaddead" for ARM code.
* Resume the target process
b) The bootstrap code just register a signal handler for SIGILL with sigaction() system call. The third argument of the signal handler is a pointer to a "struct ucontext_t" structure (This structure isn't defined in android NDK, but can simply copy one from a gnu cross toolchain). In side the signal handler we can do anything we like.
c) Modify the "pc" register of the "ucontext_t" register and return from the signal handler, the target process will jump to the corresponding address.
d) After the jump in step c), first restore the registers (the values of the registers are saved in the "ucontext_t" structure), then execute the first instruction of the injected function if the instruction is position independent, otherwise interpret it (I haven't implemented this part yet, better to do the interpretation before returning from the signal handler).
e) execute the rest of the injected function or return to its caller.

BTW, according to C standard the behavior of returning from a SIGILL handler is undefined, but on arm-linux it just works.

Thanks
-Yao

--
You received this message because you are subscribed to the Google Groups "android-platform" group.
To post to this group, send email to android-...@googlegroups.com.
To unsubscribe from this group, send email to android-platfo...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/android-platform?hl=en.


邓尧

unread,
Mar 12, 2012, 11:26:32 PM3/12/12
to android-...@googlegroups.com
d) After the jump in step c), first restore the registers (the values of the registers are saved in the "ucontext_t" structure)

I made a mistake, the kernel will restore the registers according to the values saved in the "ucontext_t" structure automatically, sorry for this.

Thanks
-Yao

Tez

unread,
Mar 13, 2012, 4:25:28 AM3/13/12
to android-platform
as far as a linux process is concerned, this may work.
but I am afraid that this is not going to work as far as Java is
concerned.
The VM interprets instructions, and thus the injection should
manipulate the VM state.
From the point of view of a linux process, this information is not
accessible.

from a process point of view, the address of an instruction could be
the same in 2 cases, but the java code being executed at that point is
something else altogether.

this is a wrong approach. you have to inject into the bytecode stream
and not native code stream

-Earlence

Tez

unread,
Mar 13, 2012, 4:28:57 AM3/13/12
to android-platform
but if you still want to try,
you can use the following function

dvmGetMethodCode

this is called to load a methods code when it is about to be executed.

-Earlence

邓尧

unread,
Mar 13, 2012, 9:17:30 PM3/13/12
to android-...@googlegroups.com
I understand this problem. Since I cannot build a customized dalvikvm, I cannot get access to the bytecode stream unless native code is injected first.
dvmGetMethodCode isn't a good function to inject, it's an inline function defined in a header file, it could be easy inlined into its caller. IMO, it's better to modify the bytecode stream right after loading it, but I haven't figured out how dalvik loads a bytecode stream.

-Yao

Tez

unread,
Mar 14, 2012, 2:49:29 AM3/14/12
to android-platform
inject in the caller of dvmGetMethodCode in that case.

-Earlence

Tez

unread,
Mar 14, 2012, 5:08:46 AM3/14/12
to android-platform
the other thing that came to mind is that this could be inefficient.
essentially, the injection point would stop execution each time the
function is called, since you have to check which method is going to
be executed.

-Earlence

邓尧

unread,
Mar 14, 2012, 6:02:08 AM3/14/12
to android-...@googlegroups.com
True, that's why I want to modify the bytecode stream right after loading it. Class loading happens just once, but execution could be thousands of times.

-Yao

Tez

unread,
Mar 14, 2012, 7:02:28 AM3/14/12
to android-platform
I am interested in how this turns out.
If you manage to do it, can you post about it here?

-Earlence

邓尧

unread,
Mar 16, 2012, 1:15:25 AM3/16/12
to android-...@googlegroups.com
Sure, but I'm now on a different project, so it would take sometime.
-Yao

baron.s...@gmail.com

unread,
Oct 10, 2013, 4:27:54 AM10/10/13
to android-...@googlegroups.com
Hi Yao, 

Any update about your post ? :)

THX
Reply all
Reply to author
Forward
0 new messages