SIGSEGV when de referencing address which should be in the code cache

35 views
Skip to first unread message

Greg Cawthorne

unread,
Jul 29, 2020, 2:36:38 PM7/29/20
to Dr. Memory Users
Hi.

I have been trying to add aarch64 support to DrMemory by closely following the x86 code, and I am currently getting an error I wouldn't mind some help with.

In instrument_esp_adjust_slowpath in stack_arm.c
I try to save the address of the label retaddr to the ESP_SLOW_SCRATCH2 reg:
 ...
 instru_insert_mov_pc
(drcontext, bb, inst, opnd_create_reg(ESP_SLOW_SCRATCH2), opnd_create_instr(retaddr));

 app_pc pc
= (sp_action == SP_ADJUST_ACTION_ZERO) ?
 shared_esp_slowpath_zero
:
 
((sp_action == SP_ADJUST_ACTION_DEFINED) ?
 shared_esp_slowpath_defined
:
 shared_esp_slowpath_shadow
);


 branch_aarch64
(drcontext, bb, inst, opnd_create_pc(pc));

 PRE
(bb, inst, retaddr);
 
...

Then in generate_shared_esp_slowpath_helper in stack.c, it saves this value to esp_spill_slot_base(sp_action)
 and since whole_bb_spills_enabled() is true this is always SPILL_SLOT_6. And then does a clean call to handle_esp_adjust_shared_slowpath:
 ...
 mov_str_aarch64
(drcontext, ilist, NULL, spill_slot_opnd(drcontext, esp_spill_slot_base(sp_action)),
 opnd_create_reg
(ESP_SLOW_SCRATCH2), NULL);

 dr_insert_clean_call
(drcontext, ilist, NULL,
 
(void *)handle_esp_adjust_shared_slowpath, false, 2,
 opnd_create_reg
(ESP_SLOW_SCRATCH1), OPND_CREATE_INT32(sp_action));
 
...

Then in handle_esp_adjust_shared_slowpath also in stack.c, we try to get the stored value and then do a call to decode which eventually attempts the failing dereference.
 ...
 app_pc pc
= (app_pc) get_own_tls_value(esp_spill_slot_base(sp_action));
 instr_t inst
;

 
void *drcontext = dr_get_current_drcontext();

 
/* We decode forward past eflags and register restoration, none of which
 * should reference esp. The next instr is the app instr.
 */

 instr_init
(drcontext, &inst);
 
while (true) {
 pc
= decode(drcontext, pc, &inst);
 
...

Dereference:
byte *
decode_common
(dcontext_t *dcontext, byte *pc, byte *orig_pc, instr_t *instr)
{
 
byte *next_pc = pc + 4;
 
uint enc = *(uint *)pc; // <- SIGSEGV here
 
...

Then the actual error in lldb:

* thread #1, name = 'simple', stop reason = signal SIGSEGV: invalid address (fault address: 0x51c81000)


    frame #0: 0x00000000712b3af4


->  0x712b3af4: ldr    w0, [x0]


    0x712b3af8: str    w0, [sp, #0x4c]


    0x712b3afc: adrp   x0, 238


    0x712b3b00: add    x5, x0, #0x630            ; =0x630



Note:
If I replace the line in stack_arm.c:

 instru_insert_mov_pc(drcontext, bb, inst, opnd_create_reg(ESP_SLOW_SCRATCH2), opnd_create_instr(retaddr));

with:
 instru_insert_mov_pc(drcontext, bb, inst, opnd_create_reg(ESP_SLOW_SCRATCH2), OPND_CREATE_INT64(7));

I get the same error but the fault address is 7, just as a sanity check.

So I am wondering if it seems likely that either:
  1. I am not inserting the address correctly.
  2. There is another step I must do to make this code cache memory addressable/in use.
  3. There is a likely way it can be corrupt I am missing
I appreciate it might be hard to know without seeing the rest of the code I have changed. Just wondering if there is any intuition around on this. There is hope to upstream this if I can get this working further.

Many thanks in advance :)
Greg

Greg Cawthorne

unread,
Jul 29, 2020, 5:03:15 PM7/29/20
to Dr. Memory Users
What is curious is that in the first while loop of umbra_address_space_init() in umbra_64.c when I print out the debug info from the successive umbra_add_app_segment() calls I eventually get:

addres space init pc 0x45d83000 info.base_pc 0x45d83000 info.size 268238848 info.type 2


umbra_add_app_segment entered MAX_NUM_APP_SEGMENTS 7


umbra_add_app_segment i: 0


size 268238848 base 0x45d83000 >= app_segments[i].app_base 0xff0000000000 base + size 0x55d53000 <= app_segments[i].app_end 0x1000000000000


umbra_add_app_segment i: 1


umbra_add_app_segment app_segment: 1 return true 1


addres space init pc 0x55d53000 info.base_pc 0x55d53000 info.size 455790592 info.type 0


addres space init pc 0x71000000 info.base_pc 0x71000000 info.size 4087808 info.type 1


where 0x55d53000 always matches the eventual fault address (which in my previous post was0x51c81000, but in this run is different as it varies). It seems to be the end of a region and then the next region is of info.type 0 which is DR_MEMTYPE_FREE/**< No memory is allocated here *, so 0x55d53000 is possibly the end of something like the code cache?

Derek Bruening

unread,
Jul 29, 2020, 5:39:42 PM7/29/20
to drmemor...@googlegroups.com
Thank you for the work on porting Dr. Memory!  Here, my first thought is to go confirm that everything is properly implemented for instrlist_insert_mov_instr_addr() on AArch64.  Inspecting insert_mov_immed_arch() (https://github.com/DynamoRIO/dynamorio/blob/master/core/ir/aarchxx/ir_utils.c#L166) it looks like it simply fails to create an instruction operand: so I think that is the bug.  Looking at the test that covers that: suite/tests/client-interface/alloc.dll.c is currently running on x86 only.  Unfortunately there are some missing features in DR for AArch64, including porting some tests...help appreciated!  For this particular bug, I would suggest filing a bug in the DR tracker (and ideally posting a pull request with a fix?  I would think the AArch64 encoder supports instr operands in general and thus it should be an easy fix?).

--

---
You received this message because you are subscribed to the Google Groups "Dr. Memory Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to drmemory-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/drmemory-users/226580e2-c148-46f9-b4ad-71c50272beeeo%40googlegroups.com.

Greg Cawthorne

unread,
Jul 29, 2020, 7:04:13 PM7/29/20
to Dr. Memory Users
Thanks for the speedy reply Derek.

I have looked at instrlist_insert_mov_instr_addr() and  insert_mov_immed_arch() previously and have seen the latter work nicely for inserting immediate values with movs. When called with the instr code path it seems to get the encode_pc value from vmcode_get_end(); which might explain my previous posts analysis. It then tries to encode the result of that as an immediate value:

 if (src_inst != NULL)
     val
= (ptr_int_t)encode_estimate;

So I could possibly check to see if the src_inst has value already?

 if (src_inst != NULL && instr_get_app_pc(src_inst) != NULL)
     val = (ptr_int_t)instr_get_app_pc(src_inst)
 else
     
val = (ptr_int_t)encode_estimate;

Although instr_get_app_pc often returns 0x0 and not sure about this usage if instru_insert_mov_pc is being called before it is being inserted with (where it will more likely return 0x0):

PRE(bb, inst, retaddr);

So this suggestion might work if I do it after the insert or I could simply call instru_insert_mov_pc with the 'where' inst instead?:

 instru_insert_mov_pc(drcontext, bb, inst, opnd_create_reg(ESP_SLOW_SCRATCH2), opnd_create_instr(inst));

I am keen to fix this if I can wrap my head around it :)
To unsubscribe from this group and stop receiving emails from it, send an email to drmemor...@googlegroups.com.

Derek Bruening

unread,
Jul 29, 2020, 7:52:26 PM7/29/20
to drmemor...@googlegroups.com
Look at the ARM implementation just below and the x86 implementation in another file: you need opnd_create_instr* instead of OPND_CREATE_INT*.  The vmcode_get_end() is just for reachability of displacements.

To unsubscribe from this group and stop receiving emails from it, send an email to drmemory-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/drmemory-users/9af76b37-6220-4534-9dc5-3284e1fac4c3o%40googlegroups.com.

Greg Cawthorne

unread,
Jul 29, 2020, 7:54:18 PM7/29/20
to Dr. Memory Users
Should read:

if (src_inst != NULL && instr_get_app_pc(src_inst) != NULL)
val = (ptr_int_t)instr_get_app_pc(src_inst)

else if (src_inst != NULL)
val = (ptr_int_t)encode_estimate;

Greg Cawthorne

unread,
Jul 29, 2020, 7:55:58 PM7/29/20
to Dr. Memory Users
Right will have a play with this tomorrow thanks

Greg Cawthorne

unread,
Jul 30, 2020, 3:44:13 AM7/30/20
to Dr. Memory Users
Had a try of adding the opnd_create_instr_ex as you said. In the encode_gen.h file generated from aarch64/codec.txt the case for handling movz has a strict check that the operand is a imm16 and fails on something of type instr. I know the b encoding handles instructions so I had a poke in there and it has a bespoke function handler in codec.c, encode_opnds_b(), which calls to encode_pc_off(). So it looks like with the current setup id have to create another similar function in codec.c I add an entry for movz with blank params in codec.txt it codec.py might the generate the correct call if I follow the naming scheme.

Derek Bruening

unread,
Jul 30, 2020, 8:31:08 AM7/30/20
to drmemor...@googlegroups.com
I see, so the AArch64 encoder does *not* handle an instr_t* as an immediate today.  If you could file an issue in the dynamorio tracker on that (I guess one issue could cover both the instrlist_insert_mov_instr_addr() and encoder).  Sure, even better to implement it and send a PR.  I guess here with the 16-bit immed limit in MOVK we're going to generate 4 MOVK instructions (with opnd_create_instr_ex() specifying a shift for the instr_t final PC immed) on AArch64 (vs 1 x86 instr or 2 ARM).  Doing instead local embedded data and a PC-rel load does not have first-class IR support (can be done though: make a raw instr whose raw bits are the data; have to arrange to not execute the data; complicates disasm when in middle of fragment; rarely done in DR but does exist: one case is in rseq handling code): so simpler to go with the 4 MOVK (presumably that executes more quickly too?).

On Thu, Jul 30, 2020 at 3:44 AM Greg Cawthorne <cawth...@gmail.com> wrote:
Had a try of adding the opnd_create_instr_ex as you said. In the encode_gen.h file generated from aarch64/codec.txt the case for handling movz has a strict check that the operand is a imm16 and fails on something of type instr. I know the b encoding handles instructions so I had a poke in there and it has a bespoke function handler in codec.c, encode_opnds_b(), which calls to encode_pc_off(). So it looks like with the current setup id have to create another similar function in codec.c I add an entry for movz with blank params in codec.txt it codec.py might the generate the  correct call if I follow the naming scheme.

--

---
You received this message because you are subscribed to the Google Groups "Dr. Memory Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to drmemory-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/drmemory-users/5781cbf0-2497-4f2f-9502-68262ac3c2ffo%40googlegroups.com.

Greg Cawthorne

unread,
Jul 31, 2020, 6:55:26 PM7/31/20
to Dr. Memory Users
Hi Derek.

I've had a play and gotten movz and movk to go into a custom encode_opnds_mov function.

I have been trying to get the appropriate values to be put into the immediate bits using the ->note field of the instr_t struct. It seems as it currently stands at the point of encoding the note field for both the parent instr_t and the instr_t in the source operand hold very small address (within 8 bits in my limited testing). This must be okay for relative branches as you just need the difference between any two but it is clear they haven't had their actual address set yet.

I will try and see if I can look how the absolute address are fetch for ARM32 and see if I can do something similar. If you have any suggestions about this that would be greatly appreciated :)

Thanks.
To unsubscribe from this group and stop receiving emails from it, send an email to drmemor...@googlegroups.com.

Greg Cawthorne

unread,
Jul 31, 2020, 8:13:32 PM7/31/20
to Dr. Memory Users
Actually I seem to have just added the pc handed into encode_opnds_mov to the offset a branch would use and the seems to have worked (potentially).

Greg Cawthorne

unread,
Aug 1, 2020, 5:38:51 AM8/1/20
to Dr. Memory Users
The one thing that is a shame is I am unsure how to determine how many movks will be needed in mangle.c in insert_mov_immed_arch (Im a bit behind master). So I am always doing the movz + 3 movk at the moment when we are incoming an instruction.

Greg Cawthorne

unread,
Aug 1, 2020, 9:11:26 AM8/1/20
to Dr. Memory Users
Could potentially encode a nop if a 0 is detected?

Derek Bruening

unread,
Aug 3, 2020, 11:09:41 AM8/3/20
to drmemor...@googlegroups.com
It would be nice to optimize the immediate.  Note that this mailing list is for end users of the Dr. Memory tool, who likely know little about and are perhaps not interested in the implementation internals.  Moving the conversation to the dynamorio-users list might pull out more discussion as the list members there are tool *builders*.

On Sat, Aug 1, 2020 at 9:11 AM Greg Cawthorne <cawth...@gmail.com> wrote:
Could potentially encode a nop if a 0 is detected?

--

---
You received this message because you are subscribed to the Google Groups "Dr. Memory Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to drmemory-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/drmemory-users/a3722a2c-d1da-4870-b697-4341a91a37bbo%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages