Incomplete source operand when decoding Aarch64

53 views
Skip to first unread message

Artem Shcherbak

unread,
Feb 15, 2024, 3:39:28 AMFeb 15
to DynamoRIO Users

Hi, @derekbruening and @AssadHashmi.

We have a question about how DR decodes instructions, especially for the decoder_v80 for Aarch64.

We observed that some instructions, such as “ldr +0x08(%x0) -> %x1” in the function decode_opndsgen_39800000_003fffff in the file opnd_decode_funcs.h (see below), write the computed address for loads as the source operand, but do not include the x0 register as a source operand. This register is written in the base register field.

This causes problems for tools that analyze register read and write dependencies and other tools that rely on the source and destination operands of such instructions, as they produce incorrect data (they ignore the register; we are facing a scenario where we write to a register without ever reading from it).

Does this decoding behavior have any rationale? Is it a problem? 


opnd_decode_funcs.h: 4555

/* ['x0'] <- ['mem12'] */

static bool

decode_opndsgen_39800000_003fffff(uint encdcontext_t *dcontextbyte *pcinstr_t *instrint opcode)

{

    opnd_t dst0src0;

    if (!decode_opnd_x0(enc & 0x0000001fopcodepc, &dst0) ||

        !decode_opnd_mem12(enc & 0xc03fffe0opcodepc, &src0))

        return false;

    instr_set_opcode(instropcode);

    instr_set_num_opnds(dcontextinstr11);

    instr_set_dst(instr0dst0);

    instr_set_src(instr0src0);

    return true;

}

Artem Shcherbak

unread,
Feb 15, 2024, 5:57:57 AMFeb 15
to DynamoRIO Users

Although, for example, for this pre-index “ldr    +0x10(%x1)[8byte] %x1 $0x0000000000000010 -> %x0 %x1”  instruction, the base register x1 is placed in the list of source arguments as full operand.

See function decode_opndsgen_38800c00_001ff3ff represented below.


/* ['x0', 'x5sp'] <- ['mem9', 'x5sp', 'mem9off'] */

static bool

decode_opndsgen_38800c00_001ff3ff(uint enc, dcontext_t *dcontext, byte *pc, instr_t *instr, int opcode)

{

    opnd_t dst0, dst1, src0, src1, src2;

    if (!decode_opnd_x0(enc & 0x0000001f, opcode, pc, &dst0) ||

        !decode_opnd_x5sp(enc & 0x000003e0, opcode, pc, &dst1) ||

        !decode_opnd_mem9(enc & 0xc01ff3e0, opcode, pc, &src0) ||

        !decode_opnd_x5sp(enc & 0x000003e0, opcode, pc, &src1) ||

        !decode_opnd_mem9off(enc & 0x001ff000, opcode, pc, &src2))

        return false;

    instr_set_opcode(instr, opcode);

    instr_set_num_opnds(dcontext, instr, 2, 3);

    instr_set_dst(instr, 0, dst0);

    instr_set_dst(instr, 1, dst1);

    instr_set_src(instr, 0, src0);

    instr_set_src(instr, 1, src1);

    instr_set_src(instr, 2, src2);

    return true;

}

BR,

Artem

четверг, 15 февраля 2024 г. в 11:39:28 UTC+3, Artem Shcherbak:

Derek Bruening

unread,
Feb 15, 2024, 6:31:53 PMFeb 15
to Artem Shcherbak, DynamoRIO Users
The design of the IR treats a memory reference as one operand even when it uses multiple registers to compute its address, which simplifies some use cases.  Tools interested in all registers inside operands can use interfaces like instr_reads_from_reg(), which as you can see looks inside the operands, or opnd_get_reg_used() to generically walk the registers inside an operand.

--
You received this message because you are subscribed to the Google Groups "DynamoRIO Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dynamorio-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dynamorio-users/bbab223e-b877-49eb-935e-3c2b6977a2a9n%40googlegroups.com.

Artem Shcherbak

unread,
Feb 28, 2024, 6:10:18 AMFeb 28
to DynamoRIO Users
Thank you! We used the function opnd_get_reg_used() and it worked for our tool.

пятница, 16 февраля 2024 г. в 02:31:53 UTC+3, Derek Bruening:
Reply all
Reply to author
Forward
Message has been deleted
0 new messages