DR Modules and ELF Sections

15 views
Skip to first unread message

Mohammad Ewais

unread,
Dec 30, 2021, 6:50:48 PM12/30/21
to DynamoRIO Users
Hi,

I have a bit of a specific question.

I am trying to find the memory locations of .bss, .data, and .rodata for the running binary, and most the modules it loads (not including VDSO).

This is usually quite simple to do:
1. I get the module's full path.
2. Use libelf on the that file, extract the sh_size and sh_addr or sh_offset (They are usually the same) of .bss, .data, and .rodata
3. Using dr_module, find the start and end of the module, then find where the address corresponds to as a memorylocation (also minding the possible gaps in the module).

Overall, this is pretty simple, since most of the time as I said earlier, sh_offset and sh_addr are usually the same. However, I encountered a couple libraries where they are NOT the same, for example (output of read elf of some shared lib):

There are 29 section headers, starting at offset 0x192d8:

Section Headers:
  [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            0000000000000000 000000 000000 00      0   0  0
  [ 1] .note.gnu.property NOTE            00000000000002a8 0002a8 000020 00   A  0   0  8
  [ 2] .note.gnu.build-id NOTE            00000000000002c8 0002c8 000024 00   A  0   0  4
  [ 3] .gnu.hash         GNU_HASH        00000000000002f0 0002f0 0004f8 00   A  4   0  8
  [ 4] .dynsym           DYNSYM          00000000000007e8 0007e8 001020 18   A  5   1  8
  [ 5] .dynstr           STRTAB          0000000000001808 001808 0008a8 00   A  0   0  1
  [ 6] .gnu.version      VERSYM          00000000000020b0 0020b0 000158 02   A  4   0  2
  [ 7] .gnu.version_d    VERDEF          0000000000002208 002208 0001c4 00   A  5  13  8
  [ 8] .gnu.version_r    VERNEED         00000000000023d0 0023d0 000030 00   A  5   1  8
  [ 9] .rela.dyn         RELA            0000000000002400 002400 000108 18   A  4   0  8
  [10] .rela.plt         RELA            0000000000002508 002508 000438 18  AI  4  24  8
  [11] .init             PROGBITS        0000000000003000 003000 00001b 00  AX  0   0  4
  [12] .plt              PROGBITS        0000000000003020 003020 0002e0 10  AX  0   0 16
  [13] .plt.got          PROGBITS        0000000000003300 003300 000010 10  AX  0   0 16
  [14] .plt.sec          PROGBITS        0000000000003310 003310 0002d0 10  AX  0   0 16
  [15] .text             PROGBITS        00000000000035e0 0035e0 011355 00  AX  0   0 16
  [16] .fini             PROGBITS        0000000000014938 014938 00000d 00  AX  0   0  4
  [17] .rodata           PROGBITS        0000000000015000 015000 000984 00   A  0   0 32
  [18] .eh_frame_hdr     PROGBITS        0000000000015984 015984 000654 00   A  0   0  4
  [19] .eh_frame         PROGBITS        0000000000015fd8 015fd8 001ec4 00   A  0   0  8
  [20] .init_array       INIT_ARRAY      0000000000019dc0 018dc0 000010 08  WA  0   0  8
  [21] .fini_array       FINI_ARRAY      0000000000019dd0 018dd0 000008 08  WA  0   0  8
  [22] .dynamic          DYNAMIC         0000000000019dd8 018dd8 0001f0 10  WA  5   0  8
  [23] .got              PROGBITS        0000000000019fc8 018fc8 000038 08  WA  0   0  8
  [24] .got.plt          PROGBITS        000000000001a000 019000 000180 08  WA  0   0  8
  [25] .data             PROGBITS        000000000001a180 019180 000010 00  WA  0   0  8
  [26] .bss              NOBITS          000000000001a1a0 019190 0002a8 00  WA  0   0 32
  [27] .gnu_debuglink    PROGBITS        0000000000000000 019190 000034 00      0   0  4
  [28] .shstrtab         STRTAB          0000000000000000 0191c4 000112 00      0   0  1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
  L (link order), O (extra OS processing required), G (group), T (TLS),
  C (compressed), x (unknown), o (OS specific), E (exclude),
  l (large), p (processor specific)

Now this brings up the following question:

It is my understanding that when sh_offset and sh_addr are different like the image above, the offset represents the offset in the file, while the address represents the offset in memory when the module is loaded (for example: This question). However, based on that assumption, i.e. by using sh_addr, I found that .data and .bss in the lib above fall outside the module boundaries (again, minding the gaps).

Does DR change the way libraries are loaded? Or am I wrong with my assumption? Are things different if the module is the main app vs a shared library? Can someone provide some guidance here? I need to be 100% sure of the locations I get for all three sections as I build up upon them later.



Derek Bruening

unread,
Jan 5, 2022, 5:44:08 PM1/5/22
to dynamor...@googlegroups.com
No, DR does not change how application libraries are loaded: the same dynamic loader code runs as would run without DR and it is that code that loads application shared libraries.  The only impact would be that DR is occupying some address space, so maybe load locations get shifted: but these days with ASLR that is already varying.

DR does map the application itself, but the app's dynamic loader then takes over: this should match what happens without DR as well.

Maybe showing precise values for where the outside-module-boundaries occurs would help to understand the details.

--
You received this message because you are subscribed to the Google Groups "DynamoRIO Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dynamorio-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dynamorio-users/43ba0c2e-5bd4-4969-aa86-9c3c940f4319n%40googlegroups.com.

Mohammad Ewais

unread,
Jan 7, 2022, 3:14:40 PM1/7/22
to DynamoRIO Users
Thanks a lot for the info Derek. As it turns out, I was misinformed. The sh_offset is the offset in file, while sh_addr is the offset in memory as I discovered. No changes with DR.
Reply all
Reply to author
Forward
0 new messages