Hi,
I have a bit of a specific question.
I am trying to find the memory locations of .bss, .data, and .rodata for the running binary, and most the modules it loads (not including VDSO).
This is usually quite simple to do:
1. I get the module's full path.
2. Use libelf on the that file, extract the sh_size and sh_addr or sh_offset (They are usually the same) of .bss, .data, and .rodata
3. Using dr_module, find the start and end of the module, then find where the address corresponds to as a memorylocation (also minding the possible gaps in the module).
Overall, this is pretty simple, since most of the time as I said earlier, sh_offset and sh_addr are usually the same. However, I encountered a couple libraries where they are NOT the same, for example (output of read elf of some shared lib):
There are 29 section headers, starting at offset 0x192d8:
Section Headers:
[Nr] Name Type Address Off Size ES Flg Lk Inf Al
[ 0] NULL 0000000000000000 000000 000000 00 0 0 0
[ 1] .note.gnu.property NOTE 00000000000002a8 0002a8 000020 00 A 0 0 8
[ 2] .note.gnu.build-id NOTE 00000000000002c8 0002c8 000024 00 A 0 0 4
[ 3] .gnu.hash GNU_HASH 00000000000002f0 0002f0 0004f8 00 A 4 0 8
[ 4] .dynsym DYNSYM 00000000000007e8 0007e8 001020 18 A 5 1 8
[ 5] .dynstr STRTAB 0000000000001808 001808 0008a8 00 A 0 0 1
[ 6] .gnu.version VERSYM 00000000000020b0 0020b0 000158 02 A 4 0 2
[ 7] .gnu.version_d VERDEF 0000000000002208 002208 0001c4 00 A 5 13 8
[ 8] .gnu.version_r VERNEED 00000000000023d0 0023d0 000030 00 A 5 1 8
[ 9] .rela.dyn RELA 0000000000002400 002400 000108 18 A 4 0 8
[10] .rela.plt RELA 0000000000002508 002508 000438 18 AI 4 24 8
[11] .init PROGBITS 0000000000003000 003000 00001b 00 AX 0 0 4
[12] .plt PROGBITS 0000000000003020 003020 0002e0 10 AX 0 0 16
[13] .plt.got PROGBITS 0000000000003300 003300 000010 10 AX 0 0 16
[14] .plt.sec PROGBITS 0000000000003310 003310 0002d0 10 AX 0 0 16
[15] .text PROGBITS 00000000000035e0 0035e0 011355 00 AX 0 0 16
[16] .fini PROGBITS 0000000000014938 014938 00000d 00 AX 0 0 4
[17] .rodata PROGBITS 0000000000015000 015000 000984 00 A 0 0 32
[18] .eh_frame_hdr PROGBITS 0000000000015984 015984 000654 00 A 0 0 4
[19] .eh_frame PROGBITS 0000000000015fd8 015fd8 001ec4 00 A 0 0 8
[20] .init_array INIT_ARRAY 0000000000019dc0 018dc0 000010 08 WA 0 0 8
[21] .fini_array FINI_ARRAY 0000000000019dd0 018dd0 000008 08 WA 0 0 8
[22] .dynamic DYNAMIC 0000000000019dd8 018dd8 0001f0 10 WA 5 0 8
[23] .got PROGBITS 0000000000019fc8 018fc8 000038 08 WA 0 0 8
[24] .got.plt PROGBITS 000000000001a000 019000 000180 08 WA 0 0 8
[25] .data PROGBITS 000000000001a180 019180 000010 00 WA 0 0 8
[26] .bss NOBITS 000000000001a1a0 019190 0002a8 00 WA 0 0 32
[27] .gnu_debuglink PROGBITS 0000000000000000 019190 000034 00 0 0 4
[28] .shstrtab STRTAB 0000000000000000 0191c4 000112 00 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
L (link order), O (extra OS processing required), G (group), T (TLS),
C (compressed), x (unknown), o (OS specific), E (exclude),
l (large), p (processor specific)
Now this brings up the following question:
It is my understanding that when sh_offset and sh_addr are different like the image above, the offset represents the offset in the file, while the address represents the offset in memory when the module is loaded (for example:
This question). However, based on that assumption, i.e. by using sh_addr, I found that .data and .bss in the lib above fall outside the module boundaries (again, minding the gaps).
Does DR change the way libraries are loaded? Or am I wrong with my assumption? Are things different if the module is the main app vs a shared library? Can someone provide some guidance here? I need to be 100% sure of the locations I get for all three sections as I build up upon them later.