Hello all
Quite new to OSv, just started playing around with OSv recently and I came across this. I had a problem where when I do `scripts/run.py -V -c 1` on a few C++ test binaries, I was getting errors and eventually a failure like so:
```
could not load f(YYDX\XA^A^(f.f.fE
ignoring missing symbol
ignoring missing symbol tA
...
failed looking up symbol UHH!33333333H)HH!!HHHH!HD
```
which seemed odd since it looks like it was getting junk values for symbol names. On debugging a bit, I found that the .strtab section of the failing library was somehow getting overwritten from some arbitrary address. On inspecting the memory, I could see an ELF header there which made me think that the libraries are getting loaded at overlapping memory regions - which eventually led me to debugging the load part - specifically the elf::
object::set_base function where it calculates the _base and _end. Here is what I found in from gdb for the problematic library:
(gdb) info locals
p = {p_type = 1, p_flags = 5, p_offset = 0, p_vaddr = 0, p_paddr = 0, p_filesz = 110200, p_memsz = 110200, p_align = 4096}
q = {p_type = 6, p_flags = 4, p_offset = 598016, p_vaddr = 122880, p_paddr = 122880, p_filesz = 448, p_memsz = 448, p_align = 8}
(gdb) p _ phdrs
$8 = std::vector of length 8, capacity 8 = {
{p_type = 6, p_flags = 4, p_offset = 598016, p_vaddr = 122880, p_paddr = 122880, p_filesz = 448, p_memsz = 448, p_align = 8},
{p_type = 1, p_flags = 5, p_offset = 0, p_vaddr = 0, p_paddr = 0, p_filesz = 110200, p_memsz = 110200, p_align = 4096},
{p_type = 1685382481, p_flags = 6, p_offset = 0, p_vaddr = 0, p_paddr = 0, p_filesz = 0, p_memsz = 0, p_align = 16},
{p_type = 1685382480, p_flags = 4, p_offset = 107228, p_vaddr = 107228, p_paddr = 107228, p_filesz = 2972, p_memsz = 2972, p_align = 4},
{p_type = 1, p_flags = 6, p_offset = 112560, p_vaddr = 116656, p_paddr = 116656, p_filesz = 3736, p_memsz = 4073, p_align = 4096},
{p_type = 1685382482, p_flags = 6, p_offset = 112560, p_vaddr = 116656, p_paddr = 116656, p_filesz = 2128, p_memsz = 2128, p_align = 16},
{p_type = 2, p_flags = 6, p_offset = 113648, p_vaddr = 117744, p_paddr = 117744, p_filesz = 592, p_memsz = 592, p_align = 8},
{p_type = 1, p_flags = 6, p_offset = 598016, p_vaddr = 122880, p_paddr = 122880, p_filesz = 22584, p_memsz = 22584, p_align = 4096}
}
(gdb)
As I understand, 'q' should be the PT_LOAD header with the highest p_vaddr but note that it is pointing to the PT_PHDR section - as both of them has the same p_vaddr but the PT_PHDR obviously comes first in the program headers and std:min_element starts comparison starting the first element. That causes it to set the next offset to a lower address and seems to be the cause of the issue. I can get past it with this patch:
diff --git a/core/elf.cc b/core/elf.cc
index ffb16004..af5dca02 100644
--- a/core/elf.cc
+++ b/core/elf.cc
@@ -354,7 +354,7 @@ void object::set_base(void* base)
auto q = std::min_element(_phdrs.begin(), _phdrs.end(),
[](Elf64_Phdr a, Elf64_Phdr b)
{ return a.p_type == PT_LOAD
- && a.p_vaddr > b.p_vaddr; });
+ && a.p_vaddr >= b.p_vaddr; });
I am compiling the test binaries and OSv with gcc-7.3.0. Is that a real bug or am I missing something very obvious?
Thanks
Deepak