Libraries overlap when loaded into OSv

33 views
Skip to first unread message

Deepak Krishnan

unread,
Jul 4, 2020, 10:50:21 AM7/4/20
to OSv Development
Hello all
Quite new to OSv, just started playing around with OSv recently and I came across this. I had a problem where when I do `scripts/run.py -V -c 1` on a few C++ test binaries, I was getting errors and eventually a failure like so:

```
could not load f(YYDX\XA^A^(f.f.fE
ignoring missing symbol
ignoring missing symbol tA
...
failed looking up symbol UHH!33333333H)HH!!HHHH!HD
```

which seemed odd since it looks like it was getting junk values for symbol names. On debugging a bit, I found that the .strtab section of the failing library was somehow getting overwritten from some arbitrary address. On inspecting the memory, I could see an ELF header there which made me think that the libraries are getting loaded at overlapping memory regions - which eventually led me to debugging the load part - specifically the elf::object::set_base function where it calculates the _base and _end. Here is what I found in from gdb for the problematic library:

(gdb) info locals
p = {p_type = 1, p_flags = 5, p_offset = 0, p_vaddr = 0, p_paddr = 0, p_filesz = 110200, p_memsz = 110200, p_align = 4096}
q = {p_type = 6, p_flags = 4, p_offset = 598016, p_vaddr = 122880, p_paddr = 122880, p_filesz = 448, p_memsz = 448, p_align = 8}
(gdb) p _ phdrs
$8 = std::vector of length 8, capacity 8 = {
{p_type = 6, p_flags = 4, p_offset = 598016, p_vaddr = 122880, p_paddr = 122880, p_filesz = 448, p_memsz = 448, p_align = 8},
{p_type = 1, p_flags = 5, p_offset = 0, p_vaddr = 0, p_paddr = 0, p_filesz = 110200, p_memsz = 110200, p_align = 4096},
{p_type = 1685382481, p_flags = 6, p_offset = 0, p_vaddr = 0, p_paddr = 0, p_filesz = 0, p_memsz = 0, p_align = 16},
{p_type = 1685382480, p_flags = 4, p_offset = 107228, p_vaddr = 107228, p_paddr = 107228, p_filesz = 2972, p_memsz = 2972, p_align = 4},
{p_type = 1, p_flags = 6, p_offset = 112560, p_vaddr = 116656, p_paddr = 116656, p_filesz = 3736, p_memsz = 4073, p_align = 4096},
{p_type = 1685382482, p_flags = 6, p_offset = 112560, p_vaddr = 116656, p_paddr = 116656, p_filesz = 2128, p_memsz = 2128, p_align = 16},
{p_type = 2, p_flags = 6, p_offset = 113648, p_vaddr = 117744, p_paddr = 117744, p_filesz = 592, p_memsz = 592, p_align = 8},
{p_type = 1, p_flags = 6, p_offset = 598016, p_vaddr = 122880, p_paddr = 122880, p_filesz = 22584, p_memsz = 22584, p_align = 4096}
}
(gdb)

As I understand, 'q' should be the PT_LOAD header with the highest p_vaddr but note that it is pointing to the PT_PHDR section - as both of them has the same p_vaddr but the PT_PHDR obviously comes first in the program headers and std:min_element starts comparison starting the first element. That causes it to set the next offset to a lower address and seems to be the cause of the issue. I can get past it with this patch:

diff --git a/core/elf.cc b/core/elf.cc
index ffb16004..af5dca02 100644
--- a/core/elf.cc
+++ b/core/elf.cc
@@ -354,7 +354,7 @@ void object::set_base(void* base)
     auto q = std::min_element(_phdrs.begin(), _phdrs.end(),
                               [](Elf64_Phdr a, Elf64_Phdr b)
                                   { return a.p_type == PT_LOAD
-                                        && a.p_vaddr > b.p_vaddr; });
+                                        && a.p_vaddr >= b.p_vaddr; });

I am compiling the test binaries and OSv with gcc-7.3.0. Is that a real bug or am I missing something very obvious?

Thanks
Deepak

Waldek Kozaczuk

unread,
Jul 6, 2020, 3:01:03 PM7/6/20
to OSv Development
Hi,
It seems that you indeed have found an old bug which seems to cause a harm when PT_LOAD and non-PT_LOAD headers share the same p_vaddr maybe. Thanks for reporting it.

I am also not 100% sure what the exact intention behind what p and q should be but the fact that q ends up being a non-PT_LOAD headers seems to prove there is a bug. I think the issue lies in the fact that we are filtering and comparing at the same time. And while your patch might fix your specific case it may not be 100% correct either. I think we should separate filtering from comparing. My proposal is this:

diff --git a/core/elf.cc b/core/elf.cc
index ffb16004
..61537756 100644
--- a/core/elf.cc
+++ b/core/elf.cc
@@ -347,14 +347,17 @@ static bool intersects_with_kernel(Elf64_Addr elf_addr)
 
 
void object::set_base(void* base)
 
{
-    auto p = std::min_element(_phdrs.begin(), _phdrs.end(),
+    std::vector<Elf64_Phdr> pt_load_headers;
+    std::copy_if(_phdrs.begin(), _phdrs.end(), std::back_inserter(pt_load_headers),
+                 [](Elf64_Phdr p)
+                { return p.p_type == PT_LOAD; });
+
+    auto p = std::min_element(pt_load_headers.begin(), pt_load_headers.end(),
                               
[](Elf64_Phdr a, Elf64_Phdr b)
-                                  { return a.p_type == PT_LOAD
-                                        && a.p_vaddr < b.p_vaddr; });
-    auto q = std::min_element(_phdrs.begin(), _phdrs.end(),
+                                  { return a.p_vaddr < b.p_vaddr; });
+    auto q = std::max_element(pt_load_headers.begin(), pt_load_headers.end(),
                               
[](Elf64_Phdr a, Elf64_Phdr b)
-                                  { return a.p_type == PT_LOAD
-                                        && a.p_vaddr > b.p_vaddr; });
+                                  { return a.p_vaddr < b.p_vaddr; });
 
     
if (!is_core() && is_non_pie_executable()) {
         
// Verify non-PIE executable does not collide with the kernel

Does it fix you case?

I wonder what others think about. I also wonder if we should add a member variable or a helper method that will give us pt_load_headers as we filter PT_LOAD segments in other places in core/elf.cc

Waldek


Thanks
Deepak

Waldek Kozaczuk

unread,
Jul 7, 2020, 2:48:41 PM7/7/20
to Deepak Krishnan, OSv Development

On Tue, Jul 7, 2020 at 2:44 PM Deepak Krishnan <deep...@gmail.com> wrote:
Thank you for the response.
I agree - whatever I had in my email would not cover all cases here - was just a quick and dirty fix I tried while debugging.
Since that code hasn't changed in years, I ended up second guessing - if I was missing something very obvious in my understanding of it and eventually decided to report it.
The patch you suggested should fix it for all cases - I tested it too, it fixes my case.

Thanks. I will send the proper patch soon. 
 
Thanks
Deepak

--
You received this message because you are subscribed to the Google Groups "OSv Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/osv-dev/6035c111-a678-4a3f-b844-725c425e3bf9o%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages