[PATCH] memory: always allocate buf using "malloc" for non reactor

49 views
Skip to first unread message

Kefu Chai

<tchaikov@gmail.com>
unread,
Oct 4, 2021, 8:39:11 AM10/4/21
to seastar-dev@googlegroups.com, Kefu Chai
before this change, there is chance that Seastar reallocate a memory
buffer allocated by the c runtime even before "original_malloc_func"
is initialized. the runtime might want to allocate memory for some TLS
variables or just for itself, like env variables.

but we might want to call functions like setenv() from a reactor thread.
and setenv() could reallocates the memory buffer previously allocated by
the "main" thread. in this case, cpu_pages::shrink() panics if it is
asked to resize a memory chunk allocated by another reactor thread.

in this change, if allocate() is called in a non-reactor thread, the
original malloc() is always used. so it's not allocated in the Seastar
memory pool, hence the assertion failure can be avoided.

Signed-off-by: Kefu Chai <tcha...@gmail.com>
---
src/core/memory.cc | 11 +++++------
1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/src/core/memory.cc b/src/core/memory.cc
index e9e828cc..6ddfbd06 100644
--- a/src/core/memory.cc
+++ b/src/core/memory.cc
@@ -1365,13 +1365,12 @@ static constexpr int debug_allocation_pattern = 0xab;

void* allocate(size_t size) {
if (!is_reactor_thread) {
- if (original_malloc_func) {
- alloc_stats::increment(alloc_stats::types::foreign_mallocs);
- return original_malloc_func(size);
+ malloc_func_type malloc_func = original_malloc_func;
+ if (!malloc_func) {
+ malloc_func = reinterpret_cast<malloc_func_type>(dlsym(RTLD_NEXT, "malloc"));
}
- // original_malloc_func might be null for allocations before main
- // in constructors before original_malloc_func ctor is called
- init_cpu_mem();
+ alloc_stats::increment(alloc_stats::types::foreign_mallocs);
+ return malloc_func(size);
}
if (size <= sizeof(free_object)) {
size = sizeof(free_object);
--
2.33.0

tcha...@gmail.com

<tchaikov@gmail.com>
unread,
Oct 4, 2021, 8:40:28 AM10/4/21
to seastar-dev
hi folks,

i came across an issue while testing Seastar master using 

./test.py --mode release --smp 1 --name Seastar.unit.file_utils

when debugging build/release/tests/unit/file_utils_test, i have following backtrace:

#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:49
#1  0x00007ffff4508536 in __GI_abort () at abort.c:79
#2  0x00007ffff450841f in __assert_fail_base (fmt=0x7ffff466f6b0 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x555556efa503 "obj_cpu == cpu_id",
    file=0x555556efa6e0 "/var/ssd/ceph/src/seastar/src/core/memory.cc", line=964, function=<optimized out>) at assert.c:92
#3  0x00007ffff45177f2 in __GI___assert_fail (assertion=assertion@entry=0x555556efa503 "obj_cpu == cpu_id", file=file@entry=0x555556efa6e0 "/var/ssd/ceph/src/seastar/src/core/memory.cc",
    line=line@entry=964, function=function@entry=0x555556efa8b0 "void seastar::memory::cpu_pages::shrink(void*, size_t)") at assert.c:101
#4  0x0000555556ba500d in seastar::memory::cpu_pages::shrink (this=<optimized out>, ptr=ptr@entry=0x6000000c3c00, new_size=new_size@entry=600) at /var/ssd/ceph/src/seastar/src/core/memory.cc:964
#5  0x0000555556ba7638 in seastar::memory::shrink (new_size=600, obj=0x6000000c3c00) at /var/ssd/ceph/src/seastar/src/core/memory.cc:1359
#6  realloc (ptr=0x6000000c3c00, size=600) at /var/ssd/ceph/src/seastar/src/core/memory.cc:1901
#7  0x00007ffff4520ea4 in __add_to_environ (name=name@entry=0x555556ee7dfd "TMPDIR", value=value@entry=0x555556ee7deb "/tmp/non-existing-TMPDIR", combined=combined@entry=0x0, replace=replace@entry=1)
    at setenv.c:154
#8  0x00007ffff45211cc in __setenv (name=name@entry=0x555556ee7dfd "TMPDIR", value=value@entry=0x555556ee7deb "/tmp/non-existing-TMPDIR", replace=replace@entry=1) at setenv.c:259
#9  0x0000555556b5f64d in test_non_existing_TMPDIR::do_run_test_case (this=<optimized out>) at /var/ssd/ceph/src/seastar/tests/unit/file_utils_test.cc:82
#10 0x0000555556b6f0f4 in test_non_existing_TMPDIR::run_test_case()::{lambda()#1}::operator()() const (__closure=<optimized out>) at /var/ssd/ceph/src/seastar/tests/unit/file_utils_test.cc:80
#11 std::__invoke_impl<void, test_non_existing_TMPDIR::run_test_case()::{lambda()#1}>(std::__invoke_other, test_non_existing_TMPDIR::run_test_case()::{lambda()#1}&&) (__f=...)
    at /usr/include/c++/10/bits/invoke.h:60
#12 std::__invoke<test_non_existing_TMPDIR::run_test_case()::{lambda()#1}>(test_non_existing_TMPDIR::run_test_case()::{lambda()#1}&&) (__fn=...) at /usr/include/c++/10/bits/invoke.h:95
#13 std::__apply_impl<test_non_existing_TMPDIR::run_test_case()::{lambda()#1}, std::tuple<>>(test_non_existing_TMPDIR::run_test_case()::{lambda()#1}&&, std::tuple<>&&, std::integer_sequence<unsigned long>) (__t=..., __f=...) at /usr/include/c++/10/tuple:1727
#14 std::apply<test_non_existing_TMPDIR::run_test_case()::{lambda()#1}, std::tuple<> >(test_non_existing_TMPDIR::run_test_case()::{lambda()#1}&&, std::tuple<>&&) (__t=..., __f=...)
    at /usr/include/c++/10/tuple:1738
#15 seastar::futurize<void>::apply<test_non_existing_TMPDIR::run_test_case()::{lambda()#1}>(test_non_existing_TMPDIR::run_test_case()::{lambda()#1}&&, std::tuple<>&&) (func=..., func=..., args=...)
    at /var/ssd/ceph/src/seastar/include/seastar/core/future.hh:2099
#16 seastar::async<test_non_existing_TMPDIR::run_test_case()::{lambda()#1}>(seastar::thread_attributes, test_non_existing_TMPDIR::run_test_case()::{lambda()#1}&&)::{lambda()#1}::operator()() const (
    this=0x6010000ddb20) at /var/ssd/ceph/src/seastar/include/seastar/core/thread.hh:258
#17 seastar::noncopyable_function<void ()>::direct_vtable_for<seastar::async<test_non_existing_TMPDIR::run_test_case()::{lambda()#1}>(seastar::thread_attributes, test_non_existing_TMPDIR::run_test_case()::{lambda()#1}&&)::{lambda()#1}>::call(seastar::noncopyable_function<void ()> const*) (func=0x6010000ddb20) at /var/ssd/ceph/src/seastar/include/seastar/util/noncopyable_function.hh:124
#18 0x000055555574a07e in seastar::noncopyable_function<void ()>::operator()() const (this=0x6010000ddb20) at /var/ssd/ceph/src/seastar/include/seastar/util/noncopyable_function.hh:209
#19 seastar::thread_context::main (this=0x6010000ddb00) at /var/ssd/ceph/src/seastar/src/core/thread.cc:299

turns out we are trying to realloc a memory buffer previously allocated by another core. i managed to print out `obj_cpu` and `this->cpu_id`. they are 0 and 1 respectively. So, Seastar is trying to resize the memory buffer allocated on core#0 on core#1. and the interesting thing is that it is resizing the internal buffer (i.e. last_environ) used by setenv(). see https://github.com/lattera/glibc/blob/master/stdlib/setenv.c#L154

i think the buffer used by setenv should not be managed a reactor thread. but what happened was that the Seastar allocator did kick in when libc was allocating memory buffer for env variables. i verified my guess by setting a break point in seastar::memory::allocate(). and Avi's comment of

// original_malloc_func might be null for allocations before main
// in constructors before original_malloc_func ctor is called

also convinced me that this behavior is expected. but the thing is, test_non_existing_TMPDIR is performed on another thread (cpu#1) which is *not* the one (i.e. cpu#0) where last_environ was allocated. this fails the assert() in cpu_pages::shrink().

i am wondering if we can just use dlsym()'ed malloc as a fallback when original_malloc_func is not set yet in the "if (!is_reactor_thread)" branch.

Avi Kivity

<avi@scylladb.com>
unread,
Oct 4, 2021, 9:43:17 AM10/4/21
to tcha...@gmail.com, seastar-dev

Unfortunately, I don't see a good way to work around it. In Seastar we insist on not using globals, but libraries are free to use globals. Most don't, but some do. In many cases there's a reentrant alternative (_r suffix) that should be used anyway.



and Avi's comment of

// original_malloc_func might be null for allocations before main
// in constructors before original_malloc_func ctor is called

also convinced me that this behavior is expected. but the thing is, test_non_existing_TMPDIR is performed on another thread (cpu#1) which is *not* the one (i.e. cpu#0) where last_environ was allocated. this fails the assert() in cpu_pages::shrink().

i am wondering if we can just use dlsym()'ed malloc as a fallback when original_malloc_func is not set yet in the "if (!is_reactor_thread)" branch.


I think that ended with an endless loop because dlsym() also uses malloc. Maybe it can be resolved by not overriding __libc_malloc() and friends - perhaps dlsym() uses __libc_malloc() and not malloc().


--
You received this message because you are subscribed to the Google Groups "seastar-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to seastar-dev...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/seastar-dev/3e0888ef-dd0b-42f1-849f-11cf4dfd8f5cn%40googlegroups.com.

Avi Kivity

<avi@scylladb.com>
unread,
Oct 4, 2021, 9:50:33 AM10/4/21
to Kefu Chai, seastar-dev@googlegroups.com
Note there is a similar issue in allocate_aligned().


As I mentioned in another place, I remember a dependency loop in this
area, where dlsym() wanted to call malloc() too. Maybe a glibc change
means dlsym() no longer calls malloc and we avoid the loop, or maybe I
misremember. If it's a glibc change, then this fix will cause a
regression on older glibc. Otherwise, this is a good fix, since we don't
want to use the Seastar allocator for non-reactor threads.

Avi Kivity

<avi@scylladb.com>
unread,
Oct 4, 2021, 9:56:46 AM10/4/21
to tcha...@gmail.com, seastar-dev


On 04/10/2021 16.50, tcha...@gmail.com wrote:


Avi, i tested my patch it survived the tests. so probably dlsym() does not use malloc? see https://github.com/lattera/glibc/blob/master/dlfcn/dlsym.c .
 


Please test with Fedora 32, which was the latest in the timeframe the alien allocator was committed. If the change works there then it probably won't cause a regression. Also make sure to update allocate_aligned to be sure it isn't the one that calls init_cpu_mem().

kefu chai

<tchaikov@gmail.com>
unread,
Oct 4, 2021, 10:21:03 AM10/4/21
to Avi Kivity, seastar-dev
just tested this change on centos7. the test ends up with following backtrace

#0 0x000000000143da74 in seastar::memory::allocate (size=32) at
../../src/core/memory.cc:1367
#1 0x000000000143de1d in malloc (n=<optimized out>) at
../../src/core/memory.cc:1855
#2 calloc (nmemb=<optimized out>, size=<optimized out>) at
../../src/core/memory.cc:1855
#3 0x00007ffff79c5a15 in _dlerror_run () from /lib64/libdl.so.2
#4 0x00007ffff79c53c3 in dlsym () from /lib64/libdl.so.2
#5 0x000000000143daf9 in seastar::memory::allocate (size=32) at
../../src/core/memory.cc:1370
#6 0x000000000143de1d in malloc (n=<optimized out>) at
../../src/core/memory.cc:1855
#7 calloc (nmemb=<optimized out>, size=<optimized out>) at
../../src/core/memory.cc:1855
#8 0x00007ffff79c5a15 in _dlerror_run () from /lib64/libdl.so.2
#9 0x00007ffff79c53c3 in dlsym () from /lib64/libdl.so.2
#10 0x000000000143daf9 in seastar::memory::allocate (size=32) at
../../src/core/memory.cc:1370
#11 0x000000000143de1d in malloc (n=<optimized out>) at
../../src/core/memory.cc:1855
#12 calloc (nmemb=<optimized out>, size=<optimized out>) at
../../src/core/memory.cc:1855
#13 0x00007ffff79c5a15 in _dlerror_run () from /lib64/libdl.so.2
#14 0x00007ffff79c53c3 in dlsym () from /lib64/libdl.so.2
#15 0x000000000143daf9 in seastar::memory::allocate (size=32) at
../../src/core/memory.cc:1370
...

and https://elixir.bootlin.com/glibc/latest/source/dlfcn/dlerror.c#L168
seems to echo this failure.
https://sourceware.org/git/?p=glibc.git;a=blob_plain;f=dlfcn/dlerror.c;hb=HEAD
has the same malloc call.

but the glibc shipped by debian unstable uses a different
implementation: it uses a static buffer. see
https://sources.debian.org/src/glibc/2.31-13/dlfcn/dlerror.c/#L76.
guess that's why the fix survived the test on my debian box.

in short, this change does introduce a regression when Seastar is
linked against some glibc versions. =(

> want to use the Seastar allocator for non-reactor threads.
>
>
>
>
> > }
> > if (size <= sizeof(free_object)) {
> > size = sizeof(free_object);



--
Regards
Kefu Chai

Avi Kivity

<avi@scylladb.com>
unread,
Oct 4, 2021, 10:29:16 AM10/4/21
to kefu chai, seastar-dev
How about this:


1. don't override __libc_malloc and friends

2. instead of using dlsym(RTLD_NEXT) for original_malloc_func, use
__libc_malloc.

Kefu Chai

<tchaikov@gmail.com>
unread,
Oct 4, 2021, 11:52:58 AM10/4/21
to seastar-dev@googlegroups.com, Kefu Chai
before this change, there is chance that Seastar reallocate a memory
buffer allocated by the c runtime even before "original_malloc_func"
is initialized. the runtime might want to allocate memory for some TLS
variables or just for itself, like env variables.

but we might want to call functions like setenv() from a reactor thread.
and setenv() could reallocates the memory buffer previously allocated by
the "main" thread. in this case, cpu_pages::shrink() panics if it is
asked to resize a memory chunk allocated by another reactor thread.

in this change, __libc_malloc and friends are not overridden anymore.
instead of using dlsym(RTLD_NEXT) for original_malloc_func, use
__libc_malloc.

the same applies to other malloc related functions.

because musl libc does not define __malloc_trim() yet, see
https://git.musl-libc.org/cgit/musl/tree/src/include/stdlib.h .
the malloc_trim() implementation calls __malloc_trim() only if __GLIBC__
is defined.

the solution was proposed by Avi Kivity <a...@scylladb.com>

Signed-off-by: Kefu Chai <tcha...@gmail.com>
---
src/core/memory.cc | 109 +++++----------------------------------------
1 file changed, 12 insertions(+), 97 deletions(-)

diff --git a/src/core/memory.cc b/src/core/memory.cc
index e9e828cc..4da69eef 100644
--- a/src/core/memory.cc
+++ b/src/core/memory.cc
@@ -240,22 +240,6 @@ static uint64_t get(types stat_type)

}

-// original memory allocator support
-// note: allocations before calling the constructor would use seastar allocator
-using malloc_func_type = void * (*)(size_t);
-using free_func_type = void * (*)(void *);
-using realloc_func_type = void * (*)(void *, size_t);
-using aligned_alloc_type = void * (*)(size_t alignment, size_t size);
-using malloc_trim_type = int (*)(size_t);
-using malloc_usable_size_type = size_t (*)(void *);
-
-malloc_func_type original_malloc_func = reinterpret_cast<malloc_func_type>(dlsym(RTLD_NEXT, "malloc"));
-free_func_type original_free_func = reinterpret_cast<free_func_type>(dlsym(RTLD_NEXT, "free"));
-realloc_func_type original_realloc_func = reinterpret_cast<realloc_func_type>(dlsym(RTLD_NEXT, "realloc"));
-aligned_alloc_type original_aligned_alloc_func = reinterpret_cast<aligned_alloc_type>(dlsym(RTLD_NEXT, "aligned_alloc"));
-malloc_trim_type original_malloc_trim_func = reinterpret_cast<malloc_trim_type>(dlsym(RTLD_NEXT, "malloc_trim"));
-malloc_usable_size_type original_malloc_usable_size_func = reinterpret_cast<malloc_usable_size_type>(dlsym(RTLD_NEXT, "malloc_usable_size"));
-
using allocate_system_memory_fn
= std::function<mmap_area (void* where, size_t how_much)>;

@@ -952,7 +936,7 @@ cpu_pages::try_foreign_free(void* ptr) {
} else {
alloc_stats::increment(alloc_stats::types::foreign_frees);
}
- original_free_func(ptr);
+ __libc_free(ptr);
return true;
}
free_cross_cpu(object_cpu_id(ptr), ptr);
@@ -1365,13 +1349,8 @@ static constexpr int debug_allocation_pattern = 0xab;

void* allocate(size_t size) {
if (!is_reactor_thread) {
- if (original_malloc_func) {
- alloc_stats::increment(alloc_stats::types::foreign_mallocs);
- return original_malloc_func(size);
- }
- // original_malloc_func might be null for allocations before main
- // in constructors before original_malloc_func ctor is called
- init_cpu_mem();
+ alloc_stats::increment(alloc_stats::types::foreign_mallocs);
+ return __libc_malloc(size);
}
if (size <= sizeof(free_object)) {
size = sizeof(free_object);
@@ -1396,13 +1375,8 @@ void* allocate(size_t size) {

void* allocate_aligned(size_t align, size_t size) {
if (!is_reactor_thread) {
- if (original_aligned_alloc_func) {
- alloc_stats::increment(alloc_stats::types::foreign_mallocs);
- return original_aligned_alloc_func(align, size);
- }
- // original_realloc_func might be null for allocations before main
- // in constructors before original_realloc_func ctor is called
- init_cpu_mem();
+ alloc_stats::increment(alloc_stats::types::foreign_mallocs);
+ return __libc_posix_memalign(align, size);
}
if (size <= sizeof(free_object)) {
size = std::max(sizeof(free_object), align);
@@ -1817,16 +1791,6 @@ void* malloc(size_t n) throw () {
return allocate(n);
}

-extern "C"
-[[gnu::alias("malloc")]]
-[[gnu::visibility("default")]]
-[[gnu::malloc]]
-[[gnu::alloc_size(1)]]
-#ifndef __clang__
-[[gnu::leaf]]
-#endif
-void* __libc_malloc(size_t n) throw ();
-
extern "C"
[[gnu::visibility("default")]]
[[gnu::used]]
@@ -1836,14 +1800,6 @@ void free(void* ptr) {
}
}

-extern "C"
-[[gnu::alias("free")]]
-[[gnu::visibility("default")]]
-#ifndef __clang__
-[[gnu::leaf]]
-#endif
-void __libc_free(void* obj) throw ();
-
extern "C"
[[gnu::visibility("default")]]
void* calloc(size_t nmemb, size_t size) {
@@ -1860,16 +1816,6 @@ void* calloc(size_t nmemb, size_t size) {
return p;
}

-extern "C"
-[[gnu::alias("calloc")]]
-[[gnu::visibility("default")]]
-[[gnu::alloc_size(1, 2)]]
-[[gnu::malloc]]
-#ifndef __clang__
-[[gnu::leaf]]
-#endif
-void* __libc_calloc(size_t n, size_t m) throw ();
-
extern "C"
[[gnu::visibility("default")]]
void* realloc(void* ptr, size_t size) {
@@ -1881,10 +1827,7 @@ void* realloc(void* ptr, size_t size) {
if (is_reactor_thread) {
abort();
}
- // original_realloc_func might be null when previous ctor allocates
- if (original_realloc_func) {
- return original_realloc_func(ptr, size);
- }
+ return __libc_realloc(ptr, size);
}
// if we're here, it's either ptr is a seastar memory ptr
// or a nullptr, or, original functions aren't available
@@ -1912,15 +1855,6 @@ void* realloc(void* ptr, size_t size) {
return nptr;
}

-extern "C"
-[[gnu::alias("realloc")]]
-[[gnu::visibility("default")]]
-[[gnu::alloc_size(2)]]
-#ifndef __clang__
-[[gnu::leaf]]
-#endif
-void* __libc_realloc(void* obj, size_t size) throw ();
-
extern "C"
[[gnu::visibility("default")]]
[[gnu::used]]
@@ -1939,15 +1873,6 @@ int posix_memalign(void** ptr, size_t align, size_t size) throw () {
return 0;
}

-extern "C"
-[[gnu::alias("posix_memalign")]]
-[[gnu::visibility("default")]]
-#ifndef __clang__
-[[gnu::leaf]]
-#endif
-[[gnu::nonnull(1)]]
-int __libc_posix_memalign(void** ptr, size_t align, size_t size) throw ();
-
extern "C"
[[gnu::visibility("default")]]
[[gnu::malloc]]
@@ -1971,31 +1896,17 @@ void *aligned_alloc(size_t align, size_t size) throw () {
return allocate_aligned(align, size);
}

-extern "C"
-[[gnu::alias("memalign")]]
-[[gnu::visibility("default")]]
-[[gnu::malloc]]
-#if defined(__GLIBC__) && __GLIBC_PREREQ(2, 30)
-[[gnu::alloc_size(2)]]
-#endif
-void* __libc_memalign(size_t align, size_t size) throw ();
-
extern "C"
[[gnu::visibility("default")]]
void cfree(void* obj) throw () {
return ::free(obj);
}

-extern "C"
-[[gnu::alias("cfree")]]
-[[gnu::visibility("default")]]
-void __libc_cfree(void* obj) throw ();
-
extern "C"
[[gnu::visibility("default")]]
size_t malloc_usable_size(void* obj) {
if (!is_seastar_memory(obj)) {
- return original_malloc_usable_size_func(obj);
+ return __malloc_usable_size(obj);
}
return object_size(obj);
}
@@ -2004,7 +1915,11 @@ extern "C"
[[gnu::visibility("default")]]
int malloc_trim(size_t pad) {
if (!is_reactor_thread) {
- return original_malloc_trim_func(pad);
+#ifdef __GLIBC__
+ return __malloc_trim(pad);
+#else
+ return 0;
+#endif
}
return 0;
}
--
2.33.0

Commit Bot

<bot@cloudius-systems.com>
unread,
Oct 4, 2021, 12:13:31 PM10/4/21
to seastar-dev@googlegroups.com, Kefu Chai
From: Kefu Chai <tcha...@gmail.com>
Committer: Avi Kivity <a...@scylladb.com>
Branch: master

memory: always allocate buf using "malloc" for non reactor

before this change, there is chance that Seastar reallocate a memory
buffer allocated by the c runtime even before "original_malloc_func"
is initialized. the runtime might want to allocate memory for some TLS
variables or just for itself, like env variables.

but we might want to call functions like setenv() from a reactor thread.
and setenv() could reallocates the memory buffer previously allocated by
the "main" thread. in this case, cpu_pages::shrink() panics if it is
asked to resize a memory chunk allocated by another reactor thread.

in this change, __libc_malloc and friends are not overridden anymore.
instead of using dlsym(RTLD_NEXT) for original_malloc_func, use
__libc_malloc.

the same applies to other malloc related functions.

because musl libc does not define __malloc_trim() yet, see
https://git.musl-libc.org/cgit/musl/tree/src/include/stdlib.h .
the malloc_trim() implementation calls __malloc_trim() only if __GLIBC__
is defined.

the solution was proposed by Avi Kivity <a...@scylladb.com>

Signed-off-by: Kefu Chai <tcha...@gmail.com>
Message-Id: <20211004155240.1...@gmail.com>

---
diff --git a/src/core/memory.cc b/src/core/memory.cc

Avi Kivity

<avi@scylladb.com>
unread,
Oct 4, 2021, 12:28:59 PM10/4/21
to Kefu Chai, seastar-dev@googlegroups.com, Commit Bot
The build bot didn't like it, looks like __libc_malloc isn't declared in
the headers it sees.

Kefu Chai

<tchaikov@gmail.com>
unread,
Oct 4, 2021, 1:58:31 PM10/4/21
to seastar-dev@googlegroups.com, Kefu Chai
before this change, there is chance that Seastar reallocate a memory
buffer allocated by the c runtime even before "original_malloc_func"
is initialized. the runtime might want to allocate memory for some TLS
variables or just for itself, like env variables.

but we might want to call functions like setenv() from a reactor thread.
and setenv() could reallocates the memory buffer previously allocated by
the "main" thread. in this case, cpu_pages::shrink() panics if it is
asked to resize a memory chunk allocated by another reactor thread.

in this change, __libc_malloc and friends are not overridden anymore.
instead of using dlsym(RTLD_NEXT) for original_malloc_func, use
__libc_malloc.

the same applies to other malloc related functions.

because musl libc does not define __malloc_trim(), __libc_memalign()
or __malloc_usable_size() yet, see
https://git.musl-libc.org/cgit/musl/tree/src/include/stdlib.h .
so the corresponding overrides just call the underlying function
only if it is available, and call the dlsym()'ed function as a fallback.
this is based on the assumption that these fallback functions is not
called by dlsym().

the solution was originally proposed by Avi Kivity <a...@scylladb.com>

Signed-off-by: Kefu Chai <tcha...@gmail.com>
---
src/core/memory.cc | 126 +++++++++++----------------------------------
1 file changed, 30 insertions(+), 96 deletions(-)

diff --git a/src/core/memory.cc b/src/core/memory.cc
index e9e828cc..df92abff 100644
--- a/src/core/memory.cc
+++ b/src/core/memory.cc
@@ -168,6 +168,16 @@ struct hash<seastar::allocation_site> {

}

+extern "C" {
+extern void *__libc_malloc (size_t);
+extern void __libc_free (void *);
+extern void *__libc_realloc (void *, size_t);
+
+#ifdef __GLIBC__
+extern void *__libc_memalign (size_t, size_t);
+#endif
+}
+
namespace seastar {

using allocation_site_ptr = const allocation_site*;
@@ -240,22 +250,6 @@ static uint64_t get(types stat_type)

}

-// original memory allocator support
-// note: allocations before calling the constructor would use seastar allocator
-using malloc_func_type = void * (*)(size_t);
-using free_func_type = void * (*)(void *);
-using realloc_func_type = void * (*)(void *, size_t);
-using aligned_alloc_type = void * (*)(size_t alignment, size_t size);
-using malloc_trim_type = int (*)(size_t);
-using malloc_usable_size_type = size_t (*)(void *);
-
-malloc_func_type original_malloc_func = reinterpret_cast<malloc_func_type>(dlsym(RTLD_NEXT, "malloc"));
-free_func_type original_free_func = reinterpret_cast<free_func_type>(dlsym(RTLD_NEXT, "free"));
-realloc_func_type original_realloc_func = reinterpret_cast<realloc_func_type>(dlsym(RTLD_NEXT, "realloc"));
-aligned_alloc_type original_aligned_alloc_func = reinterpret_cast<aligned_alloc_type>(dlsym(RTLD_NEXT, "aligned_alloc"));
-malloc_trim_type original_malloc_trim_func = reinterpret_cast<malloc_trim_type>(dlsym(RTLD_NEXT, "malloc_trim"));
-malloc_usable_size_type original_malloc_usable_size_func = reinterpret_cast<malloc_usable_size_type>(dlsym(RTLD_NEXT, "malloc_usable_size"));
-
using allocate_system_memory_fn
= std::function<mmap_area (void* where, size_t how_much)>;

@@ -952,7 +946,7 @@ cpu_pages::try_foreign_free(void* ptr) {
} else {
alloc_stats::increment(alloc_stats::types::foreign_frees);
}
- original_free_func(ptr);
+ __libc_free(ptr);
return true;
}
free_cross_cpu(object_cpu_id(ptr), ptr);
@@ -1365,13 +1359,8 @@ static constexpr int debug_allocation_pattern = 0xab;

void* allocate(size_t size) {
if (!is_reactor_thread) {
- if (original_malloc_func) {
- alloc_stats::increment(alloc_stats::types::foreign_mallocs);
- return original_malloc_func(size);
- }
- // original_malloc_func might be null for allocations before main
- // in constructors before original_malloc_func ctor is called
- init_cpu_mem();
+ alloc_stats::increment(alloc_stats::types::foreign_mallocs);
+ return __libc_malloc(size);
}
if (size <= sizeof(free_object)) {
size = sizeof(free_object);
@@ -1396,13 +1385,15 @@ void* allocate(size_t size) {

void* allocate_aligned(size_t align, size_t size) {
if (!is_reactor_thread) {
- if (original_aligned_alloc_func) {
- alloc_stats::increment(alloc_stats::types::foreign_mallocs);
- return original_aligned_alloc_func(align, size);
- }
- // original_realloc_func might be null for allocations before main
- // in constructors before original_realloc_func ctor is called
- init_cpu_mem();
+ alloc_stats::increment(alloc_stats::types::foreign_mallocs);
+#ifdef __GLIBC__
+ return __libc_memalign(align, size);
+#else
+ using aligned_alloc_type = void * (*)(size_t alignment, size_t size);
+ aligned_alloc_type original_aligned_alloc_func =
+ reinterpret_cast<aligned_alloc_type>(dlsym(RTLD_NEXT, "aligned_alloc"));
+ return original_aligned_alloc_func(obj);
+#endif
}
if (size <= sizeof(free_object)) {
size = std::max(sizeof(free_object), align);
@@ -1817,16 +1808,6 @@ void* malloc(size_t n) throw () {
return allocate(n);
}

-extern "C"
-[[gnu::alias("malloc")]]
-[[gnu::visibility("default")]]
-[[gnu::malloc]]
-[[gnu::alloc_size(1)]]
-#ifndef __clang__
-[[gnu::leaf]]
-#endif
-void* __libc_malloc(size_t n) throw ();
-
extern "C"
[[gnu::visibility("default")]]
[[gnu::used]]
@@ -1836,14 +1817,6 @@ void free(void* ptr) {
}
}

-extern "C"
-[[gnu::alias("free")]]
-[[gnu::visibility("default")]]
-#ifndef __clang__
-[[gnu::leaf]]
-#endif
-void __libc_free(void* obj) throw ();
-
extern "C"
[[gnu::visibility("default")]]
void* calloc(size_t nmemb, size_t size) {
@@ -1860,16 +1833,6 @@ void* calloc(size_t nmemb, size_t size) {
return p;
}

-extern "C"
-[[gnu::alias("calloc")]]
-[[gnu::visibility("default")]]
-[[gnu::alloc_size(1, 2)]]
-[[gnu::malloc]]
-#ifndef __clang__
-[[gnu::leaf]]
-#endif
-void* __libc_calloc(size_t n, size_t m) throw ();
-
extern "C"
[[gnu::visibility("default")]]
void* realloc(void* ptr, size_t size) {
@@ -1881,10 +1844,7 @@ void* realloc(void* ptr, size_t size) {
if (is_reactor_thread) {
abort();
}
- // original_realloc_func might be null when previous ctor allocates
- if (original_realloc_func) {
- return original_realloc_func(ptr, size);
- }
+ return __libc_realloc(ptr, size);
}
// if we're here, it's either ptr is a seastar memory ptr
// or a nullptr, or, original functions aren't available
@@ -1912,15 +1872,6 @@ void* realloc(void* ptr, size_t size) {
return nptr;
}

-extern "C"
-[[gnu::alias("realloc")]]
-[[gnu::visibility("default")]]
-[[gnu::alloc_size(2)]]
-#ifndef __clang__
-[[gnu::leaf]]
-#endif
-void* __libc_realloc(void* obj, size_t size) throw ();
-
extern "C"
[[gnu::visibility("default")]]
[[gnu::used]]
@@ -1939,15 +1890,6 @@ int posix_memalign(void** ptr, size_t align, size_t size) throw () {
return 0;
}

-extern "C"
-[[gnu::alias("posix_memalign")]]
-[[gnu::visibility("default")]]
-#ifndef __clang__
-[[gnu::leaf]]
-#endif
-[[gnu::nonnull(1)]]
-int __libc_posix_memalign(void** ptr, size_t align, size_t size) throw ();
-
extern "C"
[[gnu::visibility("default")]]
[[gnu::malloc]]
@@ -1971,30 +1913,19 @@ void *aligned_alloc(size_t align, size_t size) throw () {
return allocate_aligned(align, size);
}

-extern "C"
-[[gnu::alias("memalign")]]
-[[gnu::visibility("default")]]
-[[gnu::malloc]]
-#if defined(__GLIBC__) && __GLIBC_PREREQ(2, 30)
-[[gnu::alloc_size(2)]]
-#endif
-void* __libc_memalign(size_t align, size_t size) throw ();
-
extern "C"
[[gnu::visibility("default")]]
void cfree(void* obj) throw () {
return ::free(obj);
}

-extern "C"
-[[gnu::alias("cfree")]]
-[[gnu::visibility("default")]]
-void __libc_cfree(void* obj) throw ();
-
extern "C"
[[gnu::visibility("default")]]
size_t malloc_usable_size(void* obj) {
if (!is_seastar_memory(obj)) {
+ using malloc_usable_size_type = size_t (*)(void *);
+ static malloc_usable_size_type original_malloc_usable_size_func =
+ reinterpret_cast<malloc_usable_size_type>(dlsym(RTLD_NEXT, "malloc_usable_size"));
return original_malloc_usable_size_func(obj);
}
return object_size(obj);
@@ -2004,7 +1935,10 @@ extern "C"
[[gnu::visibility("default")]]
int malloc_trim(size_t pad) {
if (!is_reactor_thread) {
- return original_malloc_trim_func(pad);
+ using malloc_trim_type = int (*)(size_t);
+ auto original_malloc_trim_func =
+ reinterpret_cast<malloc_trim_type>(dlsym(RTLD_NEXT, "malloc_trim"));
+ original_malloc_trim_func(pad);
}
return 0;
}
--
2.33.0

tcha...@gmail.com

<tchaikov@gmail.com>
unread,
Oct 4, 2021, 2:02:43 PM10/4/21
to seastar-dev
change since v2:

- add forward declarations for the "private" functions.
- use dlsym()'ed functions as fallbacks for non-glibc libc (most likely musl libc)

tested on 
- centos8, where glibc 2.28 is used.
- debian sid, where glibc 2.32 is used.

Commit Bot

<bot@cloudius-systems.com>
unread,
Oct 5, 2021, 5:20:44 AM10/5/21
to seastar-dev@googlegroups.com, Kefu Chai
From: Kefu Chai <tcha...@gmail.com>
Committer: Avi Kivity <a...@scylladb.com>
Branch: master

memory: always allocate buf using "malloc" for non reactor

before this change, there is chance that Seastar reallocate a memory
buffer allocated by the c runtime even before "original_malloc_func"
is initialized. the runtime might want to allocate memory for some TLS
variables or just for itself, like env variables.

but we might want to call functions like setenv() from a reactor thread.
and setenv() could reallocates the memory buffer previously allocated by
the "main" thread. in this case, cpu_pages::shrink() panics if it is
asked to resize a memory chunk allocated by another reactor thread.

in this change, __libc_malloc and friends are not overridden anymore.
instead of using dlsym(RTLD_NEXT) for original_malloc_func, use
__libc_malloc.

the same applies to other malloc related functions.

because musl libc does not define __malloc_trim(), __libc_memalign()
or __malloc_usable_size() yet, see
https://git.musl-libc.org/cgit/musl/tree/src/include/stdlib.h .
so the corresponding overrides just call the underlying function
only if it is available, and call the dlsym()'ed function as a fallback.
this is based on the assumption that these fallback functions is not
called by dlsym().

the solution was originally proposed by Avi Kivity <a...@scylladb.com>

Signed-off-by: Kefu Chai <tcha...@gmail.com>
Message-Id: <20211004175823.1...@gmail.com>

---
diff --git a/src/core/memory.cc b/src/core/memory.cc

Pavel Emelyanov

<xemul@scylladb.com>
unread,
Oct 20, 2021, 11:09:14 AM10/20/21
to Kefu Chai, seastar-dev@googlegroups.com, Avi Kivity
Kefu, I had to revert it again. Scylla unit tests segfault in random places after
this change. Apparently asking you to debug these failures and fix may result in
frustrating experience. Maybe Avi can suggest how to move forward with it.

-- Pavel

Avi Kivity

<avi@scylladb.com>
unread,
Oct 20, 2021, 11:30:18 AM10/20/21
to Pavel Emelyanov, Kefu Chai, seastar-dev@googlegroups.com
One possible problem is that somewhere in glibc something allocates with
malloc() and deallocates with __libc_free() or vice versa. So my idea of
not intercepting __libc_malloc/__libc_free is wrong. That's just a guess
though, maybe the real problem is elsewhere.
Reply all
Reply to author
Forward
0 new messages