Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Putting a function in shared memory

59 views
Skip to first unread message

Frederick Gotham

unread,
Aug 1, 2020, 4:57:36 PM8/1/20
to
So let's start off with a simple function:

void cpystring(char *to, char const *from)
{
while ( *to++ = *from++ );
}

Note that this function does not access a global variable, nor does it call any other functions. Everything it accesses is on the stack.

So let's say I compile this function to 20 bytes of machine code.

I then copy this machine code into memory that is shared between processes (for example on Linux: a shared memory object). MS-Windows has a similar interprocess memory feature. I then set the permissions on this interprocess memory to allow execution.

I can then have any number of separate processes that do the following:

char buf1[] = "dog";
char buf2[] = "cat";

void (*cpystring)(char *to, char const *from);

int main(void)
{
void const *const mem = MapInterprocessMemory("gotham_cpystring");

cpystring = (void (*)(char*,char const *))mem;

cpystring(buf1,buf2);
}

This will work, right?

But if I make it a little more complicated, for example if I call a function from within cpystring like this:

void cpystring(char *const to, char const *const from)
{
unsigned const len = strlen(from);

for (unsigned i = 0; i <= len; ++i)
to[i] = from[i];
}

This won't work I don't think, because the hardcoded address of 'strlen' will be different for each process (...I think?). I think the way around this would be to pass the address of strlen as a stack parameter like this:

void cpystring(char *const to, char const *const from, unsigned (*const lenstring)(char const*))
{
unsigned const len = lenstring(from);

for (unsigned i = 0; i <= len; ++i)
to[i] = from[i];
}

I think this should work. I can try this on a PC when I'm at home later.

Is there anything else to consider when trying to execute a function from shared interprocess memory?

What I'm trying to do here is to come up with a third way of dynamically loading a shared library and executing code from it. So far I have two methods:
(1) When building, at the linking stage just specify -l:libmonkey.so
(2) At runtime, call dlopen, LoadLibrary followed by dlsym, GetProcAddress
(3) Copy executable code into a shared memory object, and then just get other processes to map the shared memory and call the function

Juha Nieminen

unread,
Aug 1, 2020, 5:28:31 PM8/1/20
to
Frederick Gotham <cauldwel...@gmail.com> wrote:
> But if I make it a little more complicated, for example if I call a function from within cpystring like this:
>
> void cpystring(char *const to, char const *const from)
> {
> unsigned const len = strlen(from);
>
> for (unsigned i = 0; i <= len; ++i)
> to[i] = from[i];
> }

I think that strlen() is not a good example because there are good
chances that the compiler will completely inline it.

But anyway, your question relates to function calls that are not
inlined. Are they functions in your code, or are they functions
eg. from the standard library or something else?

> What I'm trying to do here is to come up with a third way of dynamically loading a shared library and executing code from it. So far I have two methods:
> (1) When building, at the linking stage just specify -l:libmonkey.so
> (2) At runtime, call dlopen, LoadLibrary followed by dlsym, GetProcAddress
> (3) Copy executable code into a shared memory object, and then just get other processes to map the shared memory and call the function

Maybe the concept you are looking for is "position-independent code" (PIC).
Compilers have an option for that, but I'm not sure it's enough for what
you are trying to do. I think what you are trying to do would require
your code to contain nothing that requires a dynamic linker (which is
what PIC might or might not do, I'm not completely certain).

Alf P. Steinbach

unread,
Aug 1, 2020, 6:08:16 PM8/1/20
to
On 01.08.2020 22:57, Frederick Gotham wrote:
> [snip]
> What I'm trying to do here is to come up with a third way of dynamically loading a shared library and executing code from it.

Windows already does that automatically for you.


- Alf

Frederick Gotham

unread,
Aug 1, 2020, 7:23:39 PM8/1/20
to
The following function:

void cpystring(char *to, char const *from)
{
while ( *to++ = *from++ );
}

gets compiled on a x86_64 Linux machine to:

00000000000005ca <_Z9cpystringPcPKc>:
5ca: 55 push %rbp
5cb: 48 89 e5 mov %rsp,%rbp
5ce: 48 89 7d f8 mov %rdi,-0x8(%rbp)
5d2: 48 89 75 f0 mov %rsi,-0x10(%rbp)
5d6: 48 8b 55 f0 mov -0x10(%rbp),%rdx
5da: 48 8d 42 01 lea 0x1(%rdx),%rax
5de: 48 89 45 f0 mov %rax,-0x10(%rbp)
5e2: 48 8b 45 f8 mov -0x8(%rbp),%rax
5e6: 48 8d 48 01 lea 0x1(%rax),%rcx
5ea: 48 89 4d f8 mov %rcx,-0x8(%rbp)
5ee: 0f b6 12 movzbl (%rdx),%edx
5f1: 88 10 mov %dl,(%rax)
5f3: 0f b6 00 movzbl (%rax),%eax
5f6: 84 c0 test %al,%al
5f8: 0f 95 c0 setne %al
5fb: 84 c0 test %al,%al
5fd: 74 02 je 601 <_Z9cpystringPcPKc+0x37>
5ff: eb d5 jmp 5d6 <_Z9cpystringPcPKc+0xc>
601: 90 nop
602: 5d pop %rbp
603: c3 retq


And so here's the first program that allocates the interprocess memory and copies the machine code into it:

char unsigned const g_cpystring_bytes[] = {
0x55, 0x48, 0x89, 0xE5, 0x48, 0x89, 0x7D, 0xF8, 0x48, 0x89, 0x75, 0xF0, 0x48, 0x8B, 0x55, 0xF0,
0x48, 0x8D, 0x42, 0x01, 0x48, 0x89, 0x45, 0xF0, 0x48, 0x8B, 0x45, 0xF8, 0x48, 0x8D, 0x48, 0x01,
0x48, 0x89, 0x4D, 0xF8, 0x0F, 0xB6, 0x12, 0x88, 0x10, 0x0F, 0xB6, 0x00, 0x84, 0xC0, 0x0F, 0x95,
0xC0, 0x84, 0xC0, 0x74, 0x02, 0xEB, 0xD5, 0x90, 0x5D, 0xC3
};

#include <cstring> // memcpy
#include <boost/interprocess/shared_memory_object.hpp> // shared_memory_object
#include <boost/interprocess/mapped_region.hpp> // mapped_region

using namespace boost::interprocess;

int main(void)
{
shared_memory_object shm_obj(create_only, //only create
"gotham_cpystring", //name
read_write ); //read-write mode

shm_obj.truncate(sizeof g_cpystring_bytes);

mapped_region region(shm_obj, read_write);

std::memcpy(region.get_address(), g_cpystring_bytes, region.get_size());
}


This first program compiles fine and seems to do its job properly, the shared memory object is created in "/dev/shm" and it's 58 bytes in size.

And then next I have the second program that tries to call the function:

#include <boost/interprocess/shared_memory_object.hpp> // shared_memory_object
#include <boost/interprocess/mapped_region.hpp> // mapped_region

using namespace boost::interprocess;

char buf1[] = "cat",
buf2[] = "dog";

//Not sure if I need to specify the calling convention (e.g. sysv_abi)
void (*cpystring)(char *to, char const *from) = nullptr;

int main(void)
{
//Open already created shared memory object.
shared_memory_object shm_obj(open_only, "gotham_cpystring", read_write);

//Map the whole shared memory in this process
mapped_region region(shm_obj, read_write);

cpystring = reinterpret_cast<decltype(cpystring)>(region.get_address());

cpystring(buf1, buf2);
}


This second program fails with a segfault. I thought the problem might be that the shared memory is in a page that isn't marked for execution, and so I set it as executable as follows:

void Set_Writeability_Of_Memory(void *const p, bool const writeable)
{
uintptr_t const page_size = sysconf(_SC_PAGE_SIZE);

union {
void *p_start_of_page;
uintptr_t i_start_of_page;
};

p_start_of_page = p;

i_start_of_page -= (i_start_of_page % page_size);

mprotect(i_start_of_page, page_size, PROT_READ | (writeable ? PROT_WRITE : 0u));
}

however this hasn't fixed it. I call the function "Set_Writeability_Of_Memory" from within my first program and also from within my second program, but the second program still segfaults.

Anyone got any ideas?

Here's what I'm getting from the GDB debugger:

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7ff7000 in ?? ()
(gdb) bt
#0 0x00007ffff7ff7000 in ?? ()
#1 0x0000555555555366 in main () at drone.cpp:26

Frederick Gotham

unread,
Aug 1, 2020, 7:44:55 PM8/1/20
to
On Sunday, August 2, 2020 at 12:23:39 AM UTC+1, Frederick Gotham wrote:

> void Set_Writeability_Of_Memory(void *const p, bool const writeable)


I meant to set the page as 'executable' -- not as 'writeable'.

I have re-written my second program and everything is working now. Here's the second program:

void Set_Executability_Of_Memory(void*, bool);

#include <cstring> // memcpy

#include <iostream>
using std::cout;
using std::endl;

#include <boost/interprocess/shared_memory_object.hpp> // shared_memory_object
#include <boost/interprocess/mapped_region.hpp> // mapped_region

using namespace boost::interprocess;

char buf1[] = "cat";
char buf2[] = "dog";

void (*cpystring)(char *to, char const *from) = nullptr;

int main(void)
{
//Open already created shared memory object.
shared_memory_object shm_obj(open_only, "gotham_cpystring", read_write);

//Map the whole shared memory in this process
mapped_region region(shm_obj, read_write);

cout << "About to set memory page as executable" << endl;
Set_Executability_Of_Memory(region.get_address(), true);

cout << "About to call cpystring" << endl;
cpystring = reinterpret_cast< decltype(cpystring) >( region.get_address() );

cpystring(buf1, buf2);

cout << "buf1 : " << buf1 << endl;
cout << "buf2 : " << buf2 << endl;
}

void Set_Executability_Of_Memory(void *const p, bool const executable)
{
uintptr_t const page_size = sysconf( _SC_PAGE_SIZE );

union {
void *p_start_of_page;
uintptr_t i_start_of_page;
};

p_start_of_page = p;

i_start_of_page -= (i_start_of_page % page_size);

mprotect(p_start_of_page, page_size, PROT_READ | PROT_WRITE | (executable ? PROT_EXEC : 0u));
}


Of course it will be cleaner to put the call to "Set_Exec..." in the first program.

It's late now but tomorrow I'll play around with calling other functions from inside "cpystring".

When I have this working the way I want to, I'll be able to unload all libraries and just leave some phantom code in RAM for doing some really strange things (for example I can leave a stub function in memory when I unload a library).

Manfred

unread,
Aug 2, 2020, 10:29:22 AM8/2/20
to
What you are trying to do here is replace the functionality of the
dynamic linker within your program.
None of this is specified by C++ (or C) as a language (i.e. its
standard), in fact it is about the platform ABI, out of the scope of the
language.
C++ (and C) do specify some linkage properties, but they are all high
level requirements, nothing anywhere near the level required to execute
some actual code.

This is to say that you won't get answers to what you are trying to do
from the language, as a start you need to study the ABI of your target
platform.
Second, you should consider the compiled version of your cpystring
function as a flat bytearray that happens to contain executable code;
you shouldn't rely on how your 4 lines of code get compiled by a C or
C++ compiler - for example you can't even rely on the fact that the
entry point of the function will be the start address of the compiled
bytearray. You may get closer to what you want using assembly or even
machine code (and if you use assembly you should inspect closely your
assembler to verify what its output is).
Third, later on you rely on a blind reinterpret_cast in the hope you'll
get the right call. This is not enough, for example on Windows there are
at least 3 different calling conventions used by the compiler within a
single program for a single architecture. This is why even for
conventional library calls their header files thoroughly decorate each
function prototype to ensure that the correct ABI convention is enforced
by the compiler.
And then there's the whole chapter about memory protection.

Back to your list above, (1) and (2) should work, if used properly.
I would consider (3) unusable for any serious work - toying is always
free of course.

Paavo Helde

unread,
Aug 2, 2020, 1:10:03 PM8/2/20
to
Not quite sure about your motivations, so far it seems you have just
duplicated a small part of dynamic loader functionality. Is this meant
as a stress test for antivirus products, to see how capable they are in
detecting suspicious activities?




Frederick Gotham

unread,
Aug 2, 2020, 6:52:10 PM8/2/20
to
I've re-written my first program like this:

#include <cstring> // memcpy

#include <iostream>
using std::cout;
using std::endl;

#include <boost/interprocess/shared_memory_object.hpp> // shared_memory_object
#include <boost/interprocess/mapped_region.hpp> // mapped_region

void cpystring(char *to, char const *from) __attribute__ ((noinline));

using namespace boost::interprocess;

int main(void)
{
shared_memory_object::remove("gotham_cpystring");

shared_memory_object shm_obj(create_only, //only create
"gotham_cpystring", //name
read_write); //read-write mode

shm_obj.truncate(128u); // Averages about 17 - 58 bytes, so let's be safe with 128 bytes

mapped_region region(shm_obj, read_write);

std::memcpy(region.get_address(), (void const*)&cpystring, region.get_size());
}

void cpystring(char *to, char const *from)
{
while ( *to++ = *from++ );
}


And I've re-written my second program like this:

#include <iostream>
using std::cout;
using std::endl;

#include <boost/interprocess/shared_memory_object.hpp> // shared_memory_object
#include <boost/interprocess/mapped_region.hpp> // mapped_region

using namespace boost::interprocess;

char buf1[] = "cat";
char buf2[] = "dog";

void (*cpystring)(char *to, char const *from) = nullptr;

int main(void)
{
//Open already created shared memory object.
shared_memory_object shm_obj(open_only, "gotham_cpystring", read_only);

//Map the whole shared memory in this process
mapped_region region(shm_obj, read_only);

cout << "About to set memory page as executable" << endl;
extern void Set_Executability_Of_Memory(void *const p, bool const executable);
Set_Executability_Of_Memory(region.get_address(), true);

cout << "About to call cpystring" << endl;
cpystring = reinterpret_cast< decltype(cpystring) >( region.get_address() );

cpystring(buf1, buf2);

cout << "buf1 : " << buf1 << endl;
cout << "buf2 : " << buf2 << endl;
}

#ifdef BOOST_INTERPROCESS_WINDOWS
extern "C" int VirtualProtect(uint64_t,uint64_t,uint32_t,uint32_t*);

struct SYSTEM_INFO {
char stuff[4];
uint32_t dwPageSize;
char more_stuff[128];
};

extern "C" void GetSystemInfo(uint64_t);
#endif

void Set_Executability_Of_Memory(void *const p, bool const executable)
{
union {
void *p_start_of_page;
std::uintptr_t i_start_of_page;
};

p_start_of_page = p;

std::uintptr_t page_size;

#ifdef BOOST_INTERPROCESS_WINDOWS
SYSTEM_INFO sysinfo;
GetSystemInfo((uint64_t)&sysinfo);
page_size = sysinfo.dwPageSize;
i_start_of_page -= (i_start_of_page % page_size);
uint32_t old_perms;
VirtualProtect(i_start_of_page, page_size, 0x20 /*PAGE_EXECUTE_READ*/, &old_perms);
#else
// Linux
page_size = sysconf( 30 /*_SC_PAGE_SIZE*/);
i_start_of_page -= (i_start_of_page % page_size);
mprotect(p_start_of_page, page_size, 1u /*PROT_READ*/ | (executable ? 4u /*PROT_EXEC*/ : 0u));
#endif
}

It doesn't matter if the first program sets the memory page as executable and so I've removed that code (it's only needed in the second program).

I tried using an assembler trick "asm { _emit 0xCC _emit 0xCC _emit 0xCC _emit 0xCC }" to put a suffix on the end of the machine code for "cpystring", and then I tried to use 'strstr' to determine the length of the function by searching for 0xCC, but I think this failed on my machine here because the first argument to strstr wasn't a null-terminated string (even though it's possible to implement strstr without requiring a null-terminated string for the first argument).

Frederick Gotham

unread,
Aug 4, 2020, 7:38:11 AM8/4/20
to
On Sunday, August 2, 2020 at 3:29:22 PM UTC+1, Manfred wrote:

> Third, later on you rely on a blind reinterpret_cast in the hope you'll
> get the right call. This is not enough, for example on Windows there are
> at least 3 different calling conventions used by the compiler within a
> single program for a single architecture. This is why even for
> conventional library calls their header files thoroughly decorate each
> function prototype to ensure that the correct ABI convention is enforced
> by the compiler.



On x86, the function calling conventions were cdecl, stdcall, fastcall, and then with C++ there was also thiscall.

However since things went 64-Bit with x86_64, the function calling convention is pretty much always 'ms_abi' or 'sysv_abi'.

Making the calling convention a part of the function pointer isn't a big deal:

void (*Func)(void) __attribute__(__sysv_abi__);



> And then there's the whole chapter about memory protection.



On Linux I used "mprotect", and on MS-Windows I use "VirtualProtect" in order to mark the page of memory as executable. It works.



> Back to your list above, (1) and (2) should work, if used properly.
> I would consider (3) unusable for any serious work - toying is always
> free of course.



Loads of things started out as toys and went on to become something used extensively worldwide.

Frederick Gotham

unread,
Aug 4, 2020, 7:39:09 AM8/4/20
to
On Sunday, August 2, 2020 at 6:10:03 PM UTC+1, Paavo Helde wrote:

> Not quite sure about your motivations, so far it seems you have just
> duplicated a small part of dynamic loader functionality. Is this meant
> as a stress test for antivirus products, to see how capable they are in
> detecting suspicious activities?


Once I've gotten this working in a very basic way, I'll try to make it more elaborate. I suppose you could say I'm developing a new way of dynamically sharing code.
0 new messages