Make sure shared library doesn't execute any code at 'dlopen'

164 views
Skip to first unread message

Frederick Virchanza Gotham

unread,
Jan 26, 2023, 11:50:38 AMJan 26
to

On Linux x86_64, I've taken the code for an opensource executable program, and I've changed the final linker command to add "-fPIC -shared" so that I get a dynamic shared library instead of an executable file.

So I take this ".so" file and I load it into an executable program:

void *h = dlopen("libSomeProgram.so", RTLD_NOW);

int (*prog_main)(int,char**) = dlsym(h,"main");

And so now I can start this other program by executing its 'main' function as follows:

char *args[] = { "prog", "-v", "-k", "-c", nullptr, nullptr, nullptr };
prog_main(4, args);

I've already built this and tested it and it works fine.

At the beginning of all this, I had the choice of building the original program as either a static shared library or a dynamic shared library. I went with dynamic because I didn't want to be burdened with the library's start-up code (i.e. whatever happens inside '_init') until I actually needed the functionality of the library (i.e. until the point that 'dlopen' is called).

But then I thought to myself, I can just check what the dynamic shared library does when it's loaded. First of all, I used 'readelf' on the ".so" file, and here's what I saw:

(INIT) 0x3000

So when the library gets loaded with "dlopen", it starts executing code at the address 0x3000. So next I used 'objdump' to see what's located at address 0x3000:

endbr64
sub rsp, 8
mov rax, cs:__gmon_start__ptr
test rax, rax
jz short locationA
call rax ; __gmon_start__
locationA:
add rsp, 8
retn

So in C++ code, this basically just does:

int init(void)
{
if ( gmon_start_ptr ) return gmon_start_ptr();
return 0;
}

So next I checked what the function '__gmon_start__' function does, however there is no such function to be found inside the library. It's actually a 'weak' symbol. I used "nm -D" at the command line, and here's what I see:

w __gmon_start__

By the way I linked the shared library with "-Wl,--no-undefined" so there should be no unresolved symbols, but undefined weak symbols are allowed.

So I think what's happening here is that my shared library is seeking a function called '__gmon_start__' inside the executable file that it's loaded into. I've seen this technique before. For example let's say I were to write a library that optionally prints debug info, well I could have a weak symbol called "void debug_print(char const*)" inside my library, and then the executable file can provide that function to my library if it wants to.

Given that the dynamic shared library I built was entirely a C program, I don't have to worry about the constructors of global objects being invoked when 'dlopen' is called. However there are still other ways that code could get executed, for example there might be a global variable that gets initialised with the return value of a function:

int some_global_variable = SomeFunction();

To get to the bottom line on this though, it seems that my dynamic shared libary doesn't do anything when it's loaded except call the function "__gmon_start__". So if I were to build it as a static shared library and link it at compile time, then my executable program shouldn't be burdened with any extra start-up code. Am I right here? Am I missing something?

I need to look into why this executable file looks for "__gmon_start__" in the first place, because the shared library wasn't build with "-pg". It seems that GNU g++ places this weak symbol inside its ELF files even if you don't specify "-pg" at the command line. I don't know why it does that.





Scott Lurndal

unread,
Jan 26, 2023, 12:21:17 PMJan 26
to
Frederick Virchanza Gotham <cauldwel...@gmail.com> writes:
>
>On Linux x86_64, I've taken the code for an opensource executable program, =
>and I've changed the final linker command to add "-fPIC -shared" so that I =
>get a dynamic shared library instead of an executable file.
>
>So I take this ".so" file and I load it into an executable program:
>
> void *h =3D dlopen("libSomeProgram.so", RTLD_NOW);
>
> int (*prog_main)(int,char**) =3D dlsym(h,"main");
>
>And so now I can start this other program by executing its 'main' function =
>as follows:
>
> char *args[] =3D { "prog", "-v", "-k", "-c", nullptr, nullptr, nullptr =
>};
> prog_main(4, args);
>
>I've already built this and tested it and it works fine.

It is generally not considered to be a good idea
to include a function called 'main' in a dynamic shared object
as it must invariably conflict with the same symbol in the
executable loading the dynamic shared object using dlopen()
which has its own "main" function.


>So when the library gets loaded with "dlopen", it starts executing code at =
>the address 0x3000. So next I used 'objdump' to see what's located at addre=
>ss 0x3000:
>
> endbr64
> sub rsp, 8
> mov rax, cs:__gmon_start__ptr
> test rax, rax
> jz short locationA
> call rax ; __gmon_start__
>locationA:
> add rsp, 8
> retn
>
>So in C++ code, this basically just does:
>
> int init(void)
> {
> if ( gmon_start_ptr ) return gmon_start_ptr();
> return 0;
> }
>
<snip>
>I need to look into why this executable file looks for "__gmon_start__" in =
>the first place, because the shared library wasn't build with "-pg". It see=
>ms that GNU g++ places this weak symbol inside its ELF files even if you do=
>n't specify "-pg" at the command line. I don't know why it does that.

__gmon_start_ptr__ (and a few other static global symbols) are typically provided
by the crt (C-runtime) that gets linked in with the executable. Those symbols
may not be "linked" to the share object when it is loaded, and thus dlopen
may fail. In any case, if you call any of the crt startup functions when
the dynamic object is loaded, you'll conflict with the crt startup functions
that were executed when your program that dlopens the shared object started
(you'll note from objdump that the executable entry point is _start, not main, and _start
calls a bunch of library initialization functions before invoking main).

Which is another reason to not use 'main' as the dynamic symbol, or
include any run-time intialization (other than static class initializers) in
the shared object.

Frederick Virchanza Gotham

unread,
Jan 26, 2023, 12:37:47 PMJan 26
to
On Thursday, January 26, 2023 at 5:21:17 PM UTC, Scott Lurndal wrote:
>
> It is generally not considered to be a good idea
> to include a function called 'main' in a dynamic shared object
> as it must invariably conflict with the same symbol in the
> executable loading the dynamic shared object using dlopen()
> which has its own "main" function.


This of course will matter when I make a static library because the linker will tell me I have a multiple definition error. But when it comes to dynamic libraries on Linux, you can have more than one function with the same name, and you can even get the other function's address with dlopen(RTLD_NEXT, "main").


> __gmon_start_ptr__ (and a few other static global symbols) are typically provided
> by the crt (C-runtime) that gets linked in with the executable. Those symbols
> may not be "linked" to the share object when it is loaded, and thus dlopen
> may fail. In any case, if you call any of the crt startup functions when
> the dynamic object is loaded, you'll conflict with the crt startup functions
> that were executed when your program that dlopens the shared object started
> (you'll note from objdump that the executable entry point is _start, not main, and _start
> calls a bunch of library initialization functions before invoking main).


I'll build the original program as an executable, check what its entry point is, and see what it does before main.

Paavo Helde

unread,
Jan 26, 2023, 12:43:51 PMJan 26
to
26.01.2023 18:50 Frederick Virchanza Gotham kirjutas:
>
> On Linux x86_64, I've taken the code for an opensource executable program, and I've changed the final linker command to add "-fPIC -shared" so that I get a dynamic shared library instead of an executable file.

-fPIC is a compiler option, so in general it's not enough to add it only
to the linker command.

[...]

> At the beginning of all this, I had the choice of building the original program > as either a static shared library

What does "static shared library" mean? I guess you mean "static library".

or a dynamic shared library. I went with dynamic because I didn't want
to be burdened with the library's start-up code (i.e. whatever happens
inside '_init') until I actually needed the functionality of the library
(i.e. until the point that 'dlopen' is called).

I think you said this is open-source program, so why don't you just look
in the source to see if there is any expensive static initialization
taking place. If it does, change the code to perform this initialization
only on demand, and add an initialization function to be called from
your program to trigger this.

An why do you care? Does the program startup take 10 seconds more with
the extra library linked in? If not, why do you care?

Frederick Virchanza Gotham

unread,
Jan 26, 2023, 6:07:19 PMJan 26
to
On Thursday, January 26, 2023 at 5:43:51 PM UTC, Paavo Helde wrote:

> What does "static shared library" mean? I guess you mean "static library".
> or a dynamic shared library. I went with dynamic because I didn't want
> to be burdened with the library's start-up code (i.e. whatever happens
> inside '_init') until I actually needed the functionality of the library
> (i.e. until the point that 'dlopen' is called).
> I think you said this is open-source program, so why don't you just look
> in the source to see if there is any expensive static initialization
> taking place. If it does, change the code to perform this initialization
> only on demand, and add an initialization function to be called from
> your program to trigger this.
>
> An why do you care? Does the program startup take 10 seconds more with
> the extra library linked in? If not, why do you care?


I have forked two opensource projects on Github and I'm amalgamating them together.

With regard to the program which I want to turn into a static library, well I've gathered all of the ".a" files it needs and I've unzipped them and then combined them with the object files of the main program, and made one big ".a" file out of them, and so now I have one big file called SomeProgram.a.

Now if I write a new program and get it to link with SomeProgram.a, I might get a few 'multiple definition' errors, such as 'main' and 'options'.

So first I got a list of all the exported symbols in all the object files: find -iname "*.o" | nm -i -r -n1 "{}" | grep -Ev "( U )|( W )|( w )" | cut -d ' ' -f3- | sort | uniq > all_symbols.txt

Next I made a command line argument list from them: cat all_symbols.txt | awk '{print "--redefine-sym " $s "=SomeProgram_" $s}' | tr '\n' ' ' > cmd_line_args.txt

Next I renamed all of the symbols in all of the object files: find -iname "*.o" | xargs -i -r -n1 objcopy `cat cmd_line_args.txt` "{}"

After doing all that, I was assured that there wouldn't be a name collision, so I linked it all together and I didn't get a multiple definition error.

Michael S

unread,
Jan 27, 2023, 5:37:57 AMJan 27
to
You still didn't explain what exactly do you try to achieve. And you didn't
explain what you don't like about normal method where you keep your
executable program as is and call them with spawn().

Paavo Helde

unread,
Jan 27, 2023, 6:40:19 AMJan 27
to
You sure get some points for originality in the software development.
Brings the no-code movement to new heights. Maybe we should call it
blind-source development?

My only question still remains: why???

Frederick Virchanza Gotham

unread,
Jan 27, 2023, 6:44:09 AMJan 27
to
On Friday, January 27, 2023 at 11:40:19 AM UTC, Paavo Helde wrote:

> You sure get some points for originality in the software development.
> Brings the no-code movement to new heights. Maybe we should call it
> blind-source development?
>
> My only question still remains: why???


Program A writes into a TCP socket.

Program B reads in from a TCP socket.

I'm making one program, i.e. Program C, out of them.

Program A is about 10 times the size of Program B, so it makes sense to put B into A rather than put A into B.

Program C will have two threads, one thread running the 'main' of Program A, and one thread running the 'main' of Program B.

Since Program A and Program B are now in the one process and have access to each other's memory, I can do away with the TCP socket between them, and replace it with a lockfree container, e.g. boost::lockfree:spsc_queue.


Paavo Helde

unread,
Jan 27, 2023, 8:14:03 AMJan 27
to
27.01.2023 13:44 Frederick Virchanza Gotham kirjutas:
> On Friday, January 27, 2023 at 11:40:19 AM UTC, Paavo Helde wrote:
>
>> You sure get some points for originality in the software development.
>> Brings the no-code movement to new heights. Maybe we should call it
>> blind-source development?
>>
>> My only question still remains: why???
>
>
> Program A writes into a TCP socket.
>
> Program B reads in from a TCP socket.
>
> I'm making one program, i.e. Program C, out of them.
>
> Program A is about 10 times the size of Program B, so it makes sense to put B into A rather than put A into B.

This is non-sequitur. Also, in the previous line you said you put both
of them in C.

>
> Program C will have two threads, one thread running the 'main' of Program A, and one thread running the 'main' of Program B.
>
> Since Program A and Program B are now in the one process and have access to each other's memory, I can do away with the TCP socket between them, and replace it with a lockfree container, e.g. boost::lockfree:spsc_queue.

So what are the timings and which operations are too slow, and by how
much? I.e. is this project going to solve a real or an imagined problem?

You are planning extensive modifications in the source code of both A
and B. What prevents you to change the name of main() and other
conflicting symbols in the source code of A and B?





Frederick Virchanza Gotham

unread,
Jan 27, 2023, 9:20:44 AMJan 27
to
On Friday, January 27, 2023 at 1:14:03 PM UTC, Paavo Helde wrote:
>>
> > Program A is about 10 times the size of Program B, so it makes sense to put B into A rather than put A into B.
> This is non-sequitur. Also, in the previous line you said you put both of them in C.

C is the combination of A and B. In order to create C, I had three options:
(1) Start with nothing, then add in A, then add in B
(2) Start with A, then add in B
(3) Start with B, then add in A

I opted for option 2. So I forked A on Github, and started copying files from B into A.

> So what are the timings and which operations are too slow, and by how
> much? I.e. is this project going to solve a real or an imagined problem?

The main reason I'm doing this is to greatly simplify the running of these two programs. As things stand now, you have to run program A with a load of options:

progA --opt1 --opt2 --opt3=monkey.txt -k2 -f6 -m8 -c7 --save-n2

and then you have to create a virtual network device, then you've to alter the routing table, then you've to wait for progA to take effect, and then you've to analyse the effect that progA has had, and then you take what you analysed about progA and feed it as the command line to progB along with another bunch of command line arguments:

progB --opt1 --opt2=something_from_progA --opt3 -k -m -n3 --open-n2

I will be able to reduce this all to a simple one-liner at the command line:

progC --my-simple-option

Literally you will only need to give one simple command line argument when starting Program C.

Program C will start Program A (as a new thread), it will wait until A's ready, then it will create a virtual network device, then it will analyse the routing table and make changes, then it will run program B to communicate with program A.

> You are planning extensive modifications in the source code of both A
> and B. What prevents you to change the name of main() and other
> conflicting symbols in the source code of A and B?

I *do* change the symbol names, but not in the C++ source and header files. I wait until the object files are produced and then I use 'objcopy --redefine-sym' on the object files. It works and it means I can automate the process without having to write a C/C++ parser.

People have written scripts to do what I'm doing, i.e. to combine Program A and Program B, but my program (i.e. Program C) will be much more capable of dealing with adversity, for example it will analyse the routing table and try to find a free network even if there's already 8 entries in there. It will use ephemeral port numbers where possible.

The lockfree container between the two threads is just the icing on the cake although I'll smile when it's working and I get the CPU usage down below 1%.

Scott Lurndal

unread,
Jan 27, 2023, 10:20:27 AMJan 27
to
Frederick Virchanza Gotham <cauldwel...@gmail.com> writes:
So use mmap(2) or shmat(2) to share memory between the processes.

Kenny McCormack

unread,
Jan 27, 2023, 10:32:26 AMJan 27
to
In article <tr0d6f$1m1sc$1...@dont-email.me>,
Paavo Helde <ees...@osa.pri.ee> wrote:
...
>You sure get some points for originality in the software development.
>Brings the no-code movement to new heights. Maybe we should call it
>blind-source development?

Blah, blah, blah.

>My only question still remains: why???

This isn't a question. It's just a slam.

(OP's situation is perfectly clear to me)

--
Republican Congressman Matt Gaetz claims that only ugly women want
abortions, which they will never need since no one will impregnate them.

Paavo Helde

unread,
Jan 27, 2023, 10:57:27 AMJan 27
to
27.01.2023 16:20 Frederick Virchanza Gotham kirjutas:
> On Friday, January 27, 2023 at 1:14:03 PM UTC, Paavo Helde wrote:
>>>
>>> Program A is about 10 times the size of Program B, so it makes sense to put B into A rather than put A into B.
>> This is non-sequitur. Also, in the previous line you said you put both of them in C.
>
> C is the combination of A and B. In order to create C, I had three options:
> (1) Start with nothing, then add in A, then add in B
> (2) Start with A, then add in B
> (3) Start with B, then add in A
>
> I opted for option 2. So I forked A on Github, and started copying files from B into A.
>
>> So what are the timings and which operations are too slow, and by how
>> much? I.e. is this project going to solve a real or an imagined problem?
>
> The main reason I'm doing this is to greatly simplify the running of these two programs. As things stand now, you have to run program A with a load of options:
>
> progA --opt1 --opt2 --opt3=monkey.txt -k2 -f6 -m8 -c7 --save-n2
>
> and then you have to create a virtual network device, then you've to alter the routing table, then you've to wait for progA to take effect, and then you've to analyse the effect that progA has had, and then you take what you analysed about progA and feed it as the command line to progB along with another bunch of command line arguments:
>
> progB --opt1 --opt2=something_from_progA --opt3 -k -m -n3 --open-n2
>
> I will be able to reduce this all to a simple one-liner at the command line:
>
> progC --my-simple-option
>
> Literally you will only need to give one simple command line argument when starting Program C.

Sounds like a perfect jobs for a shell script.

>
> Program C will start Program A (as a new thread), it will wait until A's ready, then it will create a virtual network device, then it will analyse the routing table and make changes, then it will run program B to communicate with program A.
>
>> You are planning extensive modifications in the source code of both A
>> and B. What prevents you to change the name of main() and other
>> conflicting symbols in the source code of A and B?
>
> I *do* change the symbol names, but not in the C++ source and header files. I wait until the object files are produced and then I use 'objcopy --redefine-sym' on the object files. It works and it means I can automate the process without having to write a C/C++ parser.

This is insane. Why would you need a C++ parser? How many name conflicts
do you exactly have, something like 3? Why do you need to automate
replacing them?

Compiling and linking libraries (static or dynamic) is very common
practice. I have some programs with tens of third-party libraries linked
in, mostly as static libraries. Never ever have I needed to use objcopy
or C++ parser with that.

It's true that when some C code is not written with the mindset to be
used in a library, it might contain some too generic names which may
easily get into conflict with other code. In C they have their own hacks
to cope with that. In C++ we luckily have a standard way to solve this,
just put all code in a library-specific namespace.


Christian Gollwitzer

unread,
Jan 27, 2023, 1:53:23 PMJan 27
to
Am 27.01.23 um 15:20 schrieb Frederick Virchanza Gotham: > The main
reason I'm doing this is to greatly simplify the running of these two
programs. As things stand now, you have to run program A with a load of
options:
>
> progA --opt1 --opt2 --opt3=monkey.txt -k2 -f6 -m8 -c7 --save-n2
>
> and then you have to create a virtual network device, then you've to alter the routing table, then you've to wait for progA to take effect, and then you've to analyse the effect that progA has had, and then you take what you analysed about progA and feed it as the command line to progB along with another bunch of command line arguments:
>
> progB --opt1 --opt2=something_from_progA --opt3 -k -m -n3 --open-n2
>
> I will be able to reduce this all to a simple one-liner at the command line:
>
> progC --my-simple-option
>
> Literally you will only need to give one simple command line argument when starting Program C.

Sounds like Program C could be a smallish shell script. Bash is an
excellent language for this kind of thing.


Christian

Christian Gollwitzer

unread,
Jan 27, 2023, 2:03:25 PMJan 27
to
Am 27.01.23 um 16:32 schrieb Kenny McCormack:
> In article <tr0d6f$1m1sc$1...@dont-email.me>,
> Paavo Helde <ees...@osa.pri.ee> wrote:
> ...
>> You sure get some points for originality in the software development.
>> Brings the no-code movement to new heights. Maybe we should call it
>> blind-source development?
>
> Blah, blah, blah.
>
>> My only question still remains: why???
>
> This isn't a question. It's just a slam.
>
> (OP's situation is perfectly clear to me)
>

The situation maybe clear, but I can't understand how anyone in their
right mind could think that editing object files is a good idea *when
you have the source code*


Christian

Chris M. Thomasson

unread,
Jan 27, 2023, 3:31:42 PMJan 27
to
On 1/27/2023 5:13 AM, Paavo Helde wrote:
> 27.01.2023 13:44 Frederick Virchanza Gotham kirjutas:
>> On Friday, January 27, 2023 at 11:40:19 AM UTC, Paavo Helde wrote:
>>
>>> You sure get some points for originality in the software development.
>>> Brings the no-code movement to new heights. Maybe we should call it
>>> blind-source development?
>>>
>>> My only question still remains: why???
>>
>>
>> Program A writes into a TCP socket.
>>
>> Program B reads in from a TCP socket.
>>
>> I'm making one program, i.e. Program C, out of them.
>>
>> Program A is about 10 times the size of Program B, so it makes sense
>> to put B into A rather than put A into B.
>
> This is non-sequitur. Also, in the previous line you said you put both
> of them in C.
>
>>
>> Program C will have two threads, one thread running the 'main' of
>> Program A, and one thread running the 'main' of Program B.
>>
>> Since Program A and Program B are now in the one process and have
>> access to each other's memory, I can do away with the TCP socket
>> between them, and replace it with a lockfree container, e.g.
>> boost::lockfree:spsc_queue.

You have to be careful with them.


> So what are the timings and which operations are too slow, and by how
> much? I.e. is this project going to solve a real or an imagined problem?
>
> You are planning extensive modifications in the source code of both A
> and B. What prevents you to change the name of main() and other
> conflicting symbols in the source code of A and B?

Fwiw, are you familiar with the two lock queue?

https://www.cs.rochester.edu/research/synchronization/pseudocode/queues.html

Chris M. Thomasson

unread,
Jan 27, 2023, 3:34:08 PMJan 27
to
You can share memory between processes.

Chris M. Thomasson

unread,
Jan 27, 2023, 3:35:43 PMJan 27
to
Oops, I meant to respond to:

Frederick Virchanza Gotham

Sorry Paavo.

Frederick Virchanza Gotham

unread,
Jan 28, 2023, 8:34:25 AMJan 28
to
On Thursday, January 26, 2023 at 4:50:38 PM UTC, Frederick Virchanza Gotham wrote:
> On Linux x86_64, I've taken the code for an opensource executable program,
> and I've changed the final linker command to add "-fPIC -shared" so that I get
> a dynamic shared library instead of an executable file.

It's actually three programs instead of two now.

I've taken the ssh client from 'openssh' and I've added two other programs to it:
- The 'tun2socks' program from badvpn
- The 'route' program from busybox

So my new program will analyse the routing table, find an available private network (e.g. 10.10.10.0/24), create a TUN device and set its IP address, then get the ssh client to connect and establish a SOCKS server, then get tun2socks to forward traffic from the TUN to the SOCKS. So then you will be able to use any remote SSH server as a transparent proxy simply by doing:

ssh user@server --vpn

The selling point here though is that you don't need admin rights on the remote server.
I hope to have this in good working order by the end of February. Plus I'll build it as a static executable that doesn't need any shared libraries, and then I'll make a fat binary for a few different architectures (x86, x86_64, arm32, aarch64), I might even build it for macOS too. Afterward I might make a GUI in wxWidgets.

Frederick Virchanza Gotham

unread,
Jan 28, 2023, 8:44:18 AMJan 28
to
On Friday, January 27, 2023 at 7:03:25 PM UTC, Christian Gollwitzer wrote:
>
> The situation maybe clear, but I can't understand how anyone in their
> right mind could think that editing object files is a good idea *when
> you have the source code*

Because editing object files can be automated. Use 'nm file.o' to get all the symbols, then use 'objcopy file.o --redefine-sym main=pro_main' to edit the symbol names. You can put a prefix on *every* symbol name and then forget your worries about name collisions.

if I want to edit the name of a function or a variable in C++ source and header files, I need to do it manually myself, I can't just do a 'Find & Replace' in files because there might be a global variable named 'monkey' and also a stack variable within a function named 'monkey'.

In the my last job writing firmware for embedded Linux cameras, I made good use of 'objcopy' and also 'patchelf' to automate these processes. Once you have automated these processes, you can upgrade the 3rd party libraries to the latest version without having to go fixing name collisions all over again.

The format of object files and ELF files are well documented, you don't need to be squeamish about editing these files.

You'd wanna see some of the sorcery I've been able to pull off with 'patchelf', it is a very beautiful little program.

Paavo Helde

unread,
Jan 28, 2023, 9:00:51 AMJan 28
to
Just a side note: BusyBox is distributed under GPL v2, so if you want to
distribute your program, you must also make your source code available.

Mut...@dastardlyhq.com

unread,
Jan 28, 2023, 10:54:00 AMJan 28
to
On Sat, 28 Jan 2023 05:44:10 -0800 (PST)
Frederick Virchanza Gotham <cauldwel...@gmail.com> wrote:
>On Friday, January 27, 2023 at 7:03:25 PM UTC, Christian Gollwitzer wrote:
>>
>> The situation maybe clear, but I can't understand how anyone in their=20
>> right mind could think that editing object files is a good idea *when=20
>> you have the source code*=20
>
>Because editing object files can be automated. Use 'nm file.o' to get all t=
>he symbols, then use 'objcopy file.o --redefine-sym main=3Dpro_main' to edi=
>t the symbol names. You can put a prefix on *every* symbol name and then fo=
>rget your worries about name collisions.

Brilliant idea! Until a bug occurs in the program and some poor maintenance
programmer comes along who isn't aware that the binary doesn't match the source
code.

If anyone in my team edited binaries directly on a production system they'd be
out the door.

Frederick Virchanza Gotham

unread,
Jan 28, 2023, 12:22:44 PMJan 28
to
On Saturday, January 28, 2023 at 3:54:00 PM UTC, Mut...@dastardlyhq.com wrote:
>
> >Because editing object files can be automated. Use 'nm file.o' to get all t=
> >he symbols, then use 'objcopy file.o --redefine-sym main=3Dpro_main' to edi=
> >t the symbol names. You can put a prefix on *every* symbol name and then fo=
> >rget your worries about name collisions.
>
> Brilliant idea! Until a bug occurs in the program and some poor maintenance
> programmer comes along who isn't aware that the binary doesn't match the source
> code.


I don't know what you mean here when you say 'The binary doesn't match the source code'.

The object files will become part of an executable, which I will later strip, so it doesn't matter what the symbols were.


> If anyone in my team edited binaries directly on a production system they'd be
> out the door.


Object files and ELF files have a more rigid structure and format than C++ source and header files. It doesn't make sense that you're more eager to alter the latter.

When making any kind of alteration to a program, there's always the risk of introducing a bug, however I minimise this risk by making the most risk-free alteration.

Renaming a variable or function in a C++ source file has a lot more implications that renaming it in an object file or in a dynamic shared library file.

Mut...@dastardlyhq.com

unread,
Jan 28, 2023, 12:31:14 PMJan 28
to
On Sat, 28 Jan 2023 09:22:32 -0800 (PST)
Frederick Virchanza Gotham <cauldwel...@gmail.com> wrote:
>On Saturday, January 28, 2023 at 3:54:00 PM UTC, Mut...@dastardlyhq.com wrote:
>>
>> >Because editing object files can be automated. Use 'nm file.o' to get all
>t=
>> >he symbols, then use 'objcopy file.o --redefine-sym main=3Dpro_main' to
>edi=
>> >t the symbol names. You can put a prefix on *every* symbol name and then fo=
>
>> >rget your worries about name collisions.
>>
>> Brilliant idea! Until a bug occurs in the program and some poor maintenance
>> programmer comes along who isn't aware that the binary doesn't match the
>source
>> code.
>
>
>I don't know what you mean here when you say 'The binary doesn't match the
>source code'.

You know the symbol names map on to functions and variables in the source
right? How do you think debuggers work?

>The object files will become part of an executable, which I will later strip,
>so it doesn't matter what the symbols were.

Well if you strip them then it doesn't matter. Most people don't.

>> If anyone in my team edited binaries directly on a production system they'd
>be
>> out the door.
>
>
>Object files and ELF files have a more rigid structure and format than C++
>source and header files. It doesn't make sense that you're more eager to alter
>the latter.

??????!!!!

>Renaming a variable or function in a C++ source file has a lot more
>implications that renaming it in an object file or in a dynamic shared library
>file.

I'm lost for words.

Paavo Helde

unread,
Jan 28, 2023, 4:54:25 PMJan 28
to
Muttley, it's rare that I agree with you, but this time I do.

Frederick Virchanza Gotham

unread,
Jan 28, 2023, 6:28:28 PMJan 28
to
On Saturday, January 28, 2023 at 5:31:14 PM UTC, Mut...@dastardlyhq.com wrote:
>
> You know the symbol names map on to functions and variables in the source
> right? How do you think debuggers work?

In the debugger, the symbol will be 'busybox_read_back_twice" instead of "read_back_twice".

> Well if you strip them then it doesn't matter. Most people don't.

I can't recall if I've ever been given a release executable with the symbols left inside. I don't think I ever have.

> ??????!!!!
> I'm lost for words.

Let's say you combine two programs together and you try to link and you get a multiple definition error for a symbol called 'options' -- just like I did when I tried to combine 'busybox' with 'openssh'. Let's say that this symbol is used 17 times in busybox, and 63 times in openssh. Are you going to do a 'Find & Replace in Files' to change all of them? And what about if you see the following in the code:

int options;

namespace UDP {
int options;
}

namespace TCP {
int options;

void Func(void)
{
using namespace UDP;

int j = options;
}
}

Will you meticulously check all 81 uses of 'options' in both programs to make sure you're replacing the correct one? Or will you just replace all of them, so that even local stack variables are affected?

And what about when the preprocessor is used to make a variable name? For example:

#define OPT(name) name ## _option

int OPT(global) = 7;

int main(void)
{
global_option = 7;
}

After you've gone over all the 81 uses, a few months goes by and a new version of the library comes out, so now you 76 instead of 81 uses, and they're in different places so you have to go over them all over again.

Are you seriously telling me that editing the C++ source files will be less susceptible to introducing bugs than if you were to simply compile the source files to object files and then follow three simple steps:
Step 1) Get a list of all the exported symbols in all the object files:
find -iname "*.o" | xargs -i -r -n1 nm "{}" | grep -Ev "( U )|( W )|( w )" | cut -d ' ' -f3 | sort | uniq > a.txt
Step 2) Make a list of command line options to give to 'objcopy' :
cat a.txt | awk '{print "--redefine-sym "$$s"=busybox_"$$s }' > b.txt
Step 3) Run 'objcopy' on all the object files:
find -iname "*.o" | while read line; do cat b.txt | xargs -r -n2 objcopy $$line; done

I've put these 3 steps into a Makefile, and so now in the future if I upgrade either 'busybox' or 'badvpn', I don't need to go sorting out name collisions all over again.

I think your main boggle with what I'm doing owes to a decades-old belief that object file shouldn't be meddled with. The two programs, 'objcopy' and 'patchelf' are well written, and they do their job properly. There's nothing proprietary or elusive about the format of object files. In my last job I used 'patchelf' in a Makefile to compensate for a bug in a 3rd proprietary driver that I only had machine code for.

Editing object files is safer and quicker than editing C++ source files.

Paavo Helde

unread,
Jan 29, 2023, 3:20:07 AMJan 29
to
29.01.2023 01:28 Frederick Virchanza Gotham kirjutas:
> On Saturday, January 28, 2023 at 5:31:14 PM UTC, Mut...@dastardlyhq.com wrote:
>>
>> You know the symbol names map on to functions and variables in the source
>> right? How do you think debuggers work?
>
> In the debugger, the symbol will be 'busybox_read_back_twice" instead of "read_back_twice".
>
>> Well if you strip them then it doesn't matter. Most people don't.
>
> I can't recall if I've ever been given a release executable with the symbols left inside. I don't think I ever have.
>
>> ??????!!!!
>> I'm lost for words.
>
> Let's say you combine two programs together and you try to link and you get a multiple definition error for a symbol called 'options' -- just like I did when I tried to combine 'busybox' with 'openssh'. Let's say that this symbol is used 17 times in busybox, and 63 times in openssh. Are you going to do a 'Find & Replace in Files' to change all of them? And what about if you see the following in the code:
>
> int options;
>
> namespace UDP {
> int options;
> }
>
> namespace TCP {
> int options;
>
> void Func(void)
> {
> using namespace UDP;
>
> int j = options;
> }
> }
>
> Will you meticulously check all 81 uses of 'options' in both programs to make sure you're replacing the correct one? Or will you just replace all of them, so that even local stack variables are affected?

Why on earth should you search for 'options'? If there is a conflicting
function name 'options', you search for 'options(' in the "match whole
word" mode. It is enough to change the declaration and the definition (2
places in normal source code). Then you recompile the project and fix
the error lines (in case of C, you might need to turn the corresponding
warning into an error first).

This is one-time activity and you get clean code as a result. For
updating git forks one is supposed to use git merge anyway, which will
cope with such changes, so there is no need to automate such things.
Without git merge, how else are you planning to keep your added no-lock
queue changes intact in the source code?

An alternative would be to compile the library as a shared .so with
hidden symbols, except of the one which you will call. I do not like
that approach very much because it requires a platform-specific compiler
option, but it's still better than a whole build step consisting of
platform-specific hacks.

When developing software, the aim is to make things simpler after each
alteration, not more complicated. Each time when you add a kludgy hack,
you make the code twice worse. Add 4 such hacks, and you have a program
which is 16 times more difficult to deal with, meaning that you are not
able to maintain it any more.


David Brown

unread,
Jan 29, 2023, 5:47:05 AMJan 29
to
On 29/01/2023 00:28, Frederick Virchanza Gotham wrote:
> On Saturday, January 28, 2023 at 5:31:14 PM UTC, Mut...@dastardlyhq.com wrote:
>>
>> You know the symbol names map on to functions and variables in the source
>> right? How do you think debuggers work?
>
> In the debugger, the symbol will be 'busybox_read_back_twice" instead of "read_back_twice".
>
>> Well if you strip them then it doesn't matter. Most people don't.
>
> I can't recall if I've ever been given a release executable with the symbols left inside. I don't think I ever have.
>
>> ??????!!!!
>> I'm lost for words.
>

It is certainly one of the most bizarre hacks I have heard of for a while.


> Let's say you combine two programs together and you try to link and
> you get a multiple definition error for a symbol called 'options' --
> just like I did when I tried to combine 'busybox' with 'openssh'.
> Let's say that this symbol is used 17 times in busybox, and 63 times
> in openssh. Are you going to do a 'Find & Replace in Files' to change
> all of them? And what about if you see the following in the code:
Yes, going through the source code and making the changes in the right
places is /absolutely/ the thing you have to do. Hacking the generated
object code is insanity and a maintainer's worst nightmare.

But a simple search-and-replace is a clumsy way to handle it - you can
use better tools and get better results. A good IDE can figure out
every point in a project that references a particular symbol,
differentiating between local variables, functions, internal and
external linkage. Often it is just a matter of choosing the "refactor -
rename identifier" tool and the job is done.

Another good method is to rename the original variable at definition and
declaration. If the code is well-written, with a single declaration in
a single header, then a re-compile will show errors for all the
references. Fix the errors one by one, and you have re-named the
variable (or function, or whatever).


For a different kind of hack, that is IMHO significantly less bad that
hacking the object code, you could add gcc flags like :

-Doptions=BUSY_BOX_options

and

-Doptions=OPENSSH_options

to the different makefile parts. That will lead to a renaming of the
symbols.



Frederick Virchanza Gotham

unread,
Jan 29, 2023, 10:57:43 AMJan 29
to
Paavo Helde wrote:
> When developing software, the aim is to make things
> simpler after each alteration, not more complicated.
> Each time when you add a kludgy hack, you make the
> code twice worse. Add 4 such hacks, and you have a
> program which is 16 times more difficult to deal with,
> meaning that you are not able to maintain it any more.

David Brown wrote:
> It is certainly one of the most bizarre hacks I have
> heard of for a while.

Both of you are speaking from a viewpoint that's been
engendered and indoctrinated in you, rather than just
looking at my solution for what it is. If you consider the
editing of compiled files to be an abomination, then my
solution is an abomination. If you don't have any qualms
about editing object files, then I've given a few reasons
why my solution is superior to editing source files.

You have suggested just changing the declaration and
definition and then cleaning up the resultant compiler errors,
but that's work that might introduce bugs. And you've to
re-do it every time the library is upgraded.

This all boils down to one simple issue: Can we depend on
'objcopy' and 'patchelf' to do their job properly without
creating unforseen problems? I believe that we can, and so I
depend on them.

Let's not make this out to be a simple case of "I think my solution
is better than your solution". This is more of a cultural matter
-- with the binary editors on one side, and the binary intacters
on the other. Cultural clash. Within one lifetime it's unlikely
either of us will defect.

Frederick Virchanza Gotham

unread,
Jan 29, 2023, 5:44:21 PMJan 29
to
On Sunday, January 29, 2023 at 3:57:43 PM UTC, Frederick Virchanza Gotham wrote:
<snip>

Just now I got the 'ssh client' from 'openssh' to compile and link as a static executable, with both 'busybox_route' and 'badvpn_tun2socks' built in. I'm really looking forward to seeing how this turns out.

Here's what my final linker command looks like:

g++ -o ssh -static -std=c++23 ssh.o readconf.o clientloop.o sshtty.o sshconnect.o sshconnect2.o mux.o ssh-sk-client.o vpn/vpn-core.cpp.o vpn/vpn-linux-tun.o vpn/from_busybox_route/xconnect.c.o vpn/from_busybox_route/inet_common.c.o vpn/from_busybox_route/route.c.o vpn/from_busybox_route/xfuncs_printf.c.o vpn/from_busybox_route/perror_msg.c.o vpn/from_busybox_route/xfunc_die.c.o vpn/from_busybox_route/ptr_to_globals.c.o vpn/from_busybox_route/signals.c.o vpn/from_busybox_route/verror_msg.c.o vpn/from_busybox_route/read.c.o vpn/from_busybox_route/fflush_stdout_and_exit.c.o vpn/from_busybox_route/time.c.o vpn/from_busybox_route/messages.c.o vpn/from_busybox_route/wfopen.c.o vpn/from_busybox_route/xfuncs.c.o vpn/from_busybox_route/full_write.c.o vpn/from_busybox_route/default_error_retval.c.o vpn/from_busybox_route/xatonum.c.o vpn/from_busybox_route/sysconf.c.o vpn/from_busybox_route/copyfd.c.o vpn/from_busybox_route/bb_strtonum.c.o vpn/from_busybox_route/getopt32.c.o vpn/from_busybox_route/safe_write.c.o vpn/from_busybox_route/safe_strncpy.c.o vpn/from_busybox_route/llist.c.o vpn/from_busybox_route/appletlib_STRIPPED_DOWN.c.o vpn/from_badvpn_tun2socks/SingleStreamReceiver.c.o vpn/from_badvpn_tun2socks/BReactor_badvpn.c.o vpn/from_badvpn_tun2socks/ip6_addr.c.o vpn/from_badvpn_tun2socks/ip4.c.o vpn/from_badvpn_tun2socks/BLog.c.o vpn/from_badvpn_tun2socks/BufferWriter.c.o vpn/from_badvpn_tun2socks/StreamPacketSender.c.o vpn/from_badvpn_tun2socks/PacketPassConnector.c.o vpn/from_badvpn_tun2socks/BTap.c.o vpn/from_badvpn_tun2socks/ip6.c.o vpn/from_badvpn_tun2socks/BNetwork.c.o vpn/from_badvpn_tun2socks/SocksUdpClient.c.o vpn/from_badvpn_tun2socks/tcp_in.c.o vpn/from_badvpn_tun2socks/BTime.c.o vpn/from_badvpn_tun2socks/icmp.c.o vpn/from_badvpn_tun2socks/KeepaliveIO.c.o vpn/from_badvpn_tun2socks/SinglePacketSender.c.o vpn/from_badvpn_tun2socks/UdpGwClient.c.o vpn/from_badvpn_tun2socks/nd6.c.o vpn/from_badvpn_tun2socks/BProcess.c.o vpn/from_badvpn_tun2socks/mem.c.o vpn/from_badvpn_tun2socks/timeouts.c.o vpn/from_badvpn_tun2socks/pbuf.c.o vpn/from_badvpn_tun2socks/udp.c.o vpn/from_badvpn_tun2socks/def.c.o vpn/from_badvpn_tun2socks/ip4_addr.c.o vpn/from_badvpn_tun2socks/BInputProcess.c.o vpn/from_badvpn_tun2socks/icmp6.c.o vpn/from_badvpn_tun2socks/BDatagram_common.c.o vpn/from_badvpn_tun2socks/init.c.o vpn/from_badvpn_tun2socks/inet_chksum.c.o vpn/from_badvpn_tun2socks/PacketRecvBlocker.c.o vpn/from_badvpn_tun2socks/tcp_out.c.o vpn/from_badvpn_tun2socks/sys.c.o vpn/from_badvpn_tun2socks/RouteBuffer.c.o vpn/from_badvpn_tun2socks/PacketPassNotifier.c.o vpn/from_badvpn_tun2socks/BDatagram_unix.c.o vpn/from_badvpn_tun2socks/StreamPassConnector.c.o vpn/from_badvpn_tun2socks/netif.c.o vpn/from_badvpn_tun2socks/PacketPassPriorityQueue.c.o vpn/from_badvpn_tun2socks/BUnixSignal.c.o vpn/from_badvpn_tun2socks/BPending.c.o vpn/from_badvpn_tun2socks/memp.c.o vpn/from_badvpn_tun2socks/StreamRecvInterface.c.o vpn/from_badvpn_tun2socks/ip4_frag.c.o vpn/from_badvpn_tun2socks/PacketPassFairQueue.c.o vpn/from_badvpn_tun2socks/stats.c.o vpn/from_badvpn_tun2socks/BSignal.c.o vpn/from_badvpn_tun2socks/DebugObject.c.o vpn/from_badvpn_tun2socks/PacketProtoDecoder.c.o vpn/from_badvpn_tun2socks/BThreadSignal.c.o vpn/from_badvpn_tun2socks/BConnection_unix.c.o vpn/from_badvpn_tun2socks/PacketRecvConnector.c.o vpn/from_badvpn_tun2socks/PacketRecvInterface.c.o vpn/from_badvpn_tun2socks/ip6_frag.c.o vpn/from_badvpn_tun2socks/SinglePacketBuffer.c.o vpn/from_badvpn_tun2socks/BConnection_common.c.o vpn/from_badvpn_tun2socks/PacketPassFifoQueue.c.o vpn/from_badvpn_tun2socks/PacketPassInactivityMonitor.c.o vpn/from_badvpn_tun2socks/PacketProtoEncoder.c.o vpn/from_badvpn_tun2socks/SingleStreamSender.c.o vpn/from_badvpn_tun2socks/PacketStreamSender.c.o vpn/from_badvpn_tun2socks/PacketProtoFlow.c.o vpn/from_badvpn_tun2socks/ip.c.o vpn/from_badvpn_tun2socks/tcp.c.o vpn/from_badvpn_tun2socks/BSocksClient.c.o vpn/from_badvpn_tun2socks/LineBuffer.c.o vpn/from_badvpn_tun2socks/PacketPassInterface.c.o vpn/from_badvpn_tun2socks/StreamRecvConnector.c.o vpn/from_badvpn_tun2socks/BLog_syslog.c.o vpn/from_badvpn_tun2socks/PacketBuffer.c.o vpn/from_badvpn_tun2socks/PacketCopier.c.o vpn/from_badvpn_tun2socks/PacketRouter.c.o vpn/from_badvpn_tun2socks/StreamPassInterface.c.o vpn/from_badvpn_tun2socks/BLockReactor.c.o -L. -Lopenbsd-compat/ -Wl,-z,relro -Wl,-z,now -Wl,-z,noexecstack -fstack-protector-strong -pie -lssh -lopenbsd-compat -lcrypto -lz

David Brown

unread,
Jan 30, 2023, 2:19:14 AMJan 30
to
On 29/01/2023 16:57, Frederick Virchanza Gotham wrote:
> Paavo Helde wrote:
>> When developing software, the aim is to make things
>> simpler after each alteration, not more complicated.
>> Each time when you add a kludgy hack, you make the
>> code twice worse. Add 4 such hacks, and you have a
>> program which is 16 times more difficult to deal with,
>> meaning that you are not able to maintain it any more.
>
> David Brown wrote:
>> It is certainly one of the most bizarre hacks I have
>> heard of for a while.
>
> Both of you are speaking from a viewpoint that's been
> engendered and indoctrinated in you, rather than just
> looking at my solution for what it is. If you consider the
> editing of compiled files to be an abomination, then my
> solution is an abomination. If you don't have any qualms
> about editing object files, then I've given a few reasons
> why my solution is superior to editing source files.
>

That is a completely meaningless thing to write. Yes, if I think that
editing compiled files is a terrible idea, then I will think your idea
of editing compiled files is terrible - and if I don't have anything
against editing compiled files, then I won't object to it.

I /do/ have something against it - and it is not indoctrination.
Frankly, I haven't heard anyone suggest it before now, much less argue
either for or against it.

The norm, however, is that programmers write or edit source code, and it
is compiled and then linked. If you do something that is wildly
breaking that norm, you are going to cause chaos to anyone maintaining
the code or working with it later - so it is not something to consider
without a /huge/ benefit. And you don't have a huge benefit - you've
got nothing more than the laziness of not wanting to edit a few files.
(I don't think combining these programs like this in the first place is
a great idea, but that's a different matter.)

> You have suggested just changing the declaration and
> definition and then cleaning up the resultant compiler errors,
> but that's work that might introduce bugs. And you've to
> re-do it every time the library is upgraded.

You've got to re-do your Frankenstein program for every code change
anyway. This is one reason why it is a bad idea to mix the different
programs (especially when they are security-related programs). You are
far better off using the programs as separate programs, or using libraries.

Once you have done the renaming once, you have a patchset and a git
branch. For many small updates to the original programs, you merely
need to re-apply the patches.

>
> This all boils down to one simple issue: Can we depend on
> 'objcopy' and 'patchelf' to do their job properly without
> creating unforseen problems? I believe that we can, and so I
> depend on them.
>
> Let's not make this out to be a simple case of "I think my solution
> is better than your solution". This is more of a cultural matter
> -- with the binary editors on one side, and the binary intacters
> on the other. Cultural clash. Within one lifetime it's unlikely
> either of us will defect.

It is not about reliability of tools - it's about traceability and
making a clear, maintainable build that takes source and results in a
binary.

"Binary editing" can have its place. I use it all the time in my own
work in embedded systems - it is standard practice that after my
programs are built, debugged and tested as elf files, I have objcopy to
generate binary images to be flashed into an embedded system. It is
quite common that the binary file is augmented after the objcopy, with
checksums, programming information, and the like.

But that is done in a /controlled/ manner, an /expected/ manner - it is
a natural part of the build process. It is not some weird hack done in
code you don't really understand, in a way you don't fully comprehend,
as a lazy alternative to better methods.


Frederick Virchanza Gotham

unread,
Jan 30, 2023, 3:25:48 AMJan 30