Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Getting rid of libstdc++?

50 views
Skip to first unread message

John Smith

unread,
Mar 8, 2005, 4:19:02 PM3/8/05
to
I had this crazy dream the other day... Imagine a static library written in
C++ beeing deployed and used by many users who would have no problem linking
it and using it no matter their gcc/g++ version... ok back to reality!

After a while it's become obvious to me that C++ is lacking some ABI and as
another member of this group pointed out earlier it will make your static
library depend on both libc and libstdc++. I'd like to make the code "act"
to the outside world as if it was written in C to overcome some of the
dificulties there might be from using C++.
So this means exported functions have C prototypes (thats easy to do with
extern "C") but also the imported functions from external libs should not be
C++.
My library is almost C as far as external references but I do use a tiny bit
of STL. More specificly it's map, vector, list and string. I know I can find
alternate STL libraries like STLPort but I don't know how much it will help
me.

How would my chances be to remove all dependencies to libstc++ and be able
to use gcc as frontend linker instead of g++?
Someone pointed out earlier to me that because I used C++ there is a problem
regarding compatibility. However I havn't understood the whole problem yet.
Lets assume I use Redhat 9 to compile my code (static library libabcd.a).
Under what circumstances could I expect (or the opposite) that it would link
under e.g. Suse 8.1 or some other popular distribution used by comercial
world?
The same question could be applied in many scenarios. How about code
compiled under Mac OS X 10.3 and linked under 10.2? (10.3 comes with gcc 3.3
and 10.2 comes with gcc 3.1)

-- John


Paul Pluzhnikov

unread,
Mar 9, 2005, 11:22:09 PM3/9/05
to
John Smith wrote:

> I had this crazy dream the other day... Imagine a static library written in
> C++ beeing deployed and used by many users who would have no problem linking
> it and using it no matter their gcc/g++ version... ok back to reality!

If you replace "static" with "dynamic", your dream becomes
quite achievable.

> How would my chances be to remove all dependencies to libstc++ and be able
> to use gcc as frontend linker instead of g++?

For "archive" (aka "static") library the chances are nil.

> Someone pointed out earlier to me that because I used C++ there is a problem
> regarding compatibility. However I havn't understood the whole problem yet.

One of the problems is that your library depends on symbols that may not
be defined on the target system (even for linking with 'g++').

> Lets assume I use Redhat 9 to compile my code (static library libabcd.a).

Ok, that OS shipped with gcc-3.2.2

> Under what circumstances could I expect (or the opposite) that it would link
> under e.g. Suse 8.1

That shipped with gcc-3.2, so your chances of success (with g++, not
with gcc) here are pretty good.

> or some other popular distribution used by comercial world?

Let's take SuSE 7.3 instead, which shipped with gcc-2.95.3.
On that OS, your end-user will not be able to link with your library at
all, because none of its external dependencies will be satisfied (the
name-mangling changed between these two versions, so even something as
simple as '::operator new(size_t)' will be unresolved.

If OTOH you decided to ship a dynamic library, then you can:
- link it against libstdc++.a (so that you don't have any C++ external
dependencies) [1,2], and
- hide all externally-visible C++ symbols with a linker version script,
leaving only the C interface externally visible.

The latter step is important because if you don't, the user may call
your external C++ symbols instead of the ones in his libstdc++, and if
object layout has changed between his version of libstdc++ and yours,
but the name mangling didn't, he will crash and burn.

[1] Provided you comply with libstdc++ license.
[2] On platforms that require PIC code you may need to rebuild your
libstdc++.a with '-fPIC'.

Cheers,
--
In order to understand recursion you must first understand recursion.
Remove /-nsp/ for email.

"Nils O. Selåsdal"

unread,
Mar 10, 2005, 5:49:27 AM3/10/05
to
Paul Pluzhnikov wrote:
> John Smith wrote:
>
>> I had this crazy dream the other day... Imagine a static library
>> written in
>> C++ beeing deployed and used by many users who would have no problem
>> linking
>> it and using it no matter their gcc/g++ version... ok back to reality!
>
>
> If you replace "static" with "dynamic", your dream becomes
> quite achievable.
>
>> How would my chances be to remove all dependencies to libstc++ and be
>> able
>> to use gcc as frontend linker instead of g++?
>
>
> For "archive" (aka "static") library the chances are nil.
>
Why ? I've quite a few times played tricks, and linked libstdc++
statically to avoid library hell for the ones installing/using the
binary shipped to the customer..

Paul Pluzhnikov

unread,
Mar 10, 2005, 10:53:55 PM3/10/05
to
Nils O. Selåsdal wrote:

>> For "archive" (aka "static") library the chances are nil.

Actually, I'd like to revise that statement.

> Why ? I've quite a few times played tricks, and linked libstdc++
> statically

Into an executable? No problem.
Into an archive library. Not likely, unless you did this:

--- revision ---
You can make an archive C++ library that has no external C++
dependencies, by essentially performing the same steps that are required
by the shared library -- statically linking libstdc++.a and hiding
external C++ symbols -- for each individual object. Something along the
lines of:

ar x libfoo.a foo.o
ld -r foo.o /usr/lib/libstdc++.a -o foo1.o
objcopy -G foo1 -G foo2 ... -G fooN foo1.o -o foo.o
ar ru libfoo.a foo.o
... repeat for other objects ...

Cheers,

Markus Elfring

unread,
Mar 11, 2005, 4:51:49 PM3/11/05
to
> How would my chances be to remove all dependencies to libstc++ and be able
> to use gcc as frontend linker instead of g++?
> Someone pointed out earlier to me that because I used C++ there is a problem
> regarding compatibility. However I havn't understood the whole problem yet.

How do you think about the technical details from the following information sources?
1. "How to mix C and C++"
http://www.inf.uni-konstanz.de/~kuehl/c++-faq/mixing-c-and-cpp.html

2. http://en.wikipedia.org/wiki/Application_binary_interface
http://en.wikipedia.org/wiki/Name_mangling

3. Chapters "7 ABI" and "8 Objects Across Borders" of the book "Imperfect C++" (ISBN
0-321-22877-4) by Matthew Wilson
http://imperfectcplusplus.com/

Regards,
Markus


Markus Elfring

unread,
Mar 11, 2005, 4:55:47 PM3/11/05
to
> If OTOH you decided to ship a dynamic library, then you can:
> [...]

> - hide all externally-visible C++ symbols with a linker version script,
> leaving only the C interface externally visible.

Which commands or instructions have you got in mind here?

Regards,
Markus


John Smith

unread,
Mar 13, 2005, 10:39:23 AM3/13/05
to
This is certainly very interesting!

> --- revision ---
> You can make an archive C++ library that has no external C++
> dependencies, by essentially performing the same steps that are required
> by the shared library -- statically linking libstdc++.a and hiding
> external C++ symbols -- for each individual object. Something along the
> lines of:
>
> ar x libfoo.a foo.o
> ld -r foo.o /usr/lib/libstdc++.a -o foo1.o
> objcopy -G foo1 -G foo2 ... -G fooN foo1.o -o foo.o

What is the objcopy step suppose to do?
I just did a quick test on Mac OS X and it seems not to have objcopy at all.
Searching google for man pages on objcopy it seems it doesn't support "-G"
parameter.

> ar ru libfoo.a foo.o
> ... repeat for other objects ...

I did a quick test where I used your method above for ld -r foo.o ...
and then linked the file with the new object. What seems very interesting is
that it linked with no errors using gcc as frontend driver instead of g++.
When I went back to the old lib it stopped working with gcc as frontend.

Could you tell what goes on at the ld step? I'm wondering where the C++ deps
goes? I mean obviously it gets merged into the new object but does the names
disapear so you can assume there will be no troubles later?
Lets assume you have machine 1 with GCC 3.3 and libstdc++ version X and then
you have machine 2 with GCC 3.4 and libstdc++ version Y. Then you compile
the object on machine 1 and link on the 2nd. What will then happen?
Let me put it another way... If I would use C only, what would be result and
what if I use C++ with this method?

-- John


Paul Pluzhnikov

unread,
Mar 13, 2005, 8:55:40 PM3/13/05
to
"John Smith" <john....@x-formation.com> writes:

> What is the objcopy step suppose to do?

The 'hide everything except' foo1, foo2, ... fooN.

> I just did a quick test on Mac OS X and it seems not to have objcopy at all.
> Searching google for man pages on objcopy it seems it doesn't support "-G"
> parameter.

From info objcopy:

`-G SYMBOLNAME'
`--keep-global-symbol=SYMBOLNAME'
Keep only symbol SYMBOLNAME global. Make all other symbols local
to the file, so that they are not visible externally. This option
may be given more than once.


> I did a quick test where I used your method above for ld -r foo.o ...
> and then linked the file with the new object. What seems very interesting is
> that it linked with no errors using gcc as frontend driver instead of g++.

Yes, the 'ld -r' step is necessary, but not sufficient (as I said,
if you don't hide the symbols, your users are likely crash and burn
if they link with ABI-incompatible version of g++).

> When I went back to the old lib it stopped working with gcc as frontend.

Did it result in unresolved symbols? Which ones?

> Could you tell what goes on at the ld step? I'm wondering where the C++ deps
> goes? I mean obviously it gets merged into the new object but does the names
> disapear so you can assume there will be no troubles later?

Well, let's take a practical example:

$ cat junk.cc
extern "C" int *foo() { return new int(42); }
$ /usr/local/gcc-3.4.0/bin/g++ -c junk.cc
$ nm junk.o
U _Znwj
U __gxx_personality_v0
00000000 T foo

So we've got two undefined C++ symbols, the '::operator new' and
the magic __gxx_personality_v0 thingy. Both are defined in the
libstdc++.a, so relinking with 'ld -r' produces:

$ ld -r junk.o `/usr/local/gcc-3.4.0/bin/g++ -print-file-name=libstdc++.a` -o junk1.o
$ nm junk1.o | egrep '_Znwj|__gxx_personality_v0'
00000000 V DW.ref.__gxx_personality_v0
00000000 T _Znwj
00000000 T __gxx_personality_v0

As you can see, the name doesn't "disappear" it merely is now defined
(which is why you must hide it with the objcpy).

The reason it is now defined is that the linker took junk.o, and
objects from libstdc++.a that satisfied any undefined references
from junk.o (new_op.o and eh_personality.o in this case), and
recursively any objects from libstdc++.a that satisfied references
from new_op.o, etc. and merged them all together into junk1.o

You can see this in their sizes:
$ ls -l junk.o junk1.o
-rw-rw-r-- 1 paul users 964 Mar 13 17:19 junk.o
-rw-rw-r-- 1 paul users 179950 Mar 13 17:24 junk1.o

You may wish to read this:
http://webpages.charter.net/ppluzhnikov/linker.html
if you still do not understand what is happening here.

> Lets assume you have machine 1 with GCC 3.3 and libstdc++ version X and then
> you have machine 2 with GCC 3.4 and libstdc++ version Y. Then you compile
> the object on machine 1 and link on the 2nd. What will then happen?

Any number of things could happen:
- the link on 2nd machine could fail (if there are any symbols that
you referenced and that are defined in gcc-3.3-libstdc++ but aren't
in gcc-3.4-libstdc++).

There is quite a number of differences between gcc-3.3 and
gcc-3.4.3 libstdc++ versions.

Here is but one example:
< _ZNSt8ios_base13_M_grow_wordsEi ## gcc-3.3
> _ZNSt8ios_base13_M_grow_wordsEib ## gcc-3.4.3

So the 'std::ios_base::_M_grow_words()' used to take one parameter,
but now takes 2.

- the link on the 2nd machine may succeed, but the resulting
executable will crash in mysterious ways (e.g. because called
function itself did not change, but the layout of the object upon
which it operates, did).

- the link may succeed and the executable may appear to work,
until e.g. an exception is thrown, or some other rare execution
path is taken.

- the exe will just work. This is only expected when the object C++
dependencies are extremely trivial, such as in the case of
junk.cc above.

> If I would use C only, what would be result

In that case you would have no problems whatsoever.

> what if I use C++ with this method?

Any of the 4 possible outcomes above.

Bjorn Reese

unread,
Mar 14, 2005, 4:55:28 AM3/14/05
to
John Smith wrote:

>> objcopy -G foo1 -G foo2 ... -G fooN foo1.o -o foo.o
>
>
> What is the objcopy step suppose to do?
> I just did a quick test on Mac OS X and it seems not to have objcopy at all.

Paul has already answered your questions, so I am just going to add a
bit more information.

objcopy is part of the GNU binutils package.

The same package also has a strip command with a -K option, which can
be used to achieve the same result as objcopy -G.

Although both are part of the same package, and you therefore can use
either command, I prefer strip because I find it more intuitive.

--
mail1dotstofanetdotdk

John Smith

unread,
Mar 14, 2005, 11:34:18 AM3/14/05
to
>
> > When I went back to the old lib it stopped working with gcc as frontend.
>
> Did it result in unresolved symbols? Which ones?
>
Yes unresolved symbols. Not very surprising all the C++ ones from libstdc++.

Here are some of them (the list was rather long):
std::terminate()
vtable for __cxxabiv1::__class_type_info
vtable for __cxxabiv1::__si_class_type_info
operator delete(void*)
operator new(unsigned long)
___cxa_begin_catch
___cxa_end_catch
___cxa_rethrow
___gxx_personality_v0

> Well, let's take a practical example:
>

Thanks you for the example. It was really down to the point.

> You can see this in their sizes:
> $ ls -l junk.o junk1.o
> -rw-rw-r-- 1 paul users 964 Mar 13 17:19 junk.o
> -rw-rw-r-- 1 paul users 179950 Mar 13 17:24 junk1.o
>

Yeah and in my case the file grew with 1 mb under linux. But after all it's
a small price to pay for better compatibility.

>
> Lets assume you have machine 1 with GCC 3.3 and libstdc++ version X and
then
>

> you have machine2 with GCC 3.4 and libstdc++ version Y. Then you compile


>
> the object on machine 1 and link on the 2nd. What will then happen?
>

> Any number of things could happen:I assume you mean -without- your trick.


> - the exe will just work. This is only expected when the object C++
> dependencies are extremely trivial, such as in the case of
> junk.cc above.
>

All this makes me wonder why there is no control so you cannot do bad
linking. It's bad
enough the compatibility is almost non-existant but it doesn't make it
better when you can't even rely on linked files.

>
> If I would use C only, what would be result
>
> In that case you would have no problems whatsoever.
>

And with your method I hope it will happen.One little iritating thing about
your method is that you need to manually enumerate all the functions which
should be exported. It would be alot smarter if you could remove C++ names
only (e.g only those which libstdc++ exports). However since I only have
like
20 functions it's still not too much work unless I can automate this
step.

Thanks alot for your explanation.
-- John


Paul Pluzhnikov

unread,
Mar 14, 2005, 11:23:18 PM3/14/05
to
"John Smith" <john....@x-formation.com> writes:

> > > When I went back to the old lib it stopped working with gcc as frontend.
> >
> > Did it result in unresolved symbols? Which ones?
> >
> Yes unresolved symbols. Not very surprising all the C++ ones from libstdc++.
>
> Here are some of them (the list was rather long):

I am afraid I lost track of what exactly you are doing.

If you used the "link against libstdc++.a" trick to build your "old
library" (i.e. you tried to eliminate libstdc++.so dependency),
then thses symbols:

> std::terminate()
> vtable for __cxxabiv1::__class_type_info ... etc.

should not be unresolved.

> All this makes me wonder why there is no control so you cannot do bad
> linking.

There is some control -- whenever the ABI changes, the gcc developers
change name mangling as well, so as to avoid the "bad linking".

Nobody's perfect, so sometimes ABI chages sneak in unnoticed (at
least I think that's what causes a possibilty of a "bad link").

> It's bad
> enough the compatibility is almost non-existant but it doesn't make it
> better when you can't even rely on linked files.

What do you mean?
Fully-linked files (executables and DSOs) you can rely on.
It's only the relocatable objects you must link with the same
version of compiler they were compiled with.

> One little iritating thing about
> your method is that you need to manually enumerate all the functions which
> should be exported. It would be alot smarter if you could remove C++ names
> only

The linker, or objcopy, or strip, have no notion of "C++ names";
they all are "just names" to them.

How can you tell that '_ZdlPv' is a C++ name, but '__cxa_rethrow'
isn't?

John Smith

unread,
Mar 16, 2005, 3:10:04 AM3/16/05
to
> > Here are some of them (the list was rather long):
>
> I am afraid I lost track of what exactly you are doing.
>
Don't worry about this. I think I got it working under Mac OS X so far.

> How can you tell that '_ZdlPv' is a C++ name, but '__cxa_rethrow'
> isn't?

Thus it must be listed in libstdc++.a while the C name is not. So actually
you have a duplicate list of C++ symbols between the library to be stripped
and libstdc++.a.

Under Mac OS X I got the library stripped to contain only a few C++
namemangled names and the needed exports and unreferenced ones.

I'm in doubt about some of the names though like: (strip -u -s
keepsymbols.txt myobj.o -o newobj.o)
0001f2ec s
__ZNSt6vectorI15HKYIJVFLWLBNGYPSaIS0_EE13_M_insert_auxEN9__gnu_cxx17__normal
_iteratorIPS0_S2_EERKS0_

0001d730 s
__ZNSt6vectorI15HKYIJVFLWLBNGYPSaIS0_EE5eraseEN9__gnu_cxx17__normal_iterator
IPS0_S2_EE

00020b2c s
__ZNSt6vectorI15JMKJHOLHBHRIZJUSaIS0_EE13_M_insert_auxEN9__gnu_cxx17__normal
_iteratorIPS0_S2_EERKS0_

00023680 s __ZSt13__destroy_auxIP6kkfdsfEvT_S2_12__false_type

0002084c s
__ZSt24__uninitialized_copy_auxIN9__gnu_cxx17__normal_iteratorIP15HKYIJVFLWL
BNGYPSt6vectorIS2_SaIS2_EEEES7_ET0_T_S9_S8_12__false_type

00020e18 s
__ZSt24__uninitialized_copy_auxIN9__gnu_cxx17__normal_iteratorIP15JMKJHOLHBH
RIZJUSt6vectorIS2_SaIS2_EEEES7_ET0_T_S9_S8_12__false_type

I think these names should not cause problems though as the function/class
names are present in each of the references (though in the names here they
are obfuscated so don't worry about nonmeaningful names).

Under linux there is no "-u" switch for strip command which saves the
undefined references. So it could become a tiredsome task to list all the
undefined symbols. Objcopy has not got a similar switch either unfortunatly.
Hopefully there is a better solution then manually specify references?

Thanks.

-- John


0 new messages