-- Chris
--- begin transcript ---
Script started on Fri 20 Feb 2009 01:23:10 PM CST
% dietcc forktest.c # link statically against dietlibc
% ./a.out
30864.1 forks per second
% gcc forktest.c # compile with gcc, dynamic linking
% ./a.out
15723.3 forks per second
% gcc forktest.c -lm # pull in the math library
% ./a.out
14792.9 forks per second
% gcc forktest.c -lm -lcurses
% ./a.out
13888.9 forks per second
% gcc forktest.c -lm -lcurses -lpthread
% ./a.out
11961.7 forks per second
% . gcc forktest.c -lm -lcurses -lpthread -lresolv
% ./a.out
11013.2 forks per second
% gcc forktest.c -lm -lcurses -lpthread -lresolv -lssl
% ./a.out
8250.83 forks per second
% gcc forktest.c -lm -lcurses -lpthread -lresolv -lssl -lreadline
% ./a.out
7961.78 forks per second
% exit
Script done on Fri 20 Feb 2009 01:27:20 PM CST
--- end transcript ---
---begin forktest.c---
#include <stdio.h>
#include <errno.h>
#include <stdlib.h>
#include <time.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
int
main(void)
{
double elapsed_secs;
int forks = 0;
clock_t start = clock(), elapsed;
int i;
for(i = 0; i < 25000; i++) {
int waitstatus;
switch (fork()) {
case -1:
perror("fork");
break;
case 0:
exit(42);
default:
wait(&waitstatus);
++forks;
break;
}
}
elapsed = clock() - start;
elapsed_secs = (double)elapsed / CLOCKS_PER_SEC;
printf("%g forks per second\n", (double)forks / elapsed_secs);
exit(0);
}
---end forktest.c---
What fascinates me here is that forktest doesn't even
use anything from those other libraries. In the statically
linked case, listing an unneeded library is basically a
noop. It appears to be rather more involved in the
world of GNU shared libraries.
BLS
very nice.
i wonder if you may be measuring the performance
of memory management more than the performance of dynamic
linking. the reason i suspect this is because on my p3 machine i see
gcc static 13440 f/s
gcc dyn 12953 f/s
gcc static 6963 fork + exec/s
or the tipoff is the size of the executable:
; ls -l a.out
-rwxr-xr-x 1 quanstro users 609298 Feb 20 14:36 a.out
for snarky comparison, here are the programs on my system that
are that large or larger:
--rwxrwxr-x M 48625 sys sys 13275174 Jan 16 2006 /bin/gs
--rwxrwxr-x M 48625 sys sys 758520 Dec 9 2005 /bin/spin
the hard bit would be keeping the memory footprint the
same while increasing the number of shared libraries.
even in bad cases, this is like 0.2ms/fork + exec.
i wonder if the reason that there can be such big
differences is that linux fork+exec may have been
massaged for such syntetic benchmarks. thus small
amounts of extra work might look big.
- erik
Chris beat me to the punch, but I'm posting anyway because I went a
different direction. I wrote some rc scripts that make static and
dynamic libraries of various sizes and programs that use those
libraries (trivially). For each number of functions 1, 10, 100, 1000,
10000, 100000 I timed static and dynamic execution of a program that
conditionally calls that many functions in 1 or 10 libraries. The
scripts are attached, run mklibs, then mkprogs, then runtests. Below
are the results of a single run on my laptop (fixed with font looks
better). I can't spend any more time on this, but it was a fun
morning goof-off.
Static (functions libraries binary-size user system elapsed)
1 1 556898 0.24 0.50 0.78
10 1 557324 0.28 0.44 0.82
10 10 557913 0.32 0.42 0.81
100 1 561737 0.24 0.50 0.84
100 10 562196 0.28 0.46 0.79
1000 1 609496 0.29 0.47 0.83
1000 10 606381 0.26 0.48 0.84
10000 1 1105475 0.30 0.44 0.87
10000 10 1083834 0.28 0.47 0.82
100000 1 6245494 0.27 0.48 0.88
100000 10 6043871 0.28 0.48 0.81
Dynamic (functions libraries binary-size user system elapsed)
1 1 6489 0.49 0.86 1.39
10 1 7322 0.52 0.86 1.45
10 10 7464 0.83 1.14 2.03
100 1 16366 0.59 0.78 1.42
100 10 16177 1.14 1.11 2.35
1000 1 108268 0.55 0.87 1.47
1000 10 104496 0.88 1.12 2.07
10000 1 1077758 0.81 0.98 1.89
10000 10 1037387 1.12 1.36 2.63
100000 1 10915272 2.79 2.50 6.31
100000 10 10517862 3.13 3.68 7.13
I think dynamic 100-10 is a fluke, I also think it's interesting that
the dynamic binaries are bigger above 10000 function calls. Don't
know why, don't have time to figure it out.
Micah