Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

function call optimization question

8 views
Skip to first unread message

Szabolcs Nagy

unread,
Sep 12, 2007, 6:01:26 AM9/12/07
to
in the code below i thought the function call in g() could be easily
optimized out so that g() becomes the same as h() (which becomes
{return 0;})

executing 'gcc -O3 -S' i found that gcc does not do this

now i'm wondering: is there something in the standard (eg c99) that
prevents this optimization (theoretically)


#include <stdio.h>
#include <stdlib.h>

static inline int c(int a, int b) {
return a == b;
}

static int f(int a, int b, int(*c)(int, int)) {
return c(a, b) - c(b, a);
}

int g(int a, int b) {
return f(a, b, c);
}

int h(int a, int b) {
return c(a, b) - c(b, a);
}


int main(int argc, char *argv[]) {
int a, b;

if (argc < 3)
return printf("usage: %s a b\n", argv[0]);
a = atoi(argv[1]);
b = atoi(argv[2]);
printf("f: %d\n", f(a, b, c));
printf("g: %d\n", g(a, b));
printf("h: %d\n", h(a, b));
return 0;
}

Army1987

unread,
Sep 12, 2007, 6:43:00 AM9/12/07
to
On Wed, 12 Sep 2007 03:01:26 -0700, Szabolcs Nagy wrote:

> in the code below i thought the function call in g() could be easily
> optimized out so that g() becomes the same as h() (which becomes
> {return 0;})
>
> executing 'gcc -O3 -S' i found that gcc does not do this
>
> now i'm wondering: is there something in the standard (eg c99) that
> prevents this optimization (theoretically)

The standard only requires that files and volatile objects are
written/read in the right order. But I won't expect a program to
compute the 100008th prime number to generate the same machine
code as
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
return (fwrite("1299827\n", 1, 8, stdout) < 8) * EXIT_FAILURE;
}

IOW, while you shouldn't do micro-optimizations yourself, you
shouldn't write silly code hoping that the compiler will make it
decent.
--
Army1987 (Replace "NOSPAM" with "email")
If you're sending e-mail from a Windows machine, turn off Microsoft's
stupid “Smart Quotes” feature. This is so you'll avoid sprinkling garbage
characters through your mail. -- Eric S. Raymond and Rick Moen

Michal Nazarewicz

unread,
Sep 12, 2007, 7:37:44 AM9/12/07
to
Szabolcs Nagy <nsza...@gmail.com> writes:

> in the code below i thought the function call in g() could be easily
> optimized out so that g() becomes the same as h() (which becomes
> {return 0;})
>
> executing 'gcc -O3 -S' i found that gcc does not do this

Well... You can go complain to GCC developers if you like. ;)

> now i'm wondering: is there something in the standard (eg c99) that
> prevents this optimization (theoretically)

No. Standard doesn't prevents optimisation.

--
Best regards, _ _
.o. | Liege of Serenly Enlightened Majesty of o' \,=./ `o
..o | Computer Science, Michal "mina86" Nazarewicz (o o)
ooo +--<mina86*tlen.pl>---<jid:mina86*chrome.pl>--ooO--(_)--Ooo--

Szabolcs Nagy

unread,
Sep 12, 2007, 12:04:09 PM9/12/07
to

Michal Nazarewicz wrote:
>
> No. Standard doesn't prevents optimisation.
>

thanks for your answers

actually what i was thinking about is the situations like sorting int
arrays

static inline int intcmp(int *a, int *b) {
return *a < *b ? -1: *a > *b;
}

void intqsort(int *arr, size_t n) {
qsort(arr, n, sizeof(int), intcmp);
}

imho this is not a silly thing to optimize out, since many algorithms
can be done in a (type) generic way with a couple of function
arguments and most of these algorithms are performance critical (eg
when we cannot allow an additional function call for each int
comparison)

other possible examples:
int find_if(int *arr, size_t n, int (*pred)(int));
int hash_get(const hashtable_t *ht, const key_t *key, int (*hash)
(key_t *), int (*isempty)(item_t *), int (*isdeleted)(item_t *));
...

Richard Tobin

unread,
Sep 12, 2007, 12:55:01 PM9/12/07
to
In article <1189613049.0...@d55g2000hsg.googlegroups.com>,
Szabolcs Nagy <nsza...@gmail.com> wrote:

>actually what i was thinking about is the situations like sorting int
>arrays
>
>static inline int intcmp(int *a, int *b) {
> return *a < *b ? -1: *a > *b;
>}
>
>void intqsort(int *arr, size_t n) {
> qsort(arr, n, sizeof(int), intcmp);

This is unlikely to be useful, because

(a) intcmp is an argument to qsort(), and will be different for different
calls;
(b) even if it wasn't an argument, to inline the calls to intcmp()
its source would have to be available when qsort was compiled,
and typically qsort() is in a pre-compiled library.

One possibility would be for qsort itself to be inline, with its definition
in the header.

If you really need this efficiency, you could take one of the many free
implementations of qsort() and produce a specialised version yourself.

-- Richard
--
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.

Szabolcs Nagy

unread,
Sep 12, 2007, 2:07:21 PM9/12/07
to

Richard Tobin wrote:
> This is unlikely to be useful, because
>
> (a) intcmp is an argument to qsort(), and will be different for different
> calls;
> (b) even if it wasn't an argument, to inline the calls to intcmp()
> its source would have to be available when qsort was compiled,
> and typically qsort() is in a pre-compiled library.
true
you are right

well i'm writing an algorithm lib for my own amusement and i thought
it would make things easy if i could write

static void sort_internal(int *arr, size_t len, int (*less)(int, int))
{..}
static inline intless(int a, int b) {return a < b;}

void sort_f(int *arr, int len, int (*less)(int, int)) {
return sort_internal(arr, len, less);
}

void sort(int *arr, int len) {
return sort_internal(arr, len, intless);
}

so i don't need to write down the algorithm 2 times for sort() and
sort_f().
also type generic code can be written in this way so i can make my own
c++ stl like thing.

Malcolm McLean

unread,
Sep 12, 2007, 4:47:45 PM9/12/07
to

"Army1987" <army...@NOSPAM.it> wrote in message
news:pan.2007.09.12....@NOSPAM.it...

> On Wed, 12 Sep 2007 03:01:26 -0700, Szabolcs Nagy wrote:
>
> The standard only requires that files and volatile objects are
> written/read in the right order. But I won't expect a program to
> compute the 100008th prime number to generate the same machine
> code as
> #include <stdio.h>
> #include <stdlib.h>
> int main(void)
> {
> return (fwrite("1299827\n", 1, 8, stdout) < 8) * EXIT_FAILURE;
> }
>
I'd expect a Fortran 77 compiler to do this.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

0 new messages