Why is so much space allocated on the stack?

bartab4u

unread,

Dec 11, 2008, 3:48:35 PM12/11/08

to

I have some c code that I compile with gcc in cygwin and the stack for
the function call grows by more than I expect it to. I don't know why
since the local variables within the function should take 24 bytes
max...

Here's the code:

main()
{
function(1,2,3);
}

void function(int a, int b, int c) {
char buffer1[5];
char buffer2[10];
...
}

It gives me the following assembly output:

(gdb) disassemble function
Dump of assembler code for function function:
0x00401050 <function+0>: push %ebp
0x00401051 <function+1>: mov %esp,%ebp
0x00401053 <function+3>: sub $0x28,%esp

Why does the stack increase by 40? I expect to increase by 24 bytes.

Thanks

DJ Delorie

unread,

Dec 11, 2008, 4:53:54 PM12/11/08

to

bartab4u <spam...@crayne.org> writes:
> void function(int a, int b, int c) {
> char buffer1[5];
> char buffer2[10];
> ...
> }
>

> Why does the stack increase by 40? I expect to increase by 24 bytes.

Since you didn't show the rest of your function, how can we tell? It
might be used for temporaries, outgoing arguments, alloca(), or who
knows what else.

bartab4u

unread,

Dec 12, 2008, 8:54:59 AM12/12/08

to

On Dec 11, 4:53 pm, DJ Delorie <d...@delorie.com> wrote:

void main() {
function(1,2,3);
}

Assembly output:

(gdb) disassemble main
Dump of assembler code for function main:
0x00401058 <main+0>: push %ebp
0x00401059 <main+1>: mov %esp,%ebp
0x0040105b <main+3>: sub $0x18,%esp
0x0040105e <main+6>: and $0xfffffff0,%esp
0x00401061 <main+9>: mov $0x0,%eax
0x00401066 <main+14>: add $0xf,%eax
0x00401069 <main+17>: add $0xf,%eax
0x0040106c <main+20>: shr $0x4,%eax
0x0040106f <main+23>: shl $0x4,%eax
0x00401072 <main+26>: mov %eax,0xfffffffc(%ebp)
0x00401075 <main+29>: mov 0xfffffffc(%ebp),%eax
0x00401078 <main+32>: call 0x4010a0 <_alloca>
0x0040107d <main+37>: call 0x401130 <__main>
0x00401082 <main+42>: movl $0x3,0x8(%esp)
0x0040108a <main+50>: movl $0x2,0x4(%esp)
0x00401092 <main+58>: movl $0x1,(%esp)
0x00401099 <main+65>: call 0x401050 <function>
0x0040109e <main+70>: leave
0x0040109f <main+71>: ret
End of assembler dump.

(gdb) disassemble function
Dump of assembler code for function function:
0x00401050 <function+0>: push %ebp
0x00401051 <function+1>: mov %esp,%ebp
0x00401053 <function+3>: sub $0x28,%esp

0x00401056 <function+6>: leave
0x00401057 <function+7>: ret
End of assembler dump.

Tim Roberts

unread,

Dec 13, 2008, 2:04:37 AM12/13/08

to

bartab4u <spam...@crayne.org> wrote:
>
>I have some c code that I compile with gcc in cygwin and the stack for
>the function call grows by more than I expect it to. I don't know why
>since the local variables within the function should take 24 bytes
>max...

>...

>Why does the stack increase by 40? I expect to increase by 24 bytes.

This question has been asked many times in the past couple of years, and to
my knowledge the only answer has been "because it does". gcc tries to
maintain some kind of alignment in the stack (16? 32?), which accounts of
some of the difference, but that doesn't explain everything.
--
Tim Roberts, ti...@probo.com
Providenza & Boekelheide, Inc.

Andrew Tomazos

unread,

Dec 14, 2008, 8:16:39 PM12/14/08

to

On Dec 13, 8:04 am, Tim Roberts <spamt...@crayne.org> wrote:

> bartab4u <spamt...@crayne.org> wrote:
>
> >I have some c code that I compile with gcc in cygwin and the stack for
> >the function call grows by more than I expect it to. I don't know why
> >since the local variables within the function should take 24 bytes
> >max...
> >...
> >Why does the stack increase by 40? I expect to increase by 24 bytes.
>
> This question has been asked many times in the past couple of years, and to
> my knowledge the only answer has been "because it does". gcc tries to
> maintain some kind of alignment in the stack (16? 32?), which accounts of
> some of the difference, but that doesn't explain everything.

So I wrote a quick script that assembles (gcc -S) a function as
follows:

void function()
{
char buffer[X];
}

for each X = 0, 1, 2 .. 100

Then I partitioned the 100 assembled functions by equality.

I did see some kind of stack padding. As you can see for a buffer
size of 13 through 28 (which includes the 5+10=15 char buffer from the
original question), the stack grows 40 in these cases. It seems it
does not have anything to do with the functions parameters.

>From the below results can anyone spot a pattern between the buffer
size versus the stack size?

Also, I noticed that when the buffer size is 8 or above the following
is inserted into the function:

movl %gs:20, %eax
movl %eax, -4(%ebp)
xorl %eax, %eax
movl -4(%ebp), %eax
xorl %gs:20, %eax
je .L3
call __stack_chk_fail
.L3:
...

Anyone know what this does? What does "$gs:20" mean and what is it
doing in this case? And what are those xors about?

I have included a summary, the output of the script and the script
itself below.

Regards,
Andrew.

SUMMARY

For a buffer size of 0 the stack size is 0
For a buffer size of 1 to 7 the stack size is 16
For a buffer size of 8 to 12 the stack size is 24
For a buffer size of 13 to 28 the stack size is 40
For a buffer size of 29 to 44 the stack size is 56
For a buffer size of 45 to 60 the stack size is 72
For a buffer size of 61 to 76 the stack size is 88
For a buffer size of 77 to 92 the stack size is 104
For a buffer size of 93 to 100 the stack size is 120

GENF.OUT

For a buffer size of: 0 gcc output:

pushl %ebp
movl %esp, %ebp
popl %ebp
ret

For a buffer size of: 1, 2, 3, 4, 5, 6, 7 gcc output:

pushl %ebp
movl %esp, %ebp
subl $16, %esp
leave
ret

For a buffer size of: 8, 9, 10, 11, 12 gcc output:

pushl %ebp
movl %esp, %ebp
subl $24, %esp
movl %gs:20, %eax
movl %eax, -4(%ebp)
xorl %eax, %eax
movl -4(%ebp), %eax
xorl %gs:20, %eax
je .L3
call __stack_chk_fail
.L3:
leave
ret

For a buffer size of: 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24,
25, 26, 27, 28 gcc output:

pushl %ebp
movl %esp, %ebp
subl $40, %esp
movl %gs:20, %eax
movl %eax, -4(%ebp)
xorl %eax, %eax
movl -4(%ebp), %eax
xorl %gs:20, %eax
je .L3
call __stack_chk_fail
.L3:
leave
ret

For a buffer size of: 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
41, 42, 43, 44 gcc output:

pushl %ebp
movl %esp, %ebp
subl $56, %esp
movl %gs:20, %eax
movl %eax, -4(%ebp)
xorl %eax, %eax
movl -4(%ebp), %eax
xorl %gs:20, %eax
je .L3
call __stack_chk_fail
.L3:
leave
ret

For a buffer size of: 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56,
57, 58, 59, 60 gcc output:

pushl %ebp
movl %esp, %ebp
subl $72, %esp
movl %gs:20, %eax
movl %eax, -4(%ebp)
xorl %eax, %eax
movl -4(%ebp), %eax
xorl %gs:20, %eax
je .L3
call __stack_chk_fail
.L3:
leave
ret

For a buffer size of: 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72,
73, 74, 75, 76 gcc output:

pushl %ebp
movl %esp, %ebp
subl $88, %esp
movl %gs:20, %eax
movl %eax, -4(%ebp)
xorl %eax, %eax
movl -4(%ebp), %eax
xorl %gs:20, %eax
je .L3
call __stack_chk_fail
.L3:
leave
ret

For a buffer size of: 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88,
89, 90, 91, 92 gcc output:

pushl %ebp
movl %esp, %ebp
subl $104, %esp
movl %gs:20, %eax
movl %eax, -4(%ebp)
xorl %eax, %eax
movl -4(%ebp), %eax
xorl %gs:20, %eax
je .L3
call __stack_chk_fail
.L3:
leave
ret

For a buffer size of: 93, 94, 95, 96, 97, 98, 99, 100 gcc output:

pushl %ebp
movl %esp, %ebp
subl $120, %esp
movl %gs:20, %eax
movl %eax, -4(%ebp)
xorl %eax, %eax
movl -4(%ebp), %eax
xorl %gs:20, %eax
je .L3
call __stack_chk_fail
.L3:
leave
ret

GENF.PL

use strict;
use warnings;

sub assemble {
my ($source) = @_;

open FILE, ">test.c" or die;
print FILE $source . "\n";
close FILE;

system("gcc -S test.c");

open ASM, "<test.s" or die;
my $output = "";
while (<ASM>) { $output .= $_; }
close ASM;

return $output;
}

my %outputs;

for my $i (0..100)
{
my $source;

if ($i == 0)
{
$source = "void function() { /* no buffer */ }";
}
else
{
$source = "void function() { char buffer[$i]; }";
}

my $output = assemble($source);

if (!(exists $outputs{$output}))
{
$outputs{$output} = [];
}

push @{$outputs{$output}}, $i;
}

for my $output (sort { $outputs{$a}->[0] <=> $outputs{$b}->[0] } keys
%outputs)
{
print "For a buffer size of: ";

print join(", ", @{$outputs{$output}});
print " gcc output: \n\n";

print $output;

print "\n\n";
}

--
Andrew Tomazos <and...@tomazos.com> <http://www.tomazos.com>

Andrew Tomazos

unread,

Dec 15, 2008, 6:26:36 PM12/15/08

to

> movl %gs:20, %eax
> movl %eax, -4(%ebp)
> xorl %eax, %eax
> movl -4(%ebp), %eax
> xorl %gs:20, %eax
> je .L3
> call __stack_chk_fail
> .L3:

> Anyone know what this does? What does "$gs:20" mean and what is it

> doing in this case? And what are those xors about?

On further reflection I've figured this part out at least:

It looks like when the stack size is > 8 gcc puts in a stack
corruption check. It pushes the value at %gs:20 onto the stack, and
then once the function is complete it checks to see that it is still
intact....

> movl %gs:20, %eax
> movl %eax, -4(%ebp)

...Copy $gs:20 to bottom of stack frame.

> xorl %eax, %eax

...Zero %eax for cleanliness.

> movl -4(%ebp), %eax
> xorl %gs:20, %eax
> je .L3
> call __stack_chk_fail
> .L3:

...Retrieve bottom of stack and compare with gs:20 again. If not
equal, the stack frame must have become corrupted, so call
__stack_chk_fail.

This check increases the stack frame size by 4. It still doesn't
explain the rest of the padding.

-Andrew.

spam...@crayne.org

unread,

Dec 16, 2008, 1:20:55 AM12/16/08

to

At first glance it appears that GCC is simply trying to keep 16 byte
alignment on the stack so long as there are more than 7 bytes of
locals. Remember that the stack has not only the stack frame, it has
the old epb value and the return address, so there's an extra 8 bytes
right there that you have to deal with for alignment purposes, plus
the stack check word.

For routines with smaller amounts of locals, it looks like he's just
forcing 8 byte alignment, which is odd, unless he's only allowing
small leaf functions (those that don't call anything, so that
continuing alignment doesn't matter) to have alignment less than 16
bytes (which is quite plausible). I'd test it, but I'm away from any
useful copy of GCC at the moment.

Andrew Tomazos

unread,

Dec 17, 2008, 12:39:56 AM12/17/08

to

On Dec 16, 7:20 am, "robertwess...@yahoo.com" <spamt...@crayne.org>
wrote:

> At first glance it appears that GCC is simply trying to keep 16 byte
> alignment on the stack so long as there are more than 7 bytes of
> locals. Remember that the stack has not only the stack frame, it has
> the old epb value and the return address, so there's an extra 8 bytes
> right there that you have to deal with for alignment purposes, plus
> the stack check word.

>From Tim Prince on the gcc list I learned that the reason that GCC
keeps the stack aligned to 16 bytes (by default) is that it is
required by the Streaming SIMD Extensions (SSE). <http://
en.wikipedia.org/wiki/Streaming_SIMD_Extensions>

However, I am still not convinced that what we are seeing is 16-byte
alignment.

Take the following example:

$ cat test.c

void function()
{
char buffer[49];

buffer[0]++;
}

$ gcc -S test.c

$ cat test.s

.file "test.c"
.text
.globl function
.type function, @function
function:

pushl %ebp
movl %esp, %ebp
subl $72, %esp

movl %gs:20, %eax

movl %eax, -4(%ebp)
xorl %eax, %eax

movzbl -53(%ebp), %eax
addl $1, %eax
movb %al, -53(%ebp)

movl -4(%ebp), %eax
xorl %gs:20, %eax
je .L3
call __stack_chk_fail
.L3:

leave
ret
.size function, .-function
.ident "GCC: (GNU) 4.2.4 (Ubuntu 4.2.4-1ubuntu3)"
.section .note.GNU-stack,"",@progbits

$

Here we see that the space between -53(%ebp) and (esp) (== -72(%ebp)),
19 bytes, is unused padding.

Why do you need 19 bytes to pad anything to 16 byte alignment? You
should never need more than 15 bytes, right?

Regards,