Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Avoiding recursive stack overflow in C on Unix/Linux?

146 views
Skip to first unread message

bolta...@boltar.world

unread,
May 5, 2011, 5:37:57 AM5/5/11
to
Hello

If a program recurses too deeply there's always a chance that it could
run out of stack space and die with a SIGSEGV. Is there any API that can
tell you how much stack space is left or some other method to prevent this
in C/C++? I realise I could catch the signal but thats a pretty damn ugly
way to do it.

B2003

Message has been deleted

bolta...@boltar.world

unread,
May 5, 2011, 5:58:18 AM5/5/11
to
On Thu, 05 May 2011 02:51:35 -0700
China Blue Veins <chine...@yahoo.com> wrote:
>On some systems there is setrlimit which can be used to set the stack size. On
>unix you can start with 'man -k limit' and examine individual man pages.

Actually I suppose i could use getrlimit to get the stack size but is there
a clean way of actually finding the current position of top of the stack? I
could take the addresses of local variables but that seems a bit prone to
error.

B2003

Xavier Roche

unread,
May 5, 2011, 5:58:32 AM5/5/11
to

There is no portable way to find the stack bottom/size ; I am not even
sure what getrlimit(RLIMIT_STACK) returns exactly (current thread stack
size ? main thread stack size ? default stack size ?)

(And there is no portable way to find the stack top/bottom -- you'll
have to play with an address on the current stack around main, and play
with arithmetics with the page size)

Message has been deleted

Kleuskes & Moos

unread,
May 5, 2011, 3:24:26 PM5/5/11
to

The only proper way to avoid stack-overflows is to prevent it from
happening in the first place. If at all possible (and frequently it
is) avoid recursion, if it's unavoidable, as it sometimes is, make
sure your design prevents those situations from arising in the first
place.

Trying to solve a design problem by imposing artificial limitations
seems a bad idea.

Lowell Gilbert

unread,
May 5, 2011, 3:40:10 PM5/5/11
to

Without in any way disagreeing, I can't imagine being able to come up
with a contextual definition for "artificial" in that sentence.

--
Lowell Gilbert, embedded/networking software engineer
http://be-well.ilk.org/~lowell/

Ben Bacarisse

unread,
May 5, 2011, 4:17:09 PM5/5/11
to
"Kleuskes & Moos" <kle...@xs4all.nl> writes:

> On May 5, 11:37 am, boltar2...@boltar.world wrote:
>> If a program recurses too deeply there's always a chance that it could
>> run out of stack space and die with a SIGSEGV. Is there any API that can
>> tell you how much stack space is left or some other method to prevent this
>> in C/C++? I realise I could catch the signal but thats a pretty damn ugly
>> way to do it.
>

> The only proper way to avoid stack-overflows is to prevent it from
> happening in the first place. If at all possible (and frequently it
> is) avoid recursion, if it's unavoidable, as it sometimes is, make
> sure your design prevents those situations from arising in the first
> place.

I find this answer interesting, mainly because I think is suggests a
very different view about programming. I would not try to avoid
recursion "if at all possible". In general I consider that it is up to
the system to provide the resources needed by a program and this
includes stack for a reasonable amount of recursion. These resources
can run out of course, and the stack is special in that few languages
provide any way to handle its exhaustion elegantly, but that does not
seem enough reason to design out all recursion.

There are special cases: some environments are severely resource limited
but there's no indication that the OP is using such a system and,
anyway, I don't think you can make general rules from specific situations.

> Trying to solve a design problem by imposing artificial limitations
> seems a bad idea.

Isn't this what you are doing? Isn't the ban on recursion not an
artificial limitation?

--
Ben.

Datesfat Chicks

unread,
May 5, 2011, 6:45:59 PM5/5/11
to

The restrictions on stack depth are no different than the restrictions
on memory size or disk space that are present on any computer system.

These limits have been relaxed over time, and certainly programs will
run today on a typical computer that would not have run 10 years ago
on a typical computer.

"Don't use recursion" seems to be logically the same as "don't declare
very large arrays" or "don't create really big files". It is
dependent on the technology of the day, rather than being an absolute
standard.

These days I have a few files on my server that are on the order of
20G. That would have been unthinkable 10 years ago.

DFC

James Kuyper

unread,
May 5, 2011, 8:07:21 PM5/5/11
to
On 05/05/2011 03:24 PM, Kleuskes & Moos wrote:
> On May 5, 11:37�am, boltar2...@boltar.world wrote:
>> Hello
>>
>> If a program recurses too deeply there's always a chance that it could
>> run out of stack space and die with a SIGSEGV. Is there any API that can
>> tell you how much stack space is left or some other method to prevent this
>> in C/C++? I realise I could catch the signal but thats a pretty damn ugly
>> way to do it.
>>
>> B2003
>
> The only proper way to avoid stack-overflows is to prevent it from
> happening in the first place.

And how do you do that? If the stack size is sufficiently limited, even

int main(int argc, char*argv[]) { return 0; }

could overflow the stack.
You can take measures to reduce the likelihood of overflow, but there's
no easy way to know whether you've done enough. And doing enough on one
system might be a complete waste of time on others. The systems I use
most frequently allow me to put a gigabyte or more of data on the stack;
other systems are far more limited.

...


> Trying to solve a design problem by imposing artificial limitations
> seems a bad idea.

So what's the design solution you propose to "prevent overflow"?

--
James Kuyper

Ben Bacarisse

unread,
May 5, 2011, 9:53:43 PM5/5/11
to
James Kuyper <james...@verizon.net> writes:

> On 05/05/2011 03:24 PM, Kleuskes & Moos wrote:
>> On May 5, 11:37�am, boltar2...@boltar.world wrote:
>>> Hello
>>>
>>> If a program recurses too deeply there's always a chance that it could
>>> run out of stack space and die with a SIGSEGV. Is there any API that can
>>> tell you how much stack space is left or some other method to prevent this
>>> in C/C++? I realise I could catch the signal but thats a pretty damn ugly
>>> way to do it.
>>>
>>> B2003
>>
>> The only proper way to avoid stack-overflows is to prevent it from
>> happening in the first place.
>
> And how do you do that? If the stack size is sufficiently limited, even
>
> int main(int argc, char*argv[]) { return 0; }
>
> could overflow the stack.

Hmm... only in the most contrived implementation. The standard mandates
that an implementation must be able to translate and run at least one
program that is very much more complex than this. The system would have
to penalise simple, small programs deliberately!

It's a detail. I agree with your point that almost any call can be the
one that causes a stack overflow. Recursion need not be to blame.

<snip>
--
Ben.

Nobody

unread,
May 5, 2011, 10:27:42 PM5/5/11
to
On Thu, 05 May 2011 11:58:32 +0200, Xavier Roche wrote:

> There is no portable way to find the stack bottom/size ; I am not even
> sure what getrlimit(RLIMIT_STACK) returns exactly (current thread stack
> size ? main thread stack size ? default stack size ?)

It's the size of the region of address space allocated to the main
thread's stack.

> (And there is no portable way to find the stack top/bottom -- you'll
> have to play with an address on the current stack around main, and play
> with arithmetics with the page size)

If available, the value returned by alloca() will typically be the exact
bottom of the stack.

alloca() doesn't conform to any standard, but it's a common extension.


robert...@yahoo.com

unread,
May 6, 2011, 3:03:14 AM5/6/11
to


The more conventional terminology would have alloca() returning
something near the *top* of the stack - and the direction of stack
growth is irrelevant. But so will taking the address of an automatic,
which is much simpler. The approximate bottom of the stack could be
stored at entry to main().

bolta...@boltar.world

unread,
May 6, 2011, 4:59:23 AM5/6/11
to
On Thu, 5 May 2011 12:24:26 -0700 (PDT)
"Kleuskes & Moos" <kle...@xs4all.nl> wrote:

>On May 5, 11:37=A0am, boltar2...@boltar.world wrote:
>> Hello
>>
>> If a program recurses too deeply there's always a chance that it could
>> run out of stack space and die with a SIGSEGV. Is there any API that can
>> tell you how much stack space is left or some other method to prevent thi=

>s
>> in C/C++? I realise I could catch the signal but thats a pretty damn ugly
>> way to do it.
>>
>> B2003
>
>The only proper way to avoid stack-overflows is to prevent it from
>happening in the first place. If at all possible (and frequently it
>is) avoid recursion, if it's unavoidable, as it sometimes is, make
>sure your design prevents those situations from arising in the first
>place.

Well its always theoretically possible to re-write recursive code in an
iterative format but generally it leads to any reasonably complex recursive
code becoming an unreadable mess.

B2003


Måns Rullgård

unread,
May 6, 2011, 5:07:14 AM5/6/11
to
James Kuyper <james...@verizon.net> writes:

> On 05/05/2011 03:24 PM, Kleuskes & Moos wrote:
>> On May 5, 11:37�am, boltar2...@boltar.world wrote:
>>> Hello
>>>
>>> If a program recurses too deeply there's always a chance that it could
>>> run out of stack space and die with a SIGSEGV. Is there any API that can
>>> tell you how much stack space is left or some other method to prevent this
>>> in C/C++? I realise I could catch the signal but thats a pretty damn ugly
>>> way to do it.
>>>
>>> B2003
>>
>> The only proper way to avoid stack-overflows is to prevent it from
>> happening in the first place.
>
> And how do you do that? If the stack size is sufficiently limited, even
>
> int main(int argc, char*argv[]) { return 0; }
>
> could overflow the stack.
> You can take measures to reduce the likelihood of overflow, but there's
> no easy way to know whether you've done enough. And doing enough on one
> system might be a complete waste of time on others. The systems I use
> most frequently allow me to put a gigabyte or more of data on the stack;
> other systems are far more limited.

Recursion or not isn't really important. What matters is having a
static bound on the amount of stack space required. Any design with
potentially unbounded stack usage (determined by runtime input) is
flawed since there is no way to reliably detect and recover from a stack
overflow.

--
Måns Rullgård
ma...@mansr.com

Kleuskes & Moos

unread,
May 6, 2011, 5:18:37 AM5/6/11
to
On 5 mei, 21:40, Lowell Gilbert <lguse...@be-well.ilk.org> wrote:

> "Kleuskes & Moos" <kleu...@xs4all.nl> writes:
>
>
>
>
>
>
>
>
>
> > On May 5, 11:37 am, boltar2...@boltar.world wrote:
> >> Hello
>
> >> If a program recurses too deeply there's always a chance that it could
> >> run out of stack space and die with a SIGSEGV. Is there any API that can
> >> tell you how much stack space is left or some other method to prevent this
> >> in C/C++? I realise I could catch the signal but thats a pretty damn ugly
> >> way to do it.
>
> >> B2003
>
> > The only proper way to avoid stack-overflows is to prevent it from
> > happening in the first place. If at all possible (and frequently it
> > is) avoid recursion, if it's unavoidable, as it sometimes is, make
> > sure your design prevents those situations from arising in the first
> > place.
>
> > Trying to solve a design problem by imposing artificial limitations
> > seems a bad idea.
>
> Without in any way disagreeing, I can't imagine being able to come up
> with a contextual definition for "artificial" in that sentence.


Not inherent to the problem being solved or not part of the original,
abstract, algorithm being implemented. But you're right, "artificial"
is a weird word to use in the programming context.

bolta...@boltar.world

unread,
May 6, 2011, 5:18:43 AM5/6/11
to
On Fri, 6 May 2011 00:03:14 -0700 (PDT)
"robert...@yahoo.com" <robert...@yahoo.com> wrote:
>> If available, the value returned by alloca() will typically be the exact
>> bottom of the stack.
>>
>> alloca() doesn't conform to any standard, but it's a common extension.
>
>
>The more conventional terminology would have alloca() returning
>something near the *top* of the stack - and the direction of stack
>growth is irrelevant. But so will taking the address of an automatic,
>which is much simpler. The approximate bottom of the stack could be
>stored at entry to main().

I'd not heard of that function before. Calling it with zero - alloca(0) -
seems to work and return an address but the man page doesn't hint at
whether this should work or work in the future. Would it be best to call it
with a non zero value just in case the implementation changes?

B2003

Kleuskes & Moos

unread,
May 6, 2011, 5:34:50 AM5/6/11
to
On 5 mei, 22:17, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:

> "Kleuskes & Moos" <kleu...@xs4all.nl> writes:
>
> > On May 5, 11:37 am, boltar2...@boltar.world wrote:
> >> If a program recurses too deeply there's always a chance that it could
> >> run out of stack space and die with a SIGSEGV. Is there any API that can
> >> tell you how much stack space is left or some other method to prevent this
> >> in C/C++? I realise I could catch the signal but thats a pretty damn ugly
> >> way to do it.
>
> > The only proper way to avoid stack-overflows is to prevent it from
> > happening in the first place. If at all possible (and frequently it
> > is) avoid recursion, if it's unavoidable, as it sometimes is, make
> > sure your design prevents those situations from arising in the first
> > place.
>
> I find this answer interesting, mainly because I think is suggests a
> very different view about programming.  I would not try to avoid
> recursion "if at all possible".  In general I consider that it is up to
> the system to provide the resources needed by a program and this
> includes stack for a reasonable amount of recursion.  These resources
> can run out of course, and the stack is special in that few languages
> provide any way to handle its exhaustion elegantly, but that does not
> seem enough reason to design out all recursion.

If you compare recursive to equivalent non-recursive algorithms, i
think you will find recursion is usually slower and demands (much)
more resources. So as a rule of thumb, don't recurse, unless you have
to.

And yes, i do have a background in embedded systems, where these
considerations are more important than on modern PC's.

> There are special cases: some environments are severely resource limited
> but there's no indication that the OP is using such a system and,
> anyway, I don't think you can make general rules from specific situations.

well... It's bread and butter for the science guys and the practice is
called "inference". Nevertheless, it's easy to show that recursive
algorithms are outperformed my equivalent non-recursive ones.

The saving grace of recursion is that recursive implementations are
usually easier to understand. If it weren't for that, i'd ban the
practice outright.

> > Trying to solve a design problem by imposing artificial limitations
> > seems a bad idea.
>
> Isn't this what you are doing?  Isn't the ban on recursion not an
> artificial limitation?

Nope. It follows from the design of processors and a wish to avoid
unneccesary overhead (passing arguments, stack-frame maintenance,
pushing and popping return adresses) and that's without even
considering any effect a call may have on things like branch-
prediction, instruction pipe-lines and such. A 'call' instruction is
(relative to a branch) quite expensive and prevents some optimisations
like loop-unrolling.

So no, that's not an artificial limitation, but a design
consideration.

Kleuskes & Moos

unread,
May 6, 2011, 5:44:06 AM5/6/11
to
On 6 mei, 00:45, Datesfat Chicks <datesfat.chi...@gmail.com> wrote:
> On Thu, 05 May 2011 21:17:09 +0100, Ben Bacarisse
>
>
>
>
>
>
>
>
>
> <ben.use...@bsb.me.uk> wrote:

Ah yes... Computers today are so fast and so big that sound design is
superfluous. Unless of course you are dealing with huge amounts of
data, since not only have memory sizes and processor speeds increased,
the amount of data they are supposed to work on have also increased
dramatically, thus cancelling out said improvements in size and speed.

If you don't believe me, try importing planet.osm (see
http://wiki.openstreetmap.org/wiki/Planet.osm)

25 years ago i never imagined i would fill a 54 Gb partition (20 Mb
was a pretty big HD those days), now i'm using a 1Tb drive and looking
at options for an 10 Tb Raid-server.

robert...@yahoo.com

unread,
May 6, 2011, 5:56:14 AM5/6/11
to
On May 6, 4:18 am, boltar2...@boltar.world wrote:
> On Fri, 6 May 2011 00:03:14 -0700 (PDT)
>


While still non-portable, just write a function like:

char *getappxroximatestackpointer()
{
char c;
return &c;
}


Call it once at the beginning of main(), and save that value. Call it
again when you're interested, subtract the two (watch the order of the
subtraction based on the direction of stack growth - or do an abs() ),
and you'll have a pretty close measurement of the amount of stack
space you're current using on many implementations.

As I said, non-portable (not least, the stack doesn't have to be
contiguous storage), but it will work on many implementations.

bolta...@boltar.world

unread,
May 6, 2011, 5:58:43 AM5/6/11
to
I implemented a simple program that would check when it was getting near
the limit of the stack but for some reason Linux sends it SEGV when
(assuming my program is correct) there should be 10 or 20K of space left
on the stack. Is this normal? I include the code below for someone to point
out the stupid mistake I must have made somewhere. I changed all void * to
char * just in case that was throwing the maths but it made no difference:

#include <stdio.h>
#include <stdlib.h>
#include <alloca.h>
#include <sys/time.h>
#include <sys/resource.h>

char *stack_top;
unsigned int depth;

void recurse()
{
unsigned long int dist;

++depth;
dist = labs((long int)(stack_top - (char *)&dist));
if (dist < 20000)
{
printf("Nearing limit: depth = %u, addr = %lu, dist to end = %lu
\n",
depth,&dist,dist);
}
recurse();
}


int main()
{
struct rlimit rl;
long int mult;
char *top;

depth = 0;

/* Find out which way stack is growing */
top = (char *)alloca(0);
stack_top = (char *)alloca(1);

if (stack_top < top)
{
puts("Stack grows downwards");
mult = -1;
}
else
{
puts("Stack grows upwards");
mult = 1;
}

/* Now find where max extent of the stack by getting its max size */
if (getrlimit(RLIMIT_STACK,&rl) == -1)
{
perror("getrlimit()");
return 1;
}
stack_top += (rl.rlim_cur * mult);
printf("Stack max size = %lu\n",rl.rlim_cur);
printf("Stack max extent = %lu\n",stack_top);
getchar();

recurse();
return 0;
}


B2003


Kleuskes & Moos

unread,
May 6, 2011, 5:58:49 AM5/6/11
to
On 6 mei, 02:07, James Kuyper <jameskuy...@verizon.net> wrote:
> On 05/05/2011 03:24 PM, Kleuskes & Moos wrote:
>
> > On May 5, 11:37 am, boltar2...@boltar.world wrote:
> >> Hello
>
> >> If a program recurses too deeply there's always a chance that it could
> >> run out of stack space and die with a SIGSEGV. Is there any API that can
> >> tell you how much stack space is left or some other method to prevent this
> >> in C/C++? I realise I could catch the signal but thats a pretty damn ugly
> >> way to do it.
>
> >> B2003
>
> > The only proper way to avoid stack-overflows is to prevent it from
> > happening in the first place.
>
> And how do you do that? If the stack size is sufficiently limited, even
>
>         int main(int argc, char*argv[]) { return 0; }
>
> could overflow the stack.

That would be _very_ bad design, omitting to consider the basic
processing needs of the system.

> You can take measures to reduce the likelihood of overflow, but there's
> no easy way to know whether you've done enough.

Nope, there isn't. But then again, there's no easy way to produce any
software. Designing software _is_ hard and it will remain hard. Anyone
offering an "easy" solution to this kind of considerartions is either
a fool or a fraud.

> And doing enough on one
> system might be a complete waste of time on others.

True. But then again, considering the platform(s) your software will
run on is an integral part of the design process.

> The systems I use
> most frequently allow me to put a gigabyte or more of data on the stack;
> other systems are far more limited.

Nice. So what?

> > Trying to solve a design problem by imposing artificial limitations
> > seems a bad idea.
>
> So what's the design solution you propose to "prevent overflow"?

Avoid recursion if at all possible, for instance by using things like
finite state machines for parsing data. Recursion is quite costly. I'm
surprised i have to explain this in this forum.

And no, it's not _always_ possible to avoid recursion, not is it
_always_ wise to do so, but as a rule of thumb, it's good.

Kleuskes & Moos

unread,
May 6, 2011, 6:01:27 AM5/6/11
to

Bravo! That's he first sensible objection I've read in this thread,
and you're right. Recursive algorithms are usually easier to grasp.

Richard Kettlewell

unread,
May 6, 2011, 6:11:38 AM5/6/11
to
bolta...@boltar.world writes:

> I implemented a simple program that would check when it was getting near
> the limit of the stack but for some reason Linux sends it SEGV when
> (assuming my program is correct) there should be 10 or 20K of space left
> on the stack. Is this normal? I include the code below for someone to point
> out the stupid mistake I must have made somewhere. I changed all void * to
> char * just in case that was throwing the maths but it made no difference:

You're not taking into account the stack space above the call to main(),
which will include some amount of space used by Libc plus the command
line and environment.

--
http://www.greenend.org.uk/rjk/

bolta...@boltar.world

unread,
May 6, 2011, 6:16:57 AM5/6/11
to

Ah yes, of course. Doh!

Would there be any way to find out how much space they take up or alternatively
a way to find the starting base address of the stack when the process was
first exec'd?

B2003

Richard Kettlewell

unread,
May 6, 2011, 6:28:05 AM5/6/11
to

You could extend the stack down (or up as the case may be) until you get
a segfault and round the last good address to the page size. Not very
efficient though.

If you have access to the process's memory map you may be able to
identify the stack (e.g. /proc/self/maps on Linux, possibly analogous
interfaces on other platforms).

Split stack tricks like the gcc feature mentioned earlier will probably
betray you whatever you do...

--
http://www.greenend.org.uk/rjk/

James Kuyper

unread,
May 6, 2011, 7:08:37 AM5/6/11
to
On 05/06/2011 05:58 AM, Kleuskes & Moos wrote:
> On 6 mei, 02:07, James Kuyper <jameskuy...@verizon.net> wrote:
>> On 05/05/2011 03:24 PM, Kleuskes & Moos wrote:
>>
>>> On May 5, 11:37 am, boltar2...@boltar.world wrote:
>>>> Hello
>>
>>>> If a program recurses too deeply there's always a chance that it could
>>>> run out of stack space and die with a SIGSEGV. Is there any API that can
>>>> tell you how much stack space is left or some other method to prevent this
>>>> in C/C++? I realise I could catch the signal but thats a pretty damn ugly
>>>> way to do it.
>>
>>>> B2003
>>
>>> The only proper way to avoid stack-overflows is to prevent it from
>>> happening in the first place.
>>
>> And how do you do that? If the stack size is sufficiently limited, even
>>
>> � � � � int main(int argc, char*argv[]) { return 0; }
>>
>> could overflow the stack.
>
> That would be _very_ bad design, omitting to consider the basic
> processing needs of the system.
>
>> You can take measures to reduce the likelihood of overflow, but there's
>> no easy way to know whether you've done enough.
>
> Nope, there isn't. But then again, there's no easy way to produce any
> software. Designing software _is_ hard and it will remain hard. Anyone
> offering an "easy" solution to this kind of considerartions is either
> a fool or a fraud.

I'm not complaining about it being too much work to deal with this
problem; I'm complaining about the fact that there's absolutely no
portable way to deal with it, at all. There's not too many non-portable
ways, either, and none that I'm aware of that are not prohibitively
expensive in terms of developer time.

>> And doing enough on one
>> system might be a complete waste of time on others.
>
> True. But then again, considering the platform(s) your software will
> run on is an integral part of the design process.

There are programs whose purpose is inherently platform specific;
failing to take into consideration platform-specific features is indeed
a design flaw for such programs. There are other programs, such as the
ones I work on, that are required to be portable to wide variety of
platforms.

My own programs are allowed to assume support for C90, a correspondingly
early version of POSIX, and three widely ported third-party libraries.
They are required to either work correctly or fail gracefully (if, for
instance, malloc() returns a null pointer) on any platform which meets
those requirements. Thankfully, one of those libraries (HDF) handles a
lot of the portability issues itself, freeing my code to concentrate on
other issues. If my code has to take into consideration
platform-specific characteristics that cannot be tested using code that
meets these requirements, then it has violated a design requirement.

It shouldn't be difficult to meet such a requirement. The fact that C
provides no mechanism for determining whether or not you can safely call
a particular function without overflowing the stack makes it
unnecessarily difficult to do so.

>> The systems I use
>> most frequently allow me to put a gigabyte or more of data on the stack;
>> other systems are far more limited.
>
> Nice. So what?

So, it would be ridiculous to blindly prohibit the use of techniques
that take advantage of the capabilities of such platforms, just because
they won't work on other platforms. It's a problem with C (though it's
not unique to C) that it provides no way for my code to determine
whether it can safely take advantage of such capabilities.

>>> Trying to solve a design problem by imposing artificial limitations
>>> seems a bad idea.
>>
>> So what's the design solution you propose to "prevent overflow"?
>
> Avoid recursion if at all possible, for instance by using things like
> finite state machines for parsing data. Recursion is quite costly. I'm
> surprised i have to explain this in this forum.
>
> And no, it's not _always_ possible to avoid recursion, not is it
> _always_ wise to do so, but as a rule of thumb, it's good.

I'm not strongly tempted by recursion, except for traversal of certain
tree-like data structures - an issue which has seldom come up in my work.

But arbitrarily avoiding recursion makes as much sense to me as
arbitrarily avoiding binary trees, or any other design option.
--
James Kuyper

James Kuyper

unread,
May 6, 2011, 7:13:07 AM5/6/11
to
On 05/06/2011 06:01 AM, Kleuskes & Moos wrote:
> On 6 mei, 10:59, boltar2...@boltar.world wrote:
...

>> Well its always theoretically possible to re-write recursive code in an
>> iterative format but generally it leads to any reasonably complex recursive
>> code becoming an unreadable mess.
>
> Bravo! That's he first sensible objection I've read in this thread,
> and you're right. Recursive algorithms are usually easier to grasp.

I think that this objection was considered so obvious by many of us that
we didn't feel it needed emphasizing. It was referred to indirectly in
some of the other messages people did post. As it turns out, we were
right: you were in fact fully aware of the issue, and reached your
conclusion despite that fact.

--
James Kuyper

Kleuskes & Moos

unread,
May 6, 2011, 11:18:49 AM5/6/11
to

Of course there are portable ways of dealing with it. For instance:
it's simple enough to pass long an extra integer parameter and
decrementing some initial setting on every recursion. If the counter
reaches zero, print an error message and abort. Portable as hell. The
only thing you need to take a stab at are sensible initial values,
high enough to allow wellbehaved programs to run, but low enough not
to allow runaway recursion to blast the stack to kingdom come.

> >> And doing enough on one
> >> system might be a complete waste of time on others.
>
> > True. But then again, considering the platform(s) your software will
> > run on is an integral part of the design process.
>
> There are programs whose purpose is inherently platform specific;
> failing to take into consideration platform-specific features is indeed
> a design flaw for such programs. There are other programs, such as the
> ones I work on, that are required to be portable to wide variety of
> platforms.

Which is difficult and takes different and way more consideration of
the platforms involved.

> My own programs are allowed to assume support for C90, a correspondingly
> early version of POSIX, and three widely ported third-party libraries.
> They are required to either work correctly or fail gracefully (if, for
> instance, malloc() returns a null pointer) on any platform which meets
> those requirements. Thankfully, one of those libraries (HDF) handles a
> lot of the portability issues itself, freeing my code to concentrate on
> other issues. If my code has to take into consideration
> platform-specific characteristics that cannot be tested using code that
> meets these requirements, then it has violated a design requirement.

Wonderful. Sounds like you're doing your job. I wasn't talking about
introducing anything platform specific, though. Hence the '(s)' at the
end of 'platform(s)'. And judging by what you write, you ponder your
target platforms as diligently as any good programmer would. Otherwise
your would not choose C90.

> It shouldn't be difficult to meet such a requirement. The fact that C
> provides no mechanism for determining whether or not you can safely call
> a particular function without overflowing the stack makes it
> unnecessarily difficult to do so.
>
> >> The systems I use
> >> most frequently allow me to put a gigabyte or more of data on the stack;
> >> other systems are far more limited.
>
> > Nice. So what?
>
> So, it would be ridiculous to blindly prohibit the use of techniques
> that take advantage of the capabilities of such platforms, just because
> they won't work on other platforms.

Replacing a recursive algorithm with a non-recursive one is not
detrimental to performance on high-volume machines. In fact, it may
improve performance quite dramatically, depending on what gets
replaced with what.

Besides, i'm not 'prohibiting' anything, 'blindly' or otherwise. I'm
providing a simple bit of advice: 'lay off recursion, if at all
possible. Often, there are other ways'.

> It's a problem with C (though it's
> not unique to C) that it provides no way for my code to determine
> whether it can safely take advantage of such capabilities.

So pick another language. Besides, you realize that you are argueing
my point? That is if 'such capabilities' did indeed refer to
'employing recursion safely'.

Depending on your definition of 'safety' recursion can be safely
employed. If the greatest risk you run is getting a SIGSEGV and you
and your customer can live with that, recursion isn't really a
problem, especially since the risk of an actual stack-overflow and
consequent SIGSEGV is mostly hypothetical in many cases.

In other cases, it may kill someone.

> >>> Trying to solve a design problem by imposing artificial limitations
> >>> seems a bad idea.
>
> >> So what's the design solution you propose to "prevent overflow"?
>
> > Avoid recursion if at all possible, for instance by using things like
> > finite state machines for parsing data. Recursion is quite costly. I'm
> > surprised i have to explain this in this forum.
>
> > And no, it's not _always_ possible to avoid recursion, not is it
> > _always_ wise to do so, but as a rule of thumb, it's good.
>
> I'm not strongly tempted by recursion, except for traversal of certain
> tree-like data structures - an issue which has seldom come up in my work.
>
> But arbitrarily avoiding recursion makes as much sense to me as
> arbitrarily avoiding binary trees, or any other design option.

The word "arbitrarily" should be taken out of that sentence and shot
at dawn. That's the second time your argumen is maily in the
adjectives and i resent that. There is nothing arbitrary about it.

Marco Parrone

unread,
May 6, 2011, 11:32:25 AM5/6/11
to
"Kleuskes & Moos" <kle...@xs4all.nl> writes:

What about tail-calls? Sure not all recursive functions use tail-calls,
but for tail-calls gcc (and I guess other good compilers) do
tail-call-optimization. Can they be considered safe?

--
Marco Parrone <ma...@marcoparrone.com>
PGP Key fingerprint = 5E21 BED2 BF47 B3FB F17F 1DB4 D9BE B2B7 3C3A 07E2

Kleuskes & Moos

unread,
May 6, 2011, 11:51:53 AM5/6/11
to
On May 6, 5:32 pm, Marco Parrone <ma...@marcoparrone.com> wrote:

Who do you think i am? The Omniscient Oracle of C-Implementations?
well i'm not. If tail-recursion can be optimized away, then i guess it
should be no problem on platforms that actually do that. But if
portability is a issue, then it may produces a lot of other headaches,
since not ll compiler may support it, or support it in a subtly
different way.

RTFM, would be my advice, and use your own brain instead of mine.

Kleuskes & Moos

unread,
May 6, 2011, 12:02:12 PM5/6/11
to
On May 6, 1:13 pm, James Kuyper <jameskuy...@verizon.net> wrote:
> On 05/06/2011 06:01 AM, Kleuskes & Moos wrote:
>
> > On 6 mei, 10:59, boltar2...@boltar.world wrote:
> ...
> >> Well its always theoretically possible to re-write recursive code in an
> >> iterative format but generally it leads to any reasonably complex recursive
> >> code becoming an unreadable mess.
>
> > Bravo! That's he first sensible objection I've read in this thread,
> > and you're right. Recursive algorithms are usually easier to grasp.
>
> I think that this objection was considered so obvious by many of us that
> we didn't feel it needed emphasizing.

It's always nice to provide your own applauding audience, so unless
the 'we' is a 'pluralis maiestatis' i'm no likely to take that very
seriously.

> It was referred to indirectly in
> some of the other messages people did post. As it turns out, we were
> right: you were in fact fully aware of the issue, and reached your
> conclusion despite that fact.

Of course i was aware of the issue, and yes i reached that onclusion
in spite of the fact that non-recursive algorithm's tend to be a bit
less intuitive.

But since you have already (and wrongly, i might add) argued that
recursive algorithms are unsafe to begin with i still feel my position
is sound on this matter.

Michael Press

unread,
May 6, 2011, 6:09:36 PM5/6/11
to
In article
<3bd2f399-754b-40dc...@k25g2000yqf.googlegroups.com>,

"Kleuskes & Moos" <kle...@xs4all.nl> wrote:

> The saving grace of recursion is that recursive implementations are
> usually easier to understand. If it weren't for that, i'd ban the
> practice outright.

First it's recursive functions,
then world domination.
You are on a slippery slope.

--
Michael Press

Michael Press

unread,
May 6, 2011, 6:41:03 PM5/6/11
to
In article <iq0d9b$1qe$1...@speranza.aioe.org>, bolta...@boltar.world
wrote:

Does not have to. Implement a dynamically growable list
with caching and memory recovery. Design a data
structure for the data in the list. Design for list
growing failure. Write push and pop. Presto! Chango!
You have written a recursive function.

--
Michael Press

James Kuyper

unread,
May 6, 2011, 9:46:51 PM5/6/11
to
On 05/06/2011 11:18 AM, Kleuskes & Moos wrote:
> On May 6, 1:08�pm, James Kuyper <jameskuy...@verizon.net> wrote:
...

>> I'm not complaining about it being too much work to deal with this
>> problem; I'm complaining about the fact that there's absolutely no
>> portable way to deal with it, at all. There's not too many non-portable
>> ways, either, and none that I'm aware of that are not prohibitively
>> expensive in terms of developer time.
>
> Of course there are portable ways of dealing with it. For instance:
> it's simple enough to pass long an extra integer parameter and
> decrementing some initial setting on every recursion. If the counter
> reaches zero, print an error message and abort. Portable as hell.

That has nothing to do with the issue I was referring to.

> ... The


> only thing you need to take a stab at are sensible initial values,

That is the issue I was referring to: there's so portable solution to
determining what initial values are "sensible".

>> My own programs are allowed to assume support for C90, a correspondingly
>> early version of POSIX, and three widely ported third-party libraries.
>> They are required to either work correctly or fail gracefully (if, for
>> instance, malloc() returns a null pointer) on any platform which meets
>> those requirements. Thankfully, one of those libraries (HDF) handles a
>> lot of the portability issues itself, freeing my code to concentrate on
>> other issues. If my code has to take into consideration
>> platform-specific characteristics that cannot be tested using code that
>> meets these requirements, then it has violated a design requirement.
>
> Wonderful. Sounds like you're doing your job. I wasn't talking about

> introducing anything platform specific, though. ...

Any non-zero "sensible initial value' is definitely platform specific.

> ... Hence the '(s)' at the


> end of 'platform(s)'. And judging by what you write, you ponder your
> target platforms as diligently as any good programmer would. Otherwise
> your would not choose C90.

Don't give me the credit for that decision; it was made by our clients
at a time when C90 was half the age that C99 is now. At the time they
made that choice, C90 was substantially more widely supported than C99
is now, but it was still a decision that cost them significant
portability at that time. If it were up to me, right now, I'd choose C99
and C++98.

>>>> The systems I use
>>>> most frequently allow me to put a gigabyte or more of data on the stack;
>>>> other systems are far more limited.
>>
>>> Nice. So what?
>>
>> So, it would be ridiculous to blindly prohibit the use of techniques
>> that take advantage of the capabilities of such platforms, just because
>> they won't work on other platforms.
>
> Replacing a recursive algorithm with a non-recursive one is not
> detrimental to performance on high-volume machines. In fact, it may
> improve performance quite dramatically, depending on what gets
> replaced with what.

The main reason I'd go for a recursive algorithm, when appropriate,
would be maintainability, not performance, because the code would be
easier to write and easier to read. My comments about those
platform-specific capabilities is that I don't like being forced away
from that choice by reason of the existence of platforms where there
wasn't enough stack to support the recursion, unless it's actually
necessary to port my code to such platforms.

> Besides, i'm not 'prohibiting' anything, 'blindly' or otherwise. I'm
> providing a simple bit of advice: 'lay off recursion, if at all
> possible. Often, there are other ways'.

Your use of "if at all possible" is a bit disingenuous. As I'm sure you
already know, it's always possible to avoid recursion; it's never a
mandatory approach - just a convenient one.

>> It's a problem with C (though it's
>> not unique to C) that it provides no way for my code to determine
>> whether it can safely take advantage of such capabilities.
>
> So pick another language.

I'm not aware of any other language that provides a mechanism for
avoiding or dealing with this problem, either. There might be one, but I
don't know it's name.

> ... Besides, you realize that you are argueing


> my point? That is if 'such capabilities' did indeed refer to
> 'employing recursion safely'.

I was thinking more broadly than that: "employing function calls
safely". C provides no mechanism for determining whether or not there's
enough stack left to safely call any function - recursively or not -
note even the initial call to main(). Nor does it provide a mechanism
for dealing with the consequences, if there's not enough stack.
As I said above, I don't know of any other language that provides such
mechanisms, either.

...


>> But arbitrarily avoiding recursion makes as much sense to me as
>> arbitrarily avoiding binary trees, or any other design option.
>
> The word "arbitrarily" should be taken out of that sentence and shot
> at dawn. That's the second time your argumen is maily in the
> adjectives and i resent that. There is nothing arbitrary about it.

Your choice seems insufficiently motivated to me; "arbitrary" is a fine
word for conveying that judgment. It might be a mistaken judgment, but
then the fault is in the judgment, not in the word used to express it.
--
James Kuyper

Kleuskes & Moos

unread,
May 7, 2011, 3:28:20 AM5/7/11
to
On May 7, 3:46 am, James Kuyper <jameskuy...@verizon.net> wrote:
> On 05/06/2011 11:18 AM, Kleuskes & Moos wrote:
>
> > On May 6, 1:08 pm, James Kuyper <jameskuy...@verizon.net> wrote:
> ...
> >> I'm not complaining about it being too much work to deal with this
> >> problem; I'm complaining about the fact that there's absolutely no
> >> portable way to deal with it, at all. There's not too many non-portable
> >> ways, either, and none that I'm aware of that are not prohibitively
> >> expensive in terms of developer time.
>
> > Of course there are portable ways of dealing with it. For instance:
> > it's simple enough to pass long an extra integer parameter and
> > decrementing some initial setting on every recursion. If the counter
> > reaches zero, print an error message and abort. Portable as hell.
>
> That has nothing to do with the issue I was referring to.

In that case, you failed to make it clear what you were talking about.
I took it to be "applying recursion safely", but if it did not,
please...

> > ... The
> > only thing you need to take a stab at are sensible initial values,
>
> That is the issue I was referring to: there's so portable solution to
> determining what initial values are "sensible".

Usually there are. You can inspect the application domain and make an
estimate how deep the recursion could sensibly be. It is application
dependent, but then again, any software is application specific.

<snip>

> Any non-zero "sensible initial value' is definitely platform specific.

Nope, it's application specific, and merely assumes the platform has
enough memory to cope, and that's an assumption inherent to any
program, even your ultra-portable stuff.

If you're not capable of making an estimate of the maximum depth of
reccusion which will occur and how that jives with your target
platforms, every installation of your software is equivalent to just
rolling the dice.

And you don't strike me as a gambler. You rely on SIGSEGV so cap it, I
suggest you cap it yourself, since SIGSEGV is a luxury item on many
platforms.

> > ... Hence the '(s)' at the
> > end of 'platform(s)'. And judging by what you write, you ponder your
> > target platforms as diligently as any good programmer would. Otherwise
> > your would not choose C90.
>
> Don't give me the credit for that decision; it was made by our clients
> at a time when C90 was half the age that C99 is now. At the time they
> made that choice, C90 was substantially more widely supported than C99
> is now, but it was still a decision that cost them significant
> portability at that time. If it were up to me, right now, I'd choose C99
> and C++98.

And why? Perhaps because you know that version is supported on all
your target platforms? If you're not sure that it's supported across
all your targets, your choice would be irresponsable, and you do not
strike me as being that.

Besides, my point was only to note that target platforms must be
considered, and i consider it made and acknowledged.

<snip>

> > Replacing a recursive algorithm with a non-recursive one is not
> > detrimental to performance on high-volume machines. In fact, it may
> > improve performance quite dramatically, depending on what gets
> > replaced with what.
>
> The main reason I'd go for a recursive algorithm, when appropriate,
> would be maintainability, not performance, because the code would be
> easier to write and easier to read. My comments about those
> platform-specific capabilities is that I don't like being forced away
> from that choice by reason of the existence of platforms where there
> wasn't enough stack to support the recursion, unless it's actually
> necessary to port my code to such platforms.

Nobody's forcing anybody. You're right o consider readability and
maintainability important objectives. As a matter of fact, so do i.

However in my experience, the difference between recursive and non-
recursive algorithm's isn't that big. Unless of course you merely
produce a kludge instead of a design.

> > Besides, i'm not 'prohibiting' anything, 'blindly' or otherwise. I'm
> > providing a simple bit of advice: 'lay off recursion, if at all
> > possible. Often, there are other ways'.
>
> Your use of "if at all possible" is a bit disingenuous. As I'm sure you
> already know, it's always possible to avoid recursion; it's never a
> mandatory approach - just a convenient one.

Perhaps, and i tend to disagree without your providing some sort of
proof for that suggested theorem. It' also a costly and (at times)
risky one. For the reasons me, you and others have already mentioned.
The risk is in fact what gave rise to this thread.

The advice i provided still seems reasonable.

> >> It's a problem with C (though it's
> >> not unique to C) that it provides no way for my code to determine
> >> whether it can safely take advantage of such capabilities.
>
> > So pick another language.
>
> I'm not aware of any other language that provides a mechanism for
> avoiding or dealing with this problem, either. There might be one, but I
> don't know it's name.

So why say it's a problem with 'C', when in fact it's a problem in any
given language you know of?

> > ... Besides, you realize that you are argueing
> > my point? That is if 'such capabilities' did indeed refer to
> > 'employing recursion safely'.
>
> I was thinking more broadly than that: "employing function calls
> safely".

Oh, wow...

> C provides no mechanism for determining whether or not there's
> enough stack left to safely call any function - recursively or not -
> note even the initial call to main(). Nor does it provide a mechanism
> for dealing with the consequences, if there's not enough stack.
> As I said above, I don't know of any other language that provides such
> mechanisms, either.

C does not have a mechanism to guarantee 'any' statement will execute
successfully. Nor does any other (compiled) language. In interpreted
languages, however, it's a doddle.

But then again, the interpreter for such a language is compiled and so
the problem is merely shifted. Is there a point in any of this, or are
you merely providing verbal foliage?

> >> But arbitrarily avoiding recursion makes as much sense to me as
> >> arbitrarily avoiding binary trees, or any other design option.
>
> > The word "arbitrarily" should be taken out of that sentence and shot
> > at dawn. That's the second time your argumen is maily in the
> > adjectives and i resent that. There is nothing arbitrary about it.
>
> Your choice seems insufficiently motivated to me; "arbitrary" is a fine
> word for conveying that judgment. It might be a mistaken judgment, but
> then the fault is in the judgment, not in the word used to express it.

Well, pigheadedness seems a nice word to describe my judgment, so i'll
guess i'll use that.

Have a nice day. Remind me never to buy your software.

Kleuskes & Moos

unread,
May 7, 2011, 5:53:07 AM5/7/11
to
On May 7, 12:41 am, Michael Press <rub...@pacbell.net> wrote:
> In article <iq0d9b$1q...@speranza.aioe.org>, boltar2...@boltar.world

:) Thanx...

Dr Nick

unread,
May 7, 2011, 8:03:47 AM5/7/11
to
"Kleuskes & Moos" <kle...@xs4all.nl> writes:

> The saving grace of recursion is that recursive implementations are
> usually easier to understand. If it weren't for that, i'd ban the
> practice outright.

I'd be intrigued to see a non-recursive implementation of the code of a
library for creating, reading into memory and printing JSON data
structures as an obvious example.
--
Online waterways route planner | http://canalplan.eu
Plan trips, see photos, check facilities | http://canalplan.org.uk

Barry Margolin

unread,
May 7, 2011, 9:01:15 AM5/7/11
to
In article <87ei4ar...@temporary-address.org.uk>,
Dr Nick <3-no...@temporary-address.org.uk> wrote:

> "Kleuskes & Moos" <kle...@xs4all.nl> writes:
>
> > The saving grace of recursion is that recursive implementations are
> > usually easier to understand. If it weren't for that, i'd ban the
> > practice outright.
>
> I'd be intrigued to see a non-recursive implementation of the code of a
> library for creating, reading into memory and printing JSON data
> structures as an obvious example.

Recursive algorithms for applications like this are generally not a
problem. The recursion depth is proportional to the level of nesting of
the data structures, which is usually in the single digits, and
practically never more than a couple dozen. This is negligible on any
non-toy implementation.

Where recursion becomes an issue is when you use it for every element of
a sequential data structure. For instance, if a parser's algorithmic
structure were something like:

read_first_token
process_token
recurse(rest of document)

it would probably run into a stack limit on most implementations when
processing any real input.

--
Barry Margolin, bar...@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
*** PLEASE don't copy me on replies, I'll read them in the group ***

James Kuyper

unread,
May 7, 2011, 9:01:59 AM5/7/11
to
On 05/07/2011 03:28 AM, Kleuskes & Moos wrote:
> On May 7, 3:46�am, James Kuyper <jameskuy...@verizon.net> wrote:
>> On 05/06/2011 11:18 AM, Kleuskes & Moos wrote:
>>
>>> On May 6, 1:08 pm, James Kuyper <jameskuy...@verizon.net> wrote:
...
>>> ... The
>>> only thing you need to take a stab at are sensible initial values,
>>
>> That is the issue I was referring to: there's so portable solution to
>> determining what initial values are "sensible".
>
> Usually there are. You can inspect the application domain and make an
> estimate how deep the recursion could sensibly be. It is application
> dependent, but then again, any software is application specific.

If your application must run on a given platform, and that platform
doesn't provide enough room on the stack to handle the worst case, it
would be helpful to have some way of detecting, from within the program,
that it is in fact the worst case - before it causes your program to
abort. That is just as much a problem with recursive solutions, as
non-recursive ones.

...


> And you don't strike me as a gambler. You rely on SIGSEGV so cap it, I
> suggest you cap it yourself, since SIGSEGV is a luxury item on many
> platforms.

I don't rely on SIGSEGV. Among the restrictions placed on my code by our
client is a prohibition on using the features of <signal.h>. My code has
to play nicely with a library provided by another vendor, and that
library presumably (I've never checked) does things with signals that my
code is not allowed to interfere with.

...


>>>> It's a problem with C (though it's
>>>> not unique to C) that it provides no way for my code to determine
>>>> whether it can safely take advantage of such capabilities.
>>
>>> So pick another language.
>>
>> I'm not aware of any other language that provides a mechanism for
>> avoiding or dealing with this problem, either. There might be one, but I
>> don't know it's name.
>
> So why say it's a problem with 'C', when in fact it's a problem in any
> given language you know of?

Yes. The fact that it's a problem with every language I know of implies,
in particular, that it is a problem with C. There's no contradiction
there, and I even emphasized precisely that point by saying "(though
it's not unique to C)".

...


> Have a nice day. Remind me never to buy your software.

You can't buy it - it's only available for free - another requirement
imposed by our client.
--
James Kuyper

Dr Nick

unread,
May 7, 2011, 9:32:44 AM5/7/11
to
Barry Margolin <bar...@alum.mit.edu> writes:

> In article <87ei4ar...@temporary-address.org.uk>,
> Dr Nick <3-no...@temporary-address.org.uk> wrote:
>
>> "Kleuskes & Moos" <kle...@xs4all.nl> writes:
>>
>> > The saving grace of recursion is that recursive implementations are
>> > usually easier to understand. If it weren't for that, i'd ban the
>> > practice outright.
>>
>> I'd be intrigued to see a non-recursive implementation of the code of a
>> library for creating, reading into memory and printing JSON data
>> structures as an obvious example.
>
> Recursive algorithms for applications like this are generally not a
> problem. The recursion depth is proportional to the level of nesting of
> the data structures, which is usually in the single digits, and
> practically never more than a couple dozen. This is negligible on any
> non-toy implementation.

I know. It's not me who is arguing that recursion is evil and in an
ideal world ought to be banned!

> Where recursion becomes an issue is when you use it for every element of
> a sequential data structure. For instance, if a parser's algorithmic
> structure were something like:
>
> read_first_token
> process_token
> recurse(rest of document)
>
> it would probably run into a stack limit on most implementations when
> processing any real input.

Most implementations where the compiler doesn't optimise tail recursion
away, anyway.

Kleuskes & Moos

unread,
May 7, 2011, 9:35:08 AM5/7/11
to
On May 7, 3:01 pm, James Kuyper <jameskuy...@verizon.net> wrote:
> On 05/07/2011 03:28 AM, Kleuskes & Moos wrote:
>
> > On May 7, 3:46 am, James Kuyper <jameskuy...@verizon.net> wrote:
> >> On 05/06/2011 11:18 AM, Kleuskes & Moos wrote:
>
> >>> On May 6, 1:08 pm, James Kuyper <jameskuy...@verizon.net> wrote:
> ...
> >>> ... The
> >>> only thing you need to take a stab at are sensible initial values,
>
> >> That is the issue I was referring to: there's so portable solution to
> >> determining what initial values are "sensible".
>
> > Usually there are. You can inspect the application domain and make an
> > estimate how deep the recursion could sensibly be. It is application
> > dependent, but then again, any software is application specific.
>
> If your application must run on a given platform, and that platform
> doesn't provide enough room on the stack to handle the worst case, it
> would be helpful to have some way of detecting, from within the program,
> that it is in fact the worst case - before it causes your program to
> abort. That is just as much a problem with recursive solutions, as
> non-recursive ones.

Sure, and theres a slew of profilers and other nifty gadgets around
that will help you with such problems. That's still at compile time,
but if you need to inspect such things at runtime, either the design
(and with that, the hardware restrictions such as minimum required
memory size) or the implementation is seriously flawed.

In non-recursive software, the problem is relatively easily solved
using such tools, recursive software is a bit more tricky, since you
CANNOT calculate a maximum call-tree height in advance. Hence my
suggestion not to use recursive algorithms, if you can avoid it.

> > And you don't strike me as a gambler. You rely on SIGSEGV so cap it, I
> > suggest you cap it yourself, since SIGSEGV is a luxury item on many
> > platforms.
>
> I don't rely on SIGSEGV. Among the restrictions placed on my code by our
> client is a prohibition on using the features of <signal.h>.

Let me see if i got that straight... You've been arguing that there's
no portable way to cap recursion and a runaway recursive process will
overflow the stack, resulting in a SIGSEGV...

That means the only cap placed on recursion in such a case is the
kernel THROWING a SIGSEGV. This will terminate the process in question
quite effectively, but not to the liking f your client, i imagine.

> My code has
> to play nicely with a library provided by another vendor, and that
> library presumably (I've never checked) does things with signals that my
> code is not allowed to interfere with.

I guess it's more a question of portability, even the POSIX standard
leaves room for variation across platforms, so using <signal.h> would
be a major liability if portability is a major concern.

<snip>

> > So why say it's a problem with 'C', when in fact it's a problem in any
> > given language you know of?
>
> Yes. The fact that it's a problem with every language I know of implies,
> in particular, that it is a problem with C. There's no contradiction
> there, and I even emphasized precisely that point by saying "(though
> it's not unique to C)".

It's like telling a man "the problem with this car is that you might
run out of gas driving it, if you don't pay attention to the fuel-
gauge". No contradiction, but that doesn't mean nothing is wrong with
it.

> > Have a nice day. Remind me never to buy your software.
>
> You can't buy it - it's only available for free - another requirement
> imposed by our client.


I like your client.

Rainer Weikusat

unread,
May 7, 2011, 11:27:36 AM5/7/11
to
Dr Nick <3-no...@temporary-address.org.uk> writes:
> Barry Margolin <bar...@alum.mit.edu> writes:

[...]

>> Where recursion becomes an issue is when you use it for every element of
>> a sequential data structure. For instance, if a parser's algorithmic
>> structure were something like:
>>
>> read_first_token
>> process_token
>> recurse(rest of document)
>>
>> it would probably run into a stack limit on most implementations when
>> processing any real input.
>
> Most implementations where the compiler doesn't optimise tail recursion
> away, anyway.

Eh ... you do understand that 'compiler detects that programmer was a
crackpot and works around that automatically' implies that recursion
is probematic, do you?

Dr Nick

unread,
May 7, 2011, 11:43:12 AM5/7/11
to
Rainer Weikusat <rwei...@mssgmbh.com> writes:

I don't even understand the question.

Niklas Holsti

unread,
May 7, 2011, 12:11:52 PM5/7/11
to
James Kuyper wrote:

> I'm not aware of any other language that provides a mechanism for
> avoiding or dealing with this problem, either. There might be one, but I
> don't know it's name.

Stack overflow in the Ada language is expected to raise the
Storage_Error exception, which can be caught and handled in the Ada
program itself. However, this can depend on compilation options, for
example the GNU Ada compiler gnat needs the option -fstack-check for
Storage_Error to work in this way, otherwise stack overflow just signals
a SIGSEGV.

You can also do the recursive computation in a dedicated thread (an Ada
"task") and control the amount of stack space allocated to the thread by
means of pragma Storage_Size, with a static constant value or a
dynamically calculated value, when the thread is created.

If you expect that Storage_Error may be raised by stack overflow during
recursion, it is a good idea to put the handler for Storage_Error
outside the recursion, so that the recursion unwinds and stack space is
again available before the handler is entered.

--
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
. @ .

bolta...@boltar.world

unread,
May 7, 2011, 12:17:17 PM5/7/11
to
On Fri, 06 May 2011 15:41:03 -0700
Michael Press <rub...@pacbell.net> wrote:
>> Well its always theoretically possible to re-write recursive code in an
>> iterative format but generally it leads to any reasonably complex recursive
>> code becoming an unreadable mess.
>
>Does not have to. Implement a dynamically growable list
>with caching and memory recovery. Design a data
>structure for the data in the list. Design for list
>growing failure. Write push and pop. Presto! Chango!
>You have written a recursive function.

So to implement a recursive function iteratively you manual stacks?
Wow, who knew!
</sarcasm>

B2003

bolta...@boltar.world

unread,
May 7, 2011, 12:23:26 PM5/7/11
to
On Sat, 07 May 2011 09:01:15 -0400
Barry Margolin <bar...@alum.mit.edu> wrote:
>Where recursion becomes an issue is when you use it for every element of
>a sequential data structure. For instance, if a parser's algorithmic
>structure were something like:
>
>read_first_token
>process_token
>recurse(rest of document)
>
>it would probably run into a stack limit on most implementations when
>processing any real input.

Some language interpreters internally recurse along with any recursing in the
program they're interpreting. Awk is a good example and will happily crash
with a SIGSEGV due to recursion. eg:

BEGIN {
blah()
}

function blah()
{
blah()
}


B2003


William Ahern

unread,
May 7, 2011, 12:32:44 PM5/7/11
to
In comp.lang.c Dr Nick <3-no...@temporary-address.org.uk> wrote:
> "Kleuskes & Moos" <kle...@xs4all.nl> writes:

> > The saving grace of recursion is that recursive implementations are
> > usually easier to understand. If it weren't for that, i'd ban the
> > practice outright.

> I'd be intrigued to see a non-recursive implementation of the code of a
> library for creating, reading into memory and printing JSON data
> structures as an obvious example.

I would hope such library was non-recursive. This has Denial of Service
written all over it, considering that JSON data usually comes from untrusted
sources, and even many scripting languages use the C stack for calls in the
scripting language.

I suppose one could add kludges like a depth counter. I say kludge because
stack memory in C is such an opaque and unmeasureable quantity.

Kleuskes & Moos

unread,
May 7, 2011, 12:59:49 PM5/7/11
to
On May 7, 2:03 pm, Dr Nick <3-nos...@temporary-address.org.uk> wrote:

> "Kleuskes & Moos" <kleu...@xs4all.nl> writes:
>
> > The saving grace of recursion is that recursive implementations are
> > usually easier to understand. If it weren't for that, i'd ban the
> > practice outright.
>
> I'd be intrigued to see a non-recursive implementation of the code of a
> library for creating, reading into memory and printing JSON data
> structures as an obvious example.

Not that difficult. I suspect most (stable) JSON versions are non-
recursive and (like many other parsers and parser generators such as
flex), employ a FSM instead.

For good reason, too, as William Ahern points out.

William Ahern

unread,
May 7, 2011, 1:08:19 PM5/7/11
to

Not necessarily... or rather, not exactly. Take a red-black tree
implementation as an example. You can recurse, you can keep a manual stack,
or you can use parent pointers. The beauty of the last option is that once
you've allocated an object there's no possibility for collateral insertion
failure (i.e. out-of-memory). If the sibling and parent pointers are members
of the application object, then you can guarantee tree insertion.

The downside is that maintaining parent pointers is more of a headache than
a simple stack, and also most of those parent pointers are wasting memory,
because you only required as many as were necessary to maintain a stack for
the particular operation.

I prefer all of my data structures to be free from resource acquisition
failures. This makes RAII more convenient in C.

My point is, you can roll the stack into whichever data structure you're
working on, and a linked-list stack can be far more informative than an
array stack (manual or implicit). Sometimes iterative algorithms like this
are easier to comprehend. I'm writing a non-recursive implementation of
left-leaning red-black trees, and insertion in particular seems much simpler
in its iterative form.

Casper H.S. Dik

unread,
May 7, 2011, 1:22:04 PM5/7/11
to
bolta...@boltar.world writes:

And instead you run out of heap.

Casper
--

Kleuskes & Moos

unread,
May 7, 2011, 1:24:42 PM5/7/11
to
On May 7, 7:22 pm, Casper H.S. Dik <Casper....@OrSPaMcle.COM> wrote:

Aye. But there's portable ways of checking that: malloc will just
return NULL, which can be handled nicely and safely, instead of the
system throwing a SIGSEGV.

Kleuskes & Moos

unread,
May 7, 2011, 1:30:04 PM5/7/11
to
On May 7, 3:01 pm, Barry Margolin <bar...@alum.mit.edu> wrote:
<snip>

> read_first_token
> process_token
> recurse(rest of document)

Where i work, code like that is likely to get you fired on the spot.

Casper H.S. Dik

unread,
May 7, 2011, 1:32:55 PM5/7/11
to
"Kleuskes & Moos" <kle...@xs4all.nl> writes:

>On May 7, 7:22=A0pm, Casper H.S. Dik <Casper....@OrSPaMcle.COM> wrote:
>> boltar2...@boltar.world writes:
>> >On Fri, 06 May 2011 15:41:03 -0700
>> >Michael Press <rub...@pacbell.net> wrote:

>> >>> Well its always theoretically possible to re-write recursive code in =
>an
>> >>> iterative format but generally it leads to any reasonably complex rec=


>ursive
>> >>> code becoming an unreadable mess.
>>
>> >>Does not have to. Implement a dynamically growable list
>> >>with caching and memory recovery. Design a data
>> >>structure for the data in the list. Design for list
>> >>growing failure. Write push and pop. Presto! Chango!
>> >>You have written a recursive function.
>> >So to implement a recursive function iteratively you manual stacks?
>> >Wow, who knew!
>> ></sarcasm>
>>
>> And instead you run out of heap.

>Aye. But there's portable ways of checking that: malloc will just
>return NULL, which can be handled nicely and safely, instead of the
>system throwing a SIGSEGV.

As other have pointed out, it is possible to catch this and properly
terminate (the same when you run out of stack)

Casper
--

Kleuskes & Moos

unread,
May 7, 2011, 2:43:39 PM5/7/11
to
On 7 mei, 19:32, Casper H.S. Dik <Casper....@OrSPaMcle.COM> wrote:

Ah... That's news. I was under the distinct impression that stack-
overflows are impossible to catch. Please expound on your method to
reliably (and portably) catch stack-overflows in recursive algorithms.

I'm quite curious.

Nobody

unread,
May 7, 2011, 2:57:06 PM5/7/11
to
On Sat, 07 May 2011 13:03:47 +0100, Dr Nick wrote:

>> The saving grace of recursion is that recursive implementations are
>> usually easier to understand. If it weren't for that, i'd ban the
>> practice outright.
>
> I'd be intrigued to see a non-recursive implementation of the code of a
> library for creating, reading into memory and printing JSON data
> structures as an obvious example.

"Real" parsers are seldom recursive, but instead use an explicit stack
(pushdown automaton). Recursive-descent parsers tend to be used by people
who are unfamiliar with existing practice and just used the first approach
they thought of.

The main drawbacks of recursive parsers are:

1. They need hacks to avoid infinite recursion when dealing with
left-recursive grammars.
2. They're "pull" orientated; once they start reading an entity, they
don't return until they've finished.

For anything dealing with networking, point #2 favours the automaton
approach, as you can push tokens into the parser as and when they arrive
over the network, and get on with something else (e.g. redraw) when no
more data is available. To obtain a similar interface, a recursive parser
needs to be turned inside-out, using continuations, threads (coroutines),
etc.

OTOH, actually processing the subsequent parse tree naturally lends itself
to a recursive algorithm. Of course, any such algorithm can be converted
to a non-recursive algorithm, but the result will typically resemble
machine code (i.e. with explicit "call" and "return" operations), which
largely defeats the point of having high-level languages.

Nobody

unread,
May 7, 2011, 3:01:37 PM5/7/11
to
On Sat, 07 May 2011 10:30:04 -0700, Kleuskes & Moos wrote:

>> read_first_token
>> process_token
>> recurse(rest of document)
>
> Where i work, code like that is likely to get you fired on the spot.

OTOH, in functional languages (ML, Haskell, etc) it's the correct
approach. Iteration is just tail recursion; it's the compiler's job to
optimise it.

Such languages don't have iteration primitives, and any standard
functions for performing iteration are written using recursion.

Nobody

unread,
May 7, 2011, 3:20:55 PM5/7/11
to
On Sat, 07 May 2011 10:08:19 -0700, William Ahern wrote:

> Not necessarily... or rather, not exactly. Take a red-black tree
> implementation as an example. You can recurse, you can keep a manual stack,
> or you can use parent pointers. The beauty of the last option is that once
> you've allocated an object there's no possibility for collateral insertion
> failure (i.e. out-of-memory). If the sibling and parent pointers are members
> of the application object, then you can guarantee tree insertion.

OTOH, if the parent pointers are part of the object, the object can only
be inserted into a single tree (or, at least, a fixed set of trees).

Dr Nick

unread,
May 7, 2011, 3:24:40 PM5/7/11
to
"Kleuskes & Moos" <kle...@xs4all.nl> writes:

It's not so much the parser, it's the data store and the output.

suppose you have the following sort of interface:

object *new_atom(char *contents);
object *new_list(void);
object *new_array(void);
void add_to_list(object* destination, object *source);
void add_to_array(object* destination, char *name, object *source);
void print_object(object *item);

How do you code print_object in a non-recursive way? Remember, we might
have done something like this:

object *s1 = new_atom("hello");
object *s2 = new_atom("world");
object *l1 = new_list();
object *l2 = new_list();
add_to_list(l1,s1);
add_to_list(l1,s2);
add_to_list(l2,s2);
add_to_list(l2,s1);
add_to_list(l2,l1);
object a1 = new_array();
add_to_array(a1,"hi",s1);
add_to_array(a1,"simple",l1);
add_to_array(a1,"notsosimple",l2);
add_to_array(a1,"what",s2);
print_object(a1);

Somewhere you are going to need a queue. I'm far from convinced
(particularly given the way many modern machines act when memory gets
tight; even if you turn off optimistic allocation, many will grind to a
halt thrashing the swap months before malloc returns NULL) that catching
malloc failures in a manual queue is any better than catching a signal
from the stack filling up.

Dr Nick

unread,
May 7, 2011, 3:26:08 PM5/7/11
to
Nobody <nob...@nowhere.com> writes:

> On Sat, 07 May 2011 13:03:47 +0100, Dr Nick wrote:
>
>>> The saving grace of recursion is that recursive implementations are
>>> usually easier to understand. If it weren't for that, i'd ban the
>>> practice outright.
>>
>> I'd be intrigued to see a non-recursive implementation of the code of a
>> library for creating, reading into memory and printing JSON data
>> structures as an obvious example.
>

> OTOH, actually processing the subsequent parse tree naturally lends itself
> to a recursive algorithm. Of course, any such algorithm can be converted
> to a non-recursive algorithm, but the result will typically resemble
> machine code (i.e. with explicit "call" and "return" operations), which
> largely defeats the point of having high-level languages.

Indeed. Parsing is a red herring here. It's how you represent and then
how you traverse a naturally recursive data structure, such as pretty
well any modern language has.

Kleuskes & Moos

unread,
May 7, 2011, 3:37:31 PM5/7/11
to
On May 7, 9:01 pm, Nobody <nob...@nowhere.com> wrote:
> On Sat, 07 May 2011 10:30:04 -0700, Kleuskes & Moos wrote:
> >> read_first_token
> >> process_token
> >> recurse(rest of document)
>
> > Where i work, code like that is likely to get you fired on the spot.
>
> OTOH, in functional languages (ML, Haskell, etc) it's the correct
> approach. Iteration is just tail recursion; it's the compiler's job to
> optimise it.

True, and you can add Prolog to the list. But we're discussing 'C'
IIRC, and my remark was made in that context.

> Such languages don't have iteration primitives, and any standard
> functions for performing iteration are written using recursion.

Correct.

Nobody

unread,
May 7, 2011, 3:41:33 PM5/7/11
to
On Sat, 07 May 2011 10:24:42 -0700, Kleuskes & Moos wrote:

>> And instead you run out of heap.
>
> Aye. But there's portable ways of checking that: malloc will just
> return NULL, which can be handled nicely and safely, instead of the
> system throwing a SIGSEGV.

You forgot a case: malloc() succeeds but the process receives SIGKILL when
it actually tries to use the allocated memory (OOM-killer).

This can be prevented by disabling overcommit, but on some systems the
cost of doing so is unacceptably high, i.e. you may need to provide ten
times (or worse) as much swap space as will ever be used in practice.

Keith Thompson

unread,
May 7, 2011, 3:48:14 PM5/7/11
to
Nobody <nob...@nowhere.com> writes:
> On Sat, 07 May 2011 10:24:42 -0700, Kleuskes & Moos wrote:
>>> And instead you run out of heap.
>>
>> Aye. But there's portable ways of checking that: malloc will just
>> return NULL, which can be handled nicely and safely, instead of the
>> system throwing a SIGSEGV.
>
> You forgot a case: malloc() succeeds but the process receives SIGKILL when
> it actually tries to use the allocated memory (OOM-killer).

Actaully the OOM-killer kills *some* process, not necessarily the one
that just tried to use allocated memory.

> This can be prevented by disabling overcommit, but on some systems the
> cost of doing so is unacceptably high, i.e. you may need to provide ten
> times (or worse) as much swap space as will ever be used in practice.

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Nobody

unread,
May 7, 2011, 4:06:44 PM5/7/11
to
On Sat, 07 May 2011 20:26:08 +0100, Dr Nick wrote:

>> OTOH, actually processing the subsequent parse tree naturally lends itself
>> to a recursive algorithm. Of course, any such algorithm can be converted
>> to a non-recursive algorithm, but the result will typically resemble
>> machine code (i.e. with explicit "call" and "return" operations), which
>> largely defeats the point of having high-level languages.
>
> Indeed. Parsing is a red herring here. It's how you represent and then
> how you traverse a naturally recursive data structure, such as pretty
> well any modern language has.

An interpreted programming language may still use a "hand-rolled" call
stack, as it can simplify the implementation of features such as
exceptions and continuations.

More mundane processing of recursive structures[1] is likely to just use
recursive function calls.

[1] And also divide-and-conquer algorithms on linear structures, e.g.
quicksort.

Kleuskes & Moos

unread,
May 7, 2011, 4:53:41 PM5/7/11
to

Spot on. I hadn't thought of that one.

<reads up on OOM-Killer>

However, if i understand http://linux-mm.org/OOM_Killer correctly
(interesting read, BTW), processes may be killed, but not with a
SIGSEGV, so the program does not crash, but is terminated by the
system. This may not be the desired outcome, but it's not a bug.

And as K. Thompson points out, it may actually kill a different
process, based on the result of badness() and the Law of Least
Surprize.

In any case, malloc should still return NULL if allocation fails, and
the manpage says it does. That processes may be terminated by the
system as a consequence of that call is another matter entirely.

William Ahern

unread,
May 7, 2011, 5:26:24 PM5/7/11
to

True, but I don't think I've worked on a software application that required
insertion into an arbitrary number of trees, excepting actual language
implementations. In fact, I don't think I've ever needed an object to be
directly contained in more than two or three trees or lists simultaneously.
An application basically boils down to the selection and use of data
structures. If you modify the application, and add or remove a data
structure, then modify the code accordingly. I can't remember the last time
I used a void pointer in a tree or list structure in C. It just seems wrong.
I don't believe I've ever needed a complex data structure to contain objects
of unrelated type.

Things get fuzzy when one has to be concerned with modules or other third
party code. But if objects are being wrapped or subclassed, then the user
can handle implementing a container. Anyhow, those patterns are more common
in languages that handle memory management, provide very generic or
universal primitives like hashes, and often are loosely typed. In those
cases a failed allocation typically leads to the entire application
aborting; for good reason, too. If statically describing a data and object
model is too onerous, then recovering gracefully from memory errors is also
likely to be impractical or error prone.

In my opinion there's a misunderstanding about code reuse and modularity.
The myth is that reuse means the code itself should be reuseable and
extendable. But reuse is about the reuse of a solution, not code. A suitable
high-level API oriented toward the solution is usually all the programmer
should be concerned with. You don't subclass business solutions. That's
nonsensical. It's exceedingly rare to need to subclass an HTTP parser, for
example. If people are trying to tinker with the innards, then the API is
poor. This is where the beauty of Open Source comes in; you can actually fix
the API and improve the solution rather than hack around things.

Michael Press

unread,
May 7, 2011, 5:31:18 PM5/7/11
to
In article <3tag98-...@wilbur.25thandClement.com>,
William Ahern <wil...@wilbur.25thandClement.com> wrote:

I thought of that, but technically it is not a tree;
and I mean to stand on that technicality. It needs
O(depth of the tree) storage to traverse a tree.

> I prefer all of my data structures to be free from resource acquisition
> failures. This makes RAII more convenient in C.
>
> My point is, you can roll the stack into whichever data structure you're
> working on, and a linked-list stack can be far more informative than an
> array stack (manual or implicit). Sometimes iterative algorithms like this
> are easier to comprehend. I'm writing a non-recursive implementation of
> left-leaning red-black trees, and insertion in particular seems much simpler
> in its iterative form.

This means that you allocate all memory necessary
before beginning to build the data structure? And
never need to allocate more memory?

--
Michael Press

Michael Press

unread,
May 7, 2011, 6:04:04 PM5/7/11
to
In article
<8f882810-2e4e-404f...@e35g2000yqc.googlegroups.com>,

A FSM has a hard coded bound on depth.
Can you make these parser generators
admit to reaching their limit by asking
them to parse input that goes too deep?

--
Michael Press

Barry Margolin

unread,
May 7, 2011, 6:12:28 PM5/7/11
to
In article <iq3rlr$6gp$1...@speranza.aioe.org>, bolta...@boltar.world
wrote:

That's not recursing in the parser, that's recursing in the interpreter.
It't not surprising that a script with infinite recursion might cause
infinite recursion in the interpreter.

And even if the interpreter didn't recurse when the script does, it
still has to keep the script's stack somewhere. So if it doesn't blow
out the interpreter's stack, it will run out of VM eventually.

--
Barry Margolin, bar...@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
*** PLEASE don't copy me on replies, I'll read them in the group ***

Kleuskes & Moos

unread,
May 7, 2011, 6:15:29 PM5/7/11
to
On May 8, 12:04 am, Michael Press <rub...@pacbell.net> wrote:
> In article
> <8f882810-2e4e-404f-a660-fdcbb6736...@e35g2000yqc.googlegroups.com>,

>  "Kleuskes & Moos" <kleu...@xs4all.nl> wrote:
>
> > On May 7, 2:03 pm, Dr Nick <3-nos...@temporary-address.org.uk> wrote:
> > > "Kleuskes & Moos" <kleu...@xs4all.nl> writes:
>
> > > > The saving grace of recursion is that recursive implementations are
> > > > usually easier to understand. If it weren't for that, i'd ban the
> > > > practice outright.
>
> > > I'd be intrigued to see a non-recursive implementation of the code of a
> > > library for creating, reading into memory and printing JSON data
> > > structures as an obvious example.
>
> > Not that difficult. I suspect most (stable) JSON versions are non-
> > recursive and (like many other parsers and parser generators such as
> > flex), employ a FSM instead.
>
> A FSM has a hard coded bound on depth.

Since when?

> Can you make these parser generators
> admit to reaching their limit by asking
> them to parse input that goes too deep?

Huh?

Barry Margolin

unread,
May 7, 2011, 6:15:53 PM5/7/11
to
In article
<614e7c8f-8298-45d1...@g12g2000yqd.googlegroups.com>,

"Kleuskes & Moos" <kle...@xs4all.nl> wrote:

I wasn't actually suggesting it, I think it's actually an unlikely way
to write C programs. Most parsers are either state machines, which are
purely iterative, or they recurse for nested structures while iterating
for the tokens within it.

Keith Thompson

unread,
May 7, 2011, 6:16:26 PM5/7/11
to
"Kleuskes & Moos" <kle...@xs4all.nl> writes:

The problem, as I see it, is that by returning a non-null value, the
system promises the program that it can use the allocated memory.
Later invoking the OOM killer when the program actually tries to
use it seems to me to be a violation of that promise.

I understand that overly optimistic allocation can make sense
for stack space; a program will likely never use as much stack
as it could. But I'm not convinced that the same thing applies
to malloc(). If a program calls malloc(), it's telling the OS that
it *will* use that memory (and probably soon). If the system can't
fulfill the request, it should refuse it in the first place.

Do many actual programs malloc() much more memory than they're
actually going to use?

Måns Rullgård

unread,
May 7, 2011, 6:42:02 PM5/7/11
to
Keith Thompson <ks...@mib.org> writes:

They do indeed, and that practice is the very reason for overcommit
existing in the first place.

--
Måns Rullgård
ma...@mansr.com

Ian Collins

unread,
May 7, 2011, 6:46:50 PM5/7/11
to
On 05/ 8/11 04:32 AM, William Ahern wrote:
> In comp.lang.c Dr Nick<3-no...@temporary-address.org.uk> wrote:

>> "Kleuskes& Moos"<kle...@xs4all.nl> writes:
>
>>> The saving grace of recursion is that recursive implementations are
>>> usually easier to understand. If it weren't for that, i'd ban the
>>> practice outright.
>
>> I'd be intrigued to see a non-recursive implementation of the code of a
>> library for creating, reading into memory and printing JSON data
>> structures as an obvious example.
>
> I would hope such library was non-recursive. This has Denial of Service
> written all over it, considering that JSON data usually comes from untrusted
> sources, and even many scripting languages use the C stack for calls in the
> scripting language.
>
> I suppose one could add kludges like a depth counter. I say kludge because
> stack memory in C is such an opaque and unmeasureable quantity.

It would take an extremely large JSON object (at least as big as the
stack) to overflow a server's stack. A prudent sever admin would set a
realistic limit on request sizes.

--
Ian Collins

Scott Lurndal

unread,
May 7, 2011, 7:22:16 PM5/7/11
to
bolta...@boltar.world writes:
>On Thu, 05 May 2011 02:51:35 -0700
>China Blue Veins <chine...@yahoo.com> wrote:
>>In article <iptr5l$tkb$1...@speranza.aioe.org>, bolta...@boltar.world wrote:
>>
>>> Hello
>>>
>>> If a program recurses too deeply there's always a chance that it could
>>> run out of stack space and die with a SIGSEGV. Is there any API that can
>>> tell you how much stack space is left or some other method to prevent this
>>> in C/C++? I realise I could catch the signal but thats a pretty damn ugly
>>> way to do it.
>>>
>>> B2003
>>
>>On some systems there is setrlimit which can be used to set the stack size. On
>>unix you can start with 'man -k limit' and examine individual man pages.
>
>Actually I suppose i could use getrlimit to get the stack size but is there
>a clean way of actually finding the current position of top of the stack? I
>could take the addresses of local variables but that seems a bit prone to
>error.

Why would it be prone to error? It may be off by a couple of bytes, depending
on how many automatics (and alloca() calls) before you take the address. There
is no other portable way. (When I've written debuggers, I've used the following
non-portable function with G++ - lose the static keyword for GCC):

#define ALWAYS_INLINE inline __attribute__((always_inline))

static ALWAYS_INLINE uintptr_t
get_rsp()
{
uintptr_t rsp;

__asm__ __volatile__ ("movq %%rsp, %0": "=r" (rsp));

return rsp;
}

in 32-bit mode:

__asm__ __volatile__ ("movl %%esp, %0": "=r" (rsp));


You can also use __builtin_frame_address (see info gcc).

scott

Scott Lurndal

unread,
May 7, 2011, 7:27:46 PM5/7/11
to
"Kleuskes & Moos" <kle...@xs4all.nl> writes:

>Nope. It follows from the design of processors and a wish to avoid
>unneccesary overhead (passing arguments, stack-frame maintenance,
>pushing and popping return adresses) and that's without even
>considering any effect a call may have on things like branch-
>prediction, instruction pipe-lines and such.

> A 'call' instruction is
>(relative to a branch) quite expensive and prevents some optimisations
>like loop-unrolling.

Actually, it is not that expensive with modern processors (which include
internal return stacks, and for Intel x86_64, passes parameters in
general purpose registers (up to the first 6 args)). Branch prediction
doesn't have anything to do with calls (which are just unconditional branches).

A function call to a recursive function with less than 6 args will involve
one memory reference on the call instruction (to push the return address).

scott

Keith Thompson

unread,
May 7, 2011, 10:37:19 PM5/7/11
to
Måns Rullgård <ma...@mansr.com> writes:
> Keith Thompson <ks...@mib.org> writes:
[...]

>> The problem, as I see it, is that by returning a non-null value, the
>> system promises the program that it can use the allocated memory.
>> Later invoking the OOM killer when the program actually tries to
>> use it seems to me to be a violation of that promise.
>>
>> I understand that overly optimistic allocation can make sense
>> for stack space; a program will likely never use as much stack
>> as it could. But I'm not convinced that the same thing applies
>> to malloc(). If a program calls malloc(), it's telling the OS that
>> it *will* use that memory (and probably soon). If the system can't
>> fulfill the request, it should refuse it in the first place.
>>
>> Do many actual programs malloc() much more memory than they're
>> actually going to use?
>
> They do indeed, and that practice is the very reason for overcommit
> existing in the first place.

(I'm keeping this in comp.unix.programmer and comp.lang.c for now, since
it still seems relevant to both. Feel free to set followups if that
changes.)

Why? If a program finds it needs more memory, it can always call
malloc() again, or realloc() to increase the size of an existing block.
Can you give an example where it makes sense to allocate a large block
of memory and not use it (or use only a small initial portion of it)?

James Kuyper

unread,
May 7, 2011, 10:52:12 PM5/7/11
to
On 05/07/2011 09:35 AM, Kleuskes & Moos wrote:
> On May 7, 3:01�pm, James Kuyper <jameskuy...@verizon.net> wrote:
>> On 05/07/2011 03:28 AM, Kleuskes & Moos wrote:
...
>>> And you don't strike me as a gambler. You rely on SIGSEGV so cap it, I
>>> suggest you cap it yourself, since SIGSEGV is a luxury item on many
>>> platforms.
>>
>> I don't rely on SIGSEGV. Among the restrictions placed on my code by our
>> client is a prohibition on using the features of <signal.h>.
>
> Let me see if i got that straight... You've been arguing that there's
> no portable way to cap recursion and a runaway recursive process will

Nor any portable way to cap non-recursive function calls that overflow
the stack, either. My complaint has nothing to do with recursion; which
is really the point of my objection to your objections to recursion -
the problem you're trying to avoid cannot be absolutely avoided in any
context; and it can be avoided (with less than absolute success) for
recursive programs just as it can be with non-recursive ones.

> overflow the stack, resulting in a SIGSEGV...

Not necessarily; that's just what happens on systems which have SIGSEGV,
or some equivalent mechanism.

> That means the only cap placed on recursion in such a case is the
> kernel THROWING a SIGSEGV. This will terminate the process in question
> quite effectively, but not to the liking f your client, i imagine.

I'm commenting on two entirely distinct issues.

1. The lack of any portable way for a C code to avoid making a function
call that requires more stack space than is currently available; there's
not even a portable way to recover from the fact that such a call was
made, which would be a markedly inferior solution to this problem.
Handling a SIGSEGV is indeed an example of such an inferior solution,
but it's not mandated by the C standard.

2. You made a comment about me relying on SIGSEGV; I don't, and I
explained why I can't; but this is only a side issue, not directly
relevant to the first point. My first point would still apply even if I
had no such restrictions placed on my code, and even if it ran on a
platform which misbehaved in some entirely different way when it ran out
of automatically allocated memory. It doesn't even matter whether or not
it uses a stack to provide that memory, or some other mechanism, such as
activation records.

>>> So why say it's a problem with 'C', when in fact it's a problem in any
>>> given language you know of?
>>
>> Yes. The fact that it's a problem with every language I know of implies,
>> in particular, that it is a problem with C. There's no contradiction
>> there, and I even emphasized precisely that point by saying "(though
>> it's not unique to C)".
>
> It's like telling a man "the problem with this car is that you might
> run out of gas driving it, if you don't pay attention to the fuel-
> gauge". No contradiction, but that doesn't mean nothing is wrong with
> it.

Every car I've looked at since my second car search, about 1994, has had
as standard equipment either a buzzer or a flashing light (usually both)
that would provided advance warning if the fuel level was too low. I
would in fact consider any car lacking such warning mechanisms to be
seriously defective.

Where's the buzzer/warning light for approaching the stack limits in C?
The non-portable best that I can do is handling SIGSEGV, which is a
little late; the problem I wanted to avoid has already happened by the
time that signal is raised.
For that matter, I'd settle for the analog of a gas gauge - where can I
find it?

I don't want to make this sound like more of a problem than it really
is; I've been writing programs in C for 30 years now despite the lack of
this feature - it hasn't stopped me, and has only occasionally slowed me
down. However, but I'm aware of the problems that the lack of such a
feature can cause, and of my helplessness in dealing with those problems.
My point is that recursive code makes only a quantitative difference in
how difficult it is to avoid such problems; it's not a fundamental
qualitative difference. The improvement in the maintainability of the
code because of the use of a recursive algorithm can easily be
sufficient to make up for that difference. Your advice to "avoid
recursion, if at all possible" is an overreaction.
--
James Kuyper

Barry Margolin

unread,
May 8, 2011, 12:28:15 AM5/8/11
to
In article <lnaaeyp...@nuthaus.mib.org>,
Keith Thompson <ks...@mib.org> wrote:

> Do many actual programs malloc() much more memory than they're
> actually going to use?

Probably not, but that's not relevant. malloc isn't the system call
that adds memory to the process, sbrk is. When malloc find that the
heap is used up, it probably DOES request more memory with sbrk than it
needs just to satisfy the current request.

William Ahern

unread,
May 8, 2011, 12:49:18 AM5/8/11
to

Relying on prudent admins, instead of safe by design, explains where the
industry is in terms of security. But I'll admit that blowing the stack is
low on the list of things to freak out over, especially given the behavior
on most systems is to safely terminate. And if a recursive algorithm is much
simpler and easier to code safely, it's a compromise I'd make, everything
else being equal.

io_x

unread,
May 8, 2011, 1:17:16 AM5/8/11
to

"Keith Thompson" <ks...@mib.org> ha scritto nel messaggio
news:ln62plq...@nuthaus.mib.org...

> Måns Rullgård <ma...@mansr.com> writes:
>> Keith Thompson <ks...@mib.org> writes:
> [...]
>>> The problem, as I see it, is that by returning a non-null value, the
>>> system promises the program that it can use the allocated memory.
>>> Later invoking the OOM killer when the program actually tries to
>>> use it seems to me to be a violation of that promise.
>>>
>>> I understand that overly optimistic allocation can make sense
>>> for stack space; a program will likely never use as much stack
>>> as it could. But I'm not convinced that the same thing applies
>>> to malloc(). If a program calls malloc(), it's telling the OS that
>>> it *will* use that memory (and probably soon). If the system can't
>>> fulfill the request, it should refuse it in the first place.
>>>
>>> Do many actual programs malloc() much more memory than they're
>>> actually going to use?
>>
>> They do indeed, and that practice is the very reason for overcommit
>> existing in the first place.
>
> (I'm keeping this in comp.unix.programmer and comp.lang.c for now, since
> it still seems relevant to both. Feel free to set followups if that
> changes.)
>
> Why? If a program finds it needs more memory, it can always call
> malloc() again, or realloc() to increase the size of an existing block.
> Can you give an example where it makes sense to allocate a large block
> of memory and not use it (or use only a small initial portion of it)?

for to make the OS in trouble


io_x

unread,
May 8, 2011, 1:17:23 AM5/8/11
to

"Niklas Holsti" <niklas...@tidorum.invalid> ha scritto nel messaggio
news:92l9a8...@mid.individual.net...
> James Kuyper wrote:
>
>> I'm not aware of any other language that provides a mechanism for
>> avoiding or dealing with this problem, either. There might be one, but I
>> don't know it's name.
>
> Stack overflow in the Ada language is expected to raise the Storage_Error
> exception, which can be caught and handled in the Ada program itself. However,
> this can depend on compilation options, for example the GNU Ada compiler gnat
> needs the option -fstack-check for Storage_Error to work in this way,
> otherwise stack overflow just signals a SIGSEGV.

for me it is the same recursive function that has to return error; as any other
visible and invisible error; something like

int recursivefunction(int* result, int* value)
{int a, b, c; // esp-=1024 esp<
if(GetCurrentStackPointerValue()<StackLimit-1000)
return -1
....
return recursivefunction(result, value)
}

> You can also do the recursive computation in a dedicated thread (an Ada
> "task") and control the amount of stack space allocated to the thread by means
> of pragma Storage_Size, with a static constant value or a dynamically
> calculated value, when the thread is created.

ok if understand well

> If you expect that Storage_Error may be raised by stack overflow during
> recursion, it is a good idea to put the handler for Storage_Error outside the
> recursion, so that the recursion unwinds and stack space is again available
> before the handler is entered.
>
> --
> Niklas Holsti
> Tidorum Ltd
> niklas holsti tidorum fi
> . @ .


William Ahern

unread,
May 8, 2011, 1:49:03 AM5/8/11
to
In comp.unix.programmer Michael Press <rub...@pacbell.net> wrote:
> In article <3tag98-...@wilbur.25thandClement.com>,
> William Ahern <wil...@wilbur.25thandClement.com> wrote:
<snip>

> > I prefer all of my data structures to be free from resource acquisition
> > failures. This makes RAII more convenient in C.
> >
> > My point is, you can roll the stack into whichever data structure you're
> > working on, and a linked-list stack can be far more informative than an
> > array stack (manual or implicit). Sometimes iterative algorithms like this
> > are easier to comprehend. I'm writing a non-recursive implementation of
> > left-leaning red-black trees, and insertion in particular seems much simpler
> > in its iterative form.

> This means that you allocate all memory necessary
> before beginning to build the data structure? And
> never need to allocate more memory?

On BSD and Linux systems see, for example, the list implementations in
/usr/include/sys/queue.h.

You're allocating new memory for each object at construction time, but
there's no separate allocation when manipulating a list, tree, etc. So list
insertion has no failure mode. Tree insertion can often be arranged to have
no failure mode, but in any event it wouldn't be because of an OOM
condition.

I eschew hash tables in C because they require dynamic memory management,
and because of algorithmic complexity attacks--unless your hash is
cryptographically strong, an attacker could make your hash table work in
O(N), but an O(1) crypto-strong hashed table tends to be slower than
O(lg(N)) with my data sets and usage patterns, or at least not worth the
hassle. Most of my work involves network-facing software, where paranoia is
a virtue.

William Ahern

unread,
May 8, 2011, 2:06:00 AM5/8/11
to
In comp.unix.programmer Barry Margolin <bar...@alum.mit.edu> wrote:
> In article <lnaaeyp...@nuthaus.mib.org>,
> Keith Thompson <ks...@mib.org> wrote:

> > Do many actual programs malloc() much more memory than they're
> > actually going to use?

> Probably not, but that's not relevant. malloc isn't the system call
> that adds memory to the process, sbrk is. When malloc find that the
> heap is used up, it probably DOES request more memory with sbrk than it
> needs just to satisfy the current request.

But not that much relative to overall system memory on the systems vanilla
Linux tends to run on. We're talking a few megabytes per sbrk() or mmap()
request, not gigabytes. And where system memory tends to be quite limited,
as on embedded systems, the case for overcommit is even worse. This is why
the iPhone didn't support multitasking originally; and why Android requires
all applications to run effectively stateless, AFAIU.

The best argument for overcommit involves fork(), but I don't think it's
persuasive today, if it ever was persuasive. Memory hogs like databases
don't fork+exec programs once up and running, so you're not sacrificing
convenience or practicality by preventing a 1GB process from forking.

Jorgen Grahn

unread,
May 8, 2011, 2:19:00 AM5/8/11
to
On Sat, 2011-05-07, Rainer Weikusat wrote:
> Dr Nick <3-no...@temporary-address.org.uk> writes:
>> Barry Margolin <bar...@alum.mit.edu> writes:
>
> [...]

>
>>> Where recursion becomes an issue is when you use it for every element of
>>> a sequential data structure. For instance, if a parser's algorithmic
>>> structure were something like:
>>>
>>> read_first_token
>>> process_token
>>> recurse(rest of document)
>>>
>>> it would probably run into a stack limit on most implementations when
>>> processing any real input.
>>
>> Most implementations where the compiler doesn't optimise tail recursion
>> away, anyway.
>
> Eh ... you do understand that 'compiler detects that programmer was a
> crackpot and works around that automatically' implies that recursion
> is probematic, do you?

It's the other way around -- recursion is less problematic if it can
be compiled into something which runs efficiently. Compilers and
interpreters for functional languages (Haskell, ML, Lisp ...) where
recursion replaces loops, have been doing this for decades.

(Not that I use recursion a lot myself.)

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .

Dr Nick

unread,
May 8, 2011, 2:30:27 AM5/8/11
to
Barry Margolin <bar...@alum.mit.edu> writes:

> In article <lnaaeyp...@nuthaus.mib.org>,
> Keith Thompson <ks...@mib.org> wrote:
>
>> Do many actual programs malloc() much more memory than they're
>> actually going to use?
>
> Probably not, but that's not relevant. malloc isn't the system call
> that adds memory to the process, sbrk is. When malloc find that the
> heap is used up, it probably DOES request more memory with sbrk than it
> needs just to satisfy the current request.

Ah! Like Keith I'd often wondered about just why this happens. That
makes a lot of sense: malloc wants big chunks of memory to parcel out
from.
--
Online waterways route planner | http://canalplan.eu
Plan trips, see photos, check facilities | http://canalplan.org.uk

Nobody

unread,
May 8, 2011, 3:15:08 AM5/8/11
to
On Sat, 07 May 2011 13:53:41 -0700, Kleuskes & Moos wrote:

> However, if i understand http://linux-mm.org/OOM_Killer correctly
> (interesting read, BTW), processes may be killed, but not with a
> SIGSEGV, so the program does not crash, but is terminated by the
> system. This may not be the desired outcome, but it's not a bug.

It's effectively sent SIGKILL, which cannot be caught, rather than
SIGSEGV, which can be caught.

SIGSEGV isn't really the same thing as a crash (largely because "crash"
isn't a clearly-defined concept). I normally consider a crash to be when
the process' data gets corrupted resulting in the process going out of
control (which usually ends when it receives SIGSEGV due to dereferencing
a bogus pointer).

Niklas Holsti

unread,
May 8, 2011, 3:16:34 AM5/8/11
to
io_x wrote:
> "Niklas Holsti" <niklas...@tidorum.invalid> ha scritto nel messaggio
> news:92l9a8...@mid.individual.net...
>> James Kuyper wrote:
>>
>>> I'm not aware of any other language that provides a mechanism for
>>> avoiding or dealing with this problem, either. There might be one, but I
>>> don't know it's name.
>> Stack overflow in the Ada language is expected to raise the Storage_Error
>> exception, which can be caught and handled in the Ada program itself. However,
>> this can depend on compilation options, for example the GNU Ada compiler gnat
>> needs the option -fstack-check for Storage_Error to work in this way,
>> otherwise stack overflow just signals a SIGSEGV.
>
> for me it is the same recursive function that has to return error; as any other
> visible and invisible error; something like
>
> int recursivefunction(int* result, int* value)
> {int a, b, c; // esp-=1024 esp<
> if(GetCurrentStackPointerValue()<StackLimit-1000)
> return -1
> ....
> return recursivefunction(result, value)
> }

Can be done in Ada:

function recursivefunction (result, value : access integer)
return integer
is
<declarations...>
begin
<statements...>
return recursivefunction (result, value);
exception
when Storage_Error => return -1;
end recursivefunction;

In the function above, if any of the <statements> or the recursive call
causes stack overflow, the Storage_Error handler is entered and the call
returns -1. If the <declarations> cause stack overflow, the handler for
the present call of recursivefunction is bypassed and the Storage_Error
exception is propagated to the outer call level, entering the handler at
that level.

With the gnat Ada compiler it seems that stack overflow at any point in
the function (declarations or statements) abandons the present call and
propagates Storage_Error to the outer call. So if the stack overflow is
detected on recursion level 9999 it is handled on level 9998. But that
should not matter much.

io_x

unread,
May 8, 2011, 3:24:57 AM5/8/11
to

"io_x" <a...@b.c.invalid> ha scritto nel messaggio
news:4dc6271d$0$38643$4faf...@reader1.news.tin.it...

for meke the prog in trouble becouse loose the controll


Kleuskes & Moos

unread,
May 8, 2011, 4:26:13 AM5/8/11
to
On May 8, 4:52 am, James Kuyper <jameskuy...@verizon.net> wrote:
> On 05/07/2011 09:35 AM, Kleuskes & Moos wrote:
>
> > On May 7, 3:01 pm, James Kuyper <jameskuy...@verizon.net> wrote:
> >> On 05/07/2011 03:28 AM, Kleuskes & Moos wrote:
> ...
> >>> And you don't strike me as a gambler. You rely on SIGSEGV so cap it, I
> >>> suggest you cap it yourself, since SIGSEGV is a luxury item on many
> >>> platforms.
>
> >> I don't rely on SIGSEGV. Among the restrictions placed on my code by our
> >> client is a prohibition on using the features of <signal.h>.
>
> > Let me see if i got that straight... You've been arguing that there's
> > no portable way to cap recursion and a runaway recursive process will
>
> Nor any portable way to cap non-recursive function calls that overflow
> the stack, either.

Of course there is. Do not make the call-tree big enough to do that,
and if there's no recursion involved, it's rather easy to derive a max
stack size and put it in your hardware requirements.

> My complaint has nothing to do with recursion; which
> is really the point of my objection to your objections to recursion -
> the problem you're trying to avoid cannot be absolutely avoided in any
> context; and it can be avoided (with less than absolute success) for
> recursive programs just as it can be with non-recursive ones.


So, basically, you're saying, since your cannot avoid it absolutely
anyway, all advice is useless, and it's OK to use tail-recursion to
skip through a 2 Mb file. Ok.

> > overflow the stack, resulting in a SIGSEGV...
>
> Not necessarily; that's just what happens on systems which have SIGSEGV,
> or some equivalent mechanism.

Aye. And on systems that haven't got it, results are undefined and
(under some circumstances) may kill someone. As i said, a SIGSEGV is a
luxury item, if you overflow he stack and your system does NOT signal
a segfault, you're in even more trouble. Or rather, the poor sod who's
using your software is in deep shit.

You're like a mechanic telling me how brakes are useless since they
cannot guarantee that you won't run into anything, anyway. Even when
applying the brakes, you can still run over a pedestrian or a 20-ton
truck, so basically brakes are useless.

> > That means the only cap placed on recursion in such a case is the
> > kernel THROWING a SIGSEGV. This will terminate the process in question
> > quite effectively, but not to the liking f your client, i imagine.
>
> I'm commenting on two entirely distinct issues.
>
> 1. The lack of any portable way for a C code to avoid making a function
> call that requires more stack space than is currently available; there's
> not even a portable way to recover from the fact that such a call was
> made, which would be a markedly inferior solution to this problem.
> Handling a SIGSEGV is indeed an example of such an inferior solution,
> but it's not mandated by the C standard.

Nope, it's mandated by the POSIX standard instead. But i'm getting
curious, pray describe the superior solution you have in mind.

Secondly if you are making function calls that exhaust the stack (this
time in one go, apparently) either the software you write is pretty
shitty or the platform you're running it on is not suited to your
software and your analysis was pretty shitty.

> 2. You made a comment about me relying on SIGSEGV; I don't, and I
> explained why I can't; but this is only a side issue, not directly
> relevant to the first point.

Ok. So you have different ways of dealing with a stack-overflow. Again
i'm curious how you do that. Portably, since you've stressed the
importance of that quality in your software.

> My first point would still apply even if I
> had no such restrictions placed on my code, and even if it ran on a
> platform which misbehaved in some entirely different way when it ran out
> of automatically allocated memory.

Ok. lets say that if your software crashes, chances are someone gets
killed. It may seem ridiculous to you, but on one of the projects i
worked on, that was a possible consequence. The project employed 16kB
of ROM and 16kB of RAM, has a clock-frequency of 1-kHz and needs to
run for 10 years on a single battery charge.

So your point applies, and i can't guarantee that a single function-
call won't crash the system. Mission impossible? Mission
irresponsable?

What alternatives do you have?

I'm just glad i _can_ guarantee it does not run out of stack space,
since the call-tree wasn't that high.

> It doesn't even matter whether or not
> it uses a stack to provide that memory, or some other mechanism, such as
> activation records.

I'm getting curious again, what sort of killer-functioncall did you
have in mind? Something which allocates a 2Gb array on stack and
passes it around by value? If that's the case, your design stinks.

> >>> So why say it's a problem with 'C', when in fact it's a problem in any
> >>> given language you know of?
>
> >> Yes. The fact that it's a problem with every language I know of implies,
> >> in particular, that it is a problem with C. There's no contradiction
> >> there, and I even emphasized precisely that point by saying "(though
> >> it's not unique to C)".
>
> > It's like telling a man "the problem with this car is that you might
> > run out of gas driving it, if you don't pay attention to the fuel-
> > gauge". No contradiction, but that doesn't mean nothing is wrong with
> > it.
>
> Every car I've looked at since my second car search, about 1994, has had
> as standard equipment either a buzzer or a flashing light (usually both)
> that would provided advance warning if the fuel level was too low. I
> would in fact consider any car lacking such warning mechanisms to be
> seriously defective.

That wasn't the point.

> Where's the buzzer/warning light for approaching the stack limits in C?

If your design is good, you don't need one. If you feel you need one,
C is not the language for you.

> The non-portable best that I can do is handling SIGSEGV, which is a
> little late; the problem I wanted to avoid has already happened by the
> time that signal is raised.

True. And catching a SIGSEGV isn't a good idea, unless you're writing
a debugger. The only portable option (and it's been dubbed a kludge)
is to keep track of recursions manually and implement the damn buzzer
yourself. A better solution is to avoid that kind of situation in the
first place. Laying off recursion goes a long way of achieving that.

> For that matter, I'd settle for the analog of a gas gauge - where can I
> find it?

Again, if you feel you need thngs like that, C is not the language for
you. Get used to doing you thinking in design-time instead of at run-
time.

> I don't want to make this sound like more of a problem than it really
> is; I've been writing programs in C for 30 years now despite the lack of
> this feature - it hasn't stopped me, and has only occasionally slowed me
> down.

Glad to hear that.

> However, but I'm aware of the problems that the lack of such a
> feature can cause, and of my helplessness in dealing with those problems.

You're NOT helpless. You're merely LATE.

> My point is that recursive code makes only a quantitative difference in
> how difficult it is to avoid such problems; it's not a fundamental
> qualitative difference.

It's perhaps the difference between p=1E10-6 and 0.9, which makes it a
pretty big quantitative difference. And again, with proper analysis
beforehand (and that has been my job for a number of years) you can
reduce it to p=0, which makes it a quantitative difference.

> The improvement in the maintainability of the
> code because of the use of a recursive algorithm can easily be
> sufficient to make up for that difference.

Perhaps on your job. As i mentioned earlier, if you and your customer
can live with the occasional crash, there's no real problem. But
that's your design choice.

> Your advice to "avoid recursion, if at all possible" is an overreaction.

In the thread at hand, which started with someone being worried about
stack overflow, it still makes more sense than your "It's the fault of
C for not providing me with a buzzer, flashing lights and a fuel-
gauge. there's nothing you can do about it, so live with it and
recurse the hell out of your processor, since it's much easier that
way."

So i'll stick with my advice, thank-you-very-much.

Dr Nick

unread,
May 8, 2011, 4:37:52 AM5/8/11
to
"Kleuskes & Moos" <kle...@xs4all.nl> writes:

> You're like a mechanic telling me how brakes are useless since they
> cannot guarantee that you won't run into anything, anyway. Even when
> applying the brakes, you can still run over a pedestrian or a 20-ton
> truck, so basically brakes are useless.

No, you're the person who is telling me that because I can drive too
fast and kill people I shouldn't be using the accelerator.

Måns Rullgård

unread,
May 8, 2011, 5:36:55 AM5/8/11
to
Keith Thompson <ks...@mib.org> writes:

It sometimes makes sense to allocate a large block and only use small
scattered portions of it to simplify indexing. Using only a small
_initial_ portion obviously never makes sense.

--
Måns Rullgård
ma...@mansr.com

Malcolm McLean

unread,
May 8, 2011, 6:21:20 AM5/8/11
to
On May 8, 11:26 am, "Kleuskes & Moos" <kleu...@xs4all.nl> wrote:
>
> The only portable option (and it's been dubbed a kludge)
> is to keep track of recursions manually and implement the damn buzzer
> yourself. A better solution is to avoid that kind of situation in the
> first place. Laying off recursion goes a long way of achieving that.
>
The problem is that if a data structure is inherently recursive then
it can be very difficult to write non-recursive code to traverse it.
The usual solution is to use a stack. However this doesn't really
solve the problem, it just means that the stack overflow occurs, if it
occurs, in a different place.

Willem

unread,
May 8, 2011, 6:44:42 AM5/8/11
to
Malcolm McLean wrote:
) The problem is that if a data structure is inherently recursive then
) it can be very difficult to write non-recursive code to traverse it.
) The usual solution is to use a stack. However this doesn't really
) solve the problem, it just means that the stack overflow occurs, if it
) occurs, in a different place.

In a place where you can catch the error and terminate cleanly.

One of the major problems with C is that you cannot reliably handle
stack overflows, unless you roll your own stack.


SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT

Casper H.S. Dik

unread,
May 8, 2011, 6:57:35 AM5/8/11
to
"Kleuskes & Moos" <kle...@xs4all.nl> writes:

>Ah... That's news. I was under the distinct impression that stack-
>overflows are impossible to catch. Please expound on your method to
>reliably (and portably) catch stack-overflows in recursive algorithms.

>I'm quite curious.

Using an alternate signal stack is able to catch this, but
don't be an a* about asking for "reliably and portability".

Casper
--

Casper H.S. Dik

unread,
May 8, 2011, 7:01:35 AM5/8/11
to
Keith Thompson <ks...@mib.org> writes:

>Nobody <nob...@nowhere.com> writes:
>> On Sat, 07 May 2011 10:24:42 -0700, Kleuskes & Moos wrote:
>>>> And instead you run out of heap.
>>>
>>> Aye. But there's portable ways of checking that: malloc will just
>>> return NULL, which can be handled nicely and safely, instead of the
>>> system throwing a SIGSEGV.
>>
>> You forgot a case: malloc() succeeds but the process receives SIGKILL when
>> it actually tries to use the allocated memory (OOM-killer).

>Actaully the OOM-killer kills *some* process, not necessarily the one
>that just tried to use allocated memory.

Usually the large process such as the most important process :-)
(I'd start with webbrowser as people will expect it to crash anyway)

Casper
--

Casper H.S. Dik

unread,
May 8, 2011, 7:08:13 AM5/8/11
to
Keith Thompson <ks...@mib.org> writes:

>I understand that overly optimistic allocation can make sense
>for stack space; a program will likely never use as much stack
>as it could. But I'm not convinced that the same thing applies
>to malloc(). If a program calls malloc(), it's telling the OS that
>it *will* use that memory (and probably soon). If the system can't
>fulfill the request, it should refuse it in the first place.

There are many usage of anonymous memory in a system: stack,
heap but also the copy-on-write pages after a fork().

The system may not allocate the different types of memory
in a different way, but some make a difference between the
stack (no reservation is made) or fork/malloc (a reservation
is made). Specifically in the case of fork, there is an argument
to be made that that memory is unlikely ever used.

>Do many actual programs malloc() much more memory than they're
>actually going to use?

Not a lot; but a lot of fork()ing happens without ever touching
the copy-on-write pages. That is likely the most important reason
for overcommit.

Casper
--

Kenny McCormack

unread,
May 8, 2011, 9:10:41 AM5/8/11
to
In article <lnei4ap...@nuthaus.mib.org>,

Keith Thompson <ks...@mib.org> wrote:
>Nobody <nob...@nowhere.com> writes:
>> On Sat, 07 May 2011 10:24:42 -0700, Kleuskes & Moos wrote:
>>>> And instead you run out of heap.
>>>
>>> Aye. But there's portable ways of checking that: malloc will just
>>> return NULL, which can be handled nicely and safely, instead of the
>>> system throwing a SIGSEGV.
>>
>> You forgot a case: malloc() succeeds but the process receives SIGKILL when
>> it actually tries to use the allocated memory (OOM-killer).
>
>Actaully the OOM-killer kills *some* process, not necessarily the one
>that just tried to use allocated memory.

Show me where this thing of which you speak, this "OOM-killer", is mentioned
in (any of) the C standards documents. Thank you.

P.S. Having just Googled it, I see that this "OOM-killer" is a
Linux-specific thing, so discussion of it is OT in both newsgroups to which
this is being posted.

--
"We should always be disposed to believe that which appears to us to be
white is really black, if the hierarchy of the church so decides."

- Saint Ignatius Loyola (1491-1556) Founder of the Jesuit Order -

James Kuyper

unread,
May 8, 2011, 9:27:02 AM5/8/11
to
On 05/08/2011 04:26 AM, Kleuskes & Moos wrote:
> On May 8, 4:52�am, James Kuyper <jameskuy...@verizon.net> wrote:
>> On 05/07/2011 09:35 AM, Kleuskes & Moos wrote:
>>
>>> On May 7, 3:01 pm, James Kuyper <jameskuy...@verizon.net> wrote:
>>>> On 05/07/2011 03:28 AM, Kleuskes & Moos wrote:
...
>> Nor any portable way to cap non-recursive function calls that overflow
>> the stack, either.
>
> Of course there is. Do not make the call-tree big enough to do that,
> and if there's no recursion involved, it's rather easy to derive a max
> stack size and put it in your hardware requirements.

Imposing a hardware requirement is what makes that solution
non-portable. It's not a bad solution; but it cannot be ported to a
machine that doesn't meet those requirements. Some mechanism for
determining, within your program, whether a given function call would
require more stack than is currently available would make it possible to
write code which at least fails gracefully on such hardware, instead of
simply aborting (or worse).

> So, basically, you're saying, since your cannot avoid it absolutely
> anyway, all advice is useless, and it's OK to use tail-recursion to
> skip through a 2 Mb file. Ok.

No, I'm saying that any precautions you need to take to make recursion
safe necessarily limit the portability of your code; the same is true of
making non-recursive function calls safe; it's just a little easier to
take the appropriate precautions with non-recursive code. As long as you
take those precautions, and consider the associated limitations on the
portability of your code acceptable, there's no other reason to avoid
recursion. And, where it's appropriate, there can be strong reasons to
favor a recursive approach.

...


> You're like a mechanic telling me how brakes are useless since they
> cannot guarantee that you won't run into anything, anyway. Even when
> applying the brakes, you can still run over a pedestrian or a 20-ton
> truck, so basically brakes are useless.

No, you should not confuse non-portable with useless.

...


>> 1. The lack of any portable way for a C code to avoid making a function
>> call that requires more stack space than is currently available; there's
>> not even a portable way to recover from the fact that such a call was
>> made, which would be a markedly inferior solution to this problem.
>> Handling a SIGSEGV is indeed an example of such an inferior solution,
>> but it's not mandated by the C standard.
>
> Nope, it's mandated by the POSIX standard instead. But i'm getting
> curious, pray describe the superior solution you have in mind.

I haven't come up with an alternative that is easy to use.
The simplest approach I can think of, from the point of view of the
implementor, would be two constructs. I would call them standard library
functions, and using function-call syntax might be the appropriate way
to specify them in the standard, but they must be implemented in such a
way that they can always be called safely, regardless of how little
stack there is left. One function determines how much memory is
currently available for objects with automatic storage duration (if the
exact amount is hard to determine, it may instead return a guaranteed
minimum amount).
The second function takes a function pointer as an argument, and reports
the amount of memory required for objects with automatic storage
duration needed to call that function. It would not include the memory
needed by function calls within the pointed-at function; avoiding stack
overflow due to those calls would be the responsibility of the function
that directly performs them.

There's a number of reasons why this simple approach would be inadequate
for C: VLAs and objects with automatic storage duration defined inside
conditionally executed blocks are the two biggest problems I can see;
I'm sure there are others. However, a C-like language without those
features could implement such constructs. It might be possible to
implement some variant of this idea in C itself.

The key advantage of this idea, over SIGSEGV, is that it allows a
program to avoid the problem, rather than merely react to the problem
after it has already happened.

Understand: I am not advocating the creation of such a mechanism; my
main point is that the problem that such a mechanism would solve, exists
whether or not the program in question uses recursion. You shouldn't
avoid recursion just because that problem is a little harder to deal
with in recursive code than in non-recursive code.

...


>> 2. You made a comment about me relying on SIGSEGV; I don't, and I
>> explained why I can't; but this is only a side issue, not directly
>> relevant to the first point.
>
> Ok. So you have different ways of dealing with a stack-overflow. Again
> i'm curious how you do that. Portably, since you've stressed the
> importance of that quality in your software.

I can't - because neither C, nor any other language that I'm familiar
with, has portable methods to avoid stack overflow. I use non-portable
methods that are probably not too different from yours. They basically
amount to making sure that it works on every machine I have available to
test it on, and hoping that it works elsewhere.

While my software is required to be available to the public, the only
machines it absolutely must run on are machines operated by my own
company, and I have development machines I can test my code on that are
(supposed to be) configured identically to the production machines.
But if a user is having trouble porting my code to their machine, and is
willing to give me access to that machine for testing, I would certainly
be glad to fix the problem. I have anecdotal evidence suggesting that
most users don't bother; many of them simply make the fix themselves,
without even bothering to complain about it. I found a bug last year
when we first ported our code to a 64-bit CentOS machine, and when I
announced my bug fix, three users wrote to tell me they'd already
noticed the problem and fixed it in their own copies of the code,
without bothering to tell me about the bug.

...


> So your point applies, and i can't guarantee that a single function-
> call won't crash the system. Mission impossible? Mission
> irresponsable?
>
> What alternatives do you have?

None - which is the substance of my complaint.

>> It doesn't even matter whether or not
>> it uses a stack to provide that memory, or some other mechanism, such as
>> activation records.
>
> I'm getting curious again, what sort of killer-functioncall did you
> have in mind? Something which allocates a 2Gb array on stack and
> passes it around by value? If that's the case, your design stinks.

No, just a function that requires more total stack space than is
available; it doesn't all have to be allocated in a single object.
Virtual memory makes such a limit harder to exceed, but not all systems
have virtual memory. On some small machines, it could take a whole lot
less than 2GB to exceed that limit.
--
James Kuyper

Kleuskes & Moos

unread,
May 8, 2011, 10:23:46 AM5/8/11
to
On May 8, 12:44 pm, Willem <wil...@toad.stack.nl> wrote:
> Malcolm McLean wrote:
>
> ) The problem is that if a data structure is inherently recursive then
> ) it can be very difficult to write non-recursive code to traverse it.
> ) The usual solution is to use a stack. However this doesn't really
> ) solve the problem, it just means that the stack overflow occurs, if it
> ) occurs, in a different place.
>
> In a place where you can catch the error and terminate cleanly.
>
> One of the major problems with C is that you cannot reliably handle
> stack overflows, unless you roll your own stack.

Exactly. And since implementing a stack mechanism with all the
goodities and niceties a demanding programmer wants, is in the range
of Computer Science 101, it's rather superfluous as a language
element.

The language C is so powerful and lean since it does put
responsability on the programmer. So take it.

It is loading more messages.
0 new messages