Allocation failure of automatic objects

228 views
Skip to first unread message

FrankHB1989

unread,
Dec 1, 2017, 10:41:20 AM12/1/17
to ISO C++ Standard - Discussion
Does the standard have normative rules to guarantee allocation failure of automatic objects always consistent and predictable?

And what about requirements/limitations on nested function calls, with or without such guarantee?


Nicol Bolas

unread,
Dec 1, 2017, 11:31:14 AM12/1/17
to ISO C++ Standard - Discussion
On Friday, December 1, 2017 at 10:41:20 AM UTC-5, FrankHB1989 wrote:
Does the standard have normative rules to guarantee allocation failure of automatic objects always consistent and predictable?

Consider proper tail recursion. This is an optimization that a compiler can make to recursive functions that essentially removes local variables from the stack. But whether it can do so depends on a myriad of factors: the target ABI, the particular function arguments, possibly even aspects of the CPU itself.

If tail recursion is so implementation and build-target dependent, how could the standard spell out requirements for when it can happen? It can't. Instead, by not detailing specifically how "consistent and predictable" automatic object allocation is, it permits tail recursion as an optimization.

Similarly, if all of a function's automatic variables fit into registers, and that function doesn't call any other functions... why take up precious stack space for them? That optimization would again require automatic object allocation to not be "consistent and predictable", since such allocation patterns would change based on adding a (non-inlined) function call.
 
And what about requirements/limitations on nested function calls, with or without such guarantee?

There are no requirements or limitations. Or rather, there are limitations, but the standard doesn't require that they be spelled out. There's a recognition that implementations have finite resources and that these resources can be exhausted, leading to program termination.

Nothing more specific than that is said with regard to function recursion or automatic object "allocation" failure.
 

FrankHB1989

unread,
Dec 1, 2017, 12:32:08 PM12/1/17
to ISO C++ Standard - Discussion


在 2017年12月2日星期六 UTC+8上午12:31:14,Nicol Bolas写道:
On Friday, December 1, 2017 at 10:41:20 AM UTC-5, FrankHB1989 wrote:
Does the standard have normative rules to guarantee allocation failure of automatic objects always consistent and predictable?

Consider proper tail recursion. This is an optimization that a compiler can make to recursive functions that essentially removes local variables from the stack. But whether it can do so depends on a myriad of factors: the target ABI, the particular function arguments, possibly even aspects of the CPU itself.

Well, that's exactly what I expect. However, IIRC it is only mandated in language like Scheme. Without such guarantee, I can't even write an interpreter targetting an object language which has proper tail recursion guarantee in portable C++ without some sophisticated magic.

If tail recursion is so implementation and build-target dependent, how could the standard spell out requirements for when it can happen? It can't. Instead, by not detailing specifically how "consistent and predictable" automatic object allocation is, it permits tail recursion as an optimization.

So, is it essentially intentionally unspecified? In reality, stack overflow may likely cause the state of program unrecoverable, which is pretty like some cases of undefined behavior... That also sounds strange, comparing to the definition of "unspecified behavior", which concerns with "well-formed program construct and correct data". I can hardly imagine the case where stack overflow is not a bug.

BTW, as per ISO C 4p2, it seems to be undefined in C.

Similarly, if all of a function's automatic variables fit into registers, and that function doesn't call any other functions... why take up precious stack space for them? That optimization would again require automatic object allocation to not be "consistent and predictable", since such allocation patterns would change based on adding a (non-inlined) function call.
 
And what about requirements/limitations on nested function calls, with or without such guarantee?

There are no requirements or limitations. Or rather, there are limitations, but the standard doesn't require that they be spelled out. There's a recognition that implementations have finite resources and that these resources can be exhausted, leading to program termination.

Interestingly, [implimits] suggests there are limitations which can't determine compliance. Would such limitation (if any) be a candidate of this list?

Nicol Bolas

unread,
Dec 1, 2017, 1:33:44 PM12/1/17
to ISO C++ Standard - Discussion
On Friday, December 1, 2017 at 12:32:08 PM UTC-5, FrankHB1989 wrote:
在 2017年12月2日星期六 UTC+8上午12:31:14,Nicol Bolas写道:
On Friday, December 1, 2017 at 10:41:20 AM UTC-5, FrankHB1989 wrote:
Does the standard have normative rules to guarantee allocation failure of automatic objects always consistent and predictable?

Consider proper tail recursion. This is an optimization that a compiler can make to recursive functions that essentially removes local variables from the stack. But whether it can do so depends on a myriad of factors: the target ABI, the particular function arguments, possibly even aspects of the CPU itself.

Well, that's exactly what I expect. However, IIRC it is only mandated in language like Scheme. Without such guarantee, I can't even write an interpreter targetting an object language which has proper tail recursion guarantee in portable C++ without some sophisticated magic.

Nonsense. If you're writing an interpreter, then your interpretation loop is simply to interpret each instruction as it appears. A function call instruction does not mean you do a function call in the interpreter. It simply changes where you get your next instruction from. The return address is stored on the interpretation stack, along with whatever values are needed for the function call.

Go look at the Lua interpreter, which does have guaranteed proper tail calls (not just recursion; any call of the form `return some_function(...)` does not increase the stack). It doesn't need to do "sophisticated magic" to do that.

If tail recursion is so implementation and build-target dependent, how could the standard spell out requirements for when it can happen? It can't. Instead, by not detailing specifically how "consistent and predictable" automatic object allocation is, it permits tail recursion as an optimization.

So, is it essentially intentionally unspecified? In reality, stack overflow may likely cause the state of program unrecoverable, which is pretty like some cases of undefined behavior... That also sounds strange, comparing to the definition of "unspecified behavior", which concerns with "well-formed program construct and correct data". I can hardly imagine the case where stack overflow is not a bug.

BTW, as per ISO C 4p2, it seems to be undefined in C.

Similarly, if all of a function's automatic variables fit into registers, and that function doesn't call any other functions... why take up precious stack space for them? That optimization would again require automatic object allocation to not be "consistent and predictable", since such allocation patterns would change based on adding a (non-inlined) function call.
 
And what about requirements/limitations on nested function calls, with or without such guarantee?

There are no requirements or limitations. Or rather, there are limitations, but the standard doesn't require that they be spelled out. There's a recognition that implementations have finite resources and that these resources can be exhausted, leading to program termination.

Interestingly, [implimits] suggests there are limitations which can't determine compliance. Would such limitation (if any) be a candidate of this list?

No. No implementation could reasonably quantify these limitations, so they would all say "fixed limits exist, but are unknown". Which tells you nothing you didn't already know.

Michael Kilburn

unread,
Dec 1, 2017, 2:04:16 PM12/1/17
to std-dis...@isocpp.org
Does the standard have normative rules to guarantee allocation failure of automatic objects always consistent and predictable?

Afaik, yes -- it assumes local storage is large enough to run your program (i.e. automatic objects allocation failure never happens). The idea is that (knowing details of your implementation) you've analyzed your program and figured out maximum local storage usage and ensured it is available prior to running your program. But implementation is allowed to use "run-time discovery" instead -- i.e. it will run your program (in hope that whatever resources are available is enough) until it hits a limit at which point it is up to implementation -- normally it terminates your program simply because it discovered that environment were not capable of providing required guarantees to C++ runtime (and therefore can't expect program to behave). Implementation can also suspend your program (probably indefinitely) with aim to resume execution when resources become available.

In any case from C++ standpoint -- local storage allocation never fails. If underlying environment can't provide this guarantee -- program cannot execute, no diagnostic (from C++) is required.


On Fri, Dec 1, 2017 at 9:41 AM, FrankHB1989 <frank...@gmail.com> wrote:
Does the standard have normative rules to guarantee allocation failure of automatic objects always consistent and predictable?

And what about requirements/limitations on nested function calls, with or without such guarantee?


--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+unsubscribe@isocpp.org.
To post to this group, send email to std-dis...@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.



--
Sincerely yours,
Michael.

Ville Voutilainen

unread,
Dec 1, 2017, 2:14:14 PM12/1/17
to std-dis...@isocpp.org
On 1 December 2017 at 21:04, Michael Kilburn <crusad...@gmail.com> wrote:
>> Does the standard have normative rules to guarantee allocation failure of
>> automatic objects always consistent and predictable?
>
> Afaik, yes

No, not at all. See
http://eel.is/c++draft/intro.compliance#2.1

If the mentioned resource limits are exceeded, you get undefined
behavior. There's no hint of "always consistent and predictable"
anywhere near any of it.

Michael Kilburn

unread,
Dec 1, 2017, 2:22:42 PM12/1/17
to std-dis...@isocpp.org
That was somewhat "tongue-in-cheek" answer. It can never happen (from C++ program POV) -- therefore it is "always consistent and predictable". :-)

FrankHB1989

unread,
Dec 1, 2017, 9:21:46 PM12/1/17
to ISO C++ Standard - Discussion


在 2017年12月2日星期六 UTC+8上午2:33:44,Nicol Bolas写道:
On Friday, December 1, 2017 at 12:32:08 PM UTC-5, FrankHB1989 wrote:
在 2017年12月2日星期六 UTC+8上午12:31:14,Nicol Bolas写道:
On Friday, December 1, 2017 at 10:41:20 AM UTC-5, FrankHB1989 wrote:
Does the standard have normative rules to guarantee allocation failure of automatic objects always consistent and predictable?

Consider proper tail recursion. This is an optimization that a compiler can make to recursive functions that essentially removes local variables from the stack. But whether it can do so depends on a myriad of factors: the target ABI, the particular function arguments, possibly even aspects of the CPU itself.

Well, that's exactly what I expect. However, IIRC it is only mandated in language like Scheme. Without such guarantee, I can't even write an interpreter targetting an object language which has proper tail recursion guarantee in portable C++ without some sophisticated magic.

Nonsense. If you're writing an interpreter, then your interpretation loop is simply to interpret each instruction as it appears. A function call instruction does not mean you do a function call in the interpreter. It simply changes where you get your next instruction from. The return address is stored on the interpretation stack, along with whatever values are needed for the function call.

True. However, to simulate the low-level architecture of a concrete machine (either virtual or not) does seem the magic, comparing to simple REPL. With C++, I have to manually reify the activation records explicitly. I know the transition has to occur in somewhere of the language implementation stack (typically consisted of one or more kinds of ISA level machine code, portable IRs and high-level languages), but now it occurs at too high-level and that is not I want, because it requires too many details which should have been buried in underlying implementations. Has C++ to be bare-metal in the case?

Go look at the Lua interpreter, which does have guaranteed proper tail calls (not just recursion; any call of the form `return some_function(...)` does not increase the stack). It doesn't need to do "sophisticated magic" to do that.

Well, "recursion" and "calls" are synonymous here. The recursion means unbounded calls. (Though I actually use the latter in my documents, even before I've heard similar wording is also used in several other non-Scheme based languages, to avoid misinterpretation from readers.)

Lua 5+ does good job here if all I want is an interpreter (besides the language design), by providing low-level (but still portable) low-level abstraction in a form of VM. But it is a pain to work more with it (like partial specialization on the interpreter). PyPy is something sounds better here, but the object languages support no better proper tail calls than CPython.

BTW, http://lua-users.org/wiki/ProperTailRecursion: "Complain if your favourite programming language does not. " :)

And more importantly, as the step of evolution, their design does not qualified enough to address my need in the object language:
  1. No mandatory of globally separation of phases (to be friendly with program specification, supercompilation, etc)
  2. No mandatory of GC (to be friendly with deterministic & bounded resource management)
  3. First-class functions (to be friendly with ... modern programmers)
  4. First-class delimited continuations (to be friendly with both programmers and underlying implementations)
  5. Good interoperation with C++ and extensible to others with reasonably little effort
  6. Allowing to be JIT/AOT compiled to multiple targets (in future)
Or in short (but less precisely), I want a general-purposed language implementation like klisp + Racket + Chez Scheme/nanopass with opt-out GC, opt-in interops and portable (as possible) infrastructure.

The least magical I've known is trampolined style code (as klisp does).

FrankHB1989

unread,
Dec 1, 2017, 9:25:44 PM12/1/17
to ISO C++ Standard - Discussion


在 2017年12月2日星期六 UTC+8上午3:14:14,Ville Voutilainen写道:
Ah, great. That is what I have missing. Thank you.


FrankHB1989

unread,
Dec 1, 2017, 9:29:44 PM12/1/17
to ISO C++ Standard - Discussion


在 2017年12月2日星期六 UTC+8上午3:22:42,Michael Kilburn写道:
It does not occur in pure abstraction machine semantics. The problem is, in reality (even with an conforming implementation), it does occur. I have even no way to detect it reliably with aid of the language standard (unless to implement the language by myself).


Michael Kilburn

unread,
Dec 1, 2017, 9:50:20 PM12/1/17
to std-dis...@isocpp.org
It does not occur in pure abstraction machine semantics. The problem is, in reality (even with an conforming implementation), it does occur.

I explained why it "does occur" -- because implementation is allowed "to discover" it's limitation (that would otherwise will prevent it from running your program) at run-time.


I have even no way to detect it reliably with aid of the language standard (unless to implement the language by myself).

this is because on language level this situation (failure to allocate from local storage) simply doesn't happen. To deal with it you have to use implementation-specific tools (if it provides them) that are outside of language.

Let me provide an analogy:
- lets say you run a program on some hardware and your program is written on assumption that if value is written at certain address in memory -- all subsequent reads from that location will return the same value
- unfortunately such hardware can't be built thanks to entropy of universe and cosmic rays -- with time, sooner or later, this assumption will break
- instead of giving up we relax our requirements for underlying layer (i.e. hardware in our case) -- it is no longer required to give us this guarantee upfront, but we still need it to provide correct program execution
- ... so we design hardware in such way that it checks on every access whether guarantee still holds (e.g. via checksum or checking against a copy) -- and if it is discovered that smth went wrong, from your program's perspective time stops
- hardware may or may not recover from the issue, if it does -- your program won't notice it; and if it doesn't -- it won't notice it either, because it's execution will cease to happen

same for local storage allocations in C++...


--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+unsubscribe@isocpp.org.
To post to this group, send email to std-dis...@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.



--
Sincerely yours,
Michael.

FrankHB1989

unread,
Dec 1, 2017, 11:37:06 PM12/1/17
to ISO C++ Standard - Discussion


在 2017年12月2日星期六 UTC+8上午10:50:20,Michael Kilburn写道:
It does not occur in pure abstraction machine semantics. The problem is, in reality (even with an conforming implementation), it does occur.

I explained why it "does occur" -- because implementation is allowed "to discover" it's limitation (that would otherwise will prevent it from running your program) at run-time.


I have even no way to detect it reliably with aid of the language standard (unless to implement the language by myself).

this is because on language level this situation (failure to allocate from local storage) simply doesn't happen. To deal with it you have to use implementation-specific tools (if it provides them) that are outside of language.

Let me provide an analogy:
- lets say you run a program on some hardware and your program is written on assumption that if value is written at certain address in memory -- all subsequent reads from that location will return the same value
- unfortunately such hardware can't be built thanks to entropy of universe and cosmic rays -- with time, sooner or later, this assumption will break
- instead of giving up we relax our requirements for underlying layer (i.e. hardware in our case) -- it is no longer required to give us this guarantee upfront, but we still need it to provide correct program execution
- ... so we design hardware in such way that it checks on every access whether guarantee still holds (e.g. via checksum or checking against a copy) -- and if it is discovered that smth went wrong, from your program's perspective time stops
- hardware may or may not recover from the issue, if it does -- your program won't notice it; and if it doesn't -- it won't notice it either, because it's execution will cease to happen

same for local storage allocations in C++...

I think I know the situation... Just too bad for the cases those are quite easily observable by end users, so I have to fill the gap of abstraction and expectation by myself, relying on platform-dependent ways or even case-by-case analysis...


On Fri, Dec 1, 2017 at 8:29 PM, FrankHB1989 <frank...@gmail.com> wrote:


在 2017年12月2日星期六 UTC+8上午3:22:42,Michael Kilburn写道:
On Fri, Dec 1, 2017 at 1:14 PM, Ville Voutilainen <ville.vo...@gmail.com> wrote:
On 1 December 2017 at 21:04, Michael Kilburn <crusad...@gmail.com> wrote:
>> Does the standard have normative rules to guarantee allocation failure of
>> automatic objects always consistent and predictable?
>
> Afaik, yes

No, not at all. See
http://eel.is/c++draft/intro.compliance#2.1

If the mentioned resource limits are exceeded, you get undefined
behavior. There's no hint of "always consistent and predictable"
anywhere near any of it.

That was somewhat "tongue-in-cheek" answer. It can never happen (from C++ program POV) -- therefore it is "always consistent and predictable". :-)

It does not occur in pure abstraction machine semantics. The problem is, in reality (even with an conforming implementation), it does occur. I have even no way to detect it reliably with aid of the language standard (unless to implement the language by myself).


--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussio...@isocpp.org.

To post to this group, send email to std-dis...@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.



--
Sincerely yours,
Michael.

Michael Kilburn

unread,
Dec 2, 2017, 3:25:29 AM12/2/17
to std-dis...@isocpp.org
On Fri, Dec 1, 2017 at 10:37 PM, FrankHB1989 <frank...@gmail.com> wrote:
在 2017年12月2日星期六 UTC+8上午10:50:20,Michael Kilburn写道:
It does not occur in pure abstraction machine semantics. The problem is, in reality (even with an conforming implementation), it does occur.

I explained why it "does occur" -- because implementation is allowed "to discover" it's limitation (that would otherwise will prevent it from running your program) at run-time.


I have even no way to detect it reliably with aid of the language standard (unless to implement the language by myself).

this is because on language level this situation (failure to allocate from local storage) simply doesn't happen. To deal with it you have to use implementation-specific tools (if it provides them) that are outside of language.

Let me provide an analogy:
- lets say you run a program on some hardware and your program is written on assumption that if value is written at certain address in memory -- all subsequent reads from that location will return the same value
- unfortunately such hardware can't be built thanks to entropy of universe and cosmic rays -- with time, sooner or later, this assumption will break
- instead of giving up we relax our requirements for underlying layer (i.e. hardware in our case) -- it is no longer required to give us this guarantee upfront, but we still need it to provide correct program execution
- ... so we design hardware in such way that it checks on every access whether guarantee still holds (e.g. via checksum or checking against a copy) -- and if it is discovered that smth went wrong, from your program's perspective time stops
- hardware may or may not recover from the issue, if it does -- your program won't notice it; and if it doesn't -- it won't notice it either, because it's execution will cease to happen

same for local storage allocations in C++...

I think I know the situation...

Another reason for this is the fact that often your call tree depth depends on external data provided at run-time. I.e. it is simply impossible to pre-calculate your local storage requirements. Even when it is possible -- some implementations (MSVC) use stack for exceptions, this makes it much harder to figure out max possible stack usage (because while exception is in-flight -- stack "below" exception is not used for function calls made during unwinding).

 
Just too bad for the cases those are quite easily observable by end users, so I have to fill the gap of abstraction and expectation by myself, relying on platform-dependent ways or even case-by-case analysis...

Language certainly can recognize this situation and provide some model for handling mechanism. But which one? It certainly can't be via exceptions or return values. That leaves signals (inherited from C) or you have to add smth new.

Not sure about how C++ standard treats signals and if they are supported on every platform. I know they behave differently even from one Linux version to another.

Even if you come up with something -- suppose you ran out of local storage and that mechanism called another part of your code (presumably using another pre-allocated stack)... What it is going to do? There is no way to "add more stack to that thread" on typical implementation. Nor there is a good way to interrupt execution of that thread -- noexcept functions are way too important/convenient to make same mistake early versions of Java did.

If all you want to do is to print something to console (using another thread/stack) and exit (or freeze that particular thread) -- then yes, it is probably possible to change language to provide for this. But then you'll have questions about practical usefulness of that.

FrankHB1989

unread,
Dec 2, 2017, 7:45:26 AM12/2/17
to ISO C++ Standard - Discussion


在 2017年12月2日星期六 UTC+8下午4:25:29,Michael Kilburn写道:
On Fri, Dec 1, 2017 at 10:37 PM, FrankHB1989 <frank...@gmail.com> wrote:
在 2017年12月2日星期六 UTC+8上午10:50:20,Michael Kilburn写道:
It does not occur in pure abstraction machine semantics. The problem is, in reality (even with an conforming implementation), it does occur.

I explained why it "does occur" -- because implementation is allowed "to discover" it's limitation (that would otherwise will prevent it from running your program) at run-time.


I have even no way to detect it reliably with aid of the language standard (unless to implement the language by myself).

this is because on language level this situation (failure to allocate from local storage) simply doesn't happen. To deal with it you have to use implementation-specific tools (if it provides them) that are outside of language.

Let me provide an analogy:
- lets say you run a program on some hardware and your program is written on assumption that if value is written at certain address in memory -- all subsequent reads from that location will return the same value
- unfortunately such hardware can't be built thanks to entropy of universe and cosmic rays -- with time, sooner or later, this assumption will break
- instead of giving up we relax our requirements for underlying layer (i.e. hardware in our case) -- it is no longer required to give us this guarantee upfront, but we still need it to provide correct program execution
- ... so we design hardware in such way that it checks on every access whether guarantee still holds (e.g. via checksum or checking against a copy) -- and if it is discovered that smth went wrong, from your program's perspective time stops
- hardware may or may not recover from the issue, if it does -- your program won't notice it; and if it doesn't -- it won't notice it either, because it's execution will cease to happen

same for local storage allocations in C++...

I think I know the situation...

Another reason for this is the fact that often your call tree depth depends on external data provided at run-time. I.e. it is simply impossible to pre-calculate your local storage requirements. Even when it is possible -- some implementations (MSVC) use stack for exceptions, this makes it much harder to figure out max possible stack usage (because while exception is in-flight -- stack "below" exception is not used for function calls made during unwinding).

 
Just too bad for the cases those are quite easily observable by end users, so I have to fill the gap of abstraction and expectation by myself, relying on platform-dependent ways or even case-by-case analysis...

Language certainly can recognize this situation and provide some model for handling mechanism. But which one? It certainly can't be via exceptions or return values. That leaves signals (inherited from C) or you have to add smth new.
Is throwing std::bad_alloc object allowed by current C++ standard?

Not sure about how C++ standard treats signals and if they are supported on every platform. I know they behave differently even from one Linux version to another.

It seems that they are quite limited. C++ does not extend much compared to C, and C mandates only a few cases.

Even if you come up with something -- suppose you ran out of local storage and that mechanism called another part of your code (presumably using another pre-allocated stack)... What it is going to do? There is no way to "add more stack to that thread" on typical implementation. Nor there is a good way to interrupt execution of that thread -- noexcept functions are way too important/convenient to make same mistake early versions of Java did.

At least the program would have opportunity to report error... as usual std::bad_alloc cases.

Allowing reallocation of the whole stack may be also feasible sometimes, though I'd admit it may be error-prone in general. .

If all you want to do is to print something to console (using another thread/stack) and exit (or freeze that particular thread) -- then yes, it is probably possible to change language to provide for this. But then you'll have questions about practical usefulness of that.

Well, despite that I actually need things radically different now, I believe this is more or less useful. It is like the cases of std::terminated to be called - why not just leave them UB and allow crash?

Thiago Macieira

unread,
Dec 2, 2017, 2:03:04 PM12/2/17
to std-dis...@isocpp.org
On Saturday, 2 December 2017 04:45:26 PST FrankHB1989 wrote:
> > Language certainly can recognize this situation and provide some model for
> > handling mechanism. But which one? It certainly can't be via exceptions or
> > return values. That leaves signals (inherited from C) or you have to add
> > smth new.
>
> Is throwing std::bad_alloc object allowed by current C++ standard?

From where?

Paraphrasing Monty Python, when a thread reaches stack overflow, it is dead,
it's gone from this to a better one, it has gone to meet its maker. It's an
ex-thread.

No generic recovery is possible, no generic reporting mechanism other than an
out-of-stack (including other threads) handler. Not unless you add overhead to
every single stack allocation.

> > and that mechanism called another part of your code (presumably using
> > another pre-allocated stack)... What it is going to do? There is no way to
> > "add more stack to that thread" on typical implementation. Nor there is a
> > good way to interrupt execution of that thread -- *noexcept *functions
> > are way too important/convenient to make same mistake early versions of
> > Java did.
> >
> At least the program would have opportunity to report error... as usual
> std::bad_alloc cases.

It's too late if the stack has already exhausted. That's the same as heap
allocation succeeding, but the page faulting in failing.

You need to detect that the exhaustion would happen before it actually
happens. That's what std::bad_alloc does.

> Allowing reallocation of the whole stack may be also feasible sometimes,
> though I'd admit it may be error-prone in general. .

Stack cannot be reallocated, except in very restricted, specially constructed
cases.

It can be grown, but that's what it's already doing on any modern OS. If you
exhausted the stack, it's because the growth reached its limit.

--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center

Michael Kilburn

unread,
Dec 2, 2017, 4:31:45 PM12/2/17
to std-dis...@isocpp.org
On Sat, Dec 2, 2017 at 6:45 AM, FrankHB1989 <frank...@gmail.com> wrote:
在 2017年12月2日星期六 UTC+8下午4:25:29,Michael Kilburn写道:
Just too bad for the cases those are quite easily observable by end users, so I have to fill the gap of abstraction and expectation by myself, relying on platform-dependent ways or even case-by-case analysis...

Language certainly can recognize this situation and provide some model for handling mechanism. But which one? It certainly can't be via exceptions or return values. That leaves signals (inherited from C) or you have to add smth new.
Is throwing std::bad_alloc object allowed by current C++ standard?

Certainly not from "noexcept" functions. Besides, "throwing" in all current implementations means "call another function(s)" and you can't do it -- no stack space left. Doubt anyone will be thrilled if you try to enforce "throw" to not use current stack.

Even if you come up with something -- suppose you ran out of local storage and that mechanism called another part of your code (presumably using another pre-allocated stack)... What it is going to do? There is no way to "add more stack to that thread" on typical implementation. Nor there is a good way to interrupt execution of that thread -- noexcept functions are way too important/convenient to make same mistake early versions of Java did.

At least the program would have opportunity to report error... as usual std::bad_alloc cases.

Well, depends on what "reporting" means. You'll have to define it's meaning, figure out a way how to do it without using current stack and then argue with language lawyers about it practical necessity and effect on other C++ implementations that may not even use stack to model local storage.

 
Allowing reallocation of the whole stack may be also feasible sometimes, though I'd admit it may be error-prone in general. .

If all you want to do is to print something to console (using another thread/stack) and exit (or freeze that particular thread) -- then yes, it is probably possible to change language to provide for this. But then you'll have questions about practical usefulness of that.

Well, despite that I actually need things radically different now, I believe this is more or less useful. It is like the cases of std::terminated to be called - why not just leave them UB and allow crash?

Well, first of all UB doesn't mean crash -- it can be anything. Afaik, language doesn't define "crash" at all. Second, in your case -- how would you call std::terminate()? it needs stack space too...

I don't think it is impossible to devise such mechanism and put it into language (but I also argued that throwing from destructors should be allowed ;) ). Questions are:
- details of such mechanism behavior
- it's practical benefits
- it's implementation costs
- details of typical implementation (e.g. in GCC on x86)

--
Sincerely yours,
Michael.

FrankHB1989

unread,
Dec 2, 2017, 9:55:13 PM12/2/17
to ISO C++ Standard - Discussion


在 2017年12月3日星期日 UTC+8上午3:03:04,Thiago Macieira写道:
On Saturday, 2 December 2017 04:45:26 PST FrankHB1989 wrote:
> > Language certainly can recognize this situation and provide some model for
> > handling mechanism. But which one? It certainly can't be via exceptions or
> > return values. That leaves signals (inherited from C) or you have to add
> > smth new.
>
> Is throwing std::bad_alloc object allowed by current C++ standard?

From where?
... Another emergency area?


Paraphrasing Monty Python, when a thread reaches stack overflow, it is dead,
it's gone from this to a better one, it has gone to meet its maker. It's an
ex-thread.

No generic recovery is possible, no generic reporting mechanism other than an
out-of-stack (including other threads) handler. Not unless you add overhead to
every single stack allocation.

OK. The case should be that the thread had been exited with a pending allocation failure status (bad_alloc, or whatever else).

> > and that mechanism called another part of your code (presumably using
> > another pre-allocated stack)... What it is going to do? There is no way to
> > "add more stack to that thread" on typical implementation. Nor there is a
> > good way to interrupt execution of that thread -- *noexcept *functions
> > are way too important/convenient to make same mistake early versions of
> > Java did.
> >
> At least the program would have opportunity to report error... as usual
> std::bad_alloc cases.

It's too late if the stack has already exhausted. That's the same as heap
allocation succeeding, but the page faulting in failing.

You need to detect that the exhaustion would happen before it actually
happens. That's what std::bad_alloc does.

I see. So comes extra overhead and complexity, sigh...

I'd hope that page faults have never be invented in the style today so everyone has to meet and solve the real problems, without half-baked solutions (and other mess like overcommit)...MMU actually does a lot but only a few can be used by programmers, sigh.


> Allowing reallocation of the whole stack may be also feasible sometimes,
> though I'd admit it may be error-prone in general. .

Stack cannot be reallocated, except in very restricted, specially constructed
cases.

It can be grown, but that's what it's already doing on any modern OS. If you
exhausted the stack, it's because the growth reached its limit.

The actual work should be done by the processors. OS only configures it.

Thiago Macieira

unread,
Dec 2, 2017, 10:19:43 PM12/2/17
to std-dis...@isocpp.org
On Saturday, 2 December 2017 18:55:13 PST FrankHB1989 wrote:
> > Paraphrasing Monty Python, when a thread reaches stack overflow, it is
> > dead,
> > it's gone from this to a better one, it has gone to meet its maker. It's
> > an
> > ex-thread.
> >
> > No generic recovery is possible, no generic reporting mechanism other than
> > an
> > out-of-stack (including other threads) handler. Not unless you add
> > overhead to
> > every single stack allocation.
>
> OK. The case should be that the thread had been exited with a pending
> allocation failure status (bad_alloc, or whatever else).

You're mixing concepts here. An allocation failure does not exit the thread.
It's a recoverable condition.

Failing to serve a page during a page fault is an unrecoverable condition.
Failing to serve a page that is part of the stack implies the stack state is
now corrupt, which in turn means the corner stone of throwing known as "stack
unwinding" is not possible either. A stack overflow cannot be easily predicted
before it happens (and even then it would incur overhead), and after it has
happened, the thread can't save anything or do anything. It's over.

Besides, there is one thing that can react to a segmentation fault: an alt-
stack signal handler. If you want that in your application, you already have
the tools.

> I'd hope that page faults have never be invented in the style today so
> everyone has to meet and solve the real problems, without half-baked
> solutions (and other mess like overcommit)...MMU actually does a lot but
> only a few can be used by programmers, sigh.

The MMU is not the problem. It's your OS's virtual memory manager. It's
perfectly possible to use paging with static memory maps that are always
backed by RAM. That's how functionally-safe operating systems and hypervisors
do it. Most real-time operating systems will do that too.

You can choose to program only for them and leave the mainstream OSes behind.

FrankHB1989

unread,
Dec 2, 2017, 10:29:32 PM12/2/17
to ISO C++ Standard - Discussion


在 2017年12月3日星期日 UTC+8上午5:31:45,Michael Kilburn写道:
On Sat, Dec 2, 2017 at 6:45 AM, FrankHB1989 <frank...@gmail.com> wrote:
在 2017年12月2日星期六 UTC+8下午4:25:29,Michael Kilburn写道:
Just too bad for the cases those are quite easily observable by end users, so I have to fill the gap of abstraction and expectation by myself, relying on platform-dependent ways or even case-by-case analysis...

Language certainly can recognize this situation and provide some model for handling mechanism. But which one? It certainly can't be via exceptions or return values. That leaves signals (inherited from C) or you have to add smth new.
Is throwing std::bad_alloc object allowed by current C++ standard?

Certainly not from "noexcept" functions. Besides, "throwing" in all current implementations means "call another function(s)" and you can't do it -- no stack space left. Doubt anyone will be thrilled if you try to enforce "throw" to not use current stack.
Hmm, it can be elided in some cases but just too difficult in general. Certainly I should admit the stack can't be reused normally, so there needs to be some other buffer if space is required. But... if the thread is expected always to die, why not just jump instead of call?

Well, this may have ABI issues... I'm merely talking about possibilities.

Even if you come up with something -- suppose you ran out of local storage and that mechanism called another part of your code (presumably using another pre-allocated stack)... What it is going to do? There is no way to "add more stack to that thread" on typical implementation. Nor there is a good way to interrupt execution of that thread -- noexcept functions are way too important/convenient to make same mistake early versions of Java did.

At least the program would have opportunity to report error... as usual std::bad_alloc cases.

Well, depends on what "reporting" means. You'll have to define it's meaning, figure out a way how to do it without using current stack and then argue with language lawyers about it practical necessity and effect on other C++ implementations that may not even use stack to model local storage.

This is like handling std::bad_alloc. It is eventually up to user, with certain limitations (more strictly than usual cases). They can even be conditionally supported, however, UB is still not involved and other threads can remain alive.
 
Allowing reallocation of the whole stack may be also feasible sometimes, though I'd admit it may be error-prone in general. .

If all you want to do is to print something to console (using another thread/stack) and exit (or freeze that particular thread) -- then yes, it is probably possible to change language to provide for this. But then you'll have questions about practical usefulness of that.

Well, despite that I actually need things radically different now, I believe this is more or less useful. It is like the cases of std::terminated to be called - why not just leave them UB and allow crash?

Well, first of all UB doesn't mean crash -- it can be anything. Afaik, language doesn't define "crash" at all. Second, in your case -- how would you call std::terminate()? it needs stack space too...

Yes, it is "allowed". So they are similar. The handler of "std::terminated" is called by the implementation, not by user. It has freedom to pre-allocate the needed buffer, at least in hosted implementations. (Note now exception handling is even required in freestanding ones...)

I don't think it is impossible to devise such mechanism and put it into language (but I also argued that throwing from destructors should be allowed ;) ). Questions are:
- details of such mechanism behavior
- it's practical benefits
- it's implementation costs
- details of typical implementation (e.g. in GCC on x86)

Well, that may need a detailed proposal. The points here are allowance and conformance. Since resource exhaustion causes UB currently, the former is resolved; all I want (except ones I believed no sane way to be resolved like deploying spaghetti stack as requirements) remained is why it has to be UB Is it contradict with the current language design directly? AFAIK, no.

There still be some insights which are not off-topic too much (I think). They are mostly about "practical benefits".

Portability is at least one answer to it, and avoiding UB is one of the necessary condition. Detailed mechanism would bring more persuasiveness, but let it to be in std-proposals. First let's try proving it false about feasibility here, if you think it necessary.

Second, UB effect the whole program. To keep other thread work as expected is the practical need. In reality, unless there is only one thread, it is not likely that all threads have to died in same manner.

For costs... at least they are not easy to be zero. It is probably not useful for freestanding implementations as in hosted implementations, so it may be excluded there. Otherwise, I hope it can be afford by mainstream implementations.

--
Sincerely yours,
Michael.

FrankHB1989

unread,
Dec 2, 2017, 10:44:22 PM12/2/17
to ISO C++ Standard - Discussion


在 2017年12月3日星期日 UTC+8上午11:19:43,Thiago Macieira写道:
On Saturday, 2 December 2017 18:55:13 PST FrankHB1989 wrote:
> > Paraphrasing Monty Python, when a thread reaches stack overflow, it is
> > dead,
> > it's gone from this to a better one, it has gone to meet its maker. It's
> > an
> > ex-thread.
> >
> > No generic recovery is possible, no generic reporting mechanism other than
> > an
> > out-of-stack (including other threads) handler. Not unless you add
> > overhead to
> > every single stack allocation.
>
> OK. The case should be that the thread had been exited with a pending
> allocation failure status (bad_alloc, or whatever else).

You're mixing concepts here. An allocation failure does not exit the thread.
It's a recoverable condition.

Yes, they are separated concepts. I put them together only to weaken the requirements to make more implementable in the sense of standard conformance.


Failing to serve a page during a page fault is an unrecoverable condition.
Failing to serve a page that is part of the stack implies the stack state is
now corrupt, which in turn means the corner stone of throwing known as "stack
unwinding" is not possible either. A stack overflow cannot be easily predicted
before it happens (and even then it would incur overhead), and after it has
happened, the thread can't save anything or do anything. It's over.

Besides, there is one thing that can react to a segmentation fault: an alt-
stack signal handler. If you want that in your application, you already have
the tools.
That would fall back to platform-dependent solutions. And I actually does not need so powerful tool currently. (Especially when I have to work around for platforms without an MMU... I even don't have segmentation faults... though it is more or less freestanding cases.)

> I'd hope that page faults have never be invented in the style today so
> everyone has to meet and solve the real problems, without half-baked
> solutions (and other mess like overcommit)...MMU actually does a lot but
> only a few can be used by programmers, sigh.

The MMU is not the problem. It's your OS's virtual memory manager. It's
perfectly possible to use paging with static memory maps that are always
backed by RAM. That's how functionally-safe operating systems and hypervisors
do it. Most real-time operating systems will do that too.

So why mainstream OSes does not work like them? If the problem is too much overhead is incurred so manufacturers abandon the mechanism, it is the case what I said.

You can choose to program only for them and leave the mainstream OSes behind.

If it can run on my machine for daily work, I'd like to...

Nicol Bolas

unread,
Dec 2, 2017, 10:47:26 PM12/2/17
to ISO C++ Standard - Discussion
On Saturday, December 2, 2017 at 10:29:32 PM UTC-5, FrankHB1989 wrote:
在 2017年12月3日星期日 UTC+8上午5:31:45,Michael Kilburn写道:
On Sat, Dec 2, 2017 at 6:45 AM, FrankHB1989 <frank...@gmail.com> wrote:
在 2017年12月2日星期六 UTC+8下午4:25:29,Michael Kilburn写道:
Even if you come up with something -- suppose you ran out of local storage and that mechanism called another part of your code (presumably using another pre-allocated stack)... What it is going to do? There is no way to "add more stack to that thread" on typical implementation. Nor there is a good way to interrupt execution of that thread -- noexcept functions are way too important/convenient to make same mistake early versions of Java did.

At least the program would have opportunity to report error... as usual std::bad_alloc cases.

Well, depends on what "reporting" means. You'll have to define it's meaning, figure out a way how to do it without using current stack and then argue with language lawyers about it practical necessity and effect on other C++ implementations that may not even use stack to model local storage.

This is like handling std::bad_alloc.

It is nothing like handling `std::bad_alloc`. That is an exception, and there are rules for when those get thrown, where they get caught, and what happens when they do get caught.

You cannot unwind the stack if you have a stack overflow; since that risks overflowing again. And if you cannot unwind the stack, then you cannot destroy any of the objects on that stack. The absolute best you could do is say that the thread-stack never terminates in this situation. But that leaves the program broken; who knows if that stack is holding a mutex that will never be released?

Stack exhaustion is not a survivable condition. The entire C and C++ execution model relies on the stack existing. It would be exceedingly difficult to write a multithreaded program of significance where a thread can stack overflow, terminate in some meaningful way, and the program still proceed reasonably.

The most you could do is give people a standard function to test how close you are to exhaustion, so that the caller can choose to terminate in some way.

It is eventually up to user, with certain limitations (more strictly than usual cases). They can even be conditionally supported, however, UB is still not involved and other threads can remain alive.

I don't think it is impossible to devise such mechanism and put it into language (but I also argued that throwing from destructors should be allowed ;) ). Questions are:
- details of such mechanism behavior
- it's practical benefits
- it's implementation costs
- details of typical implementation (e.g. in GCC on x86)

Well, that may need a detailed proposal. The points here are allowance and conformance. Since resource exhaustion causes UB currently, the former is resolved; all I want (except ones I believed no sane way to be resolved like deploying spaghetti stack as requirements) remained is why it has to be UB Is it contradict with the current language design directly? AFAIK, no.

There still be some insights which are not off-topic too much (I think). They are mostly about "practical benefits".

Portability is at least one answer to it, and avoiding UB is one of the necessary condition.

C and C++ have existed for a combined total of over half a century. And yet, programmers have managed to write portable programs just fine despite this being UB the entire time.

Why is this a problem for portability in the real world?

This sounds very much like people who want `int` to be required to be exactly 32-bits, 2's complement, and claim that if it isn't, then they cannot write portable programs. Not everything has to be well-defined in order for a program to be portable.

Detailed mechanism would bring more persuasiveness, but let it to be in std-proposals. First let's try proving it false about feasibility here, if you think it necessary.

Second, UB effect the whole program. To keep other thread work as expected is the practical need. In reality, unless there is only one thread, it is not likely that all threads have to died in same manner.

For costs... at least they are not easy to be zero. It is probably not useful for freestanding implementations as in hosted implementations, so it may be excluded there. Otherwise, I hope it can be afford by mainstream implementations.

To what end? What would be accomplished by this which cannot otherwise be done?


FrankHB1989

unread,
Dec 2, 2017, 11:26:41 PM12/2/17
to ISO C++ Standard - Discussion


在 2017年12月3日星期日 UTC+8上午11:47:26,Nicol Bolas写道:
On Saturday, December 2, 2017 at 10:29:32 PM UTC-5, FrankHB1989 wrote:
在 2017年12月3日星期日 UTC+8上午5:31:45,Michael Kilburn写道:
On Sat, Dec 2, 2017 at 6:45 AM, FrankHB1989 <frank...@gmail.com> wrote:
在 2017年12月2日星期六 UTC+8下午4:25:29,Michael Kilburn写道:
Even if you come up with something -- suppose you ran out of local storage and that mechanism called another part of your code (presumably using another pre-allocated stack)... What it is going to do? There is no way to "add more stack to that thread" on typical implementation. Nor there is a good way to interrupt execution of that thread -- noexcept functions are way too important/convenient to make same mistake early versions of Java did.

At least the program would have opportunity to report error... as usual std::bad_alloc cases.

Well, depends on what "reporting" means. You'll have to define it's meaning, figure out a way how to do it without using current stack and then argue with language lawyers about it practical necessity and effect on other C++ implementations that may not even use stack to model local storage.

This is like handling std::bad_alloc.

It is nothing like handling `std::bad_alloc`. That is an exception, and there are rules for when those get thrown, where they get caught, and what happens when they do get caught.

You cannot unwind the stack if you have a stack overflow; since that risks overflowing again. And if you cannot unwind the stack, then you cannot destroy any of the objects on that stack. The absolute best you could do is say that the thread-stack never terminates in this situation. But that leaves the program broken; who knows if that stack is holding a mutex that will never be released?

Is I have said, if it works, I don't expect it would directly reuse the current stack.
But it is a good question about how to handling resources shared with other threads. In such case, is it possible that the stack can be unwound remotely/asynchronously?

Stack exhaustion is not a survivable condition. The entire C and C++ execution model relies on the stack existing. It would be exceedingly difficult to write a multithreaded program of significance where a thread can stack overflow, terminate in some meaningful way, and the program still proceed reasonably.

Not every case. Consider a strictly conforming `int main(){}` :)

I agree with the points of difficulties. What I want is (ideally) that it can be guaranteed never overflow until the free store is exhausted (which may be detected like catching std::bad_alloc). It seems far more difficult to be supported even by any hosted implementation. (Otherwise I have always to work it around for my task by trampolines, etc.)

The most you could do is give people a standard function to test how close you are to exhaustion, so that the caller can choose to terminate in some way.

It is eventually up to user, with certain limitations (more strictly than usual cases). They can even be conditionally supported, however, UB is still not involved and other threads can remain alive.

I don't think it is impossible to devise such mechanism and put it into language (but I also argued that throwing from destructors should be allowed ;) ). Questions are:
- details of such mechanism behavior
- it's practical benefits
- it's implementation costs
- details of typical implementation (e.g. in GCC on x86)

Well, that may need a detailed proposal. The points here are allowance and conformance. Since resource exhaustion causes UB currently, the former is resolved; all I want (except ones I believed no sane way to be resolved like deploying spaghetti stack as requirements) remained is why it has to be UB Is it contradict with the current language design directly? AFAIK, no.

There still be some insights which are not off-topic too much (I think). They are mostly about "practical benefits".

Portability is at least one answer to it, and avoiding UB is one of the necessary condition.

C and C++ have existed for a combined total of over half a century. And yet, programmers have managed to write portable programs just fine despite this being UB the entire time.

That depends on how much portability you have to meet.

Why is this a problem for portability in the real world?

In reality, the level of portability on high-level languages (like C++) and platform-dependent solutions has difference of a factor N where N is how many radically different categories of platforms have to support. If you need only portability across several platforms look alike each other enough, and there is sufficient documentation to elide the minor differences, then N ≈ 1, and you won't have too much extra work to do besides some common assumptions from these platforms, so the process is still "as-if" they are portable after the extra work and the result would kept reasonably reusable.

Sorry, this is really not my cases in reality... and I found nowhere to put the combined description of portable behavior other than the language standards.

This sounds very much like people who want `int` to be required to be exactly 32-bits, 2's complement, and claim that if it isn't, then they cannot write portable programs. Not everything has to be well-defined in order for a program to be portable.
It isn't. If there is no exact 32-bits 2's complement integer, use of other integer (correctly) does not incur extra UB directly. The extra work to remain portability is quite limited and trivial: to test all options of integer representations allowed by the standard, using conditional inclusion, etc. Note one reason of adding `std::int32_t` to the standard is to prevent reinvented wheels in vain; that is also different to my case where there is almost no reliable wheels unless the standard has required it.

(BTW, I prefer `std::ptrdiff_t`, `std::intptr_t` and `std::int_leastN_t` over `int` or `intN_t` in the sense of portability as possible, for significance of semantic properties.)


Detailed mechanism would bring more persuasiveness, but let it to be in std-proposals. First let's try proving it false about feasibility here, if you think it necessary.

Second, UB effect the whole program. To keep other thread work as expected is the practical need. In reality, unless there is only one thread, it is not likely that all threads have to died in same manner.

For costs... at least they are not easy to be zero. It is probably not useful for freestanding implementations as in hosted implementations, so it may be excluded there. Otherwise, I hope it can be afford by mainstream implementations.

To what end? What would be accomplished by this which cannot otherwise be done?


Conformance, if you insist other difference is not important in reality.


Thiago Macieira

unread,
Dec 3, 2017, 12:25:46 AM12/3/17
to std-dis...@isocpp.org
On Saturday, 2 December 2017 19:44:22 PST FrankHB1989 wrote:
> So why mainstream OSes does not work like them? If the problem is too much
> overhead is incurred so manufacturers abandon the mechanism, it is the case
> what I said.

Going off-topic here, but the long story short is that here are decades of
experience with VMMs that say it should be that way. A lot of applications
allocate memory they don't use and a great deal of memory they do use, it's
only for a short time so it can be swapped out. Then there's clean memory,
which can be simply dropped as it can be reloaded from disk.

There's not much overhead incurred. It will not get dropped.

You need to study the subject a little more before you can make suggestions
for improvements.

> > You can choose to program only for them and leave the mainstream OSes
> > behind.
>
> If it can run on my machine for daily work, I'd like to...

Obviously not. Modern VMM works just fine for "daily work" type of
applications. And that's not even talking about applications that use (gasp!)
garbage collection...

In any case, how is a modern VMM hindering your daily work? Are you
programming functionally-safe applications?

Thiago Macieira

unread,
Dec 3, 2017, 12:34:33 AM12/3/17
to std-dis...@isocpp.org
On Saturday, 2 December 2017 19:29:32 PST FrankHB1989 wrote:
> Second, UB effect the whole program.

Correct. All UB affects the whole program, all threads.

If you want to survive stack overflow, then we start by defining "stack
overflow". That's not part of the standard. Therefore, your solution is not
provided by the standard. Please read your platform's documentation.

If you want to survive a null pointer dereference, then read what your
platform does in that case. If you want to survive a division by zero, find
out how your platform delivers those.

Thiago Macieira

unread,
Dec 3, 2017, 12:45:46 AM12/3/17
to std-dis...@isocpp.org
On Saturday, 2 December 2017 20:26:41 PST FrankHB1989 wrote:
> Is I have said, if it works, I don't expect it would directly reuse the
> current stack.
> But it is a good question about how to handling resources shared with other
> threads. In such case, is it possible that the stack can be unwound
> remotely/asynchronously?

Again, please read up on your platform's architecture about stacks and the
concept of "context switching" before you make suggestions like this. There
are many things that are unique to a thread, like the thread ID, the platform-
specific thread pointer and the *stack*.

Unwinding the stack means running code *with* that stack. If the stack is
exhausted, the unwinding won't work. Even completely noexcept destructors may
call functions and create new automatic-storage variables. That requires
stack. If you could add more memory to the stack, you wouldn't have exhausted
it in the first place.

Not to mention that if the stack exhausted, then it's corrupt. Trying to
unwind a corrupt stack is GIGO - garbage in, garbage out.

> Sorry, this is really not my cases in reality... and I found nowhere to put
> the combined description of portable behavior other than the language
> standards.

Ever heard of POSIX? Single UNIX Specification and XPG?

Nicol Bolas

unread,
Dec 3, 2017, 10:42:56 AM12/3/17
to ISO C++ Standard - Discussion
On Saturday, December 2, 2017 at 11:26:41 PM UTC-5, FrankHB1989 wrote:
在 2017年12月3日星期日 UTC+8上午11:47:26,Nicol Bolas写道:
On Saturday, December 2, 2017 at 10:29:32 PM UTC-5, FrankHB1989 wrote:
在 2017年12月3日星期日 UTC+8上午5:31:45,Michael Kilburn写道:
On Sat, Dec 2, 2017 at 6:45 AM, FrankHB1989 <frank...@gmail.com> wrote:
在 2017年12月2日星期六 UTC+8下午4:25:29,Michael Kilburn写道:

Why is this a problem for portability in the real world?

In reality, the level of portability on high-level languages (like C++) and platform-dependent solutions has difference of a factor N where N is how many radically different categories of platforms have to support. If you need only portability across several platforms look alike each other enough, and there is sufficient documentation to elide the minor differences, then N ≈ 1, and you won't have too much extra work to do besides some common assumptions from these platforms, so the process is still "as-if" they are portable after the extra work and the result would kept reasonably reusable.

Sorry, this is really not my cases in reality... and I found nowhere to put the combined description of portable behavior other than the language standards.

OK, consider a program that takes a number as input and will perform recursion based on that number; larger numbers mean more recursion. Is that program portable?

Right now, the answer as far as the standard is concerned is "yes". Sure, for each platform, there is some maximum number that you can provide, beyond which you'll exhaust the available resources. But the program itself is portable.

You're effectively saying that you want the program to be considered non-portable, simply because it could exhaust resources. That is, a program is not "portable" to you unless it cannot overflow the stack.

This is not a reasonable definition of portability. It is certainly not a useful one...

This sounds very much like people who want `int` to be required to be exactly 32-bits, 2's complement, and claim that if it isn't, then they cannot write portable programs. Not everything has to be well-defined in order for a program to be portable.
It isn't. If there is no exact 32-bits 2's complement integer, use of other integer (correctly) does not incur extra UB directly. The extra work to remain portability is quite limited and trivial: to test all options of integer representations allowed by the standard, using conditional inclusion, etc. Note one reason of adding `std::int32_t` to the standard is to prevent reinvented wheels in vain; that is also different to my case where there is almost no reliable wheels unless the standard has required it.

(BTW, I prefer `std::ptrdiff_t`, `std::intptr_t` and `std::int_leastN_t` over `int` or `intN_t` in the sense of portability as possible, for significance of semantic properties.)

Detailed mechanism would bring more persuasiveness, but let it to be in std-proposals. First let's try proving it false about feasibility here, if you think it necessary.

Second, UB effect the whole program. To keep other thread work as expected is the practical need. In reality, unless there is only one thread, it is not likely that all threads have to died in same manner.

For costs... at least they are not easy to be zero. It is probably not useful for freestanding implementations as in hosted implementations, so it may be excluded there. Otherwise, I hope it can be afford by mainstream implementations.

To what end? What would be accomplished by this which cannot otherwise be done?


Conformance, if you insist other difference is not important in reality.

"Conformance" to what?


FrankHB1989

unread,
Dec 3, 2017, 12:05:33 PM12/3/17
to ISO C++ Standard - Discussion


在 2017年12月3日星期日 UTC+8下午1:25:46,Thiago Macieira写道:
On Saturday, 2 December 2017 19:44:22 PST FrankHB1989 wrote:
> So why mainstream OSes does not work like them? If the problem is too much
> overhead is incurred so manufacturers abandon the mechanism, it is the case
> what I said.

Going off-topic here, but the long story short is that here are decades of
experience with VMMs that say it should be that way. A lot of applications
allocate memory they don't use and a great deal of memory they do use, it's
only for a short time so it can be swapped out. Then there's clean memory,
which can be simply dropped as it can be reloaded from disk.

There's not much overhead incurred. It will not get dropped.

You need to study the subject a little more before you can make suggestions
for improvements.

I think I've misread your previous words. As of paging, your points are most valid. But pages (backed by RAM or not) still do not solve the problem magically. Paging is not the right tool to solve the problem in nature because it has too coarse granularity in allocation. The original point here is that MMU involves in address translation, not necessarily concrete management mechanism like paging. Segmentation can actually do something further for this problem, but it do incur some other unnecessary complexity and inflexibility. I'd expect some other internal state exported by MMU available differently from current ISA to aid runtime check of frames of activation records (not necessarily the stack) in some hardware-accelerated way.

> > You can choose to program only for them and leave the mainstream OSes
> > behind.
>
> If it can run on my machine for daily work, I'd like to...

Obviously not. Modern VMM works just fine for "daily work" type of
applications. And that's not even talking about applications that use (gasp!)
garbage collection...

In any case, how is a modern VMM hindering your daily work? Are you
programming functionally-safe applications?

OK. My daily work relies on mainstream OSes because most applications only available on them. Modern VMM does not always hindering my work, but it does not work well in every case especially when I am work on the machine where is no enough RAM space. As an end-user, I have to always carefully terminate some programs periodically to avoid too many GBs are ate in vein, or make my HDD quiet. Garbage collected programs do work worse for these cases in general. But anyway, this is another topic.

And it sounds strange if I have to write programs only work with hypervisor...

FrankHB1989

unread,
Dec 3, 2017, 12:12:32 PM12/3/17
to ISO C++ Standard - Discussion


在 2017年12月3日星期日 UTC+8下午1:34:33,Thiago Macieira写道:
On Saturday, 2 December 2017 19:29:32 PST FrankHB1989 wrote:
> Second, UB effect the whole program.

Correct. All UB affects the whole program, all threads.

If you want to survive stack overflow, then we start by defining "stack
overflow". That's not part of the standard. Therefore, your solution is not
provided by the standard. Please read your platform's documentation.

Well, this is normally not enough. Platform's documentation would not provide enough details in implementation of the language in general. Some assumptions can be provided by ISA manual, the remained things relying on ... guessing. It can even still as unpredictable as UB if not deep dug.

If you want to survive a null pointer dereference, then read what your
platform does in that case. If you want to survive a division by zero, find
out how your platform delivers those.

Perhaps also some DR in some cases...

Thiago Macieira

unread,
Dec 3, 2017, 12:18:24 PM12/3/17
to std-dis...@isocpp.org
On Sunday, 3 December 2017 09:05:32 PST FrankHB1989 wrote:
> I think I've misread your previous words. As of paging, your points are
> most valid. But pages (backed by RAM or not) still do not solve the problem
> magically. Paging is not the right tool to solve the problem in nature
> because it has too coarse granularity in allocation. The original point
> here is that MMU involves in address translation, not necessarily concrete
> management mechanism like paging. Segmentation can actually do something
> further for this problem, but it do incur some other unnecessary complexity
> and inflexibility. I'd expect some other internal state exported by MMU
> available differently from current ISA to aid runtime check of frames of
> activation records (*not necessarily the stack*) in some
> hardware-accelerated way.

Explain a little more. What would you like to see done that can't be done
right now? What purpose would this serve? What could applications do that they
can't right now?

> OK. My daily work relies on mainstream OSes because most applications only
> available on them. Modern VMM does not always hindering my work, but it
> does not work well in every case especially when I am work on the machine
> where is no enough RAM space. As an end-user, I have to always carefully
> terminate some programs periodically to avoid too many GBs are ate in vein,
> or make my HDD quiet. Garbage collected programs do work worse for these
> cases in general. But anyway, this is another topic.

More than likely, you can only run those programs in the first place *because*
of modern VMM. If paging and overcommitting were not availbale, you'd only run
half of those programs at a time and some of them would crash unexpectedly
before you realised you should close them to free up memory.

> And it sounds strange if I have to write programs only work with
> hypervisor...

It's not strange. If you *need* the guarantee that memory you asked for was
delivered and was not taken away, like in a functionally-safe application, you
need an OS that gives you that. So choose your OS carefully. Don't use a
mainstream, multi-purpose OS.

FrankHB1989

unread,
Dec 3, 2017, 12:32:02 PM12/3/17
to ISO C++ Standard - Discussion


在 2017年12月3日星期日 UTC+8下午1:45:46,Thiago Macieira写道:
On Saturday, 2 December 2017 20:26:41 PST FrankHB1989 wrote:
> Is I have said, if it works, I don't expect it would directly reuse the
> current stack.
> But it is a good question about how to handling resources shared with other
> threads. In such case, is it possible that the stack can be unwound
> remotely/asynchronously?

Again, please read up on your platform's architecture about stacks and the
concept of "context switching" before you make suggestions like this. There
are many things that are unique to a thread, like the thread ID, the platform-
specific thread pointer and the *stack*.
This time I'd like to say no. The stack is not in the standard. So is context switching. So is the threading model.

Even native threads provided by OS are expected in most implementations, they are not mandated. Even there is the stack, it is not necessarily the native one. Even the threads are one-to-one mapped to the OS thread, they are not necessary native to processors (ISA-level; SMT on microarchitectural is another story). There is too much room...


Unwinding the stack means running code *with* that stack. If the stack is
exhausted, the unwinding won't work. Even completely noexcept destructors may
call functions and create new automatic-storage variables. That requires
stack. If you could add more memory to the stack, you wouldn't have exhausted
it in the first place.

No. Stack unwinding is mandated, but not the stack. It only means the code might be run as-if on that stack. I don't see why it can't be simulated where the runtime has the whole control, besides interops/compatibility issues.

Not to mention that if the stack exhausted, then it's corrupt. Trying to
unwind a corrupt stack is GIGO - garbage in, garbage out.

This is the status quo, a true problem to be resolved.

> Sorry, this is really not my cases in reality... and I found nowhere to put
> the combined description of portable behavior other than the language
> standards.

Ever heard of POSIX? Single UNIX Specification and XPG?

Seriously, do you really rely on the portability across platforms in reality? Not to mention freestanding implementations, they don't even applicable well for Windows...

(POSIX support is always not first-class citizen supported in Windows. WSL is still in beta and is still suffering from several severe syscall bugs in current version so it is not ready to replace the host environment.)

Another question... are there working drafts of them available conveniently as WG21 documents?

FrankHB1989

unread,
Dec 3, 2017, 12:47:09 PM12/3/17
to ISO C++ Standard - Discussion


在 2017年12月3日星期日 UTC+8下午11:42:56,Nicol Bolas写道:
On Saturday, December 2, 2017 at 11:26:41 PM UTC-5, FrankHB1989 wrote:
在 2017年12月3日星期日 UTC+8上午11:47:26,Nicol Bolas写道:
On Saturday, December 2, 2017 at 10:29:32 PM UTC-5, FrankHB1989 wrote:
在 2017年12月3日星期日 UTC+8上午5:31:45,Michael Kilburn写道:
On Sat, Dec 2, 2017 at 6:45 AM, FrankHB1989 <frank...@gmail.com> wrote:
在 2017年12月2日星期六 UTC+8下午4:25:29,Michael Kilburn写道:

Why is this a problem for portability in the real world?

In reality, the level of portability on high-level languages (like C++) and platform-dependent solutions has difference of a factor N where N is how many radically different categories of platforms have to support. If you need only portability across several platforms look alike each other enough, and there is sufficient documentation to elide the minor differences, then N ≈ 1, and you won't have too much extra work to do besides some common assumptions from these platforms, so the process is still "as-if" they are portable after the extra work and the result would kept reasonably reusable.

Sorry, this is really not my cases in reality... and I found nowhere to put the combined description of portable behavior other than the language standards.

OK, consider a program that takes a number as input and will perform recursion based on that number; larger numbers mean more recursion. Is that program portable?

Right now, the answer as far as the standard is concerned is "yes". Sure, for each platform, there is some maximum number that you can provide, beyond which you'll exhaust the available resources. But the program itself is portable.

You're effectively saying that you want the program to be considered non-portable, simply because it could exhaust resources. That is, a program is not "portable" to you unless it cannot overflow the stack.

This is not a reasonable definition of portability. It is certainly not a useful one...


The usefulness depends.

It is easy to define. It can be clearly precise. The term strictly conforming program in ISO C captures the intention exactly.

Besides the property of the definition itself, instances also count. A strictly conforming C program is not so useful mostly because ISO C requires too few so almost any real program does not fall in the category. The similar ones in C++ are hopefully not so limited.

This sounds very much like people who want `int` to be required to be exactly 32-bits, 2's complement, and claim that if it isn't, then they cannot write portable programs. Not everything has to be well-defined in order for a program to be portable.
It isn't. If there is no exact 32-bits 2's complement integer, use of other integer (correctly) does not incur extra UB directly. The extra work to remain portability is quite limited and trivial: to test all options of integer representations allowed by the standard, using conditional inclusion, etc. Note one reason of adding `std::int32_t` to the standard is to prevent reinvented wheels in vain; that is also different to my case where there is almost no reliable wheels unless the standard has required it.

(BTW, I prefer `std::ptrdiff_t`, `std::intptr_t` and `std::int_leastN_t` over `int` or `intN_t` in the sense of portability as possible, for significance of semantic properties.)

Detailed mechanism would bring more persuasiveness, but let it to be in std-proposals. First let's try proving it false about feasibility here, if you think it necessary.

Second, UB effect the whole program. To keep other thread work as expected is the practical need. In reality, unless there is only one thread, it is not likely that all threads have to died in same manner.

For costs... at least they are not easy to be zero. It is probably not useful for freestanding implementations as in hosted implementations, so it may be excluded there. Otherwise, I hope it can be afford by mainstream implementations.

To what end? What would be accomplished by this which cannot otherwise be done?


Conformance, if you insist other difference is not important in reality.

"Conformance" to what?

The language specification.

In the worst case, if the portability is guaranteed in such manner but it still not meet my requirements, as the vendor, I can at most fork the language specification to derive a dialect, not the language + POSIX + ISA documents + other messes...

(Yes, this is about the usability of the spec itself.)



FrankHB1989

unread,
Dec 3, 2017, 12:53:36 PM12/3/17
to ISO C++ Standard - Discussion


在 2017年12月4日星期一 UTC+8上午1:18:24,Thiago Macieira写道:
On Sunday, 3 December 2017 09:05:32 PST FrankHB1989 wrote:
> I think I've misread your previous words. As of paging, your points are
> most valid. But pages (backed by RAM or not) still do not solve the problem
> magically. Paging is not the right tool to solve the problem in nature
> because it has too coarse granularity in allocation. The original point
> here is that MMU involves in address translation, not necessarily concrete
> management mechanism like paging. Segmentation can actually do something
> further for this problem, but it do incur some other unnecessary complexity
> and inflexibility. I'd expect some other internal state exported by MMU
> available differently from current ISA to aid runtime check of frames of
> activation records (*not necessarily the stack*) in some
> hardware-accelerated way.

Explain a little more. What would you like to see done that can't be done
right now? What purpose would this serve? What could applications do that they
can't right now?
As said previously, to write implementation of a language (at least) with first-class continuations in direct (non-trampolined, non-CPS transformed) style without construction of software stack. Unless proper tail call is required, it is not implementable in a portable way. Note manual transformation usually involve *huge* amount of work.

I've thought this was too difficult to achieve for about conformance requirements, so I weakened the requirements later.

Thiago Macieira

unread,
Dec 3, 2017, 12:54:42 PM12/3/17
to std-dis...@isocpp.org
On Sunday, 3 December 2017 09:32:02 PST FrankHB1989 wrote:
> > Unwinding the stack means running code *with* that stack. If the stack is
> > exhausted, the unwinding won't work. Even completely noexcept destructors
> > may
> > call functions and create new automatic-storage variables. That requires
> > stack. If you could add more memory to the stack, you wouldn't have
> > exhausted
> > it in the first place.
> >
> No. Stack unwinding is mandated, but not the stack. It only means the code
> might be run *as-if* on that stack. I don't see why it can't be simulated
> where the runtime has the whole control, besides interops/compatibility
> issues.

The code is not on the stack. The code is in the code pages.

The stack is what stack pointer register points to. If you point to the actual
stack, you're on that stack. If you point elsewhere, you're not on the actual
stack, but then you can't unwind it either.

Are you suggesting copying some pages off the top of the stack (lowest
addresses) to some altstack place where the destructors could run?

> > Not to mention that if the stack exhausted, then it's corrupt. Trying to
> > unwind a corrupt stack is GIGO - garbage in, garbage out.
> >
> This is the status quo, a true problem to be resolved.

We're saying it's not. Stack overflows do not usually happen in normal
applications. When they happen, they're usually due to bugs in the code in the
first place (unbounded recursion). So if it's due to bugs, there's no
guarantee that providing a recovery mechanism will help and there's also no
need for it as non-buggy code doesn't have the problem in the first place.

> > Ever heard of POSIX? Single UNIX Specification and XPG?
>
> Seriously, do you really rely on the portability *across *platforms in
> reality? Not to mention freestanding implementations, they don't even
> applicable well for Windows...

Yes, I do. Oh, sure, there's a lot of bugs everywhere that I need to work
around. But the vast majority of the the Unix code I write is just plainly
cross-platform.

Freestanding doesn't apply to me.

> Another question... are there working drafts of them available conveniently
> as WG21 documents?

Working drafts of what?

Thiago Macieira

unread,
Dec 3, 2017, 12:59:06 PM12/3/17
to std-dis...@isocpp.org
On Sunday, 3 December 2017 09:53:36 PST FrankHB1989 wrote:
> > Explain a little more. What would you like to see done that can't be done
> > right now? What purpose would this serve? What could applications do that
> > they
> > can't right now?
>
> As said previously, to write implementation of a language (at least) with
> first-class continuations in direct (non-trampolined, non-CPS transformed)
> style without construction of software stack. Unless proper tail call is
> required, it is not implementable in a portable way. Note manual
> transformation usually involve *huge* amount of work.

Sorry, this went over my head. Can you elaborate on what functionality you
need from the language, OS or processor/ISA to achieve that implementation
that you don't have today?

FrankHB1989

unread,
Dec 3, 2017, 1:14:12 PM12/3/17
to ISO C++ Standard - Discussion


在 2017年12月4日星期一 UTC+8上午1:54:42,Thiago Macieira写道:
On Sunday, 3 December 2017 09:32:02 PST FrankHB1989 wrote:
> > Unwinding the stack means running code *with* that stack. If the stack is
> > exhausted, the unwinding won't work. Even completely noexcept destructors
> > may
> > call functions and create new automatic-storage variables. That requires
> > stack. If you could add more memory to the stack, you wouldn't have
> > exhausted
> > it in the first place.
> >
> No. Stack unwinding is mandated, but not the stack. It only means the code
> might be run *as-if* on that stack. I don't see why it can't be simulated
> where the runtime has the whole control, besides interops/compatibility
> issues.

The code is not on the stack. The code is in the code pages.

The stack is what stack pointer register points to. If you point to the actual
stack, you're on that stack. If you point elsewhere, you're not on the actual
stack, but then you can't unwind it either.

Are you suggesting copying some pages off the top of the stack (lowest
addresses) to some altstack place where the destructors could run?

OK, *with* the stack is right. Sorry for my typo.

I did not suggest the concrete way to implement. I did suggest it can be out of current stack, so this can be a choice. Another way is to transform the destructor calls to stackless style.

(As I said previously, if I merely want to make other threads alive, no destructor needs to be called. It is certainly dangerous for shared resources, though.)

FrankHB1989

unread,
Dec 3, 2017, 1:19:43 PM12/3/17
to ISO C++ Standard - Discussion


在 2017年12月4日星期一 UTC+8上午1:59:06,Thiago Macieira写道:
On Sunday, 3 December 2017 09:53:36 PST FrankHB1989 wrote:
> > Explain a little more. What would you like to see done that can't be done
> > right now? What purpose would this serve? What could applications do that
> > they
> > can't right now?
>
> As said previously, to write implementation of a language (at least) with
> first-class continuations in direct (non-trampolined, non-CPS transformed)
> style without construction of software stack. Unless proper tail call is
> required, it is not implementable in a portable way. Note manual
> transformation usually involve *huge* amount of work.

Sorry, this went over my head. Can you elaborate on what functionality you
need from the language, OS or processor/ISA to achieve that implementation
that you don't have today?

If guarantee of proper tail calls is not the functionality... for example, a call/cc variant meets zero overhead principle?

FrankHB1989

unread,
Dec 3, 2017, 1:31:17 PM12/3/17
to ISO C++ Standard - Discussion


在 2017年12月4日星期一 UTC+8上午1:54:42,Thiago Macieira写道:
On Sunday, 3 December 2017 09:32:02 PST FrankHB1989 wrote:
It is considered not usual in imperative style code (though it still seldom occurs). It is actually quite normal in so-called functional code. And when I am implement a language deliberately with such style, I can't enforce the limitation on users' code as same as C++; that would kill most of usability. The workarounds are either obviously not portable or with obvious overhead. This is better fixed in the language implementing it.

As of my weakened requirements... true, it is not so significant merely to workaround bugs.

> > Ever heard of POSIX? Single UNIX Specification and XPG?
>
> Seriously, do you really rely on the portability *across *platforms in
> reality? Not to mention freestanding implementations, they don't even
> applicable well for Windows...

Yes, I do. Oh, sure, there's a lot of bugs everywhere that I need to work
around. But the vast majority of the the Unix code I write is just plainly
cross-platform.

Freestanding doesn't apply to me.
 
Sorry, that is not my case.

> Another question... are there working drafts of them available conveniently
> as WG21 documents?

Working drafts of what?

POSIX/SUS/XPG...

FrankHB1989

unread,
Dec 3, 2017, 1:36:46 PM12/3/17
to ISO C++ Standard - Discussion


在 2017年12月4日星期一 UTC+8上午2:31:17,FrankHB1989写道:
I should be talking about the code leads to stack overflow, not the stack overflow itself. It the stack overflow in the end, the implementation has bugs.

Thiago Macieira

unread,
Dec 3, 2017, 1:37:57 PM12/3/17
to std-dis...@isocpp.org
On Sunday, 3 December 2017 10:19:42 PST FrankHB1989 wrote:
> > Sorry, this went over my head. Can you elaborate on what functionality you
> > need from the language, OS or processor/ISA to achieve that implementation
> > that you don't have today?
> >
> If guarantee of proper tail calls is not the functionality... for example,
> a call/cc variant meets zero overhead principle?

What's overhead for you in a function call? Obviously a function call needs to
store somewhere the return address. In some architectures, there's also some
other state that needs to be stored. That's not overhead, that's the normal
part.

Thiago Macieira

unread,
Dec 3, 2017, 1:38:58 PM12/3/17
to std-dis...@isocpp.org
On Sunday, 3 December 2017 10:14:12 PST FrankHB1989 wrote:
> (As I said previously, if I merely want to make other threads alive, no
> destructor needs to be called. It is certainly dangerous for shared
> resources, though.)

Then just suspend this thread forever if it overflows its stack.

Any locked mutexes and other contexts will stay locked forever too.

Thiago Macieira

unread,
Dec 3, 2017, 1:41:46 PM12/3/17
to std-dis...@isocpp.org
On Sunday, 3 December 2017 10:31:16 PST FrankHB1989 wrote:
> It is considered not usual in imperative style code (though it still
> seldom occurs). It is actually quite normal in so-called functional code.
> And when I am implement a language deliberately with such style, I can't
> enforce the limitation on users' code as same as C++; that would kill most
> of usability. The workarounds are either obviously not portable or with
> obvious overhead. This is better fixed in the language implementing it.

Sorry, any code that recurses on anything besides O(1) (if I'm feeling
generous, O(log n)) on the number of elements is buggy. Redesign it.

Nicol Bolas

unread,
Dec 3, 2017, 2:00:02 PM12/3/17
to ISO C++ Standard - Discussion


On Sunday, December 3, 2017 at 12:53:36 PM UTC-5, FrankHB1989 wrote:


在 2017年12月4日星期一 UTC+8上午1:18:24,Thiago Macieira写道:
On Sunday, 3 December 2017 09:05:32 PST FrankHB1989 wrote:
> I think I've misread your previous words. As of paging, your points are
> most valid. But pages (backed by RAM or not) still do not solve the problem
> magically. Paging is not the right tool to solve the problem in nature
> because it has too coarse granularity in allocation. The original point
> here is that MMU involves in address translation, not necessarily concrete
> management mechanism like paging. Segmentation can actually do something
> further for this problem, but it do incur some other unnecessary complexity
> and inflexibility. I'd expect some other internal state exported by MMU
> available differently from current ISA to aid runtime check of frames of
> activation records (*not necessarily the stack*) in some
> hardware-accelerated way.

Explain a little more. What would you like to see done that can't be done
right now? What purpose would this serve? What could applications do that they
can't right now?
As said previously, to write implementation of a language (at least) with first-class continuations in direct (non-trampolined, non-CPS transformed) style without construction of software stack.

And why do you need to do that? This sounds very much like defining yourself into a problem. What's wrong with a software stack? Interpreters have been using them for years without problems. What's wrong with trampolined or CPS-transformed style?

Why do you need a way to do this in a non-platform-specific way? After all, the only reason I can see for wanting to do any of this is if you're trying to use JIT or something similar. And JIT has to be platform-specific, by its very nature.

A good motivation for an idea comes from practical problems; your problem here seems like an artificially created one, used to justify this particular solution.

Nicol Bolas

unread,
Dec 3, 2017, 2:22:17 PM12/3/17
to ISO C++ Standard - Discussion
On Sunday, December 3, 2017 at 12:47:09 PM UTC-5, FrankHB1989 wrote:
在 2017年12月3日星期日 UTC+8下午11:42:56,Nicol Bolas写道:
On Saturday, December 2, 2017 at 11:26:41 PM UTC-5, FrankHB1989 wrote:
在 2017年12月3日星期日 UTC+8上午11:47:26,Nicol Bolas写道:
On Saturday, December 2, 2017 at 10:29:32 PM UTC-5, FrankHB1989 wrote:
在 2017年12月3日星期日 UTC+8上午5:31:45,Michael Kilburn写道:
On Sat, Dec 2, 2017 at 6:45 AM, FrankHB1989 <frank...@gmail.com> wrote:
在 2017年12月2日星期六 UTC+8下午4:25:29,Michael Kilburn写道:

Why is this a problem for portability in the real world?

In reality, the level of portability on high-level languages (like C++) and platform-dependent solutions has difference of a factor N where N is how many radically different categories of platforms have to support. If you need only portability across several platforms look alike each other enough, and there is sufficient documentation to elide the minor differences, then N ≈ 1, and you won't have too much extra work to do besides some common assumptions from these platforms, so the process is still "as-if" they are portable after the extra work and the result would kept reasonably reusable.

Sorry, this is really not my cases in reality... and I found nowhere to put the combined description of portable behavior other than the language standards.

OK, consider a program that takes a number as input and will perform recursion based on that number; larger numbers mean more recursion. Is that program portable?

Right now, the answer as far as the standard is concerned is "yes". Sure, for each platform, there is some maximum number that you can provide, beyond which you'll exhaust the available resources. But the program itself is portable.

You're effectively saying that you want the program to be considered non-portable, simply because it could exhaust resources. That is, a program is not "portable" to you unless it cannot overflow the stack.

This is not a reasonable definition of portability. It is certainly not a useful one...


The usefulness depends.

It is easy to define.

What is "easy to define"? The limits of the stack? How could you possibly define that?

The behavior of a stack overflow? There's no effective way to define that; if there was, then `std::thread` would have a way to terminate the given thread. It doesn't have such a thing, because the idea that a thread can be terminated by outside causes is not something the C++ object model can handle.

And what of forward progress guarantees under such a system? If a thread can just stop, for whatever reason, C++ just doesn't work anymore.

Nothing of what you're talking about can be deemed "easy to define".

It can be clearly precise. The term strictly conforming program in ISO C captures the intention exactly.

Besides the property of the definition itself, instances also count. A strictly conforming C program is not so useful mostly because ISO C requires too few so almost any real program does not fall in the category.

How do you figure?

The similar ones in C++ are hopefully not so limited.

This sounds very much like people who want `int` to be required to be exactly 32-bits, 2's complement, and claim that if it isn't, then they cannot write portable programs. Not everything has to be well-defined in order for a program to be portable.
It isn't. If there is no exact 32-bits 2's complement integer, use of other integer (correctly) does not incur extra UB directly. The extra work to remain portability is quite limited and trivial: to test all options of integer representations allowed by the standard, using conditional inclusion, etc. Note one reason of adding `std::int32_t` to the standard is to prevent reinvented wheels in vain; that is also different to my case where there is almost no reliable wheels unless the standard has required it.

(BTW, I prefer `std::ptrdiff_t`, `std::intptr_t` and `std::int_leastN_t` over `int` or `intN_t` in the sense of portability as possible, for significance of semantic properties.)

Detailed mechanism would bring more persuasiveness, but let it to be in std-proposals. First let's try proving it false about feasibility here, if you think it necessary.

Second, UB effect the whole program. To keep other thread work as expected is the practical need. In reality, unless there is only one thread, it is not likely that all threads have to died in same manner.

For costs... at least they are not easy to be zero. It is probably not useful for freestanding implementations as in hosted implementations, so it may be excluded there. Otherwise, I hope it can be afford by mainstream implementations.

To what end? What would be accomplished by this which cannot otherwise be done?


Conformance, if you insist other difference is not important in reality.

"Conformance" to what?

The language specification.

My question was "What would be accomplished by this which cannot otherwise be done?" Your answer was "Conformance [to] the language specification".

That's not an answer.

Compilers already conform to the language specification. Programs already conform to the language specification. You're talking about changing the language specification to have different conformance requirements. Such changes will not increase "conformance".

So again, what is it that you're trying to gain from this?

In the worst case, if the portability is guaranteed in such manner but it still not meet my requirements, as the vendor, I can at most fork the language specification to derive a dialect, not the language + POSIX + ISA documents + other messes...

Nothing is stopping them from creating such a fork right now. If an IHV wanted to, they could fork the spec and specify all kinds of behavior that was implementation-defined or even undefined. The possibility of stack overflows and their behavior could easily be part of such additions.

So again, defining this will not improve things in this way.

FrankHB1989

unread,
Dec 3, 2017, 8:53:12 PM12/3/17
to ISO C++ Standard - Discussion


在 2017年12月4日星期一 UTC+8上午2:37:57,Thiago Macieira写道:
On Sunday, 3 December 2017 10:19:42 PST FrankHB1989 wrote:
> > Sorry, this went over my head. Can you elaborate on what functionality you
> > need from the language, OS or processor/ISA to achieve that implementation
> > that you don't have today?
> >
> If guarantee of proper tail calls is not the functionality... for example,
> a call/cc variant meets zero overhead principle?

What's overhead for you in a function call? Obviously a function call needs to
store somewhere the return address. In some architectures, there's also some
other state that needs to be stored. That's not overhead, that's the normal
part.
You need to study the subject a little more... Traditional call/cc would need to copy the whole stack without something more magical. (And strictly speaking, the arguments to be "called" are not usual functions.) In general, I don't need most of the frames it would capture, but I have no way to specify in the interface. So this directly violate the principle a lot, as well as semantic problems like how to interact with RAII. (P0534R0 excludes full continuations for the very same reason.) The overhead seems to be theoretically unavoidable so I would only consider the variant which allows me to specify a portion of top frames I needed. But even this one is not easy to implement natively (with platform-dependent code).

BTW, a limited case of the feature involved is called "exception handling" (which has limited the direction of continuation propagation)... I'm glad to see you treat it as a function call to keep the core language clean :)

FrankHB1989

unread,
Dec 3, 2017, 8:58:19 PM12/3/17
to ISO C++ Standard - Discussion


在 2017年12月4日星期一 UTC+8上午2:38:58,Thiago Macieira写道:
On Sunday, 3 December 2017 10:14:12 PST FrankHB1989 wrote:
> (As I said previously, if I merely want to make other threads alive, no
> destructor needs to be called. It is certainly dangerous for shared
> resources, though.)

Then just suspend this thread forever if it overflows its stack.

Any locked mutexes and other contexts will stay locked forever too.

This does not sound sane. It leaks resources not shared, and it is problematic on shared resources. The locks it holds would probably block others so this can be another can of worms...

FrankHB1989

unread,
Dec 3, 2017, 9:02:04 PM12/3/17
to ISO C++ Standard - Discussion


在 2017年12月4日星期一 UTC+8上午2:41:46,Thiago Macieira写道:
On Sunday, 3 December 2017 10:31:16 PST FrankHB1989 wrote:
> It is considered not usual in imperative style code (though it still
> seldom occurs). It is actually quite normal in so-called functional code.
> And when I am implement a language deliberately with such style, I can't
> enforce the limitation on users' code as same as C++; that would kill most
> of usability. The workarounds are either obviously not portable or with
> obvious overhead. This is better fixed in the language implementing it.

Sorry, any code that recurses on anything besides O(1) (if I'm feeling
generous, O(log n)) on the number of elements is buggy. Redesign it.

For example, tree traversing (even with known limited depth)?

FrankHB1989

unread,
Dec 3, 2017, 9:24:10 PM12/3/17
to ISO C++ Standard - Discussion


在 2017年12月4日星期一 UTC+8上午3:00:02,Nicol Bolas写道:


On Sunday, December 3, 2017 at 12:53:36 PM UTC-5, FrankHB1989 wrote:


在 2017年12月4日星期一 UTC+8上午1:18:24,Thiago Macieira写道:
On Sunday, 3 December 2017 09:05:32 PST FrankHB1989 wrote:
> I think I've misread your previous words. As of paging, your points are
> most valid. But pages (backed by RAM or not) still do not solve the problem
> magically. Paging is not the right tool to solve the problem in nature
> because it has too coarse granularity in allocation. The original point
> here is that MMU involves in address translation, not necessarily concrete
> management mechanism like paging. Segmentation can actually do something
> further for this problem, but it do incur some other unnecessary complexity
> and inflexibility. I'd expect some other internal state exported by MMU
> available differently from current ISA to aid runtime check of frames of
> activation records (*not necessarily the stack*) in some
> hardware-accelerated way.

Explain a little more. What would you like to see done that can't be done
right now? What purpose would this serve? What could applications do that they
can't right now?
As said previously, to write implementation of a language (at least) with first-class continuations in direct (non-trampolined, non-CPS transformed) style without construction of software stack.

And why do you need to do that? This sounds very much like defining yourself into a problem. What's wrong with a software stack? Interpreters have been using them for years without problems. What's wrong with trampolined or CPS-transformed style?
This is nothing wrong to be merely implementable. But this is of (relatively) bad quality, about both performance and maintenance of the code, with a quite significant factor. And it's a pity to work hard to just bypass the stack provided by underling implementations.


Why do you need a way to do this in a non-platform-specific way? After all, the only reason I can see for wanting to do any of this is if you're trying to use JIT or something similar. And JIT has to be platform-specific, by its very nature.

I don't argue that it should be totally non-platform-specific. If I need to target native ISA directly, certainly it should be platform-specific. It has to be platform-specific to utilize more power of the platform. But that is only one case. I suggest it should be more portable than the status quo for other cases, provided a common abstraction fit to several well-known problems, to save the manpower of duplication of error-prone works like manual CPS transformation. This is also one reason we need coroutines or resumable functions in the language.

A good motivation for an idea comes from practical problems; your problem here seems like an artificially created one, used to justify this particular solution.
Both the need to saving the amount of work and the need of abstraction are practical.

And consider alternatives. I can roll one more my own IR to be compiled merely to overcome the problem, which is also more or less the status quo in methodology used by other projects. But it is not specific only to my case. I can also reuse IRs from other projects, but I still have to port them by myself once there is a backend target and/or a feature I need is not well-supported directly, which may involve even more work compared to a fresh IR design by myself. Why it should be implemented separatedly instead of built in a high-level general-purposed language like C++?




Thiago Macieira

unread,
Dec 3, 2017, 9:36:56 PM12/3/17
to std-dis...@isocpp.org
On Sunday, 3 December 2017 18:02:03 PST FrankHB1989 wrote:
> > Sorry, any code that recurses on anything besides O(1) (if I'm feeling
> > generous, O(log n)) on the number of elements is buggy. Redesign it.
> >
> For example, tree traversing (even with known limited depth)?

I was talking about unbounded n.

If you know the maximum value of n, then you need to do a platform-specific
check.

Thiago Macieira

unread,
Dec 3, 2017, 9:43:56 PM12/3/17
to std-dis...@isocpp.org
On Sunday, 3 December 2017 17:53:12 PST FrankHB1989 wrote:
> You need to study the subject a little more... Traditional call/cc would
> need to copy the *whole *stack without something more magical
> <http://wiki.c2.com/?ContinuationImplementation>.

Function in C and C++ calls DO NOT copy the whole stack. One frame pointer is
enough.

You seem to be talking about some other language than C++. I don't care about
other languages. Everything other than C, C++ and assembly is "algorithm".

> (And strictly speaking,
> the arguments to be "called" are not usual functions
> <http://okmij.org/ftp/continuations/undelimited.html>.) In general, I don't
> need most of the frames it would capture, but I have no way to specify in
> the interface. So this directly violate the principle *a lot*, as well as
> semantic problems like how to interact with RAII. (P0534R0
> <http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0534r0.pdf>
> excludes full continuations for the very same reason.)

Ah, ok, so you seem to be talking about something other than function calls.
You're talking about capturing the entire frame (as opposed to how lambdas do
it) so that the function can be resumed.

Personal opinion, that's a horrible idea. There may be fatal problems with
this solution. Instead, just do like lambdas: capture by value what needs to
be captured and nothing further.

Thiago Macieira

unread,
Dec 3, 2017, 9:47:11 PM12/3/17
to std-dis...@isocpp.org
On Sunday, 3 December 2017 17:58:19 PST FrankHB1989 wrote:
> 在 2017年12月4日星期一 UTC+8上午2:38:58,Thiago Macieira写道:
>
> > On Sunday, 3 December 2017 10:14:12 PST FrankHB1989 wrote:
> > > (As I said previously, if I merely want to make other threads alive, no
> > > destructor needs to be called. It is certainly dangerous for shared
> > > resources, though.)
> >
> > Then just suspend this thread forever if it overflows its stack.
> >
> > Any locked mutexes and other contexts will stay locked forever too.
> >
> This does not sound sane. It leaks resources not shared, and it is
> problematic on shared resources. The locks it holds would probably block
> others so this can be another can of worms...

Exactly.

Which is why I said the only possible reaction to a stack overflow is to
immediately terminate the application. There's no recovery.

Even in functionally-safe applications, rebooting is a valid measure. It's
much better to have a controlled way to resume a known valid state then to
persist in an unknown one.

FrankHB1989

unread,
Dec 3, 2017, 10:59:55 PM12/3/17
to ISO C++ Standard - Discussion


在 2017年12月4日星期一 UTC+8上午10:36:56,Thiago Macieira写道:
On Sunday, 3 December 2017 18:02:03 PST FrankHB1989 wrote:
> > Sorry, any code that recurses on anything besides O(1) (if I'm feeling
> > generous, O(log n)) on the number of elements is buggy. Redesign it.
> >
> For example, tree traversing (even with known limited depth)?

I was talking about unbounded n.

If you know the maximum value of n, then you need to do a platform-specific
check.


Effectively tail optimized recursions for normal cases can be guaranteed in O(1) space.

And even not, should the check be exposed in the language rules? Even C++ does not require it... it's up to the user.

Do you suggest that the size of activation records should be fixed for each run of program? Can't it be part of the free store? I see no benefits here besides compatibility.

As of n... it is difficult (if not impossible) to know n for each case by both implementations and users. It is slightly easier to be determined by the results of trail-and-error... if the behavior is not unpredictable.

A platform-specific check would only help in quite limited cases to make n bounded.

FrankHB1989

unread,
Dec 3, 2017, 11:41:34 PM12/3/17
to ISO C++ Standard - Discussion


在 2017年12月4日星期一 UTC+8上午10:43:56,Thiago Macieira写道:
On Sunday, 3 December 2017 17:53:12 PST FrankHB1989 wrote:
> You need to study the subject a little more... Traditional call/cc would
> need to copy the *whole *stack without something more magical
> <http://wiki.c2.com/?ContinuationImplementation>.

Function in C and C++ calls DO NOT copy the whole stack. One frame pointer is
enough.

It is not required to copy or not copy the whole stack. There is nothing to prevent an insane implementation copying the stack in vein for every function call remaining conforming in the language rules.

You seem to be talking about some other language than C++. I don't care about
other languages. Everything other than C, C++ and assembly is "algorithm".

I'm talking about the object language to be implemented using C++. If C++ does have the feature, it would be *much* easier.

And function calls does not necessarily being procedure calls provided by ISA (remember, inlining). I also have mentioned that someone has proposed to add such magic under the skin of function calls. All of them rely on algorithm hidden by implementations (and often fails to meet your criteria of computational complexity, esp. in an AOT optimizing compiler - even they can be executed before run). So this is not that simple as you said.

> (And strictly speaking,
> the arguments to be "called" are not usual functions
> <http://okmij.org/ftp/continuations/undelimited.html>.) In general, I don't
> need most of the frames it would capture, but I have no way to specify in
> the interface. So this directly violate the principle *a lot*, as well as
> semantic problems like how to interact with RAII. (P0534R0
> <http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0534r0.pdf>
> excludes full continuations for the very same reason.)

Ah, ok, so you seem to be talking about something other than function calls.
You're talking about capturing the entire frame (as opposed to how lambdas do
it) so that the function can be resumed.

They can still be categorized as function calls (with proper types and many other expected properties of function calls in a algorithmic language), just not implemented as plain old C did.

This is like the situation of lambda expression, which is the real "function" used in current era. In this sense, Pascal/C-style pure procedural functions are only kinds of special cases that capture nothing, as well as not being first-class. Although for compatibility and some other reasons, C++ still mandates the clear boundary between so-called nominal function types and closure types, so it actual only support some special defunctionalized forms rather than uniform abstraction of functions, they share quite same spirit in high-level code in most cases, and they can often be exchanged in those cases.The fact shows the abstraction is valid and useful.

The point here is to add (as it is already allowed) further extension of the way to implement the abstraction. Operand of call/cc (i.e. a full continuation) is shown not consistent to the uniformed abstraction, but first-class entity with some other variants still can be candidates.

(BTW, I need much more, particularly unified abstraction of macros and "fisrt class" functions which can be viewed only differ in evaluation strategy of arguments. I don't expected it can be added into most languages including C++, because it requires almost totally redesign from the phase model. That's the major reason why I have to eventually write my own language.)

Personal opinion, that's a horrible idea. There may be fatal problems with
this solution. Instead, just do like lambdas: capture by value what needs to
be captured and nothing further.


In general I agree with the point to make captured contents explicit in the implementation as possible. But it is not always feasible.

To invent a separated core feature for the remained seems too luxurious to the effort of maintenance of the language spec, with less possibilities to be championed.

Thiago Macieira

unread,
Dec 3, 2017, 11:49:05 PM12/3/17
to std-dis...@isocpp.org
On Sunday, 3 December 2017 19:59:54 PST FrankHB1989 wrote:
> > If you know the maximum value of n, then you need to do a
> > platform-specific
> > check.
>
> Effectively tail optimized recursions for normal cases can be guaranteed in
> O(1) space.
>
> And even not, should the check be exposed in the language rules? Even C++
> does not require it... it's up to the user.

No, it should not. There's no tail-call optimisation in debug mode. There's
also no tail-call optimisation in some architectures depending on the
difference of parameters between the caller and callee. So assume it didn't
happen, therefore it's not O(1).

> Do you suggest that the size of activation records should be fixed for each
> run of program? Can't it be part of the free store? I see no benefits here
> besides compatibility.

That's specified in the psABI document and that's sufficient. There's no need
to specify it elsewhere.

It's also not required to be fixed.

> As of n... it is difficult (if not impossible) to know n for each case by
> both implementations and users. It is slightly easier to be determined by
> the results of trail-and-error... if the behavior is not unpredictable.

So, like I said, if you don't know what n will be, do not recurse worse than
O(log n). Rewrite your algorithm instead.

FrankHB1989

unread,
Dec 3, 2017, 11:55:10 PM12/3/17
to ISO C++ Standard - Discussion


在 2017年12月4日星期一 UTC+8上午10:47:11,Thiago Macieira写道:
On Sunday, 3 December 2017 17:58:19 PST FrankHB1989 wrote:
> 在 2017年12月4日星期一 UTC+8上午2:38:58,Thiago Macieira写道:
>
> > On Sunday, 3 December 2017 10:14:12 PST FrankHB1989 wrote:
> > > (As I said previously, if I merely want to make other threads alive, no
> > > destructor needs to be called. It is certainly dangerous for shared
> > > resources, though.)
> >
> > Then just suspend this thread forever if it overflows its stack.
> >
> > Any locked mutexes and other contexts will stay locked forever too.
> >
> This does not sound sane. It leaks resources not shared, and it is
> problematic on shared resources. The locks it holds would probably block
> others so this can be another can of worms...

Exactly.

Which is why I said the only possible reaction to a stack overflow is to
immediately terminate the application. There's no recovery.

Even in functionally-safe applications, rebooting is a valid measure. It's
much better to have a controlled way to resume a known valid state then to
persist in an unknown one. 

Basically agreed. Theoretically in a fully controlled environment it can be resolved by replaying every single operations in the history and inspect the state only when needed (where the state is valid). It is not practically useful because there are side effects altering the observable behavior out of the control of the implementation. That's why I mention std::terminate.

But I still can't assert it nonsense when the user can actual guarantee there is no shared resource in the dying thread. It is just not so obviously worthy for non-technical reasons.

Nicol Bolas

unread,
Dec 4, 2017, 12:09:00 AM12/4/17
to ISO C++ Standard - Discussion
Why do you say "for non-technical reasons"? Because it might be theoretically possible?

It's still not worthwhile to figure out the proper wording and specification for it. After all, how do you define a "shared resource"? Someone could be waiting on a spinlock that will never be unlocked. It could be a value released by a call to `atomic_thread_fence`.

Why should time and effort be spent on a case that is of vanishing importance?

FrankHB1989

unread,
Dec 4, 2017, 12:14:25 AM12/4/17
to ISO C++ Standard - Discussion


在 2017年12月4日星期一 UTC+8下午12:49:05,Thiago Macieira写道:
On Sunday, 3 December 2017 19:59:54 PST FrankHB1989 wrote:
> > If you know the maximum value of n, then you need to do a
> > platform-specific
> > check.
>
> Effectively tail optimized recursions for normal cases can be guaranteed in
> O(1) space.
>
> And even not, should the check be exposed in the language rules? Even C++
> does not require it... it's up to the user.

No, it should not. There's no tail-call optimisation in debug mode. There's
also no tail-call optimisation in some architectures depending on the
difference of parameters between the caller and callee. So assume it didn't
happen, therefore it's not O(1).
This seems to leak details of implementations.

Similarly, there may be no inlining in debug mode. Iterators and algorithms implementations may have worse complexity. This does not prevent the standard requires them, because they can be tolerated as non-conforming extensions.

> Do you suggest that the size of activation records should be fixed for each
> run of program? Can't it be part of the free store? I see no benefits here
> besides compatibility.

That's specified in the psABI document and that's sufficient. There's no need
to specify it elsewhere.

It's also not required to be fixed.

As parts of the implementation details not required by the language spec, knowledge of ABI is not the premise exposed to most users prior to real functionality - to a high-level language, including the ability of abstraction.


> As of n... it is difficult (if not impossible) to know n for each case by
> both implementations and users. It is slightly easier to be determined by
> the results of trail-and-error... if the behavior is not unpredictable.

So, like I said, if you don't know what n will be, do not recurse worse than
O(log n). Rewrite your algorithm instead.

Is it ever possible without manually and routinely simulate the activation records (like construction of an extra software stack) for any case?

I am personally quite reluctant to such work violating the separation of concerns principle.

FrankHB1989

unread,
Dec 4, 2017, 12:23:38 AM12/4/17
to ISO C++ Standard - Discussion


在 2017年12月4日星期一 UTC+8下午1:09:00,Nicol Bolas写道:
On Sunday, December 3, 2017 at 11:55:10 PM UTC-5, FrankHB1989 wrote:
在 2017年12月4日星期一 UTC+8上午10:47:11,Thiago Macieira写道:
On Sunday, 3 December 2017 17:58:19 PST FrankHB1989 wrote:
> 在 2017年12月4日星期一 UTC+8上午2:38:58,Thiago Macieira写道:
>
> > On Sunday, 3 December 2017 10:14:12 PST FrankHB1989 wrote:
> > > (As I said previously, if I merely want to make other threads alive, no
> > > destructor needs to be called. It is certainly dangerous for shared
> > > resources, though.)
> >
> > Then just suspend this thread forever if it overflows its stack.
> >
> > Any locked mutexes and other contexts will stay locked forever too.
> >
> This does not sound sane. It leaks resources not shared, and it is
> problematic on shared resources. The locks it holds would probably block
> others so this can be another can of worms...

Exactly.

Which is why I said the only possible reaction to a stack overflow is to
immediately terminate the application. There's no recovery.

Even in functionally-safe applications, rebooting is a valid measure. It's
much better to have a controlled way to resume a known valid state then to
persist in an unknown one. 

Basically agreed. Theoretically in a fully controlled environment it can be resolved by replaying every single operations in the history and inspect the state only when needed (where the state is valid). It is not practically useful because there are side effects altering the observable behavior out of the control of the implementation. That's why I mention std::terminate.

But I still can't assert it nonsense when the user can actual guarantee there is no shared resource in the dying thread. It is just not so obviously worthy for non-technical reasons.

Why do you say "for non-technical reasons"? Because it might be theoretically possible?

No. It should be practically feasible. It just benefits too little compared to the cost (to maintain the spec, to modify the implementation, etc) in my view.

It's still not worthwhile to figure out the proper wording and specification for it. After all, how do you define a "shared resource"? Someone could be waiting on a spinlock that will never be unlocked. It could be a value released by a call to `atomic_thread_fence`.

See, I actually agree with your point here now. But the informal definitions should be trivial. For example, "the implementation may assume cleanup of the resource would effect no observable behavior, otherwise the behavior is unspecified/undefined/implementation-defined". (The termination is already guaranteed by current rules if there is no other observable behavior.) Quite simple to transmit the responsibilities to users who dare to play with it, yeah?

Why should time and effort be spent on a case that is of vanishing importance?
So I don't push the topic to std-proposals...


Thiago Macieira

unread,
Dec 4, 2017, 12:24:35 AM12/4/17
to std-dis...@isocpp.org
On Sunday, 3 December 2017 21:14:25 PST FrankHB1989 wrote:
> > There's
> > also no tail-call optimisation in some architectures depending on the
> > difference of parameters between the caller and callee. So assume it
> > didn't
> > happen, therefore it's not O(1).
>
> This seems to leak details of implementations.

No, there's no leakage. Assume there's no such thing as tail-call
optimisation. The standard doesn't talk about it anyway.

> Similarly, there may be no inlining in debug mode. Iterators and algorithms
> implementations may have worse complexity. This does not prevent the
> standard requires them, because they can be *tolerated *as non-conforming
> extensions.

Making a container or an iterator have worse complexity than what the standard
requires makes the implementation non-conforming.

Not inlining, not optimising, or not tail applying tail call, or doing any of
that, does not change an implementation from conforming to non-conforming or
vice-versa. The visible behaviour of the program does not change.

> > So, like I said, if you don't know what n will be, do not recurse worse
> > than
> > O(log n). Rewrite your algorithm instead.
> >
> Is it ever possible without manually and routinely simulate the activation
> records (like construction of an extra software stack) for *any* case?

Yes, every recursion can be replaced by heap allocation and a loop. Sure, now
you have O(n) heap usage, but the failure modes there are different.

> I am personally quite reluctant to such work violating the *separation of
> concerns* principle.

I don't see how that applies. You have faulty-designed code. You fix it.

Nicol Bolas

unread,
Dec 4, 2017, 12:31:11 AM12/4/17
to ISO C++ Standard - Discussion

On Monday, December 4, 2017 at 12:14:25 AM UTC-5, FrankHB1989 wrote:
在 2017年12月4日星期一 UTC+8下午12:49:05,Thiago Macieira写道:
On Sunday, 3 December 2017 19:59:54 PST FrankHB1989 wrote:
> > If you know the maximum value of n, then you need to do a
> > platform-specific
> > check.
>
> Effectively tail optimized recursions for normal cases can be guaranteed in
> O(1) space.
>
> And even not, should the check be exposed in the language rules? Even C++
> does not require it... it's up to the user.

No, it should not. There's no tail-call optimisation in debug mode. There's
also no tail-call optimisation in some architectures depending on the
difference of parameters between the caller and callee. So assume it didn't
happen, therefore it's not O(1).
This seems to leak details of implementations.

Then stop relying on tail calls. It is an optimization; by definition, it's not guaranteed by the specification, nor is it intended to be.

Similarly, there may be no inlining in debug mode. Iterators and algorithms implementations may have worse complexity. This does not prevent the standard requires them, because they can be tolerated as non-conforming extensions.

No, actually; iterators and algorithms are not allowed to vary in their complexity. They're allowed to run slower. But a RandomAccessIterator that checks if it is in range is still O(1).

> Do you suggest that the size of activation records should be fixed for each
> run of program? Can't it be part of the free store? I see no benefits here
> besides compatibility.

That's specified in the psABI document and that's sufficient. There's no need
to specify it elsewhere.

It's also not required to be fixed.

As parts of the implementation details not required by the language spec, knowledge of ABI is not the premise exposed to most users prior to real functionality - to a high-level language, including the ability of abstraction.

If you're caring about these sorts of things, then you're not operating at a "high-level".

> As of n... it is difficult (if not impossible) to know n for each case by
> both implementations and users. It is slightly easier to be determined by
> the results of trail-and-error... if the behavior is not unpredictable.

So, like I said, if you don't know what n will be, do not recurse worse than
O(log n). Rewrite your algorithm instead.

Is it ever possible without manually and routinely simulate the activation records (like construction of an extra software stack) for any case?

I am personally quite reluctant to such work violating the separation of concerns principle.

How is it a violation of separation of concerns? Avoiding resource exhaustion in an algorithm is very much part of the "concerns" for it.



FrankHB1989

unread,
Dec 4, 2017, 3:27:40 AM12/4/17
to ISO C++ Standard - Discussion


在 2017年12月4日星期一 UTC+8下午1:24:35,Thiago Macieira写道:
On Sunday, 3 December 2017 21:14:25 PST FrankHB1989 wrote:
> > There's
> > also no tail-call optimisation in some architectures depending on the
> > difference of parameters between the caller and callee. So assume it
> > didn't
> > happen, therefore it's not O(1).
>
> This seems to leak details of implementations.

No, there's no leakage. Assume there's no such thing as tail-call
optimisation. The standard doesn't talk about it anyway.

The minimum implementable complexity is to be determined by solely the properties of the algorithm rather than the implementation. Non-O(1) case is due to the allocation strategy. The existence of naive implementation is also in the details.

On the other hand, the standard doesn't talk about the complexity for many built-in operations, e.g. indirect through a pointer value, either. Is is safely treated it as O(1), or it depends on something else, or it is simply cannot be determined?

> Similarly, there may be no inlining in debug mode. Iterators and algorithms
> implementations may have worse complexity. This does not prevent the
> standard requires them, because they can be *tolerated *as non-conforming
> extensions.

Making a container or an iterator have worse complexity than what the standard
requires makes the implementation non-conforming.

Not inlining, not optimising, or not tail applying tail call, or doing any of
that, does not change an implementation from conforming to non-conforming or
vice-versa. The visible behaviour of the program does not change.

So as you see, both conforming and nonconforming implementation can work in reality. How do they concerned with that it should be in the rules or not, if it also does not change the visible behavior?

> > So, like I said, if you don't know what n will be, do not recurse worse
> > than
> > O(log n). Rewrite your algorithm instead.
> >
> Is it ever possible without manually and routinely simulate the activation
> records (like construction of an extra software stack) for *any* case?

Yes, every recursion can be replaced by heap allocation and a loop. Sure, now
you have O(n) heap usage, but the failure modes there are different.
Well, I know it is implementable like this, which is the case I want to avoid.

> I am personally quite reluctant to such work violating the *separation of
> concerns* principle.

I don't see how that applies. You have faulty-designed code. You fix it.

In general, design of algorithms does not necessarily require explicit allocation of bounded variables (local variables in Algol-like languages, the right terms in let-forms in ML-like languages, the arguments of anonymous function evaluation idiom in JS-like languages, etc). When it does not, these allocation are in the details of the algorithm, and should be ideally trivially encoded in the implementing language. If the language is not capable to express such need for special reasons merely applicable for that language rather than the algorithm, it's the fault of language, not the algorithm design. As workaround, I have to manually do the work (explicitly construction of activation record holds the bounded variables here) should have been done by the implementation of language. The mixture of different layers of abstraction interrupts the flow of work in a strange way, which is more or less annoying to me.

FrankHB1989

unread,
Dec 4, 2017, 3:48:34 AM12/4/17
to ISO C++ Standard - Discussion


在 2017年12月4日星期一 UTC+8下午1:31:11,Nicol Bolas写道:

On Monday, December 4, 2017 at 12:14:25 AM UTC-5, FrankHB1989 wrote:
在 2017年12月4日星期一 UTC+8下午12:49:05,Thiago Macieira写道:
On Sunday, 3 December 2017 19:59:54 PST FrankHB1989 wrote:
> > If you know the maximum value of n, then you need to do a
> > platform-specific
> > check.
>
> Effectively tail optimized recursions for normal cases can be guaranteed in
> O(1) space.
>
> And even not, should the check be exposed in the language rules? Even C++
> does not require it... it's up to the user.

No, it should not. There's no tail-call optimisation in debug mode. There's
also no tail-call optimisation in some architectures depending on the
difference of parameters between the caller and callee. So assume it didn't
happen, therefore it's not O(1).
This seems to leak details of implementations.

Then stop relying on tail calls. It is an optimization; by definition, it's not guaranteed by the specification, nor is it intended to be.

By which definition? It is not touched ever by C++ specification normatively. And where is the intention come from? I've heard it was interested by someone in the committee years ago. The main reason to prevent it being in the semantic rules was that it does not interact with non-trivial destructor calls well.

If it would be in the specification, it would likely be "proper tail calls", not "optimization". (In fact, I have not seen any language specification specify such feature as "optimization".) Similarly, does copy elision (before C++17) accounts to instance of optimization, despite additional change on overloading resolution of special member functions?

Similarly, there may be no inlining in debug mode. Iterators and algorithms implementations may have worse complexity. This does not prevent the standard requires them, because they can be tolerated as non-conforming extensions.

No, actually; iterators and algorithms are not allowed to vary in their complexity. They're allowed to run slower. But a RandomAccessIterator that checks if it is in range is still O(1).

True, so they are not conforming. This is still accepted by many users, though.

Since it is not shipped as the only implementation from any vendor, I think there is no problem for particular (debugging) use by developers.

> Do you suggest that the size of activation records should be fixed for each
> run of program? Can't it be part of the free store? I see no benefits here
> besides compatibility.

That's specified in the psABI document and that's sufficient. There's no need
to specify it elsewhere.

It's also not required to be fixed.

As parts of the implementation details not required by the language spec, knowledge of ABI is not the premise exposed to most users prior to real functionality - to a high-level language, including the ability of abstraction.

If you're caring about these sorts of things, then you're not operating at a "high-level".

I'm caring about getting rid of dependency of these sorts of things. So at least I'm trying to get "high-level".

> As of n... it is difficult (if not impossible) to know n for each case by
> both implementations and users. It is slightly easier to be determined by
> the results of trail-and-error... if the behavior is not unpredictable.

So, like I said, if you don't know what n will be, do not recurse worse than
O(log n). Rewrite your algorithm instead.

Is it ever possible without manually and routinely simulate the activation records (like construction of an extra software stack) for any case?

I am personally quite reluctant to such work violating the separation of concerns principle.

How is it a violation of separation of concerns? Avoiding resource exhaustion in an algorithm is very much part of the "concerns" for it.

Not every algorithm cares how to inform the condition in exceptional paths or how to handle resource exhaustion, because they have assumed it's the responsibility of something else (usually, the caller). (Yeah, like exception neutrality!) An algorithm can also specify the validity of assumption during the whole steps as the precondition. So no, "avoiding resource exhaustion" can be out of the scope of such algorithms. Unprincipled mixture of different conditions in the description of algorithm does break the expected separation in such cases.


Nicol Bolas

unread,
Dec 4, 2017, 10:56:05 AM12/4/17
to ISO C++ Standard - Discussion

On Monday, December 4, 2017 at 3:48:34 AM UTC-5, FrankHB1989 wrote:
在 2017年12月4日星期一 UTC+8下午1:31:11,Nicol Bolas写道:
On Monday, December 4, 2017 at 12:14:25 AM UTC-5, FrankHB1989 wrote:
在 2017年12月4日星期一 UTC+8下午12:49:05,Thiago Macieira写道:
On Sunday, 3 December 2017 19:59:54 PST FrankHB1989 wrote:
> > If you know the maximum value of n, then you need to do a
> > platform-specific
> > check.
>
> Effectively tail optimized recursions for normal cases can be guaranteed in
> O(1) space.
>
> And even not, should the check be exposed in the language rules? Even C++
> does not require it... it's up to the user.

No, it should not. There's no tail-call optimisation in debug mode. There's
also no tail-call optimisation in some architectures depending on the
difference of parameters between the caller and callee. So assume it didn't
happen, therefore it's not O(1).
This seems to leak details of implementations.

Then stop relying on tail calls. It is an optimization; by definition, it's not guaranteed by the specification, nor is it intended to be.

By which definition? It is not touched ever by C++ specification normatively.

By the definition you just cited: it's not explicitly required, but it is allowed by the "as if" rule. Therefore, it an optimization.

And where is the intention come from? I've heard it was interested by someone in the committee years ago. The main reason to prevent it being in the semantic rules was that it does not interact with non-trivial destructor calls well.

If it would be in the specification, it would likely be "proper tail calls", not "optimization". (In fact, I have not seen any language specification specify such feature as "optimization".) Similarly, does copy elision (before C++17) accounts to instance of optimization, despite additional change on overloading resolution of special member functions?

Yes, elision's an optimization, but because it has visible side-effects, it is an optimization the standard explicitly makes allowances for. Tail optimization does not need explicit allowances for it, since it does not affect visible behavior as defined by the standard.

Also, C++17 did not remove all instances of elision; just those involving prvalues. So named return values still have rely on elision rules.

Similarly, there may be no inlining in debug mode. Iterators and algorithms implementations may have worse complexity. This does not prevent the standard requires them, because they can be tolerated as non-conforming extensions.

No, actually; iterators and algorithms are not allowed to vary in their complexity. They're allowed to run slower. But a RandomAccessIterator that checks if it is in range is still O(1).

True, so they are not conforming. This is still accepted by many users, though.

Did you read what I said? "But a RandomAccessIterator that checks if it is in range is still O(1)." That means it's still conforming to the standard.

Constant time is constant time, even if it's a big constant.

Since it is not shipped as the only implementation from any vendor, I think there is no problem for particular (debugging) use by developers.

> Do you suggest that the size of activation records should be fixed for each
> run of program? Can't it be part of the free store? I see no benefits here
> besides compatibility.

That's specified in the psABI document and that's sufficient. There's no need
to specify it elsewhere.

It's also not required to be fixed.

As parts of the implementation details not required by the language spec, knowledge of ABI is not the premise exposed to most users prior to real functionality - to a high-level language, including the ability of abstraction.

If you're caring about these sorts of things, then you're not operating at a "high-level".

I'm caring about getting rid of dependency of these sorts of things. So at least I'm trying to get "high-level".

OK, name a high-level language that provides guarantees about the stack in the specification. Even functional languages, where recursion is standard procedure, don't provide guarantees about the stack. They often do provide proper tail recursion, but outside of tail calls, the way the stack works is whatever happens, happens.

> As of n... it is difficult (if not impossible) to know n for each case by
> both implementations and users. It is slightly easier to be determined by
> the results of trail-and-error... if the behavior is not unpredictable.

So, like I said, if you don't know what n will be, do not recurse worse than
O(log n). Rewrite your algorithm instead.

Is it ever possible without manually and routinely simulate the activation records (like construction of an extra software stack) for any case?

I am personally quite reluctant to such work violating the separation of concerns principle.

How is it a violation of separation of concerns? Avoiding resource exhaustion in an algorithm is very much part of the "concerns" for it.

Not every algorithm cares how to inform the condition in exceptional paths or how to handle resource exhaustion, because they have assumed it's the responsibility of something else (usually, the caller).

Then the programmer behind that function has willingly said, "if this runs out of stack space, tough-luck". Which may be perfectly reasonable in some cases. But abrogating one's responsibilities doesn't mean that it's not still their responsibility.

(Yeah, like exception neutrality!) An algorithm can also specify the validity of assumption during the whole steps as the precondition.

Sure. Giving up a responsibility means washing your hands with what happens if it's not true. You either take something into account or you're willing to suffer the consequences if you don't.

Choosing not to be concerned about something doesn't mean it's not still a concern for your function. Just like allocating some memory means that part of the concerns for your function is that there will be enough memory for you to allocate it.

Think of it like being wealthy. A $1000 may be a rounding error to you, so spending it may not seem like a "concern". But if your bank account, for whatever reason, doesn't have that $1000 when you try to spend it, you will find yourself very "concerned" by that seemingly trivial sum.

FrankHB1989

unread,
Dec 4, 2017, 9:04:43 PM12/4/17
to ISO C++ Standard - Discussion


在 2017年12月4日星期一 UTC+8下午11:56:05,Nicol Bolas写道:

On Monday, December 4, 2017 at 3:48:34 AM UTC-5, FrankHB1989 wrote:
在 2017年12月4日星期一 UTC+8下午1:31:11,Nicol Bolas写道:
On Monday, December 4, 2017 at 12:14:25 AM UTC-5, FrankHB1989 wrote:
在 2017年12月4日星期一 UTC+8下午12:49:05,Thiago Macieira写道:
On Sunday, 3 December 2017 19:59:54 PST FrankHB1989 wrote:
> > If you know the maximum value of n, then you need to do a
> > platform-specific
> > check.
>
> Effectively tail optimized recursions for normal cases can be guaranteed in
> O(1) space.
>
> And even not, should the check be exposed in the language rules? Even C++
> does not require it... it's up to the user.

No, it should not. There's no tail-call optimisation in debug mode. There's
also no tail-call optimisation in some architectures depending on the
difference of parameters between the caller and callee. So assume it didn't
happen, therefore it's not O(1).
This seems to leak details of implementations.

Then stop relying on tail calls. It is an optimization; by definition, it's not guaranteed by the specification, nor is it intended to be.

By which definition? It is not touched ever by C++ specification normatively.

By the definition you just cited: it's not explicitly required, but it is allowed by the "as if" rule. Therefore, it an optimization.
The rule provides room of freedom to implementations. It does not imply optimization literally, besides the fact that such "optimization" might be actually pessimization of performance in several cases (e.g. when implemented directly by trampolines unconditionally without distinguishing tail contexts). Proper tail calls provide extra semantic guarantees so even pessimization is sometimes acceptable; that's also why every language spec that has this feature (I've seen) does not treat it merely as "optimization".


And where is the intention come from? I've heard it was interested by someone in the committee years ago. The main reason to prevent it being in the semantic rules was that it does not interact with non-trivial destructor calls well.

If it would be in the specification, it would likely be "proper tail calls", not "optimization". (In fact, I have not seen any language specification specify such feature as "optimization".) Similarly, does copy elision (before C++17) accounts to instance of optimization, despite additional change on overloading resolution of special member functions?

Yes, elision's an optimization, but because it has visible side-effects, it is an optimization the standard explicitly makes allowances for. Tail optimization does not need explicit allowances for it, since it does not affect visible behavior as defined by the standard.

Elision is an optimization of performance by the result of the rules. This is intended, true. But in the aspect of spec, whether it specify exception of the as-if rule or not is purely detail. Users of the language should first see if the rules meet their requirements on correctness, not the existence of the result. That's probably why G++ does not tune this unspecified behavior by '-O' options.

Also, C++17 did not remove all instances of elision; just those involving prvalues. So named return values still have rely on elision rules.

Yes. C++ 17 improves it in several cases so it leads to less confusion and with more usefulness but the whole cases are more complicated.

Anyway, these points are all to explain why such features are not good to simply categorized as "optimizations" at first. If the example of altering the behavior specified by "as-if" rule in the current language standard does not convince you that the semantic properties should comes before the result of optimization when making choices, consider another one: inlining, which can also be pessimization due to resulted code bloat, bad translation performance and pessimized output (such as the case warned by Clang++ [-Wweak-vtables]).

Similarly, there may be no inlining in debug mode. Iterators and algorithms implementations may have worse complexity. This does not prevent the standard requires them, because they can be tolerated as non-conforming extensions.

No, actually; iterators and algorithms are not allowed to vary in their complexity. They're allowed to run slower. But a RandomAccessIterator that checks if it is in range is still O(1).

True, so they are not conforming. This is still accepted by many users, though.

Did you read what I said? "But a RandomAccessIterator that checks if it is in range is still O(1)." That means it's still conforming to the standard.

I mean that despite conformance, it is acceptable to be non-O(1) and pretending to be a RandomAccessIterator in a few cases in reality. That's like that despite the tail calls are not specified at all in the current standard, not being safe as proper tail calls is also acceptable in a few cases.

Constant time is constant time, even if it's a big constant.

Since it is not shipped as the only implementation from any vendor, I think there is no problem for particular (debugging) use by developers.

> Do you suggest that the size of activation records should be fixed for each
> run of program? Can't it be part of the free store? I see no benefits here
> besides compatibility.

That's specified in the psABI document and that's sufficient. There's no need
to specify it elsewhere.

It's also not required to be fixed.

As parts of the implementation details not required by the language spec, knowledge of ABI is not the premise exposed to most users prior to real functionality - to a high-level language, including the ability of abstraction.

If you're caring about these sorts of things, then you're not operating at a "high-level".

I'm caring about getting rid of dependency of these sorts of things. So at least I'm trying to get "high-level".

OK, name a high-level language that provides guarantees about the stack in the specification. Even functional languages, where recursion is standard procedure, don't provide guarantees about the stack. They often do provide proper tail recursion, but outside of tail calls, the way the stack works is whatever happens, happens.
 
That's why I carefully differentiate the activation record and the stack in several contexts. Guarantee of the existence of the stack is not needed in general with the abstract machine (when specifying a high-level language rather than a stack machine like JVM). Even conceptional LIFO has too strong constraints to describe behavior in usual "functional languages" which does specify the order of specific observable behavior bounded on evaluations of special constructs in the language. Such need is only required locally for limited language contexts, like the stack unwinding case of C++.



> As of n... it is difficult (if not impossible) to know n for each case by
> both implementations and users. It is slightly easier to be determined by
> the results of trail-and-error... if the behavior is not unpredictable.

So, like I said, if you don't know what n will be, do not recurse worse than
O(log n). Rewrite your algorithm instead.

Is it ever possible without manually and routinely simulate the activation records (like construction of an extra software stack) for any case?

I am personally quite reluctant to such work violating the separation of concerns principle.

How is it a violation of separation of concerns? Avoiding resource exhaustion in an algorithm is very much part of the "concerns" for it.

Not every algorithm cares how to inform the condition in exceptional paths or how to handle resource exhaustion, because they have assumed it's the responsibility of something else (usually, the caller).

Then the programmer behind that function has willingly said, "if this runs out of stack space, tough-luck". Which may be perfectly reasonable in some cases. But abrogating one's responsibilities doesn't mean that it's not still their responsibility.

(Yeah, like exception neutrality!) An algorithm can also specify the validity of assumption during the whole steps as the precondition.

Sure. Giving up a responsibility means washing your hands with what happens if it's not true. You either take something into account or you're willing to suffer the consequences if you don't.

Choosing not to be concerned about something doesn't mean it's not still a concern for your function. Just like allocating some memory means that part of the concerns for your function is that there will be enough memory for you to allocate it.

Think of it like being wealthy. A $1000 may be a rounding error to you, so spending it may not seem like a "concern". But if your bank account, for whatever reason, doesn't have that $1000 when you try to spend it, you will find yourself very "concerned" by that seemingly trivial sum.

Surely concerns are not responsibilities. The latter are largely effected by how the problem is described. If you want to keep the invariance of "sufficient precise are always properly handled", specify it in requirements. This still does not invalidate that there can be the case without such requirement.

Nicol Bolas

unread,
Dec 5, 2017, 11:07:24 AM12/5/17
to ISO C++ Standard - Discussion
On Monday, December 4, 2017 at 9:04:43 PM UTC-5, FrankHB1989 wrote:
在 2017年12月4日星期一 UTC+8下午11:56:05,Nicol Bolas写道:

On Monday, December 4, 2017 at 3:48:34 AM UTC-5, FrankHB1989 wrote:
在 2017年12月4日星期一 UTC+8下午1:31:11,Nicol Bolas写道:
On Monday, December 4, 2017 at 12:14:25 AM UTC-5, FrankHB1989 wrote:
在 2017年12月4日星期一 UTC+8下午12:49:05,Thiago Macieira写道:
On Sunday, 3 December 2017 19:59:54 PST FrankHB1989 wrote:
> > If you know the maximum value of n, then you need to do a
> > platform-specific
> > check.
>
> Effectively tail optimized recursions for normal cases can be guaranteed in
> O(1) space.
>
> And even not, should the check be exposed in the language rules? Even C++
> does not require it... it's up to the user.

No, it should not. There's no tail-call optimisation in debug mode. There's
also no tail-call optimisation in some architectures depending on the
difference of parameters between the caller and callee. So assume it didn't
happen, therefore it's not O(1).
This seems to leak details of implementations.

Then stop relying on tail calls. It is an optimization; by definition, it's not guaranteed by the specification, nor is it intended to be.

By which definition? It is not touched ever by C++ specification normatively.

By the definition you just cited: it's not explicitly required, but it is allowed by the "as if" rule. Therefore, it an optimization.
The rule provides room of freedom to implementations. It does not imply optimization literally, besides the fact that such "optimization" might be actually pessimization of performance in several cases (e.g. when implemented directly by trampolines unconditionally without distinguishing tail contexts). Proper tail calls provide extra semantic guarantees so even pessimization is sometimes acceptable; that's also why every language spec that has this feature (I've seen) does not treat it merely as "optimization".

I think you've kind of "lost the plot" here, so to speak.

C++ does not have proper tail calls as a language feature. As such, you aren't allowed to rely on any "extra semantic guarantees" that might come from them. You seem to think this "leaks implementation details". It doesn't, because you're not supposed to rely on it. And then you say that it's unreliable.

Um, yes. Unreliable things are things you cannot rely on.

If tail calls were implemented in C++, they still would not talk about the stack. A C++ tail call feature would declare that the parameters of the tail call would be initialized, then all local variables would be destroyed, then the call happens, with the return value object from the calling function being the return value of this function. Or something to that effect.

There would be no need to discuss the stack or the possibility of stack exhaustion.

And where is the intention come from? I've heard it was interested by someone in the committee years ago. The main reason to prevent it being in the semantic rules was that it does not interact with non-trivial destructor calls well.

If it would be in the specification, it would likely be "proper tail calls", not "optimization". (In fact, I have not seen any language specification specify such feature as "optimization".) Similarly, does copy elision (before C++17) accounts to instance of optimization, despite additional change on overloading resolution of special member functions?

Yes, elision's an optimization, but because it has visible side-effects, it is an optimization the standard explicitly makes allowances for. Tail optimization does not need explicit allowances for it, since it does not affect visible behavior as defined by the standard.

Elision is an optimization of performance by the result of the rules. This is intended, true. But in the aspect of spec, whether it specify exception of the as-if rule or not is purely detail.

No, it's not "pure detail". Elision would not be possible without that explicit wording, since constructor and destructor calls are visible effects. Compilers are not allowed to change code in a way that visibly changes its effects.

Users of the language should first see if the rules meet their requirements on correctness, not the existence of the result. That's probably why G++ does not tune this unspecified behavior by '-O' options.

Also, C++17 did not remove all instances of elision; just those involving prvalues. So named return values still have rely on elision rules.

Yes. C++ 17 improves it in several cases so it leads to less confusion and with more usefulness but the whole cases are more complicated.

Anyway, these points are all to explain why such features are not good to simply categorized as "optimizations" at first.

But they are optimizations. Your program would have the same effective behavior; it just does it faster, with less stack usage or fewer copy/moves. That's the result of the standard not providing guarantees about the stack; it doesn't say that infinite recursion is UB or not UB.

Guaranteed elision isn't an optimization anymore because it's mandated behavior. It changes the legality of code, by no longer requiring that a prvalue return type be copyable or moveable.

If the example of altering the behavior specified by "as-if" rule in the current language standard does not convince you that the semantic properties should comes before the result of optimization when making choices, consider another one: inlining, which can also be pessimization due to resulted code bloat, bad translation performance and pessimized output (such as the case warned by Clang++ [-Wweak-vtables]).

And that's why we rely on the compiler to decide when to inline and when not to. I don't understand what point you're trying to make.

Similarly, there may be no inlining in debug mode. Iterators and algorithms implementations may have worse complexity. This does not prevent the standard requires them, because they can be tolerated as non-conforming extensions.

No, actually; iterators and algorithms are not allowed to vary in their complexity. They're allowed to run slower. But a RandomAccessIterator that checks if it is in range is still O(1).

True, so they are not conforming. This is still accepted by many users, though.

Did you read what I said? "But a RandomAccessIterator that checks if it is in range is still O(1)." That means it's still conforming to the standard.

I mean that despite conformance, it is acceptable to be non-O(1) and pretending to be a RandomAccessIterator in a few cases in reality. That's like that despite the tail calls are not specified at all in the current standard, not being safe as proper tail calls is also acceptable in a few cases.

No, it's not acceptable. Any such iterator is non-conforming. I don't know of a debugging RandomAccessIterator that violates the O(1) requirements of the standard. And no, getting the size from the container is not a violation.

Constant time is constant time, even if it's a big constant.

Since it is not shipped as the only implementation from any vendor, I think there is no problem for particular (debugging) use by developers.

> Do you suggest that the size of activation records should be fixed for each
> run of program? Can't it be part of the free store? I see no benefits here
> besides compatibility.

That's specified in the psABI document and that's sufficient. There's no need
to specify it elsewhere.

It's also not required to be fixed.

As parts of the implementation details not required by the language spec, knowledge of ABI is not the premise exposed to most users prior to real functionality - to a high-level language, including the ability of abstraction.

If you're caring about these sorts of things, then you're not operating at a "high-level".

I'm caring about getting rid of dependency of these sorts of things. So at least I'm trying to get "high-level".

OK, name a high-level language that provides guarantees about the stack in the specification. Even functional languages, where recursion is standard procedure, don't provide guarantees about the stack. They often do provide proper tail recursion, but outside of tail calls, the way the stack works is whatever happens, happens.
 
That's why I carefully differentiate the activation record and the stack in several contexts. Guarantee of the existence of the stack is not needed in general with the abstract machine (when specifying a high-level language rather than a stack machine like JVM). Even conceptional LIFO has too strong constraints to describe behavior in usual "functional languages" which does specify the order of specific observable behavior bounded on evaluations of special constructs in the language. Such need is only required locally for limited language contexts, like the stack unwinding case of C++.

(Yeah, like exception neutrality!) An algorithm can also specify the validity of assumption during the whole steps as the precondition.

Sure. Giving up a responsibility means washing your hands with what happens if it's not true. You either take something into account or you're willing to suffer the consequences if you don't.

Choosing not to be concerned about something doesn't mean it's not still a concern for your function. Just like allocating some memory means that part of the concerns for your function is that there will be enough memory for you to allocate it.

Think of it like being wealthy. A $1000 may be a rounding error to you, so spending it may not seem like a "concern". But if your bank account, for whatever reason, doesn't have that $1000 when you try to spend it, you will find yourself very "concerned" by that seemingly trivial sum.

Surely concerns are not responsibilities. The latter are largely effected by how the problem is described. If you want to keep the invariance of "sufficient precise are always properly handled", specify it in requirements. This still does not invalidate that there can be the case without such requirement.

For any function that uses resources, having sufficient resources to do its job is very much a part of that function's concerns. Functions which allocate memory require that there is sufficient memory for it to allocate. So if you're picking which algorithm to use, how much memory the various options require use can be the deciding factor.

Stack space is the same way. Given a particular problem domain, is a particular recursive algorithm likely to overflow? If the problem domain includes "N is always less than some relatively small number", an O(n) may be OK. If the problem domain includes an arbitrary N, then it's not OK.

There is no way that resource consumption is not part of a function's concerns or responsibilities. It simply isn't an obvious one in many cases.

FrankHB1989

unread,
Dec 6, 2017, 10:07:10 AM12/6/17
to ISO C++ Standard - Discussion


在 2017年12月6日星期三 UTC+8上午12:07:24,Nicol Bolas写道:
There are multiple issues concerned. Some are related to current standard, some are not.

The original topic is about current support. It has been answered by Ville Voutilainen - the rule of UB when the resources are not available. If the rule does not exist, implementations have to effectively support proper tail call to be conforming, which would lead to other problems.

As of the language feature about tail calls... I do not propose it. As I've said, I would have posted it in std-proposals if I want. The discussion later was all about what they are and the effect of nonexistence of them to a general purposed language, not how to get them into the current C++ standard.

The discussion has to based on some consensus. But I think you have some fundamental misconceptions even I mostly do agree with your other points.

First, the leakage of implementation details I talked about is concerned with the abstraction that can be leaked. This is uniform in most general cases, not only with particular language to implement somethings. It is true that I can't rely on the existence of proper tail calls when I am using some specific versions of C++ (or C++ implementations), but that is not intended to be the premise when I have even started to consider the change of the language. In this sense, taking care of nonexistence of proper tail calls is leaking because the ability about "to be implementable with proper tail calls" is mostly a language-neutral property of the design of a solution, rather than of any concrete implementation - with any specific language. As said, even if C++ does not support tail calls, it is possible to be implemented indirectly, like adding IRs or other kinds of implementations of general purposed languages onto the top of the solution stack. This kind of solution is imperfect for many obvious reasons, so I'm a little enthusiastic how it would interact with the underlying language specification if it is possible to change.

I will later explain other reasons, along with your words quoted.

If tail calls were implemented in C++, they still would not talk about the stack. A C++ tail call feature would declare that the parameters of the tail call would be initialized, then all local variables would be destroyed, then the call happens, with the return value object from the calling function being the return value of this function. Or something to that effect.

There would be no need to discuss the stack or the possibility of stack exhaustion.

True, but just for the rules themselves.

If we are to propose new rules into the language, we have to make sure how they can be implemented typically as well as how it will interact with current implementations. We don't want another `export`.

And where is the intention come from? I've heard it was interested by someone in the committee years ago. The main reason to prevent it being in the semantic rules was that it does not interact with non-trivial destructor calls well.

If it would be in the specification, it would likely be "proper tail calls", not "optimization". (In fact, I have not seen any language specification specify such feature as "optimization".) Similarly, does copy elision (before C++17) accounts to instance of optimization, despite additional change on overloading resolution of special member functions?

Yes, elision's an optimization, but because it has visible side-effects, it is an optimization the standard explicitly makes allowances for. Tail optimization does not need explicit allowances for it, since it does not affect visible behavior as defined by the standard.

Elision is an optimization of performance by the result of the rules. This is intended, true. But in the aspect of spec, whether it specify exception of the as-if rule or not is purely detail.

No, it's not "pure detail". Elision would not be possible without that explicit wording, since constructor and destructor calls are visible effects. Compilers are not allowed to change code in a way that visibly changes its effects.

Second, you have biased about what is an "optimization". But I'd still put the detailed reasoning later.

Let me explain why I was talking about elision - it was illustrated to show the intentionally semantics change (to "as-if" rules) in the language: because they provide something not in the language without such rules. For the case of elision, it is (hopefully) optimized program.

Note the standard does not specify what is an optimization. It only has rules to make it possible. That probably leads to illusion of the existence of optimization. For example, the rules of optional elision does not forbid a conforming implementation (insanely) eliding copying by replacing each call of to-be-elided copy construction to a random `sleep`. Can you refer it is an optimization?

Sure, you can still count on that it would be optimization (or a no-op, at least not deliberate pessimization compared with the program following the abstract machine semantics without elision). But it is based on sanity, or quality of implementations. It has nothing to do with the normative portion of the standard.

Indeed, whether a feature is considered an "optimization" or not does not directly relate with to be or not to be in the standard.

However, as I realized that almost no implementation would pessimize the code in case of elision in reality, I knew it can be misleading. So I complemented the point with other cases, notably, inlining. There are two points:

1. Historically, the `inline` keyword is one of the most famous misunderstood features in C++. It once was only treated as a way of optimization by many users, but it is truly not. (For example, the inlining suggestion is  ignored by both `g++ -O0` and `g++ -O3` for a long time.) The only "standard" meaning of the keyword `inline` is the effect on ODR.

2. Even if the inlining (more specifically, inline substitution) is effective to generate the object code, the result is not necessarily optimized (e.g. code bloat).


Users of the language should first see if the rules meet their requirements on correctness, not the existence of the result. That's probably why G++ does not tune this unspecified behavior by '-O' options.

Also, C++17 did not remove all instances of elision; just those involving prvalues. So named return values still have rely on elision rules.

Yes. C++ 17 improves it in several cases so it leads to less confusion and with more usefulness but the whole cases are more complicated.

Anyway, these points are all to explain why such features are not good to simply categorized as "optimizations" at first.

But they are optimizations. Your program would have the same effective behavior; it just does it faster, with less stack usage or fewer copy/moves. That's the result of the standard not providing guarantees about the stack; it doesn't say that infinite recursion is UB or not UB.

No. Proper tail calls are not optimized. Different to copy elision, you can't even expect the existence of them would magically make the program faster in general with a sane implementation. And without aid of other optimization, it would make the program almost always slower. (This is similar to inlining, except that the negative effect is easier to be observed in typical cases.)

Guaranteed elision isn't an optimization anymore because it's mandated behavior. It changes the legality of code, by no longer requiring that a prvalue return type be copyable or moveable.

True. But even if the criteria of well-formedness would not be changed by the rule, it is still important enough on program semantics by guaranteeing the identity of object in variant contexts (where copy-initialization is applicable). The point here is:. whether a feature is a limitation on well-formedness or not also does not directly relate with to be or not to be in the standard. (Although I'd admit that lacking of workaround of annoying use cases which disturbing objects to be first-class are also very strong reasons to get this feature into the language design.)

If the example of altering the behavior specified by "as-if" rule in the current language standard does not convince you that the semantic properties should comes before the result of optimization when making choices, consider another one: inlining, which can also be pessimization due to resulted code bloat, bad translation performance and pessimized output (such as the case warned by Clang++ [-Wweak-vtables]).

And that's why we rely on the compiler to decide when to inline and when not to. I don't understand what point you're trying to make.

See above.
Similarly, there may be no inlining in debug mode. Iterators and algorithms implementations may have worse complexity. This does not prevent the standard requires them, because they can be tolerated as non-conforming extensions.

No, actually; iterators and algorithms are not allowed to vary in their complexity. They're allowed to run slower. But a RandomAccessIterator that checks if it is in range is still O(1).

True, so they are not conforming. This is still accepted by many users, though.

Did you read what I said? "But a RandomAccessIterator that checks if it is in range is still O(1)." That means it's still conforming to the standard.

I mean that despite conformance, it is acceptable to be non-O(1) and pretending to be a RandomAccessIterator in a few cases in reality. That's like that despite the tail calls are not specified at all in the current standard, not being safe as proper tail calls is also acceptable in a few cases.

No, it's not acceptable. Any such iterator is non-conforming. I don't know of a debugging RandomAccessIterator that violates the O(1) requirements of the standard. And no, getting the size from the container is not a violation.

This is not the situation about conformance - it is nodoubtfully nonconforming. This is about how real products being accepted by users. I treat the case acceptable because it does not replace the conforming ones. And since the standard C++ does not support different configurations/profiles satisfactory, it is reasonably compromised.

Constant time is constant time, even if it's a big constant.

Since it is not shipped as the only implementation from any vendor, I think there is no problem for particular (debugging) use by developers.

> Do you suggest that the size of activation records should be fixed for each
> run of program? Can't it be part of the free store? I see no benefits here
> besides compatibility.

That's specified in the psABI document and that's sufficient. There's no need
to specify it elsewhere.

It's also not required to be fixed.

As parts of the implementation details not required by the language spec, knowledge of ABI is not the premise exposed to most users prior to real functionality - to a high-level language, including the ability of abstraction.

If you're caring about these sorts of things, then you're not operating at a "high-level".

I'm caring about getting rid of dependency of these sorts of things. So at least I'm trying to get "high-level".

OK, name a high-level language that provides guarantees about the stack in the specification. Even functional languages, where recursion is standard procedure, don't provide guarantees about the stack. They often do provide proper tail recursion, but outside of tail calls, the way the stack works is whatever happens, happens.
 
That's why I carefully differentiate the activation record and the stack in several contexts. Guarantee of the existence of the stack is not needed in general with the abstract machine (when specifying a high-level language rather than a stack machine like JVM). Even conceptional LIFO has too strong constraints to describe behavior in usual "functional languages" which does specify the order of specific observable behavior bounded on evaluations of special constructs in the language. Such need is only required locally for limited language contexts, like the stack unwinding case of C++.

(Yeah, like exception neutrality!) An algorithm can also specify the validity of assumption during the whole steps as the precondition.

Sure. Giving up a responsibility means washing your hands with what happens if it's not true. You either take something into account or you're willing to suffer the consequences if you don't.

Choosing not to be concerned about something doesn't mean it's not still a concern for your function. Just like allocating some memory means that part of the concerns for your function is that there will be enough memory for you to allocate it.

Think of it like being wealthy. A $1000 may be a rounding error to you, so spending it may not seem like a "concern". But if your bank account, for whatever reason, doesn't have that $1000 when you try to spend it, you will find yourself very "concerned" by that seemingly trivial sum.

Surely concerns are not responsibilities. The latter are largely effected by how the problem is described. If you want to keep the invariance of "sufficient precise are always properly handled", specify it in requirements. This still does not invalidate that there can be the case without such requirement.

For any function that uses resources, having sufficient resources to do its job is very much a part of that function's concerns. Functions which allocate memory require that there is sufficient memory for it to allocate. So if you're picking which algorithm to use, how much memory the various options require use can be the deciding factor.

Stack space is the same way. Given a particular problem domain, is a particular recursive algorithm likely to overflow? If the problem domain includes "N is always less than some relatively small number", an O(n) may be OK. If the problem domain includes an arbitrary N, then it's not OK.

There is no way that resource consumption is not part of a function's concerns or responsibilities. It simply isn't an obvious one in many cases.

When it is already guaranteed by authority, it is not. On the contrary, duplication or invalid assumptions on responsibility do harm the concerns because it distracts people from the real focus.

This shares the root to the reason against narrowing the contracts artificially.

FrankHB1989

unread,
Dec 6, 2017, 10:15:11 AM12/6/17
to ISO C++ Standard - Discussion


在 2017年12月6日星期三 UTC+8下午11:07:10,FrankHB1989写道:
Typo... it is against "artificially widening narrowing contracts".

Nicol Bolas

unread,
Dec 6, 2017, 2:36:13 PM12/6/17
to ISO C++ Standard - Discussion
No, if that rule didn't exist, there would have to be a bunch of other rules explaining how the stack works and behaves among function calls, explicitly describing when and where stack gets allocated and removed. Whether proper tail calls would be part of that or not would be a detail of those rules, and is hardly required for defining such rules.

You can't just take away that statement and magically get proper tail calls.

As of the language feature about tail calls... I do not propose it. As I've said, I would have posted it in std-proposals if I want. The discussion later was all about what they are and the effect of nonexistence of them to a general purposed language, not how to get them into the current C++ standard.

Then the discussion serves no useful purpose. We know what the effect of not having them in the standard is: you can't rely on any particular syntactic construct to be a proper tail call.

The discussion has to based on some consensus. But I think you have some fundamental misconceptions even I mostly do agree with your other points.

First, the leakage of implementation details I talked about is concerned with the abstraction that can be leaked.

It's a low-level language; leaking is gonna happen. And indeed, this particular "leak" is pretty trivial, all things considered, since every actual computer has limitations.

This is uniform in most general cases, not only with particular language to implement somethings. It is true that I can't rely on the existence of proper tail calls when I am using some specific versions of C++ (or C++ implementations), but that is not intended to be the premise when I have even started to consider the change of the language.

It should be. The language defines what you can rely on, so if you're thinking about changing it, you first have to see what it currently says.

In this sense, taking care of nonexistence of proper tail calls is leaking because the ability about "to be implementable with proper tail calls" is mostly a language-neutral property of the design of a solution, rather than of any concrete implementation - with any specific language. As said, even if C++ does not support tail calls, it is possible to be implemented indirectly, like adding IRs or other kinds of implementations of general purposed languages onto the top of the solution stack. This kind of solution is imperfect for many obvious reasons, so I'm a little enthusiastic how it would interact with the underlying language specification if it is possible to change.

I will later explain other reasons, along with your words quoted.

If tail calls were implemented in C++, they still would not talk about the stack. A C++ tail call feature would declare that the parameters of the tail call would be initialized, then all local variables would be destroyed, then the call happens, with the return value object from the calling function being the return value of this function. Or something to that effect.

There would be no need to discuss the stack or the possibility of stack exhaustion.

True, but just for the rules themselves.

If we are to propose new rules into the language, we have to make sure how they can be implemented typically as well as how it will interact with current implementations. We don't want another `export`.

Such a discussion should still take place in the context of finding a way to provide tail calls, not as some nebulous talk about random stack stuff.

And where is the intention come from? I've heard it was interested by someone in the committee years ago. The main reason to prevent it being in the semantic rules was that it does not interact with non-trivial destructor calls well.

If it would be in the specification, it would likely be "proper tail calls", not "optimization". (In fact, I have not seen any language specification specify such feature as "optimization".) Similarly, does copy elision (before C++17) accounts to instance of optimization, despite additional change on overloading resolution of special member functions?

Yes, elision's an optimization, but because it has visible side-effects, it is an optimization the standard explicitly makes allowances for. Tail optimization does not need explicit allowances for it, since it does not affect visible behavior as defined by the standard.

Elision is an optimization of performance by the result of the rules. This is intended, true. But in the aspect of spec, whether it specify exception of the as-if rule or not is purely detail.

No, it's not "pure detail". Elision would not be possible without that explicit wording, since constructor and destructor calls are visible effects. Compilers are not allowed to change code in a way that visibly changes its effects.

Second, you have biased about what is an "optimization". But I'd still put the detailed reasoning later.

Let me explain why I was talking about elision - it was illustrated to show the intentionally semantics change (to "as-if" rules) in the language: because they provide something not in the language without such rules.

And my point is that tail calls aren't like that. Tail calls don't have visible effects and therefore need not have rules in the language to allow compilers to implement them.

Now, if you want guaranteed tail calls, then just like guaranteed elision, you would need to add specific rules to make that happen. Though unlike guaranteed elision, it's not clear how that would be possible. After all, the rules of guaranteed elision are based on visible aspects of behavior. And since the stack is not visible behavior, it's not something the standard can talk about.

<snip all conversation related to the word "optimization">

Why are you so hung up on the word "optimize"? We both know what we're talking about: when compilers use the "as if" rule to emit code that is different from what you would expect, but still consistent with the behavior of C++ as defined by the standard.

If you object to the use of the term "optimize" for this, I don't care. It is a term in common usage for such things. We have defined entire terms around it: return value optimizations, tail call optimization, and so forth. Go look at compiler option descriptions and see how often they use "optimization" for things like this, whether they necessarily make the program faster or not. It is a term in common usage for these circumstances, so I see no problem with using it to describe such things.

Indeed, your insistence that "optimize" only applies to making programs run faster isn't even consistent with its accepted general meaning.
There are code size optimizations (most compilers allow you to optimize specifically for code size rather than runtime speed). There are even memory usage optimizations. So clearly, "optimize" means more than you think it does.

And the point of a tail call is that it makes you use less stack space, which is a finite resource. So it would be a stack space optimization; even if it makes the program slower, it's still a performance-vs-memory tradeoff. And BTW, even that can potentially improve performance, due to greater frequency of cache hits and needing fewer cache lines to cover the smaller stack. So even in the most restrictive meaning of the word "optimization", tail calls can still apply.

Overall, there is no need for this pedantry about whether a particular optimization makes code faster.

Guaranteed elision isn't an optimization anymore because it's mandated behavior. It changes the legality of code, by no longer requiring that a prvalue return type be copyable or moveable.

True. But even if the criteria of well-formedness would not be changed by the rule,

There is no way to have guaranteed elision work without affecting well-formedness. Or at least, no logically coherent way to do so. Once you define that the copy initialization no longer takes place, then the need for an accessible copy/move constructor goes away. And therefore, you affect well-formedness.

it is still important enough on program semantics by guaranteeing the identity of object in variant contexts (where copy-initialization is applicable). The point here is:. whether a feature is a limitation on well-formedness or not also does not directly relate with to be or not to be in the standard.

... what? "Whether a feature is a limitation on well-formedness or not" is defined by the standard. How can something defined by the standard not "directly relate with to be or not to be in the standard"?

Also, your repeated use of the phrase "directly relate with to be or not to be in the standard" is confusing.

(Although I'd admit that lacking of workaround of annoying use cases which disturbing objects to be first-class are also very strong reasons to get this feature into the language design.)

Similarly, there may be no inlining in debug mode. Iterators and algorithms implementations may have worse complexity. This does not prevent the standard requires them, because they can be tolerated as non-conforming extensions.

No, actually; iterators and algorithms are not allowed to vary in their complexity. They're allowed to run slower. But a RandomAccessIterator that checks if it is in range is still O(1).

True, so they are not conforming. This is still accepted by many users, though.

Did you read what I said? "But a RandomAccessIterator that checks if it is in range is still O(1)." That means it's still conforming to the standard.

I mean that despite conformance, it is acceptable to be non-O(1) and pretending to be a RandomAccessIterator in a few cases in reality. That's like that despite the tail calls are not specified at all in the current standard, not being safe as proper tail calls is also acceptable in a few cases.

No, it's not acceptable. Any such iterator is non-conforming. I don't know of a debugging RandomAccessIterator that violates the O(1) requirements of the standard. And no, getting the size from the container is not a violation.

This is not the situation about conformance - it is nodoubtfully nonconforming.

Sorry, but your double-negative usage was confusing. Do you mean that you doubt it is conforming, or that you're certain that it isn't conforming?

I guess it doesn't matter because either way, you're wrong. Fetching the size of the container is an O(1) operation, and therefore it does not violate the restrictions the standard imposes on such an iterator. Therefore, it is conforming.

QED.

This is about how real products being accepted by users. I treat the case acceptable because it does not replace the conforming ones.

If you're trying to provide an example where compilers emit non-conforming code that users frequently accept, this is not an example. If you disagree, feel free to provide evidence that accessing the size of the container is not an O(1) operation, relative to the container's size.

(Yeah, like exception neutrality!) An algorithm can also specify the validity of assumption during the whole steps as the precondition.

Sure. Giving up a responsibility means washing your hands with what happens if it's not true. You either take something into account or you're willing to suffer the consequences if you don't.

Choosing not to be concerned about something doesn't mean it's not still a concern for your function. Just like allocating some memory means that part of the concerns for your function is that there will be enough memory for you to allocate it.

Think of it like being wealthy. A $1000 may be a rounding error to you, so spending it may not seem like a "concern". But if your bank account, for whatever reason, doesn't have that $1000 when you try to spend it, you will find yourself very "concerned" by that seemingly trivial sum.

Surely concerns are not responsibilities. The latter are largely effected by how the problem is described. If you want to keep the invariance of "sufficient precise are always properly handled", specify it in requirements. This still does not invalidate that there can be the case without such requirement.

For any function that uses resources, having sufficient resources to do its job is very much a part of that function's concerns. Functions which allocate memory require that there is sufficient memory for it to allocate. So if you're picking which algorithm to use, how much memory the various options require use can be the deciding factor.

Stack space is the same way. Given a particular problem domain, is a particular recursive algorithm likely to overflow? If the problem domain includes "N is always less than some relatively small number", an O(n) may be OK. If the problem domain includes an arbitrary N, then it's not OK.

There is no way that resource consumption is not part of a function's concerns or responsibilities. It simply isn't an obvious one in many cases.

When it is already guaranteed by authority, it is not.

But, that's what a contract is. Your function says, "I don't work if you pass a NULL pointer; I'm not going to bother checking to see if the pointer you pass is not NULL.". The outside world that calls your function therefore agrees not to pass your function NULL pointers. If they fail to do so, then that's on them.

`std::lower_bound` says "I don't work if you pass me a range that is not partially ordered; I'm not going to check to see if the range is not partially ordered." The outside world that calls your function therefore agrees not to pass `lower_bound` a range that isn't partially ordered. If they fail to do so, then that's on them.

The same goes with resource usage. Your function (conceptually) says, "I don't work if you ask me to do more work than there is memory available for me to do it in." The outside world that calls your function therefore agrees not to call your function with so much work that it exhausts memory. And this applies just as well to stack memory as dynamic allocation.

Just because it cannot be easily quantified with a number does not mean it's not a contractual requirement. And it's hard to see how a contractual requirement isn't part of the "concerns" for a function.

On the contrary, duplication or invalid assumptions on responsibility do harm the concerns because it distracts people from the real focus.

That "focus" becomes decidedly more "real" when your recursive algorithm that only expected to get a small N gets a really large N and the stack overflows. Just because you didn't think about some aspect of your contract doesn't mean it can't be violated.

This shares the root to the reason against artificially widening narrowing contracts.

Thiago Macieira

unread,
Dec 6, 2017, 2:58:34 PM12/6/17
to std-dis...@isocpp.org
On Wednesday, 6 December 2017 11:36:12 PST Nicol Bolas wrote:
> And the point of a tail call is that it makes you use less stack space,
> which is a finite resource. So it would be a stack space optimization; even
> if it makes the program slower, it's still a performance-vs-memory
> tradeoff.

I've always thought of tail call optimisation as a runtime performance thing,
by avoiding an extra return jump on the way from the nested call. The middle
function can run the epilogue code and uninstall its call frame, but it may
still need to call (not jump) to the target function.

FrankHB1989

unread,
Dec 7, 2017, 8:10:59 AM12/7/17
to ISO C++ Standard - Discussion


在 2017年12月7日星期四 UTC+8上午3:36:13,Nicol Bolas写道:
No, it is not "would have to". No rules about stack have to be added for the purpose.

You can't just take away that statement and magically get proper tail calls.

So is "effectively", even it is not explicitly specified, or specified in other names.

As of the language feature about tail calls... I do not propose it. As I've said, I would have posted it in std-proposals if I want. The discussion later was all about what they are and the effect of nonexistence of them to a general purposed language, not how to get them into the current C++ standard.

Then the discussion serves no useful purpose. We know what the effect of not having them in the standard is: you can't rely on any particular syntactic construct to be a proper tail call.

No. I was talking about "to a language", not only to its specification (though it is also indirectly effected). At least in this thread you don't know much about effect of proper tail calls on pragmatics, as shown by various misconceptions about the effects of the feature (not only relying on the existence of the feature).
 
The discussion has to based on some consensus. But I think you have some fundamental misconceptions even I mostly do agree with your other points.

First, the leakage of implementation details I talked about is concerned with the abstraction that can be leaked.

It's a low-level language; leaking is gonna happen. And indeed, this particular "leak" is pretty trivial, all things considered, since every actual computer has limitations.

I believe many users would not agree with "low-level language" for various reasons.

The leak is not trivial because it is not always neglectable and it is difficult to be resolved at any single place. I have to modify multiple current works with different stakeholders to make everyone happy, or simply ignore the problem. But there is no guarantee that I won't bite by it (although it should not occur frequently).

"Every actual computer has limitations" is not a good excuse for the ignorance because different kinds of limitations have different effects. Some of them are conveniently to be ignore, some are not.

For every project large enough, eventually I will likely have to pile tons of workarounds, but no simple true fixes of the root. This is one result of essential difficulties in engineering, starting from compromise like this.


This is uniform in most general cases, not only with particular language to implement somethings. It is true that I can't rely on the existence of proper tail calls when I am using some specific versions of C++ (or C++ implementations), but that is not intended to be the premise when I have even started to consider the change of the language.

It should be. The language defines what you can rely on, so if you're thinking about changing it, you first have to see what it currently says.

OK, I started this thread "to see what it currently says" at first, and I have seen enough. Then when I have started to think about the change, it means the language defines what I used to rely on, not I have to rely on. Otherwise no more guarantees can be added to a current language in future. This should be a non-purpose.

It can't be the premise before I'm convinced which language can sufficiently serve to the particular set of needs.

Note to change the specification of a language is not the only option; it just can be far less expensive comparing to invent a new one from scratch (when other options are infeasible).
 
In this sense, taking care of nonexistence of proper tail calls is leaking because the ability about "to be implementable with proper tail calls" is mostly a language-neutral property of the design of a solution, rather than of any concrete implementation - with any specific language. As said, even if C++ does not support tail calls, it is possible to be implemented indirectly, like adding IRs or other kinds of implementations of general purposed languages onto the top of the solution stack. This kind of solution is imperfect for many obvious reasons, so I'm a little enthusiastic how it would interact with the underlying language specification if it is possible to change.

I will later explain other reasons, along with your words quoted.

If tail calls were implemented in C++, they still would not talk about the stack. A C++ tail call feature would declare that the parameters of the tail call would be initialized, then all local variables would be destroyed, then the call happens, with the return value object from the calling function being the return value of this function. Or something to that effect.

There would be no need to discuss the stack or the possibility of stack exhaustion.

True, but just for the rules themselves.

If we are to propose new rules into the language, we have to make sure how they can be implemented typically as well as how it will interact with current implementations. We don't want another `export`.

Such a discussion should still take place in the context of finding a way to provide tail calls, not as some nebulous talk about random stack stuff.

Not such simple, because "stack" has multiple meanings. Remember the "stack" in "stack unwinding". Do you think it is same to the one provided by an underlying implementation? Or why we still need the notion of "stack" in the normative rules even no call stack is defined by the standard?

The abstraction can be helpful to provide features like proper tail calls, but itself is not necessarily in the lower level.

I don't think "stack" is a good name to be standardized in many contexts besides ordered unwinding. It is biased to particular kind of implementations. It already leads to confusion and ambiguity too many times. Activation record may be a better candidate.
 
And where is the intention come from? I've heard it was interested by someone in the committee years ago. The main reason to prevent it being in the semantic rules was that it does not interact with non-trivial destructor calls well.

If it would be in the specification, it would likely be "proper tail calls", not "optimization". (In fact, I have not seen any language specification specify such feature as "optimization".) Similarly, does copy elision (before C++17) accounts to instance of optimization, despite additional change on overloading resolution of special member functions?

Yes, elision's an optimization, but because it has visible side-effects, it is an optimization the standard explicitly makes allowances for. Tail optimization does not need explicit allowances for it, since it does not affect visible behavior as defined by the standard.

Elision is an optimization of performance by the result of the rules. This is intended, true. But in the aspect of spec, whether it specify exception of the as-if rule or not is purely detail.

No, it's not "pure detail". Elision would not be possible without that explicit wording, since constructor and destructor calls are visible effects. Compilers are not allowed to change code in a way that visibly changes its effects.

Second, you have biased about what is an "optimization". But I'd still put the detailed reasoning later.

Let me explain why I was talking about elision - it was illustrated to show the intentionally semantics change (to "as-if" rules) in the language: because they provide something not in the language without such rules.

And my point is that tail calls aren't like that. Tail calls don't have visible effects and therefore need not have rules in the language to allow compilers to implement them.

Right.

Now, if you want guaranteed tail calls, then just like guaranteed elision, you would need to add specific rules to make that happen. Though unlike guaranteed elision, it's not clear how that would be possible. After all, the rules of guaranteed elision are based on visible aspects of behavior. And since the stack is not visible behavior, it's not something the standard can talk about.
True, they are not similar in this aspect.


<snip all conversation related to the word "optimization">

Why are you so hung up on the word "optimize"? We both know what we're talking about: when compilers use the "as if" rule to emit code that is different from what you would expect, but still consistent with the behavior of C++ as defined by the standard.

Because not all things allowed by "as-if" rules are concerned with optimizations. You were talking about some equivalent classes of implementations allowed by the "as-if" rules. They are different. Optimizations have "direction"; the inverse transformation of an optimization is always not an optimization, if nothing others are changed.

 
If you object to the use of the term "optimize" for this, I don't care. It is a term in common usage for such things. We have defined entire terms around it: return value optimizations, tail call optimization, and so forth. Go look at compiler option descriptions and see how often they use "optimization" for things like this, whether they necessarily make the program faster or not. It is a term in common usage for these circumstances, so I see no problem with using it to describe such things.

Indeed, your insistence that "optimize" only applies to making programs run faster isn't even consistent with its accepted general meaning.
There are code size optimizations (most compilers allow you to optimize specifically for code size rather than runtime speed). There are even memory usage optimizations. So clearly, "optimize" means more than you think it does.

I don't ever remember I have this point. "It just does it faster" were your words, not mine.

If formally defined, optimization could be an operation based on an instance of anti-reflexive and asymmetric binary relationships reflecting the partial order about "more optimized". The criteria are not so simple in general. Not simply "faster".

And the point of a tail call is that it makes you use less stack space, which is a finite resource. So it would be a stack space optimization; even if it makes the program slower, it's still a performance-vs-memory tradeoff. And BTW, even that can potentially improve performance, due to greater frequency of cache hits and needing fewer cache lines to cover the smaller stack. So even in the most restrictive meaning of the word "optimization", tail calls can still apply.

The point is not to make the space consumption less, it is much stronger: to guarantee less space complexity (typically O(1) for applicable cases of proper tail calls for a static language, see below) as possible. The minimum of complexity is essentially a property of the algorithm being described, independent to which language you use to implement. As the case of bounded variables in reentered block-like structures (in the description of the algorithm), few algorithms would make the identity of them explicit unless they had to live across the whole algorithm, so space they occupied is free to overlap, this implies asymptotic O(n) space complexity where n is the number of variables dynamically allocated in the activation record frame corresponding to the block. This became O(1) for block-scope variables of a static language. In cases you won't know O(1) is not allowed before you know which implementation is used, to rely on naive assumptions of non-O(1) as the default cases already leaks more implementation details (like the stack). And if not O(1), how much should be expected by the user? Sticking on less complexity as default strategy is far saner that to arbitrary so-called non-"optimized" ones.

And you seem to have misunderstood about "performance". Both time efficiency (e.g. low latency and high throughput) and space efficiency (e.g. low memory footprint) can be parts of performance, depending on the measurement.

Overall, there is no need for this pedantry about whether a particular optimization makes code faster.

Guaranteed elision isn't an optimization anymore because it's mandated behavior. It changes the legality of code, by no longer requiring that a prvalue return type be copyable or moveable.

True. But even if the criteria of well-formedness would not be changed by the rule,

There is no way to have guaranteed elision work without affecting well-formedness. Or at least, no logically coherent way to do so. Once you define that the copy initialization no longer takes place, then the need for an accessible copy/move constructor goes away. And therefore, you affect well-formedness.

it is still important enough on program semantics by guaranteeing the identity of object in variant contexts (where copy-initialization is applicable). The point here is:. whether a feature is a limitation on well-formedness or not also does not directly relate with to be or not to be in the standard.

... what? "Whether a feature is a limitation on well-formedness or not" is defined by the standard. How can something defined by the standard not "directly relate with to be or not to be in the standard"?

If it is defined by the standard, it has been in the standard. There is no problem about to be or not to be, unless it is to be deprecated and/or removed.
 
Also, your repeated use of the phrase "directly relate with to be or not to be in the standard" is confusing.

(Although I'd admit that lacking of workaround of annoying use cases which disturbing objects to be first-class are also very strong reasons to get this feature into the language design.)

Similarly, there may be no inlining in debug mode. Iterators and algorithms implementations may have worse complexity. This does not prevent the standard requires them, because they can be tolerated as non-conforming extensions.

No, actually; iterators and algorithms are not allowed to vary in their complexity. They're allowed to run slower. But a RandomAccessIterator that checks if it is in range is still O(1).

True, so they are not conforming. This is still accepted by many users, though.

Did you read what I said? "But a RandomAccessIterator that checks if it is in range is still O(1)." That means it's still conforming to the standard.

I mean that despite conformance, it is acceptable to be non-O(1) and pretending to be a RandomAccessIterator in a few cases in reality. That's like that despite the tail calls are not specified at all in the current standard, not being safe as proper tail calls is also acceptable in a few cases.

No, it's not acceptable. Any such iterator is non-conforming. I don't know of a debugging RandomAccessIterator that violates the O(1) requirements of the standard. And no, getting the size from the container is not a violation.

This is not the situation about conformance - it is nodoubtfully nonconforming.

Sorry, but your double-negative usage was confusing. Do you mean that you doubt it is conforming, or that you're certain that it isn't conforming?

Reordered and expanded: it is nonconforming if it violates the requirements, and it is conforming if it does not violated. There should be no doubt.

O(1) time complexity or not is less concerned because it is rarely relied on when (already) working with such nonconforming implementations. The constant often more significant in the case. And consider built-in pointers being used as or not as random access iterators. What guarantees do you get from the standard of operations on them?
 
I guess it doesn't matter because either way, you're wrong. Fetching the size of the container is an O(1) operation, and therefore it does not violate the restrictions the standard imposes on such an iterator. Therefore, it is conforming.

Fetching the size of a container was once not guaranteed O(1). BTW, I don't think it sane to require O(1) size() on containers. It has nothing to do with the etymology or metaphor of the abstracted "container". And it has already make `std::list` less useful.

QED.

This is about how real products being accepted by users. I treat the case acceptable because it does not replace the conforming ones.

If you're trying to provide an example where compilers emit non-conforming code that users frequently accept, this is not an example. If you disagree, feel free to provide evidence that accessing the size of the container is not an O(1) operation, relative to the container's size.

I treat a nonconforming implementation acceptable if it can be configured to be conforming in general. For specific cases, QoI counts. An conforming implementation can also be unacceptable in this sense.

The users here are developers. They have to know avoiding use of iterator debug frequently in production environment. But they perhaps don't care old ABI libstdc++'s std::list at all.  (AFAIK many people are even still using GCC 4...).
 
(Yeah, like exception neutrality!) An algorithm can also specify the validity of assumption during the whole steps as the precondition.

Sure. Giving up a responsibility means washing your hands with what happens if it's not true. You either take something into account or you're willing to suffer the consequences if you don't.

Choosing not to be concerned about something doesn't mean it's not still a concern for your function. Just like allocating some memory means that part of the concerns for your function is that there will be enough memory for you to allocate it.

Think of it like being wealthy. A $1000 may be a rounding error to you, so spending it may not seem like a "concern". But if your bank account, for whatever reason, doesn't have that $1000 when you try to spend it, you will find yourself very "concerned" by that seemingly trivial sum.

Surely concerns are not responsibilities. The latter are largely effected by how the problem is described. If you want to keep the invariance of "sufficient precise are always properly handled", specify it in requirements. This still does not invalidate that there can be the case without such requirement.

For any function that uses resources, having sufficient resources to do its job is very much a part of that function's concerns. Functions which allocate memory require that there is sufficient memory for it to allocate. So if you're picking which algorithm to use, how much memory the various options require use can be the deciding factor.

Stack space is the same way. Given a particular problem domain, is a particular recursive algorithm likely to overflow? If the problem domain includes "N is always less than some relatively small number", an O(n) may be OK. If the problem domain includes an arbitrary N, then it's not OK.

There is no way that resource consumption is not part of a function's concerns or responsibilities. It simply isn't an obvious one in many cases.

When it is already guaranteed by authority, it is not.

But, that's what a contract is. Your function says, "I don't work if you pass a NULL pointer; I'm not going to bother checking to see if the pointer you pass is not NULL.". The outside world that calls your function therefore agrees not to pass your function NULL pointers. If they fail to do so, then that's on them.

`std::lower_bound` says "I don't work if you pass me a range that is not partially ordered; I'm not going to check to see if the range is not partially ordered." The outside world that calls your function therefore agrees not to pass `lower_bound` a range that isn't partially ordered. If they fail to do so, then that's on them.

The same goes with resource usage. Your function (conceptually) says, "I don't work if you ask me to do more work than there is memory available for me to do it in." The outside world that calls your function therefore agrees not to call your function with so much work that it exhausts memory. And this applies just as well to stack memory as dynamic allocation.

True. So I don't claim the rule of UB is wrong. It is just not so useful with C++ when I need to work around the missing of guarantee as portable (not necessarily efficient) as possible, to catch up remained C++ code.

Just because it cannot be easily quantified with a number does not mean it's not a contractual requirement. And it's hard to see how a contractual requirement isn't part of the "concerns" for a function.

On the contrary, duplication or invalid assumptions on responsibility do harm the concerns because it distracts people from the real focus.

That "focus" becomes decidedly more "real" when your recursive algorithm that only expected to get a small N gets a really large N and the stack overflows. Just because you didn't think about some aspect of your contract doesn't mean it can't be violated.

If this makes trouble, the algorithm goes wrong. It should have make the management of the resources explicit where N is a parameter to be specified by interface.

FrankHB1989

unread,
Dec 7, 2017, 8:35:42 AM12/7/17
to ISO C++ Standard - Discussion


在 2017年12月7日星期四 UTC+8上午3:58:34,Thiago Macieira写道:
On Wednesday, 6 December 2017 11:36:12 PST Nicol Bolas wrote:
> And the point of a tail call is that it makes you use less stack space,
> which is a finite resource. So it would be a stack space optimization; even
> if it makes the program slower, it's still a performance-vs-memory
> tradeoff.

I've always thought of tail call optimisation as a runtime performance thing,
by avoiding an extra return jump on the way from the nested call. The middle
function can run the epilogue code and uninstall its call frame, but it may
still need to call (not jump) to the target function.

It is valid when you have so concrete knowledge of specific implementations, but in general, no.

Proper tail calls can work where there exist "calls" which allowing distinguishing tail and non-tail contexts. This does not rely on existence of jump.

For example, calls can be implemented as β-reductions with lambda abstractions (the "callee") in lambda calculi. There is simply nothing to be optimized like jumps. Tail calls still remain well-formed by allowance of the rewriting rules. The transformation can still be considered an optimization in some cases because it can save memory needed to hold the terms to be evaluated; however, reduction strategy are dominant all the time.
Reply all
Reply to author
Forward
0 new messages