soft and program: Digest for comp.lang.c++@googlegroups.com

comp.lang.c++@googlegroups.com

Google Groups

Has "stack overflow" specified behavior? - 13 Updates

Has "stack overflow" specified behavior?

Paavo Helde <eesnimi@osa.pri.ee>: Dec 13 01:51AM +0200

12.12.2021 10:42 wij kirjutas:
> }

> ---
> Has "stack overflow" specified behavior?

No. Stack overflow is arguably the least specified behavior of them all.
The stack size is extremely limited (few MB), compared to the RAM
amounts current computers have (tens of GB). There is no
standard-defined way to detect stack overflow, not to speak about
handling it. That's one reason why using stack-allocated things like
std::array needs special care, especially when writing libraries (which
need to execute in a stack of unknown size and fill-up).

There are some implementation-defined ways though to survive stack
overflows, but it's not so easy. You cannot continue the program if
there is no more stack space, so the only way is to throw an exception.
Alas, there is no "throw" statement in the code, so this would be an
"asynchronous" exception appearing at a pretty random place in the code,
meaning that the compiler must cope with such exceptions, which may
easily slow down the whole program (witness the /EHa compiler option in
MSVC).

BTW, your example code is not guaranteed to cause stack overflow, it
might go into an infinite loop instead because of tail recursion, or
become a zero op by optimizing the whole t() function away, either as UB
or as a code with no effect.

Juha Nieminen <nospam@thanks.invalid>: Dec 13 05:48AM

> amounts current computers have (tens of GB). There is no
> standard-defined way to detect stack overflow, not to speak about
> handling it.

That made me think: Why has neither the C nor the C++ standardization
committees ever thought of adding a standard library utility to get
the current amount of free stack space?

Sure, perhaps in some operating systems this isn't something that
programs can get, but the function in question could be optional in
that sense. For example it could return -1 to indicate "this operation
is not supported", else a non-negative value to indicate the amount
of free stack space. This could give programs at least the opportunity
to gracefully do something if running out of stack space. I think this
could be useful especially in programs that need to be as stable and
secure as possible.

(In fact, getting the amount of free (physical) RAM available to the
process could also be useful, for similar reasons. It could behave
in the same way: -1 if the operation is for some reason not supported,
else the amount of free RAM (not counting swap).)

Bo Persson <bo@bo-persson.se>: Dec 13 10:44AM +0100

On 2021-12-13 at 06:48, Juha Nieminen wrote:
> process could also be useful, for similar reasons. It could behave
> in the same way: -1 if the operation is for some reason not supported,
> else the amount of free RAM (not counting swap).)

This would be of very limited use. The result of a call from one thread
wouldn't be valid long, if the other threads allocate and free memory
already while the result is returned.

It is similar to filesystem::exists("path"). If you get a true or false
back, how long is the result valid? Nanoseconds?

wij <wyniijj@gmail.com>: Dec 13 04:44AM -0800

On Monday, 13 December 2021 at 17:44:58 UTC+8, Bo Persson wrote:
> already while the result is returned.

> It is similar to filesystem::exists("path"). If you get a true or false
> back, how long is the result valid? Nanoseconds?

If such a function is useful, the duration the value valid is not a problem.
The only useful cases I have are when implementing 'synchronized' thread
(not sure about the name) or the function algorithm uses stack heavily, e.g.
when the size of an object is significantly large.

Richard Damon <Richard@Damon-Family.org>: Dec 13 07:47AM -0500

On 12/13/21 4:44 AM, Bo Persson wrote:
> already while the result is returned.

> It is similar to filesystem::exists("path"). If you get a true or false
> back, how long is the result valid? Nanoseconds?

Actually, most threading libraries creat threads with a FIXED sized
stack (specified in the create call or it uses a default), and that
space is all reserved for the thread stack.

Only the main thread tends to have an expandable stack, which often
grows into the heap, so only the main thread would be affected by other
threads activities.

I suspect that part of the issue is that while it would normally be
possible to compute how much address space is available for the stack,
it can be more complicated to figure out if you can map usable ram into
that address space, and with the Linux over-commit issue, the straight
answer might easily be not available, or expensive to compute.

Also, this is something easy for a system to add as a system defined
function, that works for it. The fact that this isn't commonly available
seems to be a sign that it isn't really easy to provide what might be
needed, or it isn't really needed.

Manfred <noname@add.invalid>: Dec 13 05:08PM +0100

On 12/12/2021 9:42 AM, wij wrote:
> }

> ---
> Has "stack overflow" specified behavior?

Putting apart the specific example, the standard describes the behavior
of the abstract machine only, but Appendix B refers to constraints posed
by actual implementations, and that includes the "nesting levels of
compound statements" (which in turn include function bodies).
So, what you call stack overflow (an expression not found in the
standard) is in fact a possible violation of a constraint posed by the
implementation.
As a kind of constraint violation this leads to UB - specifically I'd
consider this under n4860 p4.1 clause (2.3) "If a program contains a
violation of a rule for which no diagnostic is required, this document
places no requirement on implementations with respect to that program".

With respect to the example given, n4860 p6.9.2.2 gives explicit
permission to an implementation to remove the loop, and compile the
whole program as a no-op (ref. "observable behavior").

Manfred <noname@add.invalid>: Dec 13 05:23PM +0100

On 12/13/2021 6:48 AM, Juha Nieminen wrote:

> That made me think: Why has neither the C nor the C++ standardization
> committees ever thought of adding a standard library utility to get
> the current amount of free stack space?

The more general answer is that the standard (both C and C++) describes
the behavior of an "abstract machine", and the requirement for actual
conformant implementations is to produce the same "observable behavior"
as the abstract machine.
The standard does not even mandate a stack [*]; it specifies the
language rules for function calls, scope, local variables etc. all of
which is commonly implemented via a memory stack, but that's the
implementation, not the abstract machine.
The standard then connects back to the real world in Appendix B, where
it says that limitations of finite systems may result in program
constraints that should be documented by the implementation.
In this perspective such constraints fall under the "implementation
defined" category.

> process could also be useful, for similar reasons. It could behave
> in the same way: -1 if the operation is for some reason not supported,
> else the amount of free RAM (not counting swap).)

[*] The standard talks a.o. about "stack unwinding", thus referring to
the common concept of a stack, but only to describe the process of
destructing objects with automatic storage when leaving their scope. So,
it's a concept related with scoping and lifetime, not with the stack
memory structure.

James Kuyper <jameskuyper@alumni.caltech.edu>: Dec 13 12:13PM -0500

On 12/13/21 4:44 AM, Bo Persson wrote:

>> That made me think: Why has neither the C nor the C++ standardization
>> committees ever thought of adding a standard library utility to get
>> the current amount of free stack space?

A key factor in that decision is the fact that neither the C nor the C++
standard ever talks about the stack space, a fact that allows either
language to be implemented on systems where the concept of "stack" is
meaningless.
On operating systems where the concept is meaningful, there often is a
way to conduct such a query. For instance, on Unix-like systems, there's
getrlimit(RLIMIT_STACK, &rlim).

,,,
> already while the result is returned.

> It is similar to filesystem::exists("path"). If you get a true or false
> back, how long is the result valid? Nanoseconds?

There's an important difference: as a matter of the policies you use for
managing a filesystem (rather than anything enforced by the operating
system), it is not only possible, but commonplace, to be certain that a
given file, if present, will remain in existence long enough to do
whatever it is you want to do with it.
That's not the case with the amount of free stack space.

"Öö Tiib" <ootiib@hot.ee>: Dec 13 12:05PM -0800

> standard ever talks about the stack space, a fact that allows either
> language to be implemented on systems where the concept of "stack" is
> meaningless.

That feels like quite odd argument. How can that distract anyone from
fact that every implementation has to have some kind of storage
that is used for storing objects with automatic storage duration? Neither
standard denies that it has to exist nor that it is potentially limited.
Avoiding naming it with some shorter name does not make it
disappear from abstract machine. It may run out of available space
and the standards avoid providing any ways to estimate that space
or to handle that event.

Tim Rentsch <tr.17687@z991.linuxsc.com>: Dec 13 01:13PM -0800

> }

> ---
> Has "stack overflow" specified behavior?

The expression '++a' tries to read an uninitialized variable.
After correcting for that oversight (for example, by giving a
value to 'a' at its declaration by 'int a = 0;'), the program has
defined behavior. To be more specific, each of the operations
asked for in the program has a well-defined description of what
is to happen in the abstract machine, which means the program as
a whole has defined behavior.

Note that this conclusion is about what will take place in the
/abstract/ machine, and not about what occurs if and when the
program is run in an /actual/ machine. The C++ standard
explicitly lets executing a program in an actual machine off the
hook for running out of any kind of limited resource, including
but not limited to "stack space". Section 4.1 paragraph 2.1 of
n4860 says this:

If a program contains no violations of the rules in this
document, a conforming implementation shall, within its
resource limits, accept and correctly execute that program.

So even though the program has defined behavior, it may very
well fail due to running out of stack space when executed.
Moreover that applies to all programs, for any kind of
resource the implementation might depend on.

Short summary: the program (not counting the uninitialized
access) has defined behavior, but may fail because of stack
overflow during an actual execution.

Tim Rentsch <tr.17687@z991.linuxsc.com>: Dec 13 01:50PM -0800

> violation of a rule for which no diagnostic is required, this document
> places no requirement on implementations with respect to that
> program".

First, I think you mean Annex B, not Appendix B.

Second, Annex B never uses the word 'constraint'.

Third, Annex B is informative, not normative. Nothing it says can
change the rules governing the C++ language. (Side note: Annex B
itself says in the last sentence of paragraph 2:

However, these quantities are only guidelines and do not
determine compliance.

End side note.)

The program shown above (after fixing the problem of reading an
uninitialized variable) has defined behavior, not undefined
behavior. An execution of the program in an actual machine may
fail due to running out of stack space (or any other resource)
per section 4.1 paragraph 2.1. Despite that, what happens in the
abstract machine is well-defined, and so the program has only
defined behavior, and no undefined behavior.

Tim Rentsch <tr.17687@z991.linuxsc.com>: Dec 13 02:16PM -0800

> might go into an infinite loop instead because of tail recursion, or
> become a zero op by optimizing the whole t() function away, either
> as UB or as a code with no effect.

It's important to distinguish the two realms of abstract machine
and actual machine. In the abstract machine, the program shown
above (after fixing the problem of reading an uninitialized
variable) does have a well-defined specification, and the program
as a whole has defined behavior. Whether a program has defined
behavior or undefined behavior is determined solely by what goes
on in the abstract machine (which may depend on values read from
a file or other input device, etc, but still the question is to
be answered considering only what happens in the abstract
machine, with reference to any actual machine). Everything the
program does has a well-defined specification, and so the program
has only defined behavior, and no undefined behavior.

In an actual machine, an implementation is obliged to carry out
the abstract semantics only to the extent that the execution
does not exceed the implementation's "resource limits", which
might be anything at all, including stack space. Once such a
resource limit is exceeded, the implementation has no further
obligations, and may abort, or whatever. But that isn't the
same as undefined behavior, which depends solely on what the
standard says about operations in the abstract machine.

wij <wyniijj@gmail.com>: Dec 13 03:07PM -0800

On Tuesday, 14 December 2021 at 06:16:23 UTC+8, Tim Rentsch wrote:
> obligations, and may abort, or whatever. But that isn't the
> same as undefined behavior, which depends solely on what the
> standard says about operations in the abstract machine.

If the concept of abstract (ideal) machine is used (the 1st time I heard this
term in use). The infinite recursive call should be defined as it is (never
return, or infinite loop except semantics 'optimized' to differ), all functions
within should be carried out successfully. But, for this ideal to be anything
reasonable, there should at least one machine that can execute the program
correctly.
If this is accepted, what should this 'actual machine' do with the infinite
recursive call?

You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

soft and program

Monday, December 13, 2021

Digest for comp.lang.c++@googlegroups.com - 13 updates in 1 topic

No comments:

Blog Archive

About Me