Tuesday, July 25, 2017

Digest for comp.lang.c++@googlegroups.com - 22 updates in 3 topics

Tim Rentsch <txr@alumni.caltech.edu>: Jul 25 10:14AM -0700


>> Point one: the concern here is not one of correctness but
>> one of resource requirements.
 
> When you exceed available machine stack space you get UB.
 
This is a common misconception, but actually it isn't right. The
abstract machine has no notion of stack, and no notion of running
out of stack space. The semantics of function calls is defined,
without regard to how much "stack space" is available, and there
is no excepting statement to make one undefined (assuming of
course the type of the call matches the type of the definition,
which is not at issue here).
 
In practical terms, the consequences of running out of stack
space may be just as severe as undefined behavior. But the
distinction is still important, because the issue is not a
language question but rather is extra-linguistic, ie, it depends
on things outside the language definition, and indeed on things
outside the implementation.
 
If you want to check this out, look at section 1 paragraph 2 in
the C standard (ie, the one that starts "This International
Standard does not specify ..."). AFAICT there is no similar set
of provisions in the C++ standard, but the C++ standard
incorporates the C standard as a normative reference, which I
believe is meant to make the same assertions for C++ as are
made for C in section 1 paragraph 2.
 
> compilers, for a new version or different options, or really
> anything (since there is no guarantee), can foil the crucial
> optimization of tail recursion.
 
Maybe you mean something different by "correctness" than I do.
What I mean is that the semantics of the language gives the
program a meaning that agrees with its desired effect. With this
meaning of correctness, the issues here are not about correctness
but about other factors.
 
Now I grant you, those other concerns are important, and can
matter just as much as having a wrong program or one that has
undefined behavior. I don't mean to dismiss them, just be clear
about where the issues lie. (And we will take up shortly how
they may be addressed.)
 
 
> How much work is it to implement such a check?
 
> It would be nice if you could provide one, to put this argument on
> a more solid foundation.
 
Let me outline a way for the simple case (but probably also the
most common) of self-call. Afterwards there may be some comments
about more elaborate cases.
 
1. When compiling, have the compiler leave generated
assembly around for the checking step.
 
2. Scan the generated assembly, doing this:
2a. When a function starts, remember its name
2b. When a function ends, remember it is over
2c. When a 'call' instruction is encounted, if
the call is to the same function as the
last one started, the call is a self-call,
and should be flagged
2d. At the end, if no self-calls have been
flagged, there are no "bad" functions.
 
I expect you can write one of these yourself, without too much
difficulty. I wrote one in awk that's about 50 lines. (I make
no apologies for my awk code, but I must confess it isn't always
the code I'm most proud of. :) You're a code hound - how about
you try coding that up (in C++ if that can be done easily) and
see how it goes?
 
There are several limitations that are worth mentioning (listed
roughly in order of increasing difficulty).
 
1. Some functions, eg quicksort, are meant to be truly recursive
(ie, have self-calls). If we want to allow that there needs to
be a way to identify those functions so they won't be flagged.
 
2. Scanning the assembly is target specific. If there are
multiple target platforms then there needs to be a scanning
program for each target (or perhaps a single program that
understands many targets).
 
3. If we want to identify more general patterns, eg foo() calls
bas() and bas() calls foo(), the analysis is more complicated.
Code to flag the more general cases certainly isn't trivial,
but I think it isn't too hard either (similar to call graph
analysis).
 
4. The scheme outlined looks at one TU at a time, but not
inter-TU mutual recursion. If it's important to identify
"bad" inter-TU mutual recursion, then the analysis needs
to be done on all the generated assemblies at once (or maybe
produce some sort of summary, which can then be combined at
the end for the inter-TU stuff).
 
5. Indirect function calls (eg, through pointer-to-function
types) don't get flagged (ie, assuming there isn't some sort of
flow analysis that figures out which actual function is called).
This could matter for C++, if we want to identify mutually
recursive virtual functions. Actually I think there is something
useful that can be done here, but for simplicity let's assume
this case is out of bounds (so no large recursion depths for
mutually recursive virtual functions).
 
It takes a lot to do all of these, but there's no reason we have
to do that. Taking just (1) and (2), that's probably another 10
lines plus maybe 15 lines per target. I should probably go off
and do an implementation for (3) rather than speculate about how
much code it would be. But even without (3) we have a very
useful tool, because direct self-call is what will come up most
often, especially as people start exploring functional/recursive
alternatives.
Mr Flibble <flibbleREMOVETHISBIT@i42.co.uk>: Jul 25 06:45PM +0100

On 25/07/2017 18:14, Tim Rentsch wrote:
 
> 1. Some functions, eg quicksort, are meant to be truly recursive
> (ie, have self-calls). If we want to allow that there needs to
> be a way to identify those functions so they won't be flagged.
 
Quicksort will only ever consume O(lg N) stack space worst case whilst
recursively traversing a linked list will consume O(N) stack space; an
important difference.
 
Also the ISO C++ Standard *does* recognize the existence of a machine stack.
 
Are there any other programming related issues that you are
fundamentally wrong about?
 
/Flibble
Manfred <noname@invalid.add>: Jul 25 09:36PM +0200

On 7/25/2017 7:14 PM, Tim Rentsch wrote:
> "Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com> writes:
[...]
 
> This is a common misconception, but actually it isn't right. The
> abstract machine has no notion of stack, and no notion of running
> out of stack space.
It is not a misconception. It is correct: running out of stack is the
same as running out of memory (technically it is running out of
automatic storage), which is UB.
 
[...]
 
> program a meaning that agrees with its desired effect. With this
> meaning of correctness, the issues here are not about correctness
> but about other factors.
No, the issue is about correctness: a program that runs out of stack
space (i.e. out of automatic storage memory) is just broken (I think I
already explained that). No matter if the standard does not specify (for
obvious reasons) any requirements for physical or virtual amount of memory.
 
[...]
 
> and should be flagged
> 2d. At the end, if no self-calls have been
> flagged, there are no "bad" functions.
 
[... more complications snipped]
Are you seriously stating that doing this is more clear, more robust,
and better maintainable than writing a correct non-recursive implementation?
David Brown <david.brown@hesbynett.no>: Jul 25 09:42PM +0200

On 25/07/17 19:45, Mr Flibble wrote:
 
> Quicksort will only ever consume O(lg N) stack space worst case whilst
> recursively traversing a linked list will consume O(N) stack space; an
> important difference.
 
Worst case, quicksort is linear - and will thus consume O(N) stack
space. For a simple and obvious implementation of quicksort, a
currently sorted list is the worst case. Usually, implementations are a
bit more sophisticated than that, but that is the worst case.
 
 
> Also the ISO C++ Standard *does* recognize the existence of a machine
> stack.
 
It is quite easy to take a pdf of the C++ standard (I use N3797, C++14)
and search for the word "stack". It turns up about a dozen times in the
context of "stack unwinding", and of course in the container type
"stack". The phrase "machine stack" does not exist, and "stack
unwinding" is just the name given to the process of calling destructors
for automatic objects when an exception is thrown.
 
So no, the C++ Standard does /not/ require a machine stack of any sort.
It requires a way to handle function calls, including recursive ones,
and it requires a way to track destructable automatic objects in order
to destruct them in the correct order when necessary.
 
Pretty much the only sane way to implement this is with a machine stack
(or possibly two - I know of a C implementation with separate return
stacks and data stacks, though it did not support C++). I have never
heard of a C++ implementation that does not have a machine stack. But
the standards don't require it.
 
 
If you think I am wrong, please give me the reference in the C++
standards where the machine stack is described.
 
 
(Note that the C Standard don't mention the word "stack" at all.)
 
 
Mr Flibble <flibbleREMOVETHISBIT@i42.co.uk>: Jul 25 09:43PM +0100

On 25/07/2017 20:42, David Brown wrote:
> space. For a simple and obvious implementation of quicksort, a
> currently sorted list is the worst case. Usually, implementations are a
> bit more sophisticated than that, but that is the worst case.
 
Obviously that is only the case for a naive implementation of quicksort;
for a non-naive implementation the space complexity is, as I said, O(lg N):
 
"For Quicksort, the combination of end- recursion removal and a policy
of processing the smaller of the two subfiles first turns out to ensure
that the stack need only contain room for about, lg N entries, since
each entry on the stack after the top one must represent a subfile less
than half the size of the previous entry." -- ROBERT SEDGEWICK
 
> It requires a way to handle function calls, including recursive ones,
> and it requires a way to track destructable automatic objects in order
> to destruct them in the correct order when necessary.
 
Stack unwinding is done on a machine stack ergo the C++ ISO Standard
recognizes the existence of machine stacks.
 
/Flibble
David Brown <david.brown@hesbynett.no>: Jul 25 10:58PM +0200

On 25/07/17 21:36, Manfred wrote:
> It is not a misconception. It is correct: running out of stack is the
> same as running out of memory (technically it is running out of
> automatic storage), which is UB.
 
It is technically a misconception - the standards do not require a stack
or have any notion of it. In practice, of course, all (AFAIK) C++
implementations use a stack for automatic allocation.
 
The standards does not, as far as I can see, make any mention of the
possibility of running out of automatic storage space. By definition,
therefore, that would be undefined behaviour (as you say).
 
> space (i.e. out of automatic storage memory) is just broken (I think I
> already explained that). No matter if the standard does not specify (for
> obvious reasons) any requirements for physical or virtual amount of memory.
 
Exceeding implementation-defined limits will always be undefined
behaviour. It is always very difficult to reason about the correctness
of this sort of thing, because it is difficult to know the sizes in
practice. I would consider it as a correctness issue, but not one that
is handled by the standards. I would say the same thing about speed -
the standards say absolutely nothing about how fast code will be, but if
a program is specified with a time limit and does not meet it on the
required target, then the code is clearly incorrect.
David Brown <david.brown@hesbynett.no>: Jul 25 11:12PM +0200

On 25/07/17 22:43, Mr Flibble wrote:
>> a bit more sophisticated than that, but that is the worst case.
 
> Obviously that is only the case for a naive implementation of quicksort;
> for a non-naive implementation the space complexity is, as I said, O(lg N):
 
Well, yes - I know that. But that is not what you wrote first, so I
think it needed a little more clarification.
 
>> order to destruct them in the correct order when necessary.
 
> Stack unwinding is done on a machine stack ergo the C++ ISO Standard
> recognizes the existence of machine stacks.
 
You snipped the bit where I asked for a reference to the section of the
C++ standard (any version you like) which defines or describes the
machine stack. /I/ certainly could not find any such section, but I am
happy to admit that there are plenty of parts of the C++ standards that
I am not very familiar with. However, section 15.2 (of N3797, C++14)
which discusses "stack unwinding" does not give the slightest indication
of there being a requirement for a "machine stack". The term "stack
unwinding" is used because that was a common term for the process, given
that real implementations /do/ use a stack - but it does /not/ imply
that there actually is a "machine stack".
"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Jul 25 11:27PM +0200

On 25.07.2017 21:42, David Brown wrote:
> space. For a simple and obvious implementation of quicksort, a
> currently sorted list is the worst case. Usually, implementations are a
> bit more sophisticated than that, but that is the worst case.
 
As I recall original Quicksort is O(n^2) worst case, which was
demonstrated by an adversarial algorithm (dynamically choosing data to
make any Quicksort bad) by one of the old Unix gurus.
 
A recent improvement being considered as new basic sorting algorithm in
Boost, is O(n) worst case.
 
Sorry I don't have the time to google up references.
 
 
Cheers!,
 
- Alf
Mr Flibble <flibbleREMOVETHISBIT@i42.co.uk>: Jul 25 10:29PM +0100

On 25/07/2017 22:27, Alf P. Steinbach wrote:
 
> A recent improvement being considered as new basic sorting algorithm in
> Boost, is O(n) worst case.
 
> Sorry I don't have the time to google up references.
 
One word: noise.
 
/Flibble
"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Jul 26 12:03AM +0200

On 25.07.2017 23:29, Mr Flibble wrote:
>> in Boost, is O(n) worst case.
 
>> Sorry I don't have the time to google up references.
 
> One word: noise.
 
But why are you injecting noise now?
Manfred <invalid@invalid.add>: Jul 26 01:26AM +0200

On 07/25/2017 10:58 PM, David Brown wrote:
 
> It is technically a misconception - the standards do not require a stack
> or have any notion of it. In practice, of course, all (AFAIK) C++
> implementations use a stack for automatic allocation.
My point is that the stack is memory, physical resources. It does not
matter if an implementation uses some part of memory organized as a
stack (in the conventional sense) or as anything else; if a program
exceeds the limits of such resources (on which the standard poses no
requirements) obviously it falls into UB. This is not a misconception,
it is basic (like 'abc' basic) computer science.
 
> the standards say absolutely nothing about how fast code will be, but if
> a program is specified with a time limit and does not meet it on the
> required target, then the code is clearly incorrect.
 
You get the point is about sizes, but my argument is more definite than
that:
The original question was about unprotected recursive implementation:
using it to reverse a linked list of arbitrary length - think e.g of the
case where the list comes from some external storage, a database or a
network stream.
Such an implementation, that can result in an unbound recursion depth is
definitely incorrect, even if Tim Rentsch was claiming that compiler
optimizations can fix the problem.
On the other hand, as I wrote earlier, recursion when the recursion
depth (and memory usage) is guaranteed a-priori to be bounded and within
implementation limits, /is/ correct - e.g an FFT implementation where
the sample size is fixed.
David Brown <david.brown@hesbynett.no>: Jul 25 09:33AM +0200

On 25/07/17 00:09, Mr Flibble wrote:
 
>> A class (or other kind of structuring construct) can
>> stand in for a type, but it is not the same as a type.
 
> Again you are wrong. In C++ a class is a type.
 
You seem to have missed Tim's point entirely.
 
He has said that types and classes are not /synonymous/ - they are not
the same thing. This is clearly true. There are many kinds of "type"
in C++ that are not classes - "int", "double", pointers, functions,
lambdas, etc.
 
But you are arguing as though Tim had said "classes are not types" - and
that is /not/ what he said.
 
 
I don't think Tim expressed it very well when he said "a class can
/stand in/ for a type" - I think that adds confusion (the rest of his
post was, IMHO, very good). I think it is better to say that a class
(or other structuring construct) is part of what defines a type. Other
parts include qualifiers like "const" and "volatile" - these are also
needed in determining what you can do with an object of that type.
leigh.v.johnston@googlemail.com: Jul 25 04:11AM -0700

Obviously you can't read. He explicitly stated that a class isn't type which is patently absurd.
 
/Flibble
David Brown <david.brown@hesbynett.no>: Jul 25 03:00PM +0200

> Obviously you can't read. He explicitly stated that a class isn't type which is patently absurd.
 
Instead of snipping everything, please quote from Tim's post /exactly/.
That would make it much easier to see what he did or did not
"explicitly state".
 
And you might also like to try reading his post yourself, where Tim
tries to explain to you the difference between classes and types.
Jerry Stuckle <jstucklex@attglobal.net>: Jul 25 10:21AM -0400

> Obviously you can't read. He explicitly stated that a class isn't type which is patently absurd.
 
> /Flibble
 
In that statement you are perfectly correct - as David has so repeatedly
demonstrated.
 
--
==================
Remove the "x" from my email address
Jerry Stuckle
jstucklex@attglobal.net
==================
Mr Flibble <flibbleREMOVETHISBIT@i42.co.uk>: Jul 25 05:30PM +0100

On 25/07/2017 14:00, David Brown wrote:
> "explicitly state".
 
> And you might also like to try reading his post yourself, where Tim
> tries to explain to you the difference between classes and types.
 
You are missing the point: there is no difference between classes and
types because classes are types.
 
As far as Tim's waffle about cv qualifiers are concerned: cv
qualification does not create a type from a non-type; cv qualification
changes a type from one type to another.
 
/Flibble
David Brown <david.brown@hesbynett.no>: Jul 25 09:27PM +0200

On 25/07/17 18:30, Mr Flibble wrote:
>> tries to explain to you the difference between classes and types.
 
> You are missing the point: there is no difference between classes and
> types because classes are types.
 
Are all types classes? If not, then there /is/ a difference. Even if
you disagree with everything else Tim (or I) wrote, surely you can agree
that there are types which are not classes. Thus if nothing else,
classes and types are different in the way that Toyotas and cars are
different.
 
> As far as Tim's waffle about cv qualifiers are concerned: cv
> qualification does not create a type from a non-type; cv qualification
> changes a type from one type to another.
 
I'll let Tim reply to that, if he wants to.
Mr Flibble <flibbleREMOVETHISBIT@i42.co.uk>: Jul 25 09:18PM +0100

On 25/07/2017 20:27, David Brown wrote:
> that there are types which are not classes. Thus if nothing else,
> classes and types are different in the way that Toyotas and cars are
> different.
 
What utter codswallop. I have no idea why you are making the absurd
leap from me asserting that classes are types to me asserting that all
types are classes.
 
A class is a type of type just as a pointer is a type of type or a
built-in scalar is type of type.
 
/Flibble
David Brown <david.brown@hesbynett.no>: Jul 25 10:43PM +0200

On 25/07/17 22:18, Mr Flibble wrote:
 
> What utter codswallop. I have no idea why you are making the absurd
> leap from me asserting that classes are types to me asserting that all
> types are classes.
 
"There is no difference between classes and types" - does that ring a bell?
 
Tim wrote "A class ... is not the same as a type", and you replied
"Again you are wrong".
 
 
I don't actually think you think all types are classes in C++. But
that's what you wrote, in your haste and desperation to yell at people
for being wrong rather than actually reading, thinking, and constructing
a useful argument.
 
Mr Flibble <flibbleREMOVETHISBIT@i42.co.uk>: Jul 25 09:48PM +0100

On 25/07/2017 21:43, David Brown wrote:
 
> "There is no difference between classes and types" - does that ring a bell?
 
> Tim wrote "A class ... is not the same as a type", and you replied
> "Again you are wrong".
 
Tim also wrote "A class (or other kind of structuring construct) can
stand in for a type, but it is not the same as a type." which is
erroneous: a class is a type.
 
/Flibble
David Brown <david.brown@hesbynett.no>: Jul 25 11:05PM +0200

On 25/07/17 22:48, Mr Flibble wrote:
 
> Tim also wrote "A class (or other kind of structuring construct) can
> stand in for a type, but it is not the same as a type." which is
> erroneous: a class is a type.
 
I agree that - at least in the context of C++ - classes are kinds of
types. That's what the standards say. They don't give all the
information you might need about a type - cv qualifiers can be added.
 
In a wider context, it can perhaps make sense to distinguish between
"type" and the structure of the data and its methods, as Tim said - I am
not convinced one way or the other.
 
My reason for posting is your knee-jerk "you are wrong" replies to an
interesting post covering a range of issues.
"Rick C. Hodgin" <rick.c.hodgin@gmail.com>: Jul 25 04:28AM -0700

Trust in God and God's ways. Live your life for Him. Use your talents
for Him. Encourage those around you toward Him.
 
He is God Almighty, and His Son paid a heavy price for your free gift
of salvation. Hold Him in proper regard. Hold Him as He is. Ask Him
to guide you through life.
 
He will only lead you rightly. He only wants to prosper you, and those
around you.
 
Thank you,
Rick C. Hodgin
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

No comments: