Tuesday, June 25, 2019

Digest for comp.lang.c++@googlegroups.com - 10 updates in 5 topics

queequeg@trust.no1 (Queequeg): Jun 25 02:32PM


> Does this answer your question?
 
No, I'd prefer the answer from Bonita.
 
--
https://www.youtube.com/watch?v=9lSzL1DqQn0
David Brown <david.brown@hesbynett.no>: Jun 25 09:21PM +0200

On 25/06/2019 14:00, Bart wrote:
>> the stack.
 
> But a hardware stack (even if you define it as one using hardware
> support) is very commonly used. It has to be, to avoid inefficiency.
 
On many RISC architectures, there is no "hardware stack". There is no
dedicated "stack pointer" register, no "push" or "pop" instructions, no
"call" and "return" instructions using a stack. What you have is pre-
and post- increment and decrement addressing modes (equivalent of things
like "x = *p++;" or "*++p = x;"). With that, you can have as many
stacks as you want (limited by your typically 32 registers), growing
upwards or downwards. And you have a "link register" with "branch and
link" and "branch to link" instructions instead of "call" and "return".
This system is in many ways more efficient than a dedicated hardware
stack, but is less efficient in instruction size.
 
 
> And hardware stacks (like the ones on x86 and ARM) have certain
> characteristics, one of which is not dealing gracefully with overflows.
 
On many architectures it is possible to deal gracefully with stack
overflows using memory management units. On some architectures, there
are also limit registers for the stack pointer (though I would prefer to
see such features on many architectures).
Nathaniel <nathaniel@xor.systems>: Jun 25 03:31PM +0300

On 2019-06-24 22:25, Bonita Montero wrote:
>> but the standard says almost nothing about such issues.
 
> C and C++ allow recursions.
> And recursions aren't possible without a stack.
 
Well, it's not just recursion. Taking for instance an ARM processor or
similar hardware with a link register, without a stack it wouldn't be
possible to call two functions deep and you would need some
compiler-based mechanism for handling local (to the function) variables,
but recursion is the most definitive answer overall.
 
You could get by without a stack, but only on very simplistic programs
whose code paths were known in full at compile time and you'd have to
avoid hardware instructions like enter/leave/call/ret and local
variables (requiring a customized compiler), but that's not overly
practical in most all cases.
Nathaniel <nathaniel@xor.systems>: Jun 25 03:39PM +0300

On 2019-06-25 14:19, Bonita Montero wrote:
 
> There's nothing like a "hardware stack", it's just memory, but only
> special instructions which alleviate you pushing and popping onto
> the stack.
 
There kinda sorta is now on x86/amd64 platforms; Intel released what it
calls Control-flow Enforcement Technology (CET) which implements
hardware controls/protections for indirect branches (virtual function
calls/function pointers/et cetera) and includes the concept of a "shadow
stack" that it uses to prevent a variety of memory corruption attacks
and/or exploitation.
scott@slp53.sl.home (Scott Lurndal): Jun 25 07:53PM


>On some architectures, there
>are also limit registers for the stack pointer (though I would prefer to
>see such features on many architectures).
 
In order to make a limit register work, you need a dedicated instruction to adjust
the stack pointer - you can't use the ADD instruction as we do today (unless you
generate code to test against the limit, like Microsoft does).
 
One architecture I worked on had a dedicated instruction to advance the stack
pointer. It was generally the first instruction of a subroutine/function
when the function required stack-local storage.
 
The instruction would increment the stack pointer and fault if the stack
limit was exceeded.
 
http://vseries.lurndal.org/doku.php?id=instructions:asp
 
The stack grows from smaller addresses to larger addresses.
 
The return instruction would restore the stack pointer.
 
http://vseries.lurndal.org/doku.php?id=instructions:ret
"Öö Tiib" <ootiib@hot.ee>: Jun 25 12:33PM -0700

On Sunday, 23 June 2019 20:52:32 UTC+3, Tim Rentsch wrote:
> Tiib <ootiib@hot.ee> writes:
 
Snipping a bit.
 
> > For example .NET is defined to throw System.StackOverflowExeption
> > when it runs into stack limit. So it can't do "anything" unlike C++.
 
> I don't see how that has any bearing on this discussion.
 
It is an example of how exceeding resource limits can have
defined behavior. In C and in C++ we may use non-portable,
inconvenient and "undefined behavior" techniques to achieve
same effect. Or we may just ignore it and let the stuff crash
and burn. That does not feel responsible in world where vast
majority of software in cranes, lifts, vehicles, medical
instruments and so on is written in C and/or C++.
 
> tolerate, but the unsigned addition is defined (that is, has
> defined behavior) regardless of whether that physical limit has
> been reached.
 
Yes, but the resource limits are always there and so most
software is expected to deal with those. Uncontrolled heat
buildup however is usually not part of use-cases. When it
is then for example a system that has to ventilate a room
can be configured to do that until it melts if it detects
conflagration in that room.
 
> > misunderstand what aspect of it you have in mind.
 
> Probably the most important is that "undefined behavior" as you
> envision the term would be either meaningless or useless.
 
No it is not. Undefined behavior means that standard does not
require anything from implementation in certain situation. It
does not mean that something else (for example that
implementation) may not define it. However particular case
with resource limits seems badly not handled.
 
 
> I understand that that is a consequence of your worldview. I
> don't see the point of adopting a worldview that renders the term
> "undefined behavior" meaningless or useless.
 
For me the "undefined behavior" is very useful term. It means that
C++ (or C) standard does guarantee nothing in the situation.
It is evil "undefined behavior" when the situation can not be avoided
and other documents and/or standards also don't provide any
guaranteed ways out of it but it is not useless "undefined behavior".
 
> what the standards say is ambiguous. In many or even most cases,
> to resolve these ambiguities we need to reason about what meaning
> was intended. Yes?
 
I merely meant that I do not know intentions why some often
inevitable situations (like exceeding resource limits) are undefined
behavior while some easily stoppable situations (like throwing from
noexcept function) are defined behavior. So I can not comment
those specific intentions.
 
Only when I have read about intentions behind of (some other
half-ambiguous or odd at first glance) things in standard from
public discussions between committee members then I can
comment.
 
 
> > It feels logically orthogonal if machine is abstract or actual
> > and if resources of it are limitless or limited.
 
> I don't know what you mean by this sentence.
 
I meant that "abstract machine" does not somehow imply that
it is "endless" or "limitless" unless specially stated. Is there
some kind of grammatical issue in how I worded it?
 
> limit applies. But if there is a definition for a particular
> behavior, and the defining passages do /not/ state a limit,
> then the definition applies unconditionally.
 
We are again back at start I feel. Resource limits are also in C++
standard as general concept right in [intro]:
 
"If a program contains no violations of the rules in this International
Standard, a conforming implementation shall, *within* *its*
*resource* *limits*, accept and correctly execute that program."
 
And informative annex [implimits] leaves that quite dim how
implementations should aid us in avoiding exceeding those
limits.
 
So those *are* stated *and* left to be source of "undefined
behavior" and it seems to be intentional.
"Öö Tiib" <ootiib@hot.ee>: Jun 25 09:46AM -0700

On Monday, 24 June 2019 14:28:50 UTC+3, Juha Nieminen wrote:
> probably very rare to need a shared pointer to a compile-time array that's
> being nevertheless being allocated dynamically at runtime. I suppose it's
> not inconceivable, but rare.
 
That likelihood perhaps depends on problem domain / domain of knowledge
for what we write applications. In domains for what I have written the upper
limits are often clear and shared ownership of virtually unbounded arrays
feels unusual to extreme.
 
> array of values and pretty much nothing else from it (other than being
> able to index it), why add needless overhead by using a dynamically
> allocated std::vector?
 
Because it would simplify life with (practically unbounded?) shared array.
If it is potentially major, multi-megabyte array there then I would likely
make std::vector<Element> to be data member of separate class and
so what we talk about here is std::shared_ptr<ThatSeparateClass> and
not shared_ptr<Element[]>.

> std::fread(), for instance, you often need a temporary array to
> read into. If you are reading the *entire* file into the array,
> the size of that array is only known at runtime.
 
The particular use-case (unless I misunderstood) is sequential
std::fread() from potentially large file or from unknown length
stream?
 
My experience is that reading in 4096 byte chunks is close to
optimal default (so length does not need to be dynamic) and
same buffer can be reused for whole file/stream (so also its
allocation does not need to be so dynamic). IOW std::array is
usually splendid for that. The run-time data structure into what
the file is read can be anything but usually has finer
organization than (potentially) large immutable scoped
arrays.
 
> probably don't need the array initialized because you are filling it
> with data anyway), with little to no drawbacks, if you don't need
> anything that std::vector provides.
 
My starting point is one, 4-8 KB std::array (+ one integer to
indicate how "full" it is) for sequentially reading one file.
It typically remains that regardless if it is small or multi-megabyte
file. On ~5% of cases when that is not performant enough I go full
way to best what is available (like mmap) and for managing that
it is better to have special class (not to hack unique_ptr<char[]>
somehow). OTOH if we talk about hundreds of megabytes
of data then there are database engines for that.
 
 
> Besides it being non-standard, you'll be burdening the stack with
> a potentially very large amount of data. It would be bad if you
> run out of stack space.
 
We were talking about allocations/deallocations in so tight sequence
that it alters performance. It means there are lot of relatively small
(not virtually unbounded) buffers involved? Is it is uncertain for
software designer if what the program is dealing with is small or
potentially huge data?
 
Take projectiles for analogy. Buckshot cartridges are better for
small game and centre-fire cartridges are better for heavy game
but for a tank we need mortar shells. "Silver bullets" that are optimal
for all cases do not exist, so if it is potentially rabbit but potentially
also tank then we perhaps have to carry all three weapons and
choose dynamically what we shoot it with. Now there is also bow
with what small game is hard to hit but heavy game is dangerous
to wound and is hopeless against tank.
 
Same with software: small std::arrays (possibly in stack) are better
for small data and medium data is better in containers like
std::unordered_map but for large data we likely need database
engine (that deals with memory mapping and indexing internally).
I can imagine how software can pick dynamically between those
three choices but std::unique_ptr<x[]> feels like a bow there. ;)
"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Jun 25 05:39PM +0200

On 25.06.2019 10:05, Ralf Goertz wrote:
> do something more sophisticated than the naïve way?
 
> ¹Clarke, L. E. and Singer, James: On circular permutations, The American
> Mathematical Monthly (65), 1958, 609--610.
 
In Windows 10 with Visual C++ 2019 and g++ 8.2 following code
consistently produces garbage output:
 
 
#include <cppx-core/all.hpp> // <url:
https://github.com/alf-p-steinbach/cppx-core>
 
using namespace std;
 
auto main() -> int
{
string permuter = "Blah12345678901234567890"; // More than short
string buffer.
string_view const tester = permuter + permuter.substr( 0,
permuter.size() - 1 );
cout << tester << endl;
}
 
 
However, since it's UB it's not /guaranteed/ to produce garbage.
 
The `basic_string::find` is just a naïve direct find algorithm. More
advanced text search algorithms were added as freestanding functions in
C++17. See overlead (5) of `std::search` at <url:
https://en.cppreference.com/w/cpp/algorithm/search>. But this newfangled
stuff isn't necessarily available with any particular compiler yet.
 
 
Cheers!,
 
- Alf
queequeg@trust.no1 (Queequeg): Jun 25 02:39PM


> Will you please stop posting the same thing over and over?
 
It's not the same thing. He corrected. You need to read again.
 
;)
 
--
https://www.youtube.com/watch?v=9lSzL1DqQn0
Bonita Montero <Bonita.Montero@gmail.com>: Jun 25 05:27PM +0200

> Will you please stop posting the same thing over and over?
 
He is manic, and I bet my right hand that he is manic-depressive.
In a manic phase such people are unconvincible.
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

No comments: