Wednesday, February 17, 2021

Digest for comp.lang.c++@googlegroups.com - 21 updates in 3 topics

Manfred <noname@add.invalid>: Feb 17 06:35PM +0100

On 2/17/2021 11:57 AM, David Brown wrote:
>>> larger difference but are only rarely applicable). When you start
>>> saying "we'll make this change in the language because people get it
>>> wrong", where do you stop?
That's not the point, it is in fact the other way around. It is the C++
committee who decided to make a change in the language, because they
decided that people were getting something wrong. Despite this something
being a very basic language feature brought up as an example even by
Bjarne himself in his book.
 
Should you also make signed overflow
>>> defined, because some people think it is? Should you add checks for
>>> pointers being non-zero before dereferencing them, because some people
>>> get it wrong and many of the checks can be optimised away?
 
Again, this is the attitude taken by the C++ committee, and it is wrong.
It is them who changed the language to second the opinion of a part of
the audience.
 
> specification when it was written. If you add semantics and reduce
> optimisation, code that was fine before now runs less efficiently.
> Neither is good.
 
This is why I believe the std::launder change is an example of bad design:
In the beginning there was type punning, unions, and the "effective
type" rules of C, i.e. the rules of pointer casts.
Then C++ addressed this by adding a number of cast operators designed
for the very purpose of making the semantics of these rules more explicit.
 
Baseline is that pointer conversion is strictly coupled with dynamic
memory allocation - and I think you agree that dynamic memory is a core
feature of C++. malloc /is/ useful in C++ too;
And C++ did "figure out where it stands on this kind of thing at an
early stage".
The fact is that the C++ committee decided, only 30+ years after this
early stage, to turn the whole thing upside down. Not smart, IMO.
 
 
 
>>> Casting pointers is /dangerous/. It is lying to the compiler - it is
>>> saying that an object has one type, but you want to pretend it is a
>>> different type.
/Careless/ casting is dangerous. Bjarne knew about it and he addressed
the problem.
It's not lying unless you misuse it, and in C++ you need to exercise
some gymnastics to achieve that.
On the other hand, if you want to do anything with a piece of memory
returned by malloc you /need/ to cast the pointer you get (the same is
substantially true for new char[]).
 
Many other programming languages don't allow anything
>>> equivalent to such conversions.
And these programming languages are far less powerful than C++ (and C),
to the point that the most popular of them would just not work without C
or C++, e.g Java and C#.
 
However, it can be useful on occasion
>>> understand the issues. The same applies in C++ - std::launder is likely
>>> to find use in implementing memory pools and specialist allocators, not
>>> in normal application code.
 
I'm not that convinced about this ivory tower argument.
I do get annoyed by broken code when written by some incompetent
keypusher when I see it, and I dislike when some major software vendor
advertises their new programming language being "easy to use" as its
primary selling point, but I don't think that making C++ more convoluted
or fragmented is a solution to that.
James Kuyper <jameskuyper@alumni.caltech.edu>: Feb 17 01:26PM -0500

> On Wed, 17 Feb 2021 11:22:36 -0500
> James Kuyper <jameskuyper@alumni.caltech.edu> wrote:
...
> {
> char s[5];
 
> uint32_t *i = (uint32_t *)s;
 
That's not guaranteed to be misaligned. It would make a better example
to use
 
_Alignas(_Alignof(uint32_t)) char s[5];
uint32_t *i = (uint32_t*)(s+1);
 
That would guarantee misalignment on any implementation where
_Alignof(uint32_t) > 1.
 
> *i = 123;
> printf("addr = %p, val = %u\n",i,*i);
 
The values corresponding to a format specifier of %p are supposed to
have a type of void*, otherwise the behavior is undefined. On
implementations where all pointers have the same representation, which
probably includes every implementation you've ever used, that's
generally not a problem: the implementation defines the behavior that
the standard leaves undefined, in precisely the way you presumably
thought it was required to be defined.
 
However, on implementations where the problem I described can occur,
sizeof(void*) will be larger than sizeof(uint32_t*), and the %p
specifier will therefore generally make up the difference by
misinterpreting something else as containing the bytes it's looking for
that aren't part of i.
 
> gondor$ a.out
> addr = 0x7ffe017f3873, val = 123
 
> That int looks nonaligned to me.
 
For the reasons given above, on an implementation which can have the
problem I described, the value printed out is meaningless. However, even
without that problem, the meaning of the string printed out with "%p" is
implementation defined. The way in which you can determine whether or
not the pointer is correctly aligned can differ from one platform to
another. That might seem a purely pedantic issue to worry about, but you
can avoid it completely by using print("%p : %p\n", (void*)s, (void*)i).
If you're using an implementation where the problem I described can
occur, and if you made the other changes I suggested above, those two
pointers must differ.
 
Are you claiming that you just compiled this code using such an
implementation of C? If so, which implementation is it, and what is the
target platform? In particular, what are the values of
_Alignof(uint32_t), sizeof(uint32_t*), and sizeof(char*)?
Manfred <noname@add.invalid>: Feb 17 07:43PM +0100

On 2/17/2021 5:30 PM, David Brown wrote:
> movl 1(%rdi), %eax
> ret
 
> Exactly which cache lines are "evicted" by the use of memcpy here?
 
I think this still does not solve the issue at the language level.
The standard does not mandate the behaviour that is shown by your
example, so even if compilers do compensate for inefficiency of the
code, this does not make the language good.
 
Looking at your first and second example above there is no reason for
which one should prefer getb2 over getb1, in fact getb2, as written, is
less efficient than getb1 because it introduces some unneeded extra
storage and an extra function call - albeit at the level of the abstract
machine, with no benefit in readability or robustness against bugs.
If the language somehow requires to use getb2 instead of getb1, I see
this as an inefficiency in the language.
The fact that the compiler puts a remedy to this by applying some
operation that is hidden to the language specification does not make the
language itself any more efficient.
Keith Thompson <Keith.S.Thompson+u@gmail.com>: Feb 17 11:27AM -0800

>>beginning of a word to a pointer type that can only represent positions
>>at the beginning of a word CANNOT result in a pointer to the same location.
 
> Really?
 
Yes, really.
 
> gondor$ a.out
> addr = 0x7ffe017f3873, val = 123
 
> That int looks nonaligned to me.
 
James was talking about "a pointer type that can only represent
positions at the beginning of a word", something that doesn't exist on
the implementation you're using.
 
Imagine an implementation on which machine-level addresses point to
32-bit words, and byte pointers (CHAR_BIT==8) are constructed in
software by adding bits describing the byte offset within the word. On
such an implementation, a uint32_t* pointer value cannot refer to
anything other than an entire 32-bit word. Your program would behave
differently on such an implementation.
 
I've worked on such implementations (except that the word size was 64
bits).
 
On the implementation you're using, a uint32_t* pointer value *can*
point to an odd address, and such a pointer can be dereferenced
successfully.
 
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */
David Brown <david.brown@hesbynett.no>: Feb 17 09:19PM +0100

On 17/02/2021 18:35, Manfred wrote:
 
> Again, this is the attitude taken by the C++ committee, and it is wrong.
> It is them who changed the language to second the opinion of a part of
> the audience.
 
std::launder is not a new feature as a change to the language - it is a
way to write code that is correct according to the way the C++ memory
model has worked since it was defined for C++03. The mistake is not
adding std::launder in C++17 - the mistake was not including it in
C++03, or not finding a memory model that did not need such a feature.
 
So it is /not/ the language that has changed - this is a feature that
has been needed (but only in rare situations) since C++03, that has
finally been added.
 
 
Why is std::launder needed? Let's take an example (assuming I have
understood the details correctly) :
 
#include <new>
 
struct X {
int a = 1;
virtual int foo() { return a + 2; }
};
 
void foof(X& x);
 
int foobar1() {
X x;
int f = x.foo();
foof(x);
int g = x.foo();
return f + g;
}
 
int foobar2() {
X x;
int f = x.foo();
foof(x);
X& y = *std::launder(&x);
int g = y.foo();
return f + g;
}
 
When compiling foobar1(), the compiler knows that the function "foof"
cannot replace "x" with a new object and still access it through "x" -
that would be breaking the C++ memory model. That means it can be sure
that any "const" fields, including the vtable, are unchanged by "foof".
This lets it make very significantly more efficient code by
devirtualizing then inlining the code. It can compile foobar1() as
though it were:
 
int foobar1() {
X x;
foof(x);
return x.a + 5;
}
 
If foof() used placement new to put a new object in x (such as a type
that inherits from X but is the same size, with a new implementation of
foo), then the programmer would want that new foo() to be called. The
best way to do this would be for "foof" to return the result of the
placement new - which is a pointer to the same memory, but known to
point to a different object. However, that's not always convenient. So
std::launder tells the compiler that the object may have changed. The
compiler thus cannot do the same kind of optimisation. gcc implements
it roughly as though the programmer had written:
 
int foobar2() {
X x;
foof(x);
auto p = &x.foo;
if (p == &X.foo) {
return x.a + 5;
} else {
return x.p() + 3;
}
}
 
 
Devirtualization, fully or partially, is a /big/ optimisation for C++
classes with virtual functions. It only works well because the C++
memory model (from C++03, not C++17) limits what you can do.
std::launder give you a way to be more flexible for unusual cases.
 
> type" rules of C, i.e. the rules of pointer casts.
> Then C++ addressed this by adding a number of cast operators designed
> for the very purpose of making the semantics of these rules more explicit.
 
No, it did not. The cast operators make casts clearer and make it more
obvious what that cast does or does not do. But they don't make the
type aliasing rules clearer or more explicit, and they don't change them
significantly compared to C. In particular, if you have a "float*" that
points to a float, there is no cast operator that will turn that into an
"int*" that can be used (in a fully defined manner) to read the float
object's memory as though it were an int.
 
 
> Baseline is that pointer conversion is strictly coupled with dynamic
> memory allocation - and I think you agree that dynamic memory is a core
> feature of C++. malloc /is/ useful in C++ too;
 
malloc can be used in C++ in the same way as C. And like C, the memory
returned by C++ has (AFAIUI) no type until you use the memory. So there
is no problem.
 
> early stage".
> The fact is that the C++ committee decided, only 30+ years after this
> early stage, to turn the whole thing upside down. Not smart, IMO.
 
They have clarified it, and added flexibility that was missing. The
language hasn't changed here since C++03.
 
What /has/ changed, is that compilers have gained optimisations that
take advantage of decisions made 18 years ago (or more), and which
perhaps people have misunderstood in the meantime.
 
The question of whether compilers should continue to do what some people
thought they were supposed to do, or whether they optimise more based on
what the /standards/ say they should do, is a difficult one. But that
is the question to be asking here - not whether the committee should
have added std::launder or not.
 
(You can, of course, ask whether std::launder was the best way to
implement the additional flexibility.)
 
> On the other hand, if you want to do anything with a piece of memory
> returned by malloc you /need/ to cast the pointer you get (the same is
> substantially true for new char[]).
 
You can cast the return value from malloc() and use it - that is an
entirely reasonable (indeed, essential) use of casting pointer types.
You can't do the same with memory returned by "new char[]".
 
You can't do it in C either. I'd like a std::launder equivalent in C to
be able to handle this.
 
(When I say "you can't do this", I mean the standards don't define a
particular behaviour for it. Particular compilers might, perhaps using
extensions, or they might simply give you the code you expect even
though it is not guaranteed by design of the language or specification
of the compiler.)
 
> advertises their new programming language being "easy to use" as its
> primary selling point, but I don't think that making C++ more convoluted
> or fragmented is a solution to that.
 
I'm not going to argue that the C++ memory model here is the best
choice, or that the ideal balance has been found between a compiler's
opportunities for optimisation and the programmer's expectation that
code works the way it looks like it works. I'm always happier if
mistakes in the code lead to compile-time failures, or at least compiler
warnings - having to remember subtle things like std::launder in certain
types of low-level code is not great.
 
But complaints and blame should be appropriate. std::launder was not
added so that previously safe code would now be unsafe without it - it
was added so that previously unsafe code could now be written safely.
mickspud@potatofield.co.uk: Feb 13 04:39PM

On Sat, 13 Feb 2021 14:57:19 +0100
>cache friendly memory layouts. And of course compilers can offer
>options or extensions to give different alignment (and thereby padding)
>arrangements, even if that breaks the platform's ABI.
 
#pragma pack is absolutely essential if you're mapping a structure directly
to a memory block or serialising the block into a structure such as network
packet header/data or in a device driver so you let the compiler know exactly
how you want padding if any.
David Brown <david.brown@hesbynett.no>: Feb 17 09:35PM +0100

On 17/02/2021 19:43, Manfred wrote:
 
>>> Except for any cache line(s) evicted as a result of the memcpy, which
>>> may indeed have an overall performance cost.
 
> I think this still does not solve the issue at the language level.
 
What issue? The "cache lines" Scott referred to don't exist at the
"language level".
 
> The standard does not mandate the behaviour that is shown by your
> example, so even if compilers do compensate for inefficiency of the
> code, this does not make the language good.
 
That is correct. The standard only mandates that they will work with
the same effect (if the non-standard pragma were removed).
 
> less efficient than getb1 because it introduces some unneeded extra
> storage and an extra function call - albeit at the level of the abstract
> machine, with no benefit in readability or robustness against bugs.
 
In these examples, getb1() is the simplest and clearest. Consider
instead that we had:
 
int32_t getb1(const void* p) {
return *(const int32_t *)p;
}
 
int32_t getb2(const void* p) {
int32_t x;
memcpy(&x, p, sizeof x);
return x;
}
 
(and so on for the others).
 
The results from gcc are the same - a single "mov" instruction. Here
getb2() /does/ have an advantage over getb1() in that it is valid and
fully defined behaviour even if the parameter did not point to a
properly aligned int - perhaps it points into a buffer of unsigned char
of incoming raw data. In that case, it /is/ more robust - it will work
even if changes to the code and build process (such as using LTO) give
the compiler enough information to know that a particular call to
getb1() is undefined behaviour and can lead to unexpected failures. I
don't like code that appears to work in a simple test case, but has
subtle flaws that mean it might not work in all cases.
 
And no, at least for this compiler, getb2() is not less efficient than
getb1(). I really don't care about a couple of microseconds of the
compiler's time - the efficiency I care about is at run-time.
 
(There's no argument about the readability.)
 
 
> The fact that the compiler puts a remedy to this by applying some
> operation that is hidden to the language specification does not make the
> language itself any more efficient.
 
This particular branch of the thread was in response to Scott's strange
claim that using memcpy() in the code caused cache lines to be evicted.
David Brown <david.brown@hesbynett.no>: Feb 17 09:41PM +0100

On 17/02/2021 18:28, Chris Vine wrote:
> (In fact, in C++17 it has to be an intrinsic because memcpy cannot be
> implemented as a function using standard C++17 without undefined
> behaviour, but that is a bug in the standard rather than a feature.)
 
That is not quite accurate. In gcc, memcpy with known small sizes
(usually regardless of alignment) is handled using a built-in version
that is going to be as good as it gets - memcpy(&a, &b, sizeof a) is
going to be roughly like "a = b" (but skipping any assignment operator
stuff). That won't necessarily apply to all other compilers, though gcc
is not alone in handling this.
 
I don't think memcpy can be implemented (or duplicated) in pure C or C++
of any standard, with all aspects of the way it copies effective types,
but I am not sure on that. However, that doesn't mean it has to be an
"intrinsic" - it just means it has to be implemented using extensions in
the compiler, or treated specially in some other way by the tool.
Bo Persson <bo@bo-persson.se>: Feb 17 10:05PM +0100

On 2021-02-17 at 19:43, Manfred wrote:
> less efficient than getb1 because it introduces some unneeded extra
> storage and an extra function call - albeit at the level of the abstract
> machine, with no benefit in readability or robustness against bugs.
 
And therefore C++20 packages this into std::bit_cast, for your convenience.
 
https://en.cppreference.com/w/cpp/numeric/bit_cast
 
constexpr double f64v = 19880124.0;
constexpr auto u64v = std::bit_cast<std::uint64_t>(f64v);
Keith Thompson <Keith.S.Thompson+u@gmail.com>: Feb 17 01:07PM -0800

James Kuyper <jameskuyper@alumni.caltech.edu> writes:
[...]
> generally not a problem: the implementation defines the behavior that
> the standard leaves undefined, in precisely the way you presumably
> thought it was required to be defined.
 
Do most implementations actually *define* (i.e., document) the behavior
of passing a non-void* pointer with a %p format specifier? I haven't
checked, but it's plausible that most of them implement it in the
obvious way but don't bother to mention it. Since the behavior is
undefined, not implementation-defined, implementations are not required
to document it.
 
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */
Chris Vine <chris@cvine--nospam--.freeserve.co.uk>: Feb 17 09:17PM

On Wed, 17 Feb 2021 21:19:41 +0100
David Brown <david.brown@hesbynett.no> wrote:
[snip]
> But complaints and blame should be appropriate. std::launder was not
> added so that previously safe code would now be unsafe without it - it
> was added so that previously unsafe code could now be written safely.
 
You probably have already deduced this, but this is to say I disagree.
And I don't think that C++03 was the source of this but if it was, I am
absolutely certain that the authors of C++03 didn't think they were
declaring past (and future) accepted practice to be invalid.
 
In the beginning (C88/89) was the strict-aliasing rule. The competent
C and subsequently C++ programmer promised that (subject to certain
specified exceptions) she would not dereference a pointer to an object
which did not in fact exist at the memory location in question. You
want to construct an object in uninitialized memory and cast your
pointer to the type of that object? Fine, you complied with the
strict-aliasing rule. Examples appeared in all the texts and websites,
including Stroustrup's C++PL 4th edition. No one thought C++03 had the
effect you mention.
 
But if C++03 was the source of the problem, then the mistake in the
standard should have been corrected. You don't correct it by inventing
additional rules which declare past practice (and C practice) to
comprise undefined behaviour. std::launder declares itself in a part
of the standard described as "Pointer optimization barrier". This is a
conceptual error, casting (!) the compiler's job (deciding whether code
can be optimized or not) onto the programmer. Write a better compiler
algorithm or don't do it at all.
Keith Thompson <Keith.S.Thompson+u@gmail.com>: Feb 17 01:55PM -0800

David Brown <david.brown@hesbynett.no> writes:
[...]
> but I am not sure on that. However, that doesn't mean it has to be an
> "intrinsic" - it just means it has to be implemented using extensions in
> the compiler, or treated specially in some other way by the tool.
 
A pure C implementation of memcpy cannot, as I understand it, be
portable to all implementations. But a pure C implementation could work
correctly with a particular implementation, especially if the compiler
doesn't try to optimize based on the effective type rules.
 
Violations of the effective type rules result in undefined behavior. A
compiler *could* treat such violations in a way that's consistent with a
naive memcpy.
 
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */
Chris Vine <chris@cvine--nospam--.freeserve.co.uk>: Feb 17 10:06PM

On Wed, 17 Feb 2021 13:55:41 -0800
 
> Violations of the effective type rules result in undefined behavior. A
> compiler *could* treat such violations in a way that's consistent with a
> naive memcpy.
 
The effective type of the result of applying memcpy is the type of the
destination. The effective type of a cast is the type of the source.
That is why memcpy works. Where it can be, memcpy is a compiler
illusion.
Keith Thompson <Keith.S.Thompson+u@gmail.com>: Feb 17 02:28PM -0800

Chris Vine <chris@cvine--nospam--.freeserve.co.uk> writes:
[...]
> In the beginning (C88/89) was the strict-aliasing rule.
[...]
 
What is C88? The first C standard was ANSI C89, which became ISO C90.
Of course work started before 1989, but there is no C88 standard.
 
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */
Keith Thompson <Keith.S.Thompson+u@gmail.com>: Feb 17 02:29PM -0800

> destination. The effective type of a cast is the type of the source.
> That is why memcpy works. Where it can be, memcpy is a compiler
> illusion.
 
Does that contradict what I wrote?
 
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */
Chris Vine <chris@cvine--nospam--.freeserve.co.uk>: Feb 17 10:43PM

On Wed, 17 Feb 2021 14:28:41 -0800
> [...]
 
> What is C88? The first C standard was ANSI C89, which became ISO C90.
> Of course work started before 1989, but there is no C88 standard.
 
Indeed. A memory bus error.
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Feb 17 03:18PM -0800

On 2/17/2021 2:43 PM, Chris Vine wrote:
 
>> What is C88? The first C standard was ANSI C89, which became ISO C90.
>> Of course work started before 1989, but there is no C88 standard.
 
> Indeed. A memory bus error.
 
Shit happens!
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Feb 12 04:28PM -0800

On 2/11/2021 11:55 AM, Alf P. Steinbach wrote:
 
>> He is hyper smart... Big time.
 
> And a free book! Great! Except... I'm not so much into shared memory
> parallel programming, but others here may be.
 
Been into it for a long time now. Went through periods of pause, when I
got into fractals. Remember a long time ago when creating externally
assembled functions was safer. Now we have C++11... What a JOY!
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Feb 12 04:31PM -0800

On 2/12/2021 4:28 PM, Chris M. Thomasson wrote:
 
> Been into it for a long time now. Went through periods of pause, when I
> got into fractals. Remember a long time ago when creating externally
> assembled functions was safer. Now we have C++11... What a JOY!
 
Going into the fractal world is akin to a large realm "embarrassingly"
parallel algorithms. Example:
 
https://www.shadertoy.com/view/ltycRz
 
So, really dove into GPU and shaders. However, there can be cases and
fractals that are not so "embarrassingly" parallel...
mickspud@potatofield.co.uk: Feb 12 11:45AM

On Thu, 11 Feb 2021 17:37:37 +0000
 
>> Any? Hows it going with declarative languages such as SQL or Prolog then?
 
>Yes, any. Why do you feel the need to ask about declarative languages
>specifically?
 
Because beyond the lexical parser they don't break down into the same
execution structures as procedural languages. If you knew anything about
parsing you'd know that.
 
>> Or better yet forget threads and go multiprocess. If you're developing
>> on a proper OS anyway, on Windows the pain isn't worth it I imagine.
 
>You serious, bruv? Threads are essential.
 
Essential for what precisely? Specifically, what can they do that multiprocess
can't?
mickspud@potatofield.co.uk: Feb 12 05:12PM

On Fri, 12 Feb 2021 16:33:09 +0000
>On 12/02/2021 16:20, mickspud@potatofield.co.uk wrote:
 
Still waiting for your proof. Take your time.
 
 
>It is "UNIX" not "unix", dear. And as far as UNIX-like is concerned: Linux's
>overcommit/OOM-killer to support fork()ing is a fucking omnishambles; but you
>wouldn't know this of course as you are fucking clueless anachronism.
 
Oh dear, someone tell the child about copy-on-write.
 
If you're going to google stuff and pretend its your own knowledge you might
want to have a clue first. FWIW overcommit is pretty standard amongst OS's of
all colours and has been for decades.
 
>If you knew anything useful about threads you would know what advantages they
>have over processes; I repeat: go back to school, dear.
 
Thanks for proving my point. Have a good w/e with your boyfriend cupcake.
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

No comments: