soft and program: Digest for comp.lang.c++@googlegroups.com

comp.lang.c++@googlegroups.com

Google Groups

#include'ing .c files considered harmful? - 8 Updates
"The weirdest compiler bug" by Scott Rasmussen - 1 Update

#include'ing .c files considered harmful?

mickspud@potatofield.co.uk: Feb 16 09:47AM

On Mon, 15 Feb 2021 15:54:36 -0500
>such as memcpy() or memmove(), to copy an entire object over into such
>memory, or if you copied it over as an array of character type, that
>memory acquires the same effective type as the object it was copied from.

Memory is memory, it doesn't have a type. How the compiler sees it is another
matter of course but unless a C/C++ compiler wants to break a huge amount of
code its going to have to treat memory in this instance as void.

>The relevant rule violated by many kinds of type punning is the
>anti-aliasing rule:

Fine, but like it or not type punning has been de facto standard C for a very
long time and any C compiler (and C++ in many cases) breaks it at its peril.

>attention to footnote 95), but that way only works if the object is
>question is actually of the union type, and only if the declaration of
>that union is in scope at the point where the problem could otherwise occur.

Unions are another matter entirely mainly because endianess issues tend to
occur with them regardless of memory alignment.

>It's not just Windows - compilers that take advantage of the
>anti-aliasing rules to optimize code generation are quite common.

IME most compilers when pushed to do heavy optimisation start making subtle
mistakes here and there. Any heavily optimised code should always be tested
much more thoroughly than non optimised before its released.

mickspud@potatofield.co.uk: Feb 16 09:49AM

On Tue, 16 Feb 2021 00:00:33 +0200
>>> in more complex situations when code is inlined, link-time optimised, or

>> Rubbish. Maybe in Windows but that doesn't concern me.

>FYI, the biggest "culprit" in this area has been gcc in recent years. It

gcc optimisation has always been a bit flaky when you go beyond -O2 anyway.
Their optimisation system seems to be a permanent work in progress IMO.

David Brown <david.brown@hesbynett.no>: Feb 16 11:20AM +0100

> fields in it just for the sake of ivory tower correctness, you'd have to
> be insane. A structure only has to be correct once in the header, memcpys have
> to be correct everywhere you use them.

Casting pointers is allowed in C and C++ - but there are very tight
limits on what you are actually able to do with them in a portable way
(by that I mean there are limits on which accesses have defined
behaviour in the standards). But you don't need pointer casts to access
structs or fields in structs - you only need them if you are messing
about taking a pointer to one type of object and using it as though it
were a pointer to a different type of object. And it is that kind of
usage that is risky - there are lots of situations where people /think/
it is valid code, and it works when they test it, but it comes without
guarantees and might fail in other circumstances.

I get the impression here that there is a bit of mismatch between what I
have been trying to say, and what you think I have been saying. I am
not sure how - whether I was unclear or you misunderstood. But to go
back to the beginning, you claimed that "packed" structs were required
to handle pre-defined structures such as for network packets, and I
pointed out that this is not correct - you can, for example, use memcpy
to access the data in a portable and efficient manner. Do you agree on
that point?

> If you don't believe me have a look in any of the /usr/include/linux network
> header files and then go through this and check out the casting to structs:

> https://github.com/torvalds/linux/blob/master/net/ipv4/tcp.c

Can you give a more specific reference? I'd rather not read through
four thousand lines.

>> can easily get something that works fine in your simple tests, but fails
>> in more complex situations when code is inlined, link-time optimised, or

> Rubbish. Maybe in Windows but that doesn't concern me.

Who has been talking about Windows? I have been talking about C and C++.

If you mess about with pointers and objects in a way that breaks the
"strict aliasing rules", simple test code /usually/ works as you might
expect (as the "obvious" implementation is typically already optimal).
But in more complex situations the compiler might be able to generate
more efficient code by using the knowledge that you cannot use a
pointer-to-int to change a float object, for example.

<https://gcc.gnu.org/onlinedocs/gcc-10.2.0/gcc/Optimize-Options.html#index-fstrict-aliasing>

> Wtf are you taking about? You just access them as structure fields. There
> may be a small cost in deferencing but there's a large gain in code
> readability and correctness.

You need to put you r ntoh* functions somewhere!

> $ grep __attribute__ *.h | grep packed | wc -l
> 239

> But what do they know?

What do they know about writing highly portable and standard C? Not
everything, that's for sure. You can see that in many ways. For
starters, they don't /have/ to write code that is fully portable - they
can assume a number of basic features (32-bit int, 8-bit char,
little-endian or big-endian ordering, and lots of other common features
of any POSIX system). They don't have to write code that relies only on
standard C - they use gcc extensions freely. They (Torvalds in
particular) regularly get into arguments with the gcc development team
when new gcc versions come out and Torvalds says it "breaks" Linux, then
the gcc team point out that the C code was incorrect. Sometimes the
agreed solution is to fix the Linux code, sometimes it is to add flags
to gcc for finer control.

(None of this is criticism, by the way - using these assumptions lets
them write simpler or more efficient code. Most people, including me,
write non-portable code all the time.)

Oh, and I have several times said that "packed" can be /convenient/.
But it is never /necessary/. There is a difference.

And of course a sample of where someone else uses a particular feature
does not show the code is correct, and certainly does not show that the
feature is necessary. Using the Linux kernel as sample code is
particularly inappropriate, as it is a very unique piece of work with
very unique requirements and history.

David Brown <david.brown@hesbynett.no>: Feb 16 11:47AM +0100

> Memory is memory, it doesn't have a type. How the compiler sees it is another
> matter of course but unless a C/C++ compiler wants to break a huge amount of
> code its going to have to treat memory in this instance as void.

I'm sorry, but you are wrong - C and C++ view memory in terms of
objects, which have specific types, and compilers will at times take
advantage of that. It is relatively rare that this makes a difference
to the code, but it happens sometimes. And yes, this results in
mistakes in people's C and C++ code giving buggy results. But it is not
that the compiler "breaks" their code - their code was broken when they
wrote it.

>> anti-aliasing rule:

> Fine, but like it or not type punning has been de facto standard C for a very
> long time and any C compiler (and C++ in many cases) breaks it at its peril.

Type punning is possible in a variety of ways. But the standards do
/not/ allow it just by doing pointer casts. Accessing memory by an
incompatible type breaks strong typing.

You are not the first person to misunderstand this - it is unfortunately
common amongst C and C++ programmers. (You can well argue that this is
a mistake in the way the languages are defined, and I think you'd find
support for that - but that's they way they are. Most programming
languages have similar rules - they just don't make it as easy to write
code that breaks the rules as C does.)

>> that union is in scope at the point where the problem could otherwise occur.

> Unions are another matter entirely mainly because endianess issues tend to
> occur with them regardless of memory alignment.

Endianness is inherent in pre-defined structures, and is orthogonal to
alignment questions and independent of unions.

>> anti-aliasing rules to optimize code generation are quite common.

> IME most compilers when pushed to do heavy optimisation start making subtle
> mistakes here and there.

That is not my experience, with quality compilers (though bugs do occur
in compilers). But it /is/ my experience that heavy optimisation can
reveal bugs in the C or C++ source.

Optimisations from type-based alias analysis are not mistakes in the
compiler.

> Any heavily optimised code should always be tested
> much more thoroughly than non optimised before its released.

That much is true.

My experience is that code that "works when optimisation is disabled but
fails when optimised" is almost invariably bugs in the code, not in the
compiler.

David Brown <david.brown@hesbynett.no>: Feb 16 12:08PM +0100

On 15/02/2021 23:00, Paavo Helde wrote:

> FYI, the biggest "culprit" in this area has been gcc in recent years. It
> is keen to optimize away things which are formally UB, like infinite
> loops.

Certainly it would often be nice to get more warnings about this kind of
thing - but getting good warnings with few false positives is not easy.
gcc has been getting steadily better at its warnings over the years.

I can understand why people /want/ their compiler to read their minds
and guess what they meant to write, even though the actual code is in
error. I have a harder time understanding when they /expect/ it to do so.

It is particularly difficult for me to understand in this particular
case of type punning and type-based alias analysis. It's fair enough
that this is an advanced topic and lots of programmers don't really know
about it. But when you explain to people that the C and C++ standards
have rules about how objects can be accessed, and the compiler assumes
you follow those rules, some people get completely irrational - I have
seen people call compiler writers "evil" and "obsessed with benchmarks
at the expense of users".

C and C++ are defined the way they are defined. A C or C++ compiler
implements that language. As a programmer, you are expected to write
code that follows the rules of the language. The standard (plus
additional rules defined by the compiler) form an agreement between the
programmer and the compiler. If the programmer does not hold up his/her
side of the deal by writing correct code, the compiler can't be expected
to produce correct output from incorrect input.

Having said all that, it is of course important that a compiler does its
best to help the developer find and fix his/her errors, such as by
giving warning messages. It is not in anybody's interest for the
compiler to cover up the mistakes by pretending incorrect code means
something different.

Some people don't like certain aspects of the C and C++ standards - they
want a language with additional semantics defined. In particular, some
people want to be able to access data using any pointer types, and don't
want to use supported methods (memcpy, char access, placement new,
unions, volatile, compiler extensions). gcc helpfully gives you the
option "-fno-strict-aliasing" which does precisely that. So if you want
to program in a language that is mostly like C or C++ but has this
additional feature, that's the way to do it (for gcc and clang, anyway).

(The other common case like this is that many people believe that
because their signed integers are stored as two's complement, overflow
behaviour is defined as wrapping. This is, of course, nonsense. But to
help people who want this, gcc has a "-fwrapv" flag.)

> tons of crap code produced by hordes of cowboy programmers during last
> decades, only because such code accidentally happened to work at some
> time in the past.

MSVC has experimented with this kind of optimisation in their compiler.
But their problem is that the biggest source of crap code from cowboy
programmers is MS - the standard windows.h header relies on it.

(Contrast this with wrapping overflow for signed integers. MSVC
generally gives you wrapping behaviour, simply because it doesn't do as
good a job at optimising this kind of thing as gcc. Many people believe
that MSVC guarantees wrapping behaviour, and rely on it - but it does
not, and sometimes code that assumes wrapping will fail on MSVC. There
is, AFAIK, no "-fwrapv" flag for MSVC.)

> And yes, this is C++, not C, the rules are different.

The details are different, but many have the same effect here.

mickspud@potatofield.co.uk: Feb 16 02:25PM

On Tue, 16 Feb 2021 11:20:20 +0100
>pointed out that this is not correct - you can, for example, use memcpy
>to access the data in a portable and efficient manner. Do you agree on
>that point?

Ok, when I said essential I meant for efficient coding. Obviously you can
always use other methods and for [reasons] you prefer memcpy. It seems to boil
down to personal choice and there's little point arguing over that.

Chris Vine <chris@cvine--nospam--.freeserve.co.uk>: Feb 16 09:35PM

On Tue, 16 Feb 2021 11:47:52 +0100
David Brown <david.brown@hesbynett.no> wrote:
[snip]
> support for that - but that's they way they are. Most programming
> languages have similar rules - they just don't make it as easy to write
> code that breaks the rules as C does.)

I don't think there can be many competent C programmers who have not at
least heard of the strict aliasing rule by now, given that it has
existed since the first C88 standard was promulgated. Possibly there
is also the reverse problem - some C programmers don't properly
understand that it is fine to cast from a struct to its first member,
or back the other way again, and dereference the cast at will. This is
commonplace for example in network programming, and is basically how
POSIX's networking API is built up: POSIX does not rely on undefined
behaviour as far as that is concerned.

But although wilful ignorance is no excuse, I do wonder about whether
there has been a proper analysis of the speed-up gains of strict
aliasing, given that is does appear to be a problem for some second
rate programmers. Furthermore I think the way that C++ has doubled down
on this by requiring the use of std::launder for any case where a
pointer cast is not "pointer interconvertible" is a mistake. Too many
obscure technical rules launched at programmers because a compiler
vendor has asserted that it might make 1% of code 0.5% faster seems to
me to be the wrong balance.

David Brown <david.brown@hesbynett.no>: Feb 16 11:25PM +0100

On 16/02/2021 22:35, Chris Vine wrote:
> obscure technical rules launched at programmers because a compiler
> vendor has asserted that it might make 1% of code 0.5% faster seems to
> me to be the wrong balance.

That is a valid argument. However, optimisations and efficient code is
made from the sum of many small optimisations (either ones that are
often applicable but only make a small difference, or ones that make a
larger difference but are only rarely applicable). When you start
saying "we'll make this change in the language because people get it
wrong", where do you stop? Should you also make signed overflow
defined, because some people think it is? Should you add checks for
pointers being non-zero before dereferencing them, because some people
get it wrong and many of the checks can be optimised away?

Casting pointers is /dangerous/. It is lying to the compiler - it is
saying that an object has one type, but you want to pretend it is a
different type. Many other programming languages don't allow anything
equivalent to such conversions. However, it can be useful on occasion
in low-level code, which can usually be left to a few programmers who
understand the issues. The same applies in C++ - std::launder is likely
to find use in implementing memory pools and specialist allocators, not
in normal application code. It is also part of the move towards
defining a pointer provenance model for C and C++, to improve alias
tracking (for knowing when apparently different pointers may alias, and
for being sure that pointers of similar types do not alias).

"The weirdest compiler bug" by Scott Rasmussen

Juha Nieminen <nospam@thanks.invalid>: Feb 16 02:10PM

>> the code in such a manner that it didn't trigger the bug).

>> (If you are curious, the compiler in question was sdcc.)

> That's a "Heisenbug". Observing the bug affects its behavior.

If you are curious about why modifying the code affected the bug, it
was because the compiler was generating code that wrote on the wrong
part of the stack, which would thus corrupt something that some
function upper on the call stack was using. (I started suspecting that
to be the case when the program was crashing on a 'return'. Indeed,
the return address was being corrupted somewhere deep down the
call stack.)

Rather obviously, if you wrote any code that added (or removed)
anything from the stack, the thing that would get corrupted would
likewise change.

You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

soft and program

Tuesday, February 16, 2021

Digest for comp.lang.c++@googlegroups.com - 9 updates in 2 topics

No comments:

Blog Archive

About Me