- #include'ing .c files considered harmful? - 8 Updates
- "The weirdest compiler bug" by Scott Rasmussen - 1 Update
mickspud@potatofield.co.uk: Feb 16 09:47AM On Mon, 15 Feb 2021 15:54:36 -0500 >such as memcpy() or memmove(), to copy an entire object over into such >memory, or if you copied it over as an array of character type, that >memory acquires the same effective type as the object it was copied from. Memory is memory, it doesn't have a type. How the compiler sees it is another matter of course but unless a C/C++ compiler wants to break a huge amount of code its going to have to treat memory in this instance as void. >The relevant rule violated by many kinds of type punning is the >anti-aliasing rule: Fine, but like it or not type punning has been de facto standard C for a very long time and any C compiler (and C++ in many cases) breaks it at its peril. >attention to footnote 95), but that way only works if the object is >question is actually of the union type, and only if the declaration of >that union is in scope at the point where the problem could otherwise occur. Unions are another matter entirely mainly because endianess issues tend to occur with them regardless of memory alignment. >It's not just Windows - compilers that take advantage of the >anti-aliasing rules to optimize code generation are quite common. IME most compilers when pushed to do heavy optimisation start making subtle mistakes here and there. Any heavily optimised code should always be tested much more thoroughly than non optimised before its released. |
mickspud@potatofield.co.uk: Feb 16 09:49AM On Tue, 16 Feb 2021 00:00:33 +0200 >>> in more complex situations when code is inlined, link-time optimised, or >> Rubbish. Maybe in Windows but that doesn't concern me. >FYI, the biggest "culprit" in this area has been gcc in recent years. It gcc optimisation has always been a bit flaky when you go beyond -O2 anyway. Their optimisation system seems to be a permanent work in progress IMO. |
David Brown <david.brown@hesbynett.no>: Feb 16 11:20AM +0100 > fields in it just for the sake of ivory tower correctness, you'd have to > be insane. A structure only has to be correct once in the header, memcpys have > to be correct everywhere you use them. Casting pointers is allowed in C and C++ - but there are very tight limits on what you are actually able to do with them in a portable way (by that I mean there are limits on which accesses have defined behaviour in the standards). But you don't need pointer casts to access structs or fields in structs - you only need them if you are messing about taking a pointer to one type of object and using it as though it were a pointer to a different type of object. And it is that kind of usage that is risky - there are lots of situations where people /think/ it is valid code, and it works when they test it, but it comes without guarantees and might fail in other circumstances. I get the impression here that there is a bit of mismatch between what I have been trying to say, and what you think I have been saying. I am not sure how - whether I was unclear or you misunderstood. But to go back to the beginning, you claimed that "packed" structs were required to handle pre-defined structures such as for network packets, and I pointed out that this is not correct - you can, for example, use memcpy to access the data in a portable and efficient manner. Do you agree on that point? > If you don't believe me have a look in any of the /usr/include/linux network > header files and then go through this and check out the casting to structs: > https://github.com/torvalds/linux/blob/master/net/ipv4/tcp.c Can you give a more specific reference? I'd rather not read through four thousand lines. >> can easily get something that works fine in your simple tests, but fails >> in more complex situations when code is inlined, link-time optimised, or > Rubbish. Maybe in Windows but that doesn't concern me. Who has been talking about Windows? I have been talking about C and C++. If you mess about with pointers and objects in a way that breaks the "strict aliasing rules", simple test code /usually/ works as you might expect (as the "obvious" implementation is typically already optimal). But in more complex situations the compiler might be able to generate more efficient code by using the knowledge that you cannot use a pointer-to-int to change a float object, for example. <https://gcc.gnu.org/onlinedocs/gcc-10.2.0/gcc/Optimize-Options.html#index-fstrict-aliasing> > Wtf are you taking about? You just access them as structure fields. There > may be a small cost in deferencing but there's a large gain in code > readability and correctness. You need to put you r ntoh* functions somewhere! > $ grep __attribute__ *.h | grep packed | wc -l > 239 > But what do they know? What do they know about writing highly portable and standard C? Not everything, that's for sure. You can see that in many ways. For starters, they don't /have/ to write code that is fully portable - they can assume a number of basic features (32-bit int, 8-bit char, little-endian or big-endian ordering, and lots of other common features of any POSIX system). They don't have to write code that relies only on standard C - they use gcc extensions freely. They (Torvalds in particular) regularly get into arguments with the gcc development team when new gcc versions come out and Torvalds says it "breaks" Linux, then the gcc team point out that the C code was incorrect. Sometimes the agreed solution is to fix the Linux code, sometimes it is to add flags to gcc for finer control. (None of this is criticism, by the way - using these assumptions lets them write simpler or more efficient code. Most people, including me, write non-portable code all the time.) Oh, and I have several times said that "packed" can be /convenient/. But it is never /necessary/. There is a difference. And of course a sample of where someone else uses a particular feature does not show the code is correct, and certainly does not show that the feature is necessary. Using the Linux kernel as sample code is particularly inappropriate, as it is a very unique piece of work with very unique requirements and history. |
David Brown <david.brown@hesbynett.no>: Feb 16 11:47AM +0100 > Memory is memory, it doesn't have a type. How the compiler sees it is another > matter of course but unless a C/C++ compiler wants to break a huge amount of > code its going to have to treat memory in this instance as void. I'm sorry, but you are wrong - C and C++ view memory in terms of objects, which have specific types, and compilers will at times take advantage of that. It is relatively rare that this makes a difference to the code, but it happens sometimes. And yes, this results in mistakes in people's C and C++ code giving buggy results. But it is not that the compiler "breaks" their code - their code was broken when they wrote it. >> anti-aliasing rule: > Fine, but like it or not type punning has been de facto standard C for a very > long time and any C compiler (and C++ in many cases) breaks it at its peril. Type punning is possible in a variety of ways. But the standards do /not/ allow it just by doing pointer casts. Accessing memory by an incompatible type breaks strong typing. You are not the first person to misunderstand this - it is unfortunately common amongst C and C++ programmers. (You can well argue that this is a mistake in the way the languages are defined, and I think you'd find support for that - but that's they way they are. Most programming languages have similar rules - they just don't make it as easy to write code that breaks the rules as C does.) >> that union is in scope at the point where the problem could otherwise occur. > Unions are another matter entirely mainly because endianess issues tend to > occur with them regardless of memory alignment. Endianness is inherent in pre-defined structures, and is orthogonal to alignment questions and independent of unions. >> anti-aliasing rules to optimize code generation are quite common. > IME most compilers when pushed to do heavy optimisation start making subtle > mistakes here and there. That is not my experience, with quality compilers (though bugs do occur in compilers). But it /is/ my experience that heavy optimisation can reveal bugs in the C or C++ source. Optimisations from type-based alias analysis are not mistakes in the compiler. > Any heavily optimised code should always be tested > much more thoroughly than non optimised before its released. That much is true. My experience is that code that "works when optimisation is disabled but fails when optimised" is almost invariably bugs in the code, not in the compiler. |
David Brown <david.brown@hesbynett.no>: Feb 16 12:08PM +0100 On 15/02/2021 23:00, Paavo Helde wrote: > FYI, the biggest "culprit" in this area has been gcc in recent years. It > is keen to optimize away things which are formally UB, like infinite > loops. Certainly it would often be nice to get more warnings about this kind of thing - but getting good warnings with few false positives is not easy. gcc has been getting steadily better at its warnings over the years. I can understand why people /want/ their compiler to read their minds and guess what they meant to write, even though the actual code is in error. I have a harder time understanding when they /expect/ it to do so. It is particularly difficult for me to understand in this particular case of type punning and type-based alias analysis. It's fair enough that this is an advanced topic and lots of programmers don't really know about it. But when you explain to people that the C and C++ standards have rules about how objects can be accessed, and the compiler assumes you follow those rules, some people get completely irrational - I have seen people call compiler writers "evil" and "obsessed with benchmarks at the expense of users". C and C++ are defined the way they are defined. A C or C++ compiler implements that language. As a programmer, you are expected to write code that follows the rules of the language. The standard (plus additional rules defined by the compiler) form an agreement between the programmer and the compiler. If the programmer does not hold up his/her side of the deal by writing correct code, the compiler can't be expected to produce correct output from incorrect input. Having said all that, it is of course important that a compiler does its best to help the developer find and fix his/her errors, such as by giving warning messages. It is not in anybody's interest for the compiler to cover up the mistakes by pretending incorrect code means something different. Some people don't like certain aspects of the C and C++ standards - they want a language with additional semantics defined. In particular, some people want to be able to access data using any pointer types, and don't want to use supported methods (memcpy, char access, placement new, unions, volatile, compiler extensions). gcc helpfully gives you the option "-fno-strict-aliasing" which does precisely that. So if you want to program in a language that is mostly like C or C++ but has this additional feature, that's the way to do it (for gcc and clang, anyway). (The other common case like this is that many people believe that because their signed integers are stored as two's complement, overflow behaviour is defined as wrapping. This is, of course, nonsense. But to help people who want this, gcc has a "-fwrapv" flag.) > tons of crap code produced by hordes of cowboy programmers during last > decades, only because such code accidentally happened to work at some > time in the past. MSVC has experimented with this kind of optimisation in their compiler. But their problem is that the biggest source of crap code from cowboy programmers is MS - the standard windows.h header relies on it. (Contrast this with wrapping overflow for signed integers. MSVC generally gives you wrapping behaviour, simply because it doesn't do as good a job at optimising this kind of thing as gcc. Many people believe that MSVC guarantees wrapping behaviour, and rely on it - but it does not, and sometimes code that assumes wrapping will fail on MSVC. There is, AFAIK, no "-fwrapv" flag for MSVC.) > And yes, this is C++, not C, the rules are different. The details are different, but many have the same effect here. |
mickspud@potatofield.co.uk: Feb 16 02:25PM On Tue, 16 Feb 2021 11:20:20 +0100 >pointed out that this is not correct - you can, for example, use memcpy >to access the data in a portable and efficient manner. Do you agree on >that point? Ok, when I said essential I meant for efficient coding. Obviously you can always use other methods and for [reasons] you prefer memcpy. It seems to boil down to personal choice and there's little point arguing over that. |
Chris Vine <chris@cvine--nospam--.freeserve.co.uk>: Feb 16 09:35PM On Tue, 16 Feb 2021 11:47:52 +0100 David Brown <david.brown@hesbynett.no> wrote: [snip] > support for that - but that's they way they are. Most programming > languages have similar rules - they just don't make it as easy to write > code that breaks the rules as C does.) I don't think there can be many competent C programmers who have not at least heard of the strict aliasing rule by now, given that it has existed since the first C88 standard was promulgated. Possibly there is also the reverse problem - some C programmers don't properly understand that it is fine to cast from a struct to its first member, or back the other way again, and dereference the cast at will. This is commonplace for example in network programming, and is basically how POSIX's networking API is built up: POSIX does not rely on undefined behaviour as far as that is concerned. But although wilful ignorance is no excuse, I do wonder about whether there has been a proper analysis of the speed-up gains of strict aliasing, given that is does appear to be a problem for some second rate programmers. Furthermore I think the way that C++ has doubled down on this by requiring the use of std::launder for any case where a pointer cast is not "pointer interconvertible" is a mistake. Too many obscure technical rules launched at programmers because a compiler vendor has asserted that it might make 1% of code 0.5% faster seems to me to be the wrong balance. |
David Brown <david.brown@hesbynett.no>: Feb 16 11:25PM +0100 On 16/02/2021 22:35, Chris Vine wrote: > obscure technical rules launched at programmers because a compiler > vendor has asserted that it might make 1% of code 0.5% faster seems to > me to be the wrong balance. That is a valid argument. However, optimisations and efficient code is made from the sum of many small optimisations (either ones that are often applicable but only make a small difference, or ones that make a larger difference but are only rarely applicable). When you start saying "we'll make this change in the language because people get it wrong", where do you stop? Should you also make signed overflow defined, because some people think it is? Should you add checks for pointers being non-zero before dereferencing them, because some people get it wrong and many of the checks can be optimised away? Casting pointers is /dangerous/. It is lying to the compiler - it is saying that an object has one type, but you want to pretend it is a different type. Many other programming languages don't allow anything equivalent to such conversions. However, it can be useful on occasion in low-level code, which can usually be left to a few programmers who understand the issues. The same applies in C++ - std::launder is likely to find use in implementing memory pools and specialist allocators, not in normal application code. It is also part of the move towards defining a pointer provenance model for C and C++, to improve alias tracking (for knowing when apparently different pointers may alias, and for being sure that pointers of similar types do not alias). |
Juha Nieminen <nospam@thanks.invalid>: Feb 16 02:10PM >> the code in such a manner that it didn't trigger the bug). >> (If you are curious, the compiler in question was sdcc.) > That's a "Heisenbug". Observing the bug affects its behavior. If you are curious about why modifying the code affected the bug, it was because the compiler was generating code that wrote on the wrong part of the stack, which would thus corrupt something that some function upper on the call stack was using. (I started suspecting that to be the case when the program was crashing on a 'return'. Indeed, the return address was being corrupted somewhere deep down the call stack.) Rather obviously, if you wrote any code that added (or removed) anything from the stack, the thing that would get corrupted would likewise change. |
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page. To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com. |
No comments:
Post a Comment