- #include'ing .c files considered harmful? - 3 Updates
mickspud@potatofield.co.uk: Feb 15 05:16PM On Mon, 15 Feb 2021 17:55:58 +0100 >memcpy is going to give you the same code. >The key difference is that casting pointer types then using them to >access data is often lying to the compiler - for all but a handful of Sorry? Its standard C. Perhaps its frowned on in C++ but I've been doing network programming for a couple of decades and this method is used all over the place. No one does 50 memcpys if there's a memory structure with 50 fields in it just for the sake of ivory tower correctness, you'd have to be insane. A structure only has to be correct once in the header, memcpys have to be correct everywhere you use them. If you don't believe me have a look in any of the /usr/include/linux network header files and then go through this and check out the casting to structs: https://github.com/torvalds/linux/blob/master/net/ipv4/tcp.c >exceptions, it is behaviour undefined by the standard. This means you >can easily get something that works fine in your simple tests, but fails >in more complex situations when code is inlined, link-time optimised, or Rubbish. Maybe in Windows but that doesn't concern me. >> for numeric values but why bother plus its unlikely to be very efficient. >The point is that you have to have code for accessing the fields, you >can't just use them directly. And when you have a an accessor function Wtf are you taking about? You just access them as structure fields. There may be a small cost in deferencing but there's a large gain in code readability and correctness. >more portable. People have been writing code to access network-defined >or file format defined structures since C has been in existence, and >#pragma pack is neither necessary nor sufficient for the task. Whether its pragma pack or attribute packed, its used a lot in Linux. $ pwd /usr/include/linux $ grep __attribute__ *.h | grep packed | wc -l 239 But what do they know? |
James Kuyper <jameskuyper@alumni.caltech.edu>: Feb 15 03:54PM -0500 > On Mon, 15 Feb 2021 17:55:58 +0100 > David Brown <david.brown@hesbynett.no> wrote: ... > Sorry? Its standard C. Perhaps its frowned on in C++ but I've been doing > network programming for a couple of decades and this method is used all over > the place. The C++ rules are stricter than the C rules, but it's also a problem in C. Type punning is standard C, but there are restrictions on when it can safely be used. Those restrictions are defined in terms of the "effective type" of a piece of memory. For objects with a declared type, the effective type is the same as the declared type. For memory with no declared type (which basically means dynamically allocated memory), the effective type is set by the last store into that memory using a non-character type T. If you used an lvalue of type T to store the value, then the memory has an effective type of T. If you use methods such as memcpy() or memmove(), to copy an entire object over into such memory, or if you copied it over as an array of character type, that memory acquires the same effective type as the object it was copied from. The relevant rule violated by many kinds of type punning is the anti-aliasing rule: "An object shall have its stored value accessed only by an lvalue expression that has one of the following types: 88) — a type compatible with the effective type of the object, — a qualified version of a type compatible with the effective type of the object, — a type that is the signed or unsigned type corresponding to the effective type of the object, — a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object, — an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or — a character type." (C standard, 6.5p7). Since that "shall" occurs outside of a constraints section, type punning that violates the above rule has undefined behavior. Here's an example that shows what can go wrong as a result of violating that rule. Given: U func(T *pt, U *pu){ *pt = 0; return *pu; } then *pt acquires the effective type of T. If U is not one of the types permitted by the anti-aliasing rule, a compiler is not obligated to consider the possibility that pt and pu might point to overlapping blocks of memory. It could, therefore, delay the write to *pt until after it has read the value of *pu. In such a simple piece of code, it's unlikely to do so, but in more complicated code there's a very good chance of such optimizations occurring. Unions provide a way to avoid this problem (see 6.5.2.3p3, and pay attention to footnote 95), but that way only works if the object is question is actually of the union type, and only if the declaration of that union is in scope at the point where the problem could otherwise occur. ... >> can easily get something that works fine in your simple tests, but fails >> in more complex situations when code is inlined, link-time optimised, or > Rubbish. Maybe in Windows but that doesn't concern me. It's not just Windows - compilers that take advantage of the anti-aliasing rules to optimize code generation are quite common. |
Paavo Helde <myfirstname@osa.pri.ee>: Feb 16 12:00AM +0200 >> can easily get something that works fine in your simple tests, but fails >> in more complex situations when code is inlined, link-time optimised, or > Rubbish. Maybe in Windows but that doesn't concern me. FYI, the biggest "culprit" in this area has been gcc in recent years. It is keen to optimize away things which are formally UB, like infinite loops. For some pointer conversions it helpfully warns you that it is planning to break your code ("dereferencing type-punned pointer will break strict-aliasing rules"). For some other kind of UB one might not get so lucky. MSVC, on the other hand, is generally much more careful to keep alive tons of crap code produced by hordes of cowboy programmers during last decades, only because such code accidentally happened to work at some time in the past. And yes, this is C++, not C, the rules are different. |
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page. To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com. |
No comments:
Post a Comment