- Performance of unaligned memory-accesses - 9 Updates
- newbie question: exceptions - 2 Updates
James Kuyper <jameskuyper@alumni.caltech.edu>: Aug 14 04:53PM -0700 On Wednesday, August 14, 2019 at 2:51:32 PM UTC-4, Bonita Montero wrote: [attributions stripped by Bonita] > > you want it to. > In theory, but in practice it has an expectable behaviour depending > on the platform. Undefined behavior includes, as one possibility "expectable behavior depending upon the platform", so that fact does not, in any way, conflict with his statement. |
Bonita Montero <Bonita.Montero@gmail.com>: Aug 15 07:11AM +0200 > So it may change because of change of compiler, its version, compiling > options, device that it is ran on, settings of operating system or > other (possibly unrelated) code in same code-base. There's no reason why a compiler might get incorrect code because of this. It's undefined behaviour only because of the runtime-behaivour of the CPU / OS. |
"Öö Tiib" <ootiib@hot.ee>: Aug 14 11:37PM -0700 On Thursday, 15 August 2019 08:11:52 UTC+3, Bonita Montero wrote: > There's no reason why a compiler might get incorrect code because of > this. It's undefined behaviour only because of the runtime-behaivour > of the CPU / OS. It is all interrelated. Compiler translates by its settings and uses CPU operations and OS calls that those settings allow. It is not changing its settings because you used #pragma pack(1) in some other, potentially unrelated translation unit and certainly it can't recompile OS because of that. |
Bonita Montero <Bonita.Montero@gmail.com>: Aug 15 09:25AM +0200 No, that's stupid. It simply depends on the CPU / OS at runtime. And not on the compiler. |
David Brown <david.brown@hesbynett.no>: Aug 15 10:15AM +0200 On 15/08/2019 00:04, Keith Thompson wrote: > behavior for which this document imposes no requirements > where "this document" is the C++ standard. If something other than > the standard imposes requirements, it's still undefined behavior. True. But in this case, nothing else imposes requirements or definitions either, leaving the behaviour undefined. Neither Bonita nor anyone else has given any references, pointers to documentation or manuals, or other indications that their C or C++ compiler supports unaligned accesses via normal pointers. So this /is/ undefined behaviour - nothing has defined the behaviour in question. But that does not imply that there is unexpected or undesirable behaviour - it seems that in all testing that has been done, compilers treated the unaligned accesses as though they /were/ defined, in the expected way, and that is what Bonita desires. > If you want to talk about unexpected or undesirable behavior, > call it that. That's not what "undefined behavior" means. That distinction is important. However, the behaviour here is undefined, but it is not unexpected and not (to Bonita) undesirable. Let me try to summarise what we know about unaligned accesses, and I think you will see what I mean. At the hardware level, we know: 1. Many cpus support unaligned accesses (though perhaps with limitations, such as slower access or lacking features like locks or atomic guarantees). Examples include x86 and Arm-64. 2. Many cpus trap on unaligned accesses. Examples include SPARC, Arm-32. 3. Some cpus have silent corruption on unaligned accesses, such as the MSP430. 4. For some cpu families, there are members that support unaligned accesses, and members that trap, despite having broadly compatible binary code. That includes the 68K family. At the OS level, we know: 1. Some OS's have trap handlers that emulate the unaligned access for processors that don't support it in hardware. That includes at least some versions of Solaris for SPARC, and some (but not all) 32-bit ARM Linux systems. 2. Other OS's don't have such handlers. At the language standards level, we know: 1. The C standards clearly state that creating unaligned pointers is undefined behaviour in most cases. (It may be possible, using something like memcpy.) Dereferencing unaligned pointers is explicitly undefined behaviour. 2. The C++ standards are not as clear to me (due to lack of familiarity), but I believe using unaligned pointers is also undefined in C++. 3. In C and C++, the actual alignments are implementation dependent, and are not required to be the same as the sizes of the types or the "natural" alignment preferred by the hardware. 4. The standards rarely say something is "not allowed", but explicitly stating that something is "undefined behaviour" is usually interpreted as a strong suggestion to avoid the behaviour in portable code, and to avoid it in non-portable code unless the compiler explicitly defines the behaviour. For safe and reliable coding standards, this is often an rule rather than a suggestion. At the compiler level, we know: 1. You can always access unaligned data by char, or by using memcpy. Many compilers will turn common patterns into optimal instructions for the target, while the source code remains safe, portable and fully defined on all toolchains and targets. 2. Many compilers support extensions that can be used to safely and reliably access unaligned data in convenient ways, such as "packed" structs or MSVC's "__unaligned" keyword. These methods are documented, and will work correctly even on targets that don't support unaligned access in hardware. 3. Compilers try to minimise surprises, unexpected code generation, and behaviour changes for code that "worked with an older version". 4. Compilers try to maximise efficiency of correct code. This is sometimes in conflict with point 3. 5. Compilers optimise on the assumption that certain types of undefined behaviour do not occur (or at least, the programmer doesn't care what happens if it does occur). This applies to /all/ compilers. But some compilers are more aggressive about this than others. 6. No tests have yet shown a compiler optimising on the assumption that pointers are aligned (for targets that support unaligned accesses). That may be luck, limitations of tests, or limitations of optimisations. It may also be intentional but undocumented behaviour for the compiler. 7. No references have been found for any compiler that explicitly documents supporting unaligned accesses (other than via specific extensions). 8. Compilers are often poor at issuing warnings or errors about undefined behaviour or potentially undefined behaviour, such as making or using unaligned pointers. 9. Future directions of compiler technology, including link-time optimisations and pointer provenance tracking, make it a serious risk that compilers /will/ optimise code on the assumption that pointers are valid (including being aligned). Hopefully, this will also improve their ability to warn about potential problems. So we know that general unaligned accesses usually appear to work as expected and as desired (by those that use them). But we know that this is not documented or defined anywhere, not guaranteed, and certainly not portable to all types of processor. We know it is often considered bad, or at least risky, programming practice from general rule of "don't rely on undefined behaviour". We know that despite working now, it might fail in the future. We know that some programmers are quite happy with "it worked when I tried it" coding, without concern for guarantees or documentation of behaviour. To me, using unaligned pointers is therefore a fine example of undefined behaviour, and Bonita's blasé attitude is an example of what is often wrong in the programming world. |
"Öö Tiib" <ootiib@hot.ee>: Aug 15 03:41AM -0700 On Thursday, 15 August 2019 10:25:49 UTC+3, Bonita Montero wrote: > No, that's stupid. It simply depends on the CPU / OS at runtime. > And not on the compiler. Hundreds of millions of programmers have expressed opinions that what C or C++ compilers actually do or have started to do or have stopped doing is stupid. Yet the only things that matter in such disputes are specifications. Without cites and quotes of specifications (or improvement requests of such) their opinions have remained hollow, muffled and meaningless. All the decades. Why you differ? |
Bonita Montero <Bonita.Montero@gmail.com>: Aug 15 02:25PM +0200 > Yet the only things that matter in such disputes are specifications. No, the things that matters are the compiler you target. And pointers are internally numbers on all processors; they aren't technicall dis- tinguishable. So the compiler usally even can't see that it is compi- ling an unaligned access. And theres no logical reason that a compiler might not do this. The reason why this might not run only depends on the CPU and / or OS at runtime. |
David Brown <david.brown@hesbynett.no>: Aug 15 04:06PM +0200 On 15/08/2019 14:25, Bonita Montero wrote: >> Yet the only things that matter in such disputes are specifications. > No, the things that matters are the compiler you target. Is your code only ever compiled with one version of one compiler, with one set of options, one set of libraries (static and dynamic), for one processor and one host (a single specific version and set of libraries), and with one set of source code that does not change? Sometimes that is appropriate. I have written code like that, where it was acceptable to depend on certain behaviour which was confirmed by examining the generated assembly, and where /everything/ involved in code generation was tied down so tightly that "it gives the code I want when I tried it" is actually good enough. I severely doubt the same thing applies to the code /you/ write - it very rarely does, outside of constrained embedded systems. For anyone else, what matters is that the toolchain documents the behaviour for the code you write. It doesn't matter if the documentation is in the C or C++ specifications, the host system's ABI, the compiler manual, or any other appropriate documents. Use of normal pointers with unaligned values is not documented anywhere - ergo, it is undefined behaviour. And that /does/ matter. At least, it matters to people who care about quality in their code and their development practices. > And pointers > are internally numbers on all processors; Really? You know this for a fact? On /all/ processors? You don't imagine that on some processors, pointers combine addresses with valid ranges, or security features, or access control, or bits controlling caching or buffering features? Many of these only found on obscure systems. You would not be extrapolating from your knowledge of Windows and the x86 world and making assumptions about every processor or every system again? I have personally used at least four compilers and processors where pointers are not merely a "number" - where you can't, for example, assume that you can find the distance between unrelated objects by simply subtracting the pointers. > they aren't technicall distinguishable. You do know that the compiler can implement pointers in all sorts of ways, in certain circumstances? And it can do odd things with them, or assume things that it cannot when dealing with simple numbers? void inc(int *p) { (*p)++; } int addtwo(int x) { inc(&x); inc(&x); return x; } gcc compiles "addtwo" as though it were just: int addtwo(int x) { return x + 2; } The "just number" pointers have disappeared entirely. Depending on the circumstances, the compiler can often know that two pointers cannot possibly be equal even though it does not know their values (such as when they are different types). At the cpu level, on many processors, a pointer is just a number. But C and C++ are not assembly. > ling an unaligned access. And theres no logical reason that a compiler > might not do this. The reason why this might not run only depends on > the CPU and / or OS at runtime. If the compiler doesn't know anything about a pointer, it will use it blindly - since using an invalid pointer would be undefined behaviour, it can assume the pointer is valid and can be used directly. But it can still make assumptions about it. int foo(int * p) { *p = 1; return (p == 0); } gcc compiles that to: foo(int*): movl $1, (%rdi) xorl %eax, %eax ret Note that the compiler assumes "p" is not zero. The pointer is not just treated as a number. And if you write: int bar(int * p) { *p = 1; return ((uintptr_t) p) & 0x03; } the compiler /could/ implement that as: bar(int*): movl $1, (%rdi) xorl %eax, %eax ret No compiler I tested does that, but compilers are allowed to do so. Unless, of course, they document their behaviour with unaligned pointers - which no compiler (that I have seen) does. |
Bonita Montero <Bonita.Montero@gmail.com>: Aug 15 04:21PM +0200 > one set of options, one set of libraries (static and dynamic), for one > processor and one host (a single specific version and set of libraries), > and with one set of source code that does not change? Thats plainly idiotic! No compiler will prevent unaligned acccesses. That's a pure runtime-issuel. |
James Kuyper <jameskuyper@alumni.caltech.edu>: Aug 14 07:47PM -0400 On 8/14/19 1:39 PM, Alf P. Steinbach wrote: > On 14.08.2019 14:41, James Kuyper wrote: >> On 8/14/19 4:25 AM, Alf P. Steinbach wrote: >>> On 14.08.2019 10:06, Jivanmukta wrote: ... >>> allocation internally depends on the C++ implementation. >> True, but only in the same sense that it is equally true of passing an >> argument to a function ... > dynamically allocated. Deallocation is a matter of destroying all owning > smart pointers to that exception object. Which does not guarantee to > deallocate immediately, but ensures that proper cleanup is done. It's true that "The referenced object shall remain valid at least as long as there is an exception_ptr object that refers to it." (21.8.6p8). Depending upon where it's stored, the lifetime of the exception_ptr returned by current_exception() might be a lot longer than than the end of the block for that exception's handler. Therefore, the referenced object cannot, in general, be an object with automatic storage duration local to that block. However, other parts of that same clause say: "Returns: An exception_ptr object that refers to ... or a copy of the currently handled exception ... It is unspecified whether the return values of two successive calls to current_exception refer to the same exception object. [ Note: That is, it is unspecified whether current_exception creates a new copy each time it is called. — end note ] If the attempt to copy the current exception object throws an exception, the function returns an exception_ptr object that refers to the thrown exception or, if this is not possible, to an instance of bad_exception. [ Note: The copy constructor of the thrown exception may also fail, so the implementation is allowed to substitute a bad_exception object to avoid infinite recursion." Are those the semantics of a shared pointer to the current exception? As I understand the concept, a shared pointer to something actually points at that thing, rather than at a copy of that thing. Given that the referenced object can be a copy of the currently handled exception, rather than the exception itself, the lifetime of the referenced object doesn't impose any requirements on how the exception itself is handled. It's entirely feasible for dynamic allocation to only occur when needed to store that copy. >> (18p1), so none of those exceptions apply. > Yes, but it seems that here you went out on a tangent, into the land of > irrelevancies. Those clauses demonstrate that the exception itself must have automatic storage duration; a fact very relevant to the point I was making. Allocating it dynamically is only allowed insofar as covered by the as-if rule (the same is true of function parameters, for similar reasons). |
"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Aug 15 05:13AM +0200 On 15.08.2019 01:47, James Kuyper wrote: > also fail, so the implementation is allowed to substitute a > bad_exception object to avoid infinite recursion." > Are those the semantics of a shared pointer to the current exception? It's the semantics of a shared pointer. Pointing to either the current exception or a copy. > As > I understand the concept, a shared pointer to something actually points > at that thing, rather than at a copy of that thing. Yes. I have the feeling that in your mind you felt that you were making a point here, arguing against something. You're not, but, provided that my interpretation is correct, you wouldn't have written it if you were aware of that. To convince yourself that you were making a meaningless statement, try to quote the context that you were thinking of. > referenced object doesn't impose any requirements on how the exception > itself is handled. It's entirely feasible for dynamic allocation to only > occur when needed to store that copy. Yes. >> irrelevancies. > Those clauses demonstrate that the exception itself must have automatic > storage duration; No, they don't. The temporary T constructed in the throw statement, if there is one, has automatic storage duration, that's all. Then there is an exception object, let's call it E, created as a copy of T, that conceptually lives somewhere else: it does conceptually not live on the stack, because the stack is unwinded, all variables in the abandoned frame destroyed, but it might be copied/moved around. Details. The first quote is about the scope of the name declared in a `catch` clause. That says nothing about T or E. The second quote is about block scope variables. T, if there is a T, is a temporary so it's not a variable, and E doesn't have block scope. One implementation strategy, at the machine code level, could be to just let that temporary continue to live down in the old abandoned stack frame until a handler is found, whence it can be copied up into the exception handling stack frame. But that's just me speculating: it's a technical possibility, not a strategy I've seen by inspecting compiler code. The only C++ compiler code I've looked at was g++, when they used original C function declarations. The idea then was to implement an exception handling control structure I'd argued for, but the code was so ugly I left it. > a fact very relevant to the point I was making. Not sure what the point was. :) > Allocating it dynamically is only allowed insofar as covered by the > as-if rule (the same is true of function parameters, for similar reasons). Yes, which means it's allowed. Cheers!, - Alf |
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page. To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com. |
No comments:
Post a Comment