- Is this really necessary - 19 Updates
- A way around _HAS_ITERATOR_DEBUGGING - 1 Update
| Bonita Montero <Bonita.Montero@gmail.com>: Jun 05 12:53PM +0200 Consider the following code: size_t s( int *begin, int *end ) { return (end - begin) * sizeof(int); } This is MSVC's disassembly: sub rdx, rcx and rdx, -4 mov rax, rdx ret 0 This is gcc's disassembly: movq %rsi, %rax subq %rdi, %rax ret It the masking really necessary or just an optimizer weakness; it seems to me MSVC sees that two bits are shifted out and "in" and replaces this with a mask. I think that the language-standard assumes equally aligned data among same types for the above code and gcc is corcect, but I'm not sure. |
| David Brown <david.brown@hesbynett.no>: Jun 05 02:19PM +0200 On 05/06/2021 12:53, Bonita Montero wrote: > I think that the language-standard assumes equally aligned data > among same types for the above code and gcc is corcect, but I'm > not sure. I can't help thinking that putting a non-aligned value into an int pointer is undefined behaviour - but I can't find a reference to that in the C standards (and I don't know the C++ standards well enough to look there). However, when you subtract two pointers, the behaviour is only defined if they both point to elements inside the same array (or one past the end). It doesn't matter how they are aligned, as the offset from the standard "int" alignment will be the same for both pointers. (This applies also to 8-bit systems where you might have 16-bit int but 8-bit alignment for int.) Thus gcc's code is fine. MSVC has a tradition of being more conservative in its optimisations, while gcc has a tradition of expecting people to write valid code and optimise on the assumption that undefined behaviour does not occur. This is, I think, because MSDOS and Windows programmers have a tradition of assuming their code runs on one platform and one compiler, and "it worked when I tested it" means "it is correct". (Your own misstatements about undefined behaviour have demonstrated that.) gcc users are more likely to understand that their code could be used on different platforms and different processors, and pay a bit more attention to the rules of the language. |
| Bonita Montero <Bonita.Montero@gmail.com>: Jun 05 02:27PM +0200 > look there). However, when you subtract two pointers, the behaviour > is only defined if they both point to elements inside the same array > (or one past the end). ... And in a struct with two ints ? > MSVC has a tradition of being more conservative in its optimisations, > ... MSVC is not more conservative, but more stupid because it lacks many safe optimizations. |
| David Brown <david.brown@hesbynett.no>: Jun 05 02:33PM +0200 On 05/06/2021 14:27, Bonita Montero wrote: >> is only defined if they both point to elements inside the same array >> (or one past the end). ... > And in a struct with two ints ? No. Subtraction of pointers is defined as the difference in their indexes within a single array. >> ... > MSVC is not more conservative, but more stupid because it lacks many > safe optimizations. I was not being judgemental about what is a good or bad implementation. I personally prefer gcc's philosophy, but I know other people have preferences that are somewhere in between. (You have posted in the past about your beliefs about how compilers handle some kinds of undefined behaviour - and shown why there is a market for such conservative compilers.) |
| "Öö Tiib" <ootiib@hot.ee>: Jun 05 05:38AM -0700 On Saturday, 5 June 2021 at 15:20:07 UTC+3, David Brown wrote: > pointer is undefined behaviour - but I can't find a reference to that in > the C standards (and I don't know the C++ standards well enough to look > there). The C and C++ programs that use unaligned pointers are undefined (in sense of standard) regardless of the target architecture (that may allow unaligned accesses) but the implementations can extend. > from the standard "int" alignment will be the same for both pointers. > (This applies also to 8-bit systems where you might have 16-bit int but > 8-bit alignment for int.) Thus gcc's code is fine. Also MS code is fine as that masking should not have any ill effects to conforming code. > likely to understand that their code could be used on different > platforms and different processors, and pay a bit more attention to the > rules of the language. I think that most important requirement of MSVC is that it should build good binaries out of Microsoft's own code base regardless how tricky that code base is. The gcc as whole does not have that sort of obligations. Perhaps some people working on gcc code base have but they are from wide variety of companies. |
| Bonita Montero <Bonita.Montero@gmail.com>: Jun 05 03:06PM +0200 > No. Subtraction of pointers is defined as the difference in their > indexes within a single array. That coudn't be true because you can cast any pointer-pair to char *, subtract them and use the difference for memcpy(). |
| David Brown <david.brown@hesbynett.no>: Jun 05 03:29PM +0200 On 05/06/2021 15:06, Bonita Montero wrote: >> indexes within a single array. > That coudn't be true because you can cast any pointer-pair > to char *, subtract them and use the difference for memcpy(). Look it up. You can do lots of things in C and C++ that are syntactically correct, but might have undefined behaviour. |
| "Öö Tiib" <ootiib@hot.ee>: Jun 05 06:34AM -0700 On Saturday, 5 June 2021 at 16:06:41 UTC+3, Bonita Montero wrote: > > indexes within a single array. > That coudn't be true because you can cast any pointer-pair > to char *, subtract them and use the difference for memcpy(). So it couldn't be true that standard specifies it so: | If the expressions P and Q point to, respectively, elements | x[i] and x[j] of the same array object x, the expression P - Q has | the value i − j; otherwise, the behavior is undefined. I don't understand what supposedly stops it? |
| David Brown <david.brown@hesbynett.no>: Jun 05 03:41PM +0200 On 05/06/2021 14:38, Öö Tiib wrote: > The C and C++ programs that use unaligned pointers are undefined (in > sense of standard) regardless of the target architecture (that may allow > unaligned accesses) but the implementations can extend. Of course implementations can add whatever definitions they want beyond the requirements of the standard. And while dereferencing unaligned pointers is undefined behaviour (by the standards), I haven't found anything that says that merely assigning an unaligned value to a pointer is undefined behaviour. But that could easily be something I missed - hopefully someone can then give the reference (in the C or C++ standards). >> 8-bit alignment for int.) Thus gcc's code is fine. > Also MS code is fine as that masking should not have any ill effects > to conforming code. Sure. Suboptimal, but correct. > I think that most important requirement of MSVC is that it should build > good binaries out of Microsoft's own code base regardless how tricky > that code base is. That is a reasonable requirement! > The gcc as whole does not have that sort of obligations. gcc needs to be able to compile gcc and all its dependencies, libraries, etc. That in itself is a rather massive and complex code base, full of all kinds of weird stuff for historic reasons (including garbage collection, mixes of C and C++, and code that dates back 30+ years that no one really understands). They also work with the Linux kernel folk and distributions like Debian to test on a huge variety of existing software. I'm not sure whether you could call that an "obligation" or a "requirement" for gcc as a whole or, as you say, just for some people working on gcc. But it is certainly something they do in the process of testing and preparing releases. |
| "Öö Tiib" <ootiib@hot.ee>: Jun 05 08:18AM -0700 On Saturday, 5 June 2021 at 16:41:35 UTC+3, David Brown wrote: > an unaligned value to a pointer is undefined behaviour. But that could > easily be something I missed - hopefully someone can then give the > reference (in the C or C++ standards). When to attempt to make the pointer that is unaligned then "resulting pointer value is unspecified" or equal wording in couple places of C++ standard. I did mean usage like dereferencing of such unspecified pointer value is undefined (unless implementation gives some better guarantees). > all kinds of weird stuff for historic reasons (including garbage > collection, mixes of C and C++, and code that dates back 30+ years that > no one really understands). I agree. Still in MS if a code that did run with compiler version A does not run with with compiler version B then it is about what is cheaper to business: (1) to fix undefined behavior in that code or (2) to adjust that compiler B. That (2) is more common with msvc than with gcc that also compiles undefined behaviors in popular benchmarks "correctly" (as example of (2) with gcc). > whole or, as you say, just for some people working on gcc. But it is > certainly something they do in the process of testing and preparing > releases. That code-base can't be declared sacred by business or by being popular benchmark. So there we see Linus being vulgar but complying and fixing such legacy code. |
| Bonita Montero <Bonita.Montero@gmail.com>: Jun 05 05:38PM +0200 > Look it up. > You can do lots of things in C and C++ that are syntactically correct, > but might have undefined behaviour. I don't believe that memcpy()ing this way is UB. |
| David Brown <david.brown@hesbynett.no>: Jun 05 05:51PM +0200 On 05/06/2021 17:38, Bonita Montero wrote: >> You can do lots of things in C and C++ that are syntactically correct, >> but might have undefined behaviour. > I don't believe that memcpy()ing this way is UB. Can you give an example of what you are thinking about? |
| Richard Damon <Richard@Damon-Family.org>: Jun 05 11:56AM -0400 On 6/5/21 11:38 AM, Bonita Montero wrote: >> You can do lots of things in C and C++ that are syntactically correct, >> but might have undefined behaviour. > I don't believe that memcpy()ing this way is UB. The memcpy might not be, but subtracting two pointers that don't point to elements of the same array is. |
| Richard Damon <Richard@Damon-Family.org>: Jun 05 11:58AM -0400 On 6/5/21 11:18 AM, Öö Tiib wrote: > C++ standard. I did mean usage like dereferencing of such unspecified > pointer value is undefined (unless implementation gives some better > guarantees). My understanding is that the unaligned pointer has an unspecified value that might be a trap value, so any operation that uses that value can cause Undefined Behavior. |
| MrSpook_rs7x@4hhtozmpj299zx.tv: Jun 05 04:00PM On Sat, 5 Jun 2021 14:19:52 +0200 >> not sure. >I can't help thinking that putting a non-aligned value into an int >pointer is undefined behaviour - but I can't find a reference to that in Alignment only matters on certain architectures, and even then , not always. eg this compiles and runs fine on x86 MacOS using clang, setting non aligned ints on both the stack and the heap: #include <stdio.h> #include <stdint.h> #include <stdlib.h> int main() { uint32_t i; uint32_t j; uint32_t *p1; uint32_t *p2; p1 = (&j < &i ? &j : &i); p2 = (uint32_t *)((char *)p1 + 1); printf("p1 = %p, p2 = %p\n",p1,p2); *p2 = (uint32_t)-1; printf("*p2 = %u\n",*p2); p1 = (uint32_t *)malloc(sizeof(uint32_t) * 2); p2 = (uint32_t *)((char *)p1 + 1); printf("p1 = %p, p2 = %p\n",p1,p2); *p2 = (uint32_t)-1; printf("*p2 = %u\n",*p2); return 0; } |
| David Brown <david.brown@hesbynett.no>: Jun 05 06:16PM +0200 >> I can't help thinking that putting a non-aligned value into an int >> pointer is undefined behaviour - but I can't find a reference to that in > Alignment only matters on certain architectures, and even then , not always. The discussion is about behaviour that is not defined by the C and C++ standards. If you can point to /documentation/ that says clang on x86 MacOS defines the behaviour of unaligned access, that would be interesting. But other than that, a sample of "this example happens to work on this compiler with these flags on this target" is irrelevant. We all know that on many - but not all - cpu targets, unaligned accesses work as expected, albeit usually at a performance cost. But we are talking about the standards definition of C++ here (and perhaps C, in that C++ inherits such things from C), not cpus. |
| Bo Persson <bo@bo-persson.se>: Jun 05 07:50PM +0200 > printf("*p2 = %u\n",*p2); > return 0; > } You did notice (right?) that the optimiser transformed printf("*p2 = %u\n",*p2); into printf("*p2 = %u\n", (uint32_t)-1); resulting in the code 00007FF71990106D mov rcx,rsi 00007FF719901070 mov edx,0FFFFFFFFh 00007FF719901075 call printf (07FF719901090h) |
| James Kuyper <jameskuyper@alumni.caltech.edu>: Jun 05 05:42PM -0400 On 6/5/21 8:19 AM, David Brown wrote: ... > pointer is undefined behaviour - but I can't find a reference to that in > the C standards (and I don't know the C++ standards well enough to look > there). It's not something either standard says explicitly. Rather, it's something that need to be derived from what it says about other things. If you start from a pointer that is correctly aligned for it's type, most pointer operations give a result that still points to the same type, and is still correctly aligned for that type. This includes conversion to an integer type and back to the original pointer type. The only operations that could result in a mis-aligned pointer all have undefined behavior for one reason or another. For instance, conversion to a pointer to a more strictly aligned type has undefined behavior if the original pointer doesn't meet the alignment requirements of the new type. Converting a pointer to an integer, performing any kind on arithmetic on that integer to produce a different integer value, and converting back again, has undefined behavior due to the omission of any explicit definition of the behavior. So what about starting with a mis-aligned pointer? If you have a packed struct, the members of that struct might not be correctly aligned for their type. By packing a struct is not a core language feature - it's only available as an extension. On platforms with strong alignment requirements, implementations that allow struct packing will generally provide warnings about ways you should not use normal pointers to access objects that might be misaligned. You could also convert an integer that represents a memory location that is not correctly aligned for a given type, and convert it into a pointer to that type - but the behavior of such a conversion is undefined. I'm not sure that the above argument covers every possibility, but I do believe that every possible way of getting a misaligned pointer is covered, in some fashion, by both standards. |
| James Kuyper <jameskuyper@alumni.caltech.edu>: Jun 05 07:08PM -0400 Apologies to David, who has already received two versions of this message as e-mail, because I keep hitting the Thunderbird "Reply" button instead of their new "Followup" button. On 6/5/21 9:29 AM, David Brown wrote: >> That coudn't be true because you can cast any pointer-pair >> to char *, subtract them and use the difference for memcpy(). > Look it up. That works in C because every C object can be accessed as an array of char. C++ allows more complicated possibilities, including objects that are not contiguous. But it works in C++ if the relevant objects are required to be contiguous. If two such objects are both sub-objects of the same larger object, the difference between those pointers satisfies that requirement, otherwise the subtraction is undefined. |
| MrSpook_6qpp@dggw8.org: Jun 05 09:28AM On Fri, 4 Jun 2021 13:30:16 -0700 >would be fun. The teacher says something like: We are going to implement >several std containers under the namespace std_course. Imvvho, it would >be an interesting and worth while exercise. Writing a basic implementation of a doubly linked list used to be a fairly standard interview test for C programmers back in the day before C++ became popular. You'd be amazed (or maybe not) how many of them didn't have a clue what such a construct even was never mind how to implement it. |
| You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page. To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com. |
No comments:
Post a Comment