- Is this undefined behavior? - 2 Updates
- who's at fault, me or compiler? - 7 Updates
- Observable end padding in arrays - 3 Updates
- Link with library of exact filename (i.e. exact version) - 4 Updates
- Fixing some undefined behavior - 3 Updates
Tim Rentsch <tr.17687@z991.linuxsc.com>: Jul 08 08:23AM -0700 > converting iterations into a recursive function, or is this > "permission" merely based on the lack of prohibition under the > umbrella of the "as if" rule? To a certain extent it is both. There is no specific statement in the C++ standard (or the C standard either) that an iterative function may be implemented using recursive object code, or vice versa. But the C++ standard does say (in intro.execution, p1) "[Conforming implementations] need not copy or emulate the structure of the abstract machine." Any such mapping does fall under the "as if" rule, so in that sense the freedom is implicit rather than explicit. At the same time there is an explicit freedom granted to disregard what the abstract machine would do (provided of course the "as if" requirements are met). A C++ implementation can be conforming even if the source code is "compiled" by translating it to pure Lisp and then running the Lisp code. Pure Lisp doesn't have any way of iterating; all it has is recursion. As long as the Lisp code produces the same output that the abstract machine would, the implementation is conforming. (Note: there are some other aspects of what is called "observable behavior" that I've left out, but that doesn't change the key point that compiling to a recursive-only environment such as pure Lisp can still be conforming.) > time) than iterative ones, so the source code may have well defined > behavior in the iterative form, but run into UB in the recursive form > due to resource exhaustion. I know some people think that running out of stack space (as easily might happen in a deeply recursive function) is necessarily undefined behavior. That's not right. Running out of stack space, whether because of recursion or otherwise, is a property of the execution environment, and the implementation has no control over that. Indeed the implementation might not even be able to discover if it's about to happen. These things do not affect whether a program has undefined behavior, which is determined solely by what is specified (or not) to happen in the abstract machine. I explained this stuff in more detail, sometime last year, in several postings in this newsgroup. If you would like more explanation, I can dig around and see if I can find some of that commentary, to help answer further questions. Of course I reserve the right to answer further questions directly, without making any reference to my previous comments. ;) |
"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Jul 09 12:59AM +0200 On 08.07.2020 17:23, Tim Rentsch wrote: > These things [like stack usage] do not affect whether a > program has undefined behavior, which is determined solely by what > is specified (or not) to happen in the abstract machine. No. UB is not solely a static property of a program. It's also a dynamic property, such as exceeding a buffer size, or an implementation limit. I seem to have conceded your POV that a C or C++ compiler /can/ translate to recursive function implementation. But only by then supposing that a compiler /can/ introduce likely dynamic UB, or at the very least remove a conditional guarantee of well defined operation. So this is now the point where we differ. In my view having standards that allow that is very ungood. But it's a thorny problem. The "conditional" I mention is because there are a zillion possible dynamic UB sources, such as a `bool` variable: the compiler is free to willy-nilly decide that in this particular compilation `sizeof(bool)` is, say, 2M, and furthermore to know that its implementation limit of stack size is less, then prove to itself that hence the `main`, which here happens to have a `bool` temporary, would incur UB, hence that any behavior added to `main` would be fine. It's the anything goes. And again, that happens because an informal, practically oriented standard is treated as a precise formal work, which it clearly isn't. - Alf |
boltar@nowhere.co.uk: Jul 08 09:05AM On Tue, 7 Jul 2020 15:15:41 +0000 (UTC) >integers and do effectively a memset() on that memory block >to zero it. Any code that accesses the members just use >offsets from the beginning of that memory block. I was using "stored" in a rather liberal sense. I didn't mean it had to be some kind of lookup table but if you have struct mystruct { int i; char c; short s[5]; int j; }; then the binary needs to be aware that (assuming no padding) s is 5 bytes away from i in the memory and j is 10 bytes away however that awareness is stored internally. |
boltar@nowhere.co.uk: Jul 08 09:05AM On Tue, 7 Jul 2020 20:38:20 +0200 >> The class memory layout has to be stored in some form somewhere in the binary >> otherwise on the fly objects could never be created. >Yes, they can. It is called construction. Whoooosh.... |
boltar@nowhere.co.uk: Jul 08 09:07AM On Wed, 8 Jul 2020 00:16:21 +0300 >time and to create objects from them "on the fly". The compiled code >accesses object members at fixed offsets which are hardcoded in the code >and not looked up anywhere at run time. And the difference between memory layout being stored and hardcoded offsets is.... what exactly? |
Juha Nieminen <nospam@thanks.invalid>: Jul 08 09:25AM >>and not looked up anywhere at run time. > And the difference between memory layout being stored and hardcoded offsets > is.... what exactly? "otherwise on the fly objects could never be created" would imply that classes can only be instantiated if their exact internal structure is known, else it's impossible. As mentioned, that's not necessarily the case. For example, if the class consists only of integer variables, for instance, which are all zero-initialized or default-initialized, it can be instantiated by knowing the size of the class, with no knowledge of its internal structure. (If it's completely zero-initialized, then what amounts to a single memset() call can be used to initialize it.) No code in the program might access every single member variable of that class, which means that only some of the offsets will be stored in code, not all of them. This means that even by the loosest possible definition of "storing the memory layout of the class", it would only have been "stored" partially, not fully, yet that doesn't make it impossible to instantiate. For example, maybe the class has 20 integer member variables, but this particular program only accesses the first one of them, ignoring the rest. This means that, effectively, the program only accesses the very first value in the class. None of the rest of the internal structure of the class is stored anywhere in the code, in any way. Yet that doesn't make it impossible to instantiate the class. |
Paavo Helde <eesnimi@osa.pri.ee>: Jul 08 03:28PM +0300 >> and not looked up anywhere at run time. > And the difference between memory layout being stored and hardcoded offsets > is.... what exactly? The difference is in how easy is to access that information. For starters, write a function which will take a pointer to any class object and print out its class definition. This can be done quite easily in some other languages. |
boltar@nowhere.co.uk: Jul 08 03:01PM On Wed, 8 Jul 2020 09:25:19 +0000 (UTC) >"otherwise on the fly objects could never be created" >would imply that classes can only be instantiated if their exact internal >structure is known, else it's impossible. It doesn't imply anything of the sort. However the binary needs some representation of the class and if not stripped it'll also contain the function and attribute names too otherwise debuggers wouldn't work. |
boltar@nowhere.co.uk: Jul 08 03:03PM On Wed, 8 Jul 2020 15:28:46 +0300 >starters, write a function which will take a pointer to any class object >and print out its class definition. This can be done quite easily in >some other languages. In interpreted languages like python anything is possible because their object format isn't constrained by the OS executable format but if the language is compiled down to a binary there are limitations on what can be stored in the binary unless the OS supports it. Eg MacOS supports a large amount of exe metadata. |
Keith Thompson <Keith.S.Thompson+u@gmail.com>: Jul 07 04:53PM -0700 >> language. I'm not aware of anything in the C++ core language that's >> defined by the C standard. > Minimum ranges of integer types. Fair enough -- though that's also an example of inheriting the contents of <limits.h> from the C library. > For the C++ standard it's explicitly stated in normative text, but > implicitly referring to some other unspecified text via "this > implies". Right, and I'm trying to figure out what the "this implies" refers to. > fix this in C++11 but as I recall they didn't manage to clear it up > completely, although that would not be hard to do -- which IMHO on its > own brings the competence of the committee into question. Hmm. C++17 8.3.1 [expr.unary.op] : The unary * operator performs indirection: the expression to which it is applied shall be a pointer to an object type, or a pointer to a function type and the result is an lvalue referring to the object or function to which the expression points. C++11 has similar or identical wording. This defines the behavior when there is such an object or function. By failing to define the behavior when there is no such object or function, it leave the behavior undefined by omission. A note under the definition of undefined behavior (3.27 [defns.undefined]): Undefined behavior may be expected when this document omits any explicit definition of behavior or when a program uses an erroneous construct or erroneous data. I wouldn't mind if the standard were a bit more explicit about dereferencing a null pointer (or any other pointer that doesn't point to an object or function), but it seems unambiguous as it is. My only complaint might be that the wording assumes that the object or function exists rather than saying what happens *if* it exists. [snip] My guess is that the authors of the standard thought it was so obvious that arrays can't have padding at the end that they didn't bother to state it. -- Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com Working, but not speaking, for Philips Healthcare void Void(void) { Void(); } /* The recursive call of the void */ |
Manfred <noname@add.invalid>: Jul 08 01:43PM +0200 On 7/7/2020 9:29 PM, James Kuyper wrote: > unnamed padding within a structure object, but not at its beginning." > (6.7.2.1p15) and "There may be unnamed padding at the end of a structure > or union." (6.7.2.1p17). There are no such statements for arrays. Which is "some other statement in the C standard" like I mentioned in my previous post. > padding is permitted between the members of a struct and at the end > implies that such padding is not permitted for arrays, for which no such > exceptions have been specified. I understand your point, however this would assume a very high level of self-consistency of the standard, a level that I am not 100% confident I can acknowledge. That said, I tend to consider the C standard somewhat more solid than the C++ one. Somehow related to your position (an exception implies a rule even if the rule is missing), the approach of the C++ standard according to which "Undefined behavior may be expected when this document omits any explicit definition of behavior..." puts a lot of responsibility on the standard committee, probably too much given how controversial the UB topic has become. Expecially given the meaning of UB when it comes to compilers: it is a very well-defined Bad Thing™. |
James Kuyper <jameskuyper@alumni.caltech.edu>: Jul 08 10:37AM -0400 On 7/8/20 7:43 AM, Manfred wrote: >>> On 7/7/2020 3:57 PM, James Kuyper wrote: >>>> On 7/6/20 11:14 PM, Alf P. Steinbach wrote: >>>>> On 07.07.2020 04:21, Tim Rentsch wrote: ... >> or union." (6.7.2.1p17). There are no such statements for arrays. > Which is "some other statement in the C standard" like I mentioned in my > previous post. Yes, but your wording "this may well follow from" implied uncertainty about the existence of that other statement, uncertainty which I hope I've removed. > more solid than the C++ one. > Somehow related to your position (an exception implies a rule even if > the rule is missing), ... While that is true in general, the rule should not actually be missing in this case - if it were, that would constitute a defect in the standard. I argue that the rule is actually present as an implication of the "An array type describes ..." clause, for precisely the reasons I gave in my original post. It's merely inobvious and correspondingly debatable. The existence of an exception provides evidence in support of my side in that debate; but if I were wrong in my interpretation of the "describes" clause, then the exception would not make me right. |
Frederick Gotham <cauldwell.thomas@gmail.com>: Jul 08 05:32AM -0700 On Tuesday, July 7, 2020 at 6:57:16 PM UTC+1, Manfred wrote: > Usually the GNU linker puts the SONAME of the shared library in the > executable, and at runtime the dynamic linker loads the shared library > by SO_NAME rather than by filename (see ld -soname). I haven't investigated into this fully but I think you're right, Manfred. I used the program "patchelf" to change the "soname" inside my ".so" file, and now when I link with it, the resultant executable file is dependent upon the full filename. Alternatively, and in similar fashion to "patchelf", I could have just altered a 5 - 10 bytes in the resultant binary to get it to look a file with a longer filename. By the way, does anyone know if these two lines do exactly the same thing on Linux? g++ -o prog main.cpp -L./ -l:libmonkey.so g++ -o prog main.cpp ./libmonkey.so From what I can see so far, these two commands are identical in effect. |
Manfred <noname@add.invalid>: Jul 08 03:02PM +0200 On 7/8/2020 2:32 PM, Frederick Gotham wrote: >> executable, and at runtime the dynamic linker loads the shared library >> by SO_NAME rather than by filename (see ld -soname). > I haven't investigated into this fully but I think you're right, Manfred. I used the program "patchelf" to change the "soname" inside my ".so" file, and now when I link with it, the resultant executable file is dependent upon the full filename. If you are compiling and linking the .so yourself, then you'd probably better use the "-soname" ld option rather than patching the binary afterwards - if it is a 3rd party project, it is probably a good idea to suggest this to the maintainer. > g++ -o prog main.cpp -L./ -l:libmonkey.so > g++ -o prog main.cpp ./libmonkey.so > From what I can see so far, these two commands are identical in effect. I would guess so, anyway the documentation of both gcc and ld is usually pretty accurate, so if you want to be sure better check "info gcc" and "info ld" (or man gcc, man ld) If you find some inconsistency, usually bug reports do get consideration on these projects - note that ld is part of binutils. (BTW I think there is one between the manpages of ld and ld.so on this very topic) |
Frederick Gotham <cauldwell.thomas@gmail.com>: Jul 08 06:37AM -0700 On Wednesday, July 8, 2020 at 2:03:11 PM UTC+1, Manfred wrote: > better use the "-soname" ld option rather than patching the binary > afterwards - if it is a 3rd party project, it is probably a good idea to > suggest this to the maintainer. We contact the maintainer at most once a week and only when absolutely necessary. Furthermore the maintainer didn't intend for us to use multiple versions of the same library, so they probably like the "soname" that they currently are using. As "patchelf" seems to work fine for my needs, this is the best option. |
James Kuyper <jameskuyper@alumni.caltech.edu>: Jul 08 10:24AM -0400 On 7/8/20 9:02 AM, Manfred wrote: > On 7/8/2020 2:32 PM, Frederick Gotham wrote: ... > on these projects - note that ld is part of binutils. > (BTW I think there is one between the manpages of ld and ld.so on this > very topic) -L./ adds ./ to the list of locations that are searched for libraries. -l:libmonkey.so tells it to look in the current list of locations for a library named libmonkey.so. Using ./libmonkey.so searches only for ./libmonkey.so These two commands can do different things (depending upn the context) for couple of reasons: A. The -L option affects all subsequent library searches, not just this one. B. the -l option causes other places to be searched for the library, not just ./ |
woodbrian77@gmail.com: Jul 07 05:55PM -0700 On Tuesday, July 7, 2020 at 12:36:31 PM UTC-5, Mr Flibble wrote: > On 07/07/2020 07:37, Keith Thompson wrote: > > Any of them. > There is only one homophobic misogynistic religious bigot in this thread. He is probably a racist Trump supporter too. Not me. And I'd be a Trump supporter if I had to choose between him and Biden. Thankfully there are other options. Brian |
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Jul 07 05:56PM -0700 > Not me. And I'd be a Trump supporter if I had to > choose between him and Biden. Thankfully there are > other options. Holy Moly! One vs the Other. Well, thats fair. ;^) |
David Brown <david.brown@hesbynett.no>: Jul 08 09:41AM +0200 On 08/07/2020 02:56, Chris M. Thomasson wrote: >> choose between him and Biden. Thankfully there are >> other options. > Holy Moly! One vs the Other. Well, thats fair. ;^) Again - /please/ stop feeding the trolls - both of them. (And you are quickly joining that category.) Both Brian and Mr. Flibble are capable of making sensible, on-topic posts and contributing to the group. When they do that, by all means join in. But when they get together, they both write things that are clearly and intentionally provocative, offensive, and non-productive. When they do that, they are trolling. Do not encourage it. I am, more than some regulars, quite happy with the occasional off-topic thread in a technical group. But it must be /occasional/, interesting, informative, enjoyed by many, and in its own thread that does not spoil a technical thread. This thread does not fit on any count. If Brian and Mr. Flibble want to fight, let them do it in private - both their email addresses are accessible. And if you want to respond to this (other than by simply not posting more encouragement to such threads), you have my email address. |
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page. To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com. |
No comments:
Post a Comment