- two's complement idea - 20 Updates
- Thread must sleep forever (Try to double-lock mutex?) - 2 Updates
- alloca-alternative - 3 Updates
| "Öö Tiib" <ootiib@hot.ee>: Nov 05 03:49PM -0800 On Tuesday, 5 November 2019 23:48:46 UTC+2, Manfred wrote: > programming error: even if this is true I think that unsigned overflow > should have defined behavior (and wrap) rather than being handled as an > error by the compiler. I just listed facts with what lot of people agree like "on majority of cases overflow (even unsigned) is programming error" I did not say what to conclude from these facts here. > > Physically damaged, disconnected or short-circuited temperature sensor > > can no way repair or reconnect itself. > Undoubtedly, but that's not what I wrote. Ok. > My point is that rather than using NaNs the hardware or driver should > raise specific error signals (like some error code on the control I/O > port, or at the API level) instead. Device has to operate on incomplete data. And saturating silent NaN works perfectly as such missing part of data. Driver that is panicking throwing up and signaling too lot has to be killed to reduce disturbance. Panic solves nothing regardless if you are Schwarzenegger or not. ;) |
| Bonita Montero <Bonita.Montero@gmail.com>: Nov 06 07:08AM +0100 >> And That's not how computers work. > That is utterly irrelevant. You can rely on that p0907r0 will be included in an upcoming standard and all implementations will have std::numeric_limits<signed...>:: is_modulo to be set to true; so g++ must drop thé shown optimization. There are so many language-properties that represent how a CPU logi- cally works, why not this property? |
| David Brown <david.brown@hesbynett.no>: Nov 06 09:30AM +0100 On 06/11/2019 07:08, Bonita Montero wrote: >> That is utterly irrelevant. > You can rely on that p0907r0 will be included in an upcoming standard > and all implementations will have std::numeric_limits<signed...>:: Have you actually /read/ the paper, and its subsequent revisions (we are now on p0907r4) ? <http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0907r4.html> Signed integer overflow remains undefined behaviour. This is what the majority of the committee, the majority of compiler vendors, and the majority of users want. > is_modulo to be set to true; so g++ must drop thé shown optimization. "is_modulo" can be (but doesn't need to be) set to true if the implementation gives signed integer arithmetic wrapping semantics. /If/ an implementation has is_modulo set true for signed types, then you are correct that it can't do the kind of optimisations I showed (or many other optimisations). gcc, clang and MSVC currently have is_modulo false for signed integer types, and do not guarantee wrapping behaviour. This is fine, and the way it should be. (gcc and clang leave it false even under "-fwrapv", which is also fine.) > There are so many language-properties that represent how a CPU logi- > cally works, why not this property? C and C++ are high level languages, abstracted from the underlying cpu. And it has already been explained to you why undefined signed integer overflow is a good idea. |
| Bonita Montero <Bonita.Montero@gmail.com>: Nov 06 09:57AM +0100 >> There are so many language-properties that represent how a CPU logi- >> cally works, why not this property? > C and C++ are high level languages, abstracted from the underlying cpu. C isn't high-level and C++ is high-level as well as low-level. And the issue we're ralking about is low-level. |
| Manfred <noname@add.invalid>: Nov 06 01:33PM +0100 On 11/5/2019 8:45 PM, Paavo Helde wrote: >> With the same reasoning you could say that unsigneds might never >> wrap; but in fact they're specified to wrap. > In retrospect, this (wrapping unsigneds) looks like a major design mistake. No, it isn't. > IMO, wrapping integers (signed or unsigned) are an example of > "optimization which nobody asked for", and they are there basically only > because the hardware happened to support such operations. Look at the following code and see for yourself how efficient it is to check for integer overflow if unsigned integers do wrap. Achieving the same would be much more verbose (and less efficient) if unsigned overflow were not defined behavior. (taken from http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1969.htm) char* make_pathname (const char *dir, const char *fname, const char *ext) { size_t dirlen = strlen (dir); size_t filelen = strlen (fname); size_t extlen = strlen (ext); size_t pathlen = dirlen; // detect and handle integer wrapping if ( (pathlen += filelen) < filelen || (pathlen += extlen) < extlen || (pathlen += 3) < 3) return 0; char *p, *path = malloc (pathlen); if (!path) return 0; p = memcpy (path, dir, dirlen); p [dirlen] = '/'; p = memcpy (p + dirlen + 1, fname, filelen); p [filelen] = '.'; memcpy (p + filelen + 1, ext, extlen + 1); return path; } |
| Bonita Montero <Bonita.Montero@gmail.com>: Nov 06 01:51PM +0100 > || (pathlen += extlen) < extlen > || (pathlen += 3) < 3) > return 0; Sorry, but when are paths longer than size_t? |
| "Öö Tiib" <ootiib@hot.ee>: Nov 06 05:26AM -0800 On Wednesday, 6 November 2019 00:00:16 UTC+2, David Brown wrote: > Agreed (where "trap" could mean any kind of notification, exception, > error log, etc.). But this is something you might only want during > debugging - it is of significant efficiency cost. Indeed, majority of programming errors should be found during debugging. > I like that in debugging or finding problems - with tools like > sanitizers. But I would not want that in normal code. With this kind > of semantics, the compiler can't even simplify "x + 1 - 1" to "x". It may be can or may be can not that depends on wording. I have not really thought it thru how to word the semantics precisely. The major purpose is to get errors when program is actually storing value into type where it does not fit (IOW really overflows). Analogical argument is that automatic storage overflow may not be trapped in principle since that would disallow optimizing recursions (that exhaust stack) into loops (that don't exhaust stack). The rules can be still likely worded in a way that implementation is allowed not to trap when it manages to get the job done without exhausting automatic storage somehow. > turn it into the most efficient results. I intentionally use an > optimising compiler for C and C++ programming - when efficiency doesn't > matter, I'll program in Python where integers grow to avoid overflow. I do almost same but I think of some of it slightly differently. Exact formula in programming is unfortunately more important than its clarity and intuitivity for reader. For example we need to calculate average of two values of A and B. Mathematically there are lot of ways to calculate it and what is most intuitive may depend on meaning of A and B. Like: 1) (A + B) / 2 2) A / 2 + B / 2 3) A + (B - A) / 2 4) B + (A - B) / 2 etc. But in software these can be very different expressions because these have different potential overflows and/or losses of accuracy. Until something helps to reduce that issue it is all about exactly that formula and period. As of efficiency It is anyway often uncertain until it is shown where the bottlenecks are. Also it can be often only shown by profiling products with realistic worst case loads of data and then it is usually small subset of code that can change overall efficiency. Python I use less not because of its bad performance but because I have failed to use it scalably. Lot of of lIttle script programs is great, but when any of those starts to grow bigger then my productivity with those drops. For C++ same feels like nonsensically unimportant size. Somehow in C++ I have learned to separate different concerns and to abstract details away but not in Python. > the operations that have the behaviour, not the types. However, I can't > see a convenient way to specify overflow behaviour on operations - using > types is the best balance between flexibility and legible code. I mean totally new "advanced" operators like (A +% B) or (C +^ D). Yes there will be precedence (and may be associativity etc.) to define but it is business as usual and not some show-stopper issue. In some languages (like Swift) it is done and it seems to work fine. |
| David Brown <david.brown@hesbynett.no>: Nov 06 02:37PM +0100 On 06/11/2019 13:33, Manfred wrote: > || (pathlen += extlen) < extlen > || (pathlen += 3) < 3) > return 0; That is just silly, in all sorts of ways. First, decide if the function is an "internal" function where you can trust the parameters, and have undefined behaviour if assumptions don't hold, or an "external" function where you have to check the validity of the parameters. If it is internal, you know the lengths of the passed strings will not sum to more than 4G - or you don't care if someone does something ridiculous. (And on most modern systems, size_t is 64-bit - overflowing here would require 16 EB ram for storing the strings.) If it is external, the checking is too little - if you have char* pointers from an unknown source, you should be wary about running strlen() on them because you don't know if it will actually end with a 0 in a reasonable limit. You only need to check for overflow if it is possible for the calculations to overflow. If the operands are too small to cause an overflow, there will not be an overflow. And until you are talking about large integers for cryptography or that sort of thing, adding up realistic numbers will not overflow a 64-bit type. So /if/ you have an old 32-bit size_t system, and /if/ you have maliciously crafted parameters that point to huge strings (and you'll have to make them point within the same string - you don't get over 4 GB user memory address space with 32-bit size_t), then you can do your adding up using 64-bit types and you get zero risk of overflow. uint_least64_t dirlen = strlen (dir); uint_least64_t filelen = strlen (fname); uint_least64_t extlen = strlen (ext); uint_least64_t pathlen = dirlen + filelen + extlen; if (big_size_t > MAX_SANE_PATHLENGTH) return 0; There are times when unsigned wrapping overflow is useful. This is not one of them. |
| Manfred <noname@add.invalid>: Nov 06 03:38PM +0100 On 11/6/2019 2:37 PM, David Brown wrote: >> || (pathlen += 3) < 3) >> return 0; > That is just silly, in all sorts of ways. You realize that this comes from the glibc maintainers, don't you? You can say they wrote silly code for this example (I don't), but I doubt there are many more knowledgeable people about this kind of matter than them. Moreover, I took this as an example of detection of integer overflow. The fact that it happens to be about pathname strings is irrelevant to this discussion. > trust the parameters, and have undefined behaviour if assumptions don't > hold, or an "external" function where you have to check the validity of > the parameters. This example was written about code safety, so yes, I believe it is pretty clear the assumption is that strings come from an external source. Obviously this applies to string /contents/; the pointer themselves can only be internal to the program (can't they?), so no need to check for null pointer. On the other hand, contents of the string is checked by ensuring that the result of strlen and their combination is valid. This is ensured /exactly/ by making use of unsigned wrapping behavior. > pointers from an unknown source, you should be wary about running > strlen() on them because you don't know if it will actually end with a 0 > in a reasonable limit. This code handles C strings, so there is no way to check for their length other than running strlen. The fact that you seem to miss is that it is exactly thanks to the check that you call "silly" that it is ensured that they "actually end with a 0 in a reasonable limit". We could argue about what happens with /read/ access to a non-0-terminated string, but I would simply assume that the strings are 0 terminated, since the function is going to be called by some other part of the program that can take care that there is a 0 at the end of the buffer. What is not guaranteed is that the strings actually contain pathnames, and don't contain very long malicious text instead (e.g. they could come from stdin). That risk is avoided by the code you call silly. So, no there is not too little checking. > overflow, there will not be an overflow. > And until you are talking about large integers for cryptography or that > sort of thing, adding up realistic numbers will not overflow a 64-bit type. In fact cryptography is another example where unsigned wrap is useful, but it would be much more complex (and off topic) to draw an example of this (not that I claim to be an expert in this area) And no, just assuming that "adding up realistic numbers will not overflow a 64-bit type" is not what safe code is about. > uint_least64_t extlen = strlen (ext); > uint_least64_t pathlen = dirlen + filelen + extlen; > if (big_size_t > MAX_SANE_PATHLENGTH) return 0; You realize that this code is less efficient than the original one, don't you? And what would be the correct value for MAX_SANE_PATHLENGTH? Are you aware of the trouble that has been caused by Windows MAX_PATH? The example I posted achieves the same level of safety, using less resources, and allowing for the maximum string length that the system can /safely/ handle (don't miss the check after malloc). What more do you want? > There are times when unsigned wrapping overflow is useful. This is not > one of them. I suggest you read the code again (and its source - it is instructive) |
| Bonita Montero <Bonita.Montero@gmail.com>: Nov 06 03:53PM +0100 > You can say they wrote silly code for this example (I don't), but I > doubt there are many more knowledgeable people about this kind of > matter than them. Yes, this is useless code. |
| James Kuyper <jameskuyper@alumni.caltech.edu>: Nov 06 10:06AM -0500 On 11/6/19 9:38 AM, Manfred wrote: ... > And no, just assuming that "adding up realistic numbers will not > overflow a 64-bit type" is not what safe code is about. Assuming it: no. Verifying it: yes. If you validate your inputs, you can often place upper and lower limits on the value of an expression calculated from those inputs. If those limits fall within the range that is guaranteed to be representable in the expression's type, it is perfectly legitimate to not bothering to include an overflow check. |
| Paavo Helde <myfirstname@osa.pri.ee>: Nov 06 05:54PM +0200 On 6.11.2019 14:33, Manfred wrote: > memcpy (p + filelen + 1, ext, extlen + 1); > return path; > } Seriously? std::string make_pathname(const std::string& dir, const std::string& fname, const std::string& ext) { return dir + "/" + fname + "." + ext; } No need to check for any overflows. Not to speak about that there cannot be overflow in the first place because if pathlen overflows the three strings dir, fname and ext would not fit in the process memory anyway. Not to speak about that the time lost for a more explicit check for overflow would be zero or unmeasurable, compared to any file access itself, or even when compared to the malloc() call in the same function. |
| "Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Nov 06 05:00PM +0100 On 06.11.2019 15:38, Manfred wrote: > You can say they wrote silly code for this example (I don't), but I > doubt there are many more knowledgeable people about this kind of matter > than them. David has a point that with 32-bit `size_t` there's no way to have separate strings whose lengths sum to >= 4G. So some of the arguments have to point within the same superlong string in order for the checking to end up at `return 0;`. Whether it's silly to try to give well-defined behavior also for such an unlikely case: maybe silly when one just codes up something for limited use and with limited time, but probably not silly when one's crafting widely used library code. I.e. the context, what it's made for, "glibc", is important. However I think the appeal to authority, "glibc /maintainers/", is a fallacious argument. - Alf |
| Manfred <noname@add.invalid>: Nov 06 05:47PM +0100 On 11/6/2019 5:00 PM, Alf P. Steinbach wrote: >> matter than them. > David has a point that with 32-bit `size_t` there's no way to have > separate strings whose lengths sum to >= 4G. I should check the details (if I had the time and will to do it) but even if this is true for the physical memory address space, if I remember correctly the 386 has way larger virtual memory addressing space: it does have segmenting capability, even if most OSs never used it. I don't remember if it is possible for the 386 to address more than 4G within a single process, though. Theoretically it is nonetheless possible, using segments, to have a 32-bit architecture wherein the lengths sum up to more than 4G. More practically, the example was about code safety, and so the possibility of malicious usage has to be assumed, hence the need for the check (at least for the +3 part). > I.e. the context, what it's made for, "glibc", is important. > However I think the appeal to authority, "glibc /maintainers/", is a > fallacious argument. It would be if it was only an appeal to authority. After giving context (and yes, pointing out that this example was not just rubbish taken from some dump in the internet), in the followup my argument has gone into the subject of the matter. |
| Bonita Montero <Bonita.Montero@gmail.com>: Nov 06 05:54PM +0100 > I remember correctly the 386 has way larger virtual memory addressing > space: it does have segmenting capability, even if most OSs never used. > it. And there is an operating-system using the glibc in a segmented environment? > 4G within a single process, though. Theoretically it is nonetheless > possible, using segments, to have a 32-bit architecture wherein the > lengths sum up to more than 4G. I think it would be rather stupid to continue the segmented behaviour of the 286 protected mode with the 386 protected mode, although it is hypothetically possible. Also because the 32-bit-machnise almost never had more memory than 4GB. |
| Manfred <noname@add.invalid>: Nov 06 06:15PM +0100 On 11/6/2019 4:54 PM, Paavo Helde wrote: > return dir + "/" + fname + "." + ext; > } > No need to check for any overflows. How do you think that overflow check is done inside std::string? > Not to speak about that the time lost for a more explicit check for > overflow would be zero or unmeasurable, compared to any file access > itself, or even when compared to the malloc() call in the same function. What do you mean with "more explicit check for overflow"? Assuming you know the variables are 32-bit unsigned, I suppose you could do if (pathlen < 0xFFFFFFFF-filelen) { pathlen += filelen; } else { return 0; } and then repeat, but honestly I don't see the benefit of it compared to the above (as a first you are introducing a dependency on the specific integer size). Or you can cast to a wider type, but then you are not solving the problem, you are only moving it forward, and still I wouldn't see the benefit. Besides, this is about /integer/ overflow check, so it could apply to more computations other than memory size. |
| Bonita Montero <Bonita.Montero@gmail.com>: Nov 06 06:33PM +0100 > What do you mean with "more explicit check for overflow"? concatenating strings in C++ with the + opereator is reliable. |
| "Öö Tiib" <ootiib@hot.ee>: Nov 06 10:55AM -0800 On Wednesday, 6 November 2019 19:16:08 UTC+2, Manfred wrote: > > } > > No need to check for any overflows. > How do you think that overflow check is done inside std::string? All standard library writers are rather good programmers. Obviously they have something that is easy to read from afar that it does no way overflow. Likely it is some short inline member to call when size is supposed to grow that does the check: if (max_size() - size() < size_to_add) throw std::length_error(text_to_throw); Why don't you look into any of implementations in your computer? |
| Bo Persson <bo@bo-persson.se>: Nov 06 08:00PM +0100 On 2019-11-06 at 17:54, Bonita Montero wrote: > of the 286 protected mode with the 386 protected mode, although it is > hypothetically possible. Also because the 32-bit-machnise almost never > had more memory than 4GB. The original problem wasn't only about memory. The designers of Windows NT *did* briefly consider adding support for more than one 4GB segment in a program. However, to load a new segment you first have to swap the old 4GB segment out to disk. And they couldn't see PC hard disks ever becoming that large. :-) Bo Persson |
| Bonita Montero <Bonita.Montero@gmail.com>: Nov 06 08:13PM +0100 > The designers of Windows NT *did* briefly consider adding support for > more than one 4GB segment in a program. The actually have it today for a small in which the thread-information-block resides: https://en.wikipedia.org/wiki/Win32_Thread_Information_Block > However, to load a new segment you first have to swap the old 4GB > segment out to disk. That's not necessarily true. |
| "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Nov 06 12:03AM -0800 On 10/24/2019 9:02 PM, Ian Collins wrote: >> wait() family of functions to detect when a child dies. > Or simply use the platform's service management framework rather than > reinventing it! That works. There is usually a basic start/resume/pause/shutdown protocol wrt services. I am just fond of creating a little webserver for each main service. |
| queequeg@trust.no1 (Queequeg): Nov 06 01:16PM >>will the process be stopped before raise() returns? My test shows that >>yes, but I don't know if it's guaranteed or only a coincidence. > Yes, it is guaranteed. Ok, thanks. -- https://www.youtube.com/watch?v=9lSzL1DqQn0 |
| Bonita Montero <Bonita.Montero@gmail.com>: Nov 06 11:43AM +0100 I just found that alloca() with MSVC isn't that fast that it could be. alloca() calls a function called __chkstk which touches the pages down the stack to tigger Windows' overcommitting of stacks, That's while Windows is only able to allocate new pages to a stack when the pages are touched down the stack, i.e. you'll get a exception if you skip a page. So I came to the conclusion to write a little class that has a static internal buffer with two template-parameters: first the type of the internal static array and second the size of the array. The construc- tor takes a parameter which will be the final size of the container; if it is larger than the second template-parameter, an external array will be allocated via new T[N]. The allocation will have more overhead than an alloca(), but my idea is that if there are a larger number of entries the processing time on the entires will outweigh the allocation. I'm asking myself if there's a class in boost or another well-known classlib that implements the same pattern. So here's the code: #pragma once #include <cstddef> #include <utility> #include <stdexcept> #include <algorithm> template<typename T, std::size_t N> struct overflow_array { overflow_array( std::size_t n ); ~overflow_array(); T &operator []( std::size_t i ); T &front(); T &back(); T *data(); T *begin(); T *end(); T const *cbegin() const; T const *cend() const; void resize( std::size_t n ); private: T m_array[N]; T *m_external; T *m_begin, *m_end; }; template<typename T, std::size_t N> inline overflow_array<T, N>::overflow_array( std::size_t n ) { if( N <= n ) { m_external = nullptr; m_begin = m_array; return; } m_external = new T[n]; m_begin = m_external; m_end = m_external + n; } template<typename T, std::size_t N> inline overflow_array<T, N>::~overflow_array() { if( m_external ) delete []m_external; } template<typename T, std::size_t N> inline T &overflow_array<T, N>::operator []( std::size_t i ) { return m_begin[i]; } template<typename T, std::size_t N> inline T &overflow_array<T, N>::front() { return *m_begin; } template<typename T, std::size_t N> inline T &overflow_array<T, N>::back() { return m_end[-1]; } template<typename T, std::size_t N> inline T *overflow_array<T, N>::data() { return m_begin; } template<typename T, std::size_t N> inline T *overflow_array<T, N>::begin() { return m_begin; } template<typename T, std::size_t N> inline T *overflow_array<T, N>::end() { return m_end; } template<typename T, std::size_t N> inline T const *overflow_array<T, N>::cbegin() const { return m_begin; } template<typename T, std::size_t N> inline T const *overflow_array<T, N>::cend() const { return m_end; } template<typename T, std::size_t N> inline void overflow_array<T, N>::resize( std::size_t n ) { if( n <= N ) return; T *newExternal = new T[n]; copy( m_begin, m_end, newExternal ); delete []m_external; m_external = newExternal; m_begin = newExternal; m_end = newExternal + n; } |
| Bonita Montero <Bonita.Montero@gmail.com>: Nov 06 11:46AM +0100 > { > m_external = nullptr; > m_begin = m_array; m_end = m_array + n; |
| Paavo Helde <myfirstname@osa.pri.ee>: Nov 06 01:18PM +0200 On 6.11.2019 12:43, Bonita Montero wrote: > on the entires will outweigh the allocation. > I'm asking myself if there's a class in boost or another well-known > classlib that implements the same pattern. Looks like boost::container::small_vector: "https://www.boost.org/doc/libs/1_71_0/doc/html/boost/container/small_vector.html" |
| You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page. To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com. |
No comments:
Post a Comment