- We've won! - 20 Updates
- constexpr issues - 2 Updates
- Undefined Behaviour - 1 Update
- experimental multi-mutex algorithm... - 2 Updates
Chris Vine <chris@cvine--nospam--.freeserve.co.uk>: Nov 25 11:58PM On Sun, 25 Nov 2018 22:55:27 +0100 > On 25/11/2018 18:29, Chris Vine wrote: > > On Sun, 25 Nov 2018 16:28:28 GMT > > Melzzzzz <Melzzzzz@zzzzz.com> wrote: [snip] > same order without additional fences, barriers, or synchronisation > instructions), and volatile accesses can be significantly cheaper than > atomic accesses. Concurrent reads and writes on a volatile int will work in practice with single processor systems (although it would still be undefined behaviour according to the standard), and would also work on SMP systems if you don't want synchronization, but may or may not work for other scalars with some platforms/hardware: in particular volatile will only work in this way for scalars which are also atomic at the hardware level. std::atomic with relaxed memory ordering is standards compliant if you don't need synchronization, so why not use it instead of volatile? If volatile is capable of working for the scalar is question then std::atomic with relaxed ordering has no additional overhead and will behave the same. > multi-threading primitives or atomics (unless these are also volatile), > and their order is not guaranteed to match when viewed from different > threads. I was not saying that volatile has no use. It clearly has with respect to signals/interrupts and mmaping for example. I was making the point that it has no use with respect to variables accessed by more than one thread, and has no use in solving the reported "bug" because it has no defined behaviour in the standard for that case. If it solves the compiler "bug" it is a fluke and not a fix. [snip] > } > return res; > } Two of the cases "work" in the sense that two compiler bugs would be required for the wrong code to be emitted instead of just one, but those two in consequence suffer from double-synchronization. The other suffers from possible volatile pessimisation and if it suppresses the original "bug" it does so as a fluke. The point is that the original code that you are trying to "correct" also works and is C11/C++11 compliant (assuming the C/C++ memory model). In fact the better view is that it was also required to work in a conforming POSIX implementation: David Butenhof (author of the authorative "Programming With POSIX Threads" which most people have on their shelves) was of the opinion that POSIX required the code to work and that the reported issue was a gcc optimization bug. He was on (and I think also chairman) of the relevant POSIX committee. The overarching point is that any compiler which reorders memory operations so as to advance them from a position after an acquire operation on a C/C++ mutex to ahead of that acquire operation with respect to another synchronizing thread is buggy and non-compliant (it can only do it if it can prove that the reordering has no observable effect, which it cannot do here). > Without that, other threads will read the atomic data with either the > old value, or the new value - but not necessarily with the same ordering > amongst other data. Which is why having operations on a variable protected by both a mutex and by making the variable atomic is a bad idea. It is easier to get it wrong. Having a variable protected by a mutex which also has to be atomic is a big warning that there is some wrong with the code. The original "bug" code was correct and obviously so when viewed from the C11/C++11 perspective. There is no need to chase shadows. > unnecessary "doubling up". But too much protection is better than too > little protection - it is better to be a bit inefficient, than to have > the risk of race conditions. It does have unnecessary doubling. It uses a memory barrier/fence within code already protected by a mutex. And I repeat that the original code in question, with a plain count variable, has no risk of a race condition with C/C++ mutexes in the C/C++ memory model. |
"Chris M. Thomasson" <invalid_chris_thomasson@invalid.invalid>: Nov 25 04:18PM -0800 On 11/25/2018 2:18 PM, Chris M. Thomasson wrote: >>>>> On 2018-11-23, David Brown <david.brown@hesbynett.no> wrote: >>>>>> On 24/11/2018 00:12, Chris M. Thomasson wrote: >>>> <snip> [...] > this out, please try to read all, it _is_ an interesting thread: > https://groups.google.com/d/topic/comp.programming.threads/KepRbFWBJA4/discussion > Fwiw, I was SenderX. ;^) A most relevant post, a response to Scott: https://groups.google.com/forum/#!original/comp.programming.threads/KepRbFWBJA4/pg83oJTzPUIJ Notice the externally assembled functions? Ahhh, back in the day. Fwiw, the Scott Meyers in that thread can be found here: https://www.aristeia.com ;^) |
Chris Vine <chris@cvine--nospam--.freeserve.co.uk>: Nov 26 12:20AM On Sun, 25 Nov 2018 16:21:05 -0500 > assumption should be made about compiler memory access reordering across > POSIX mutexes; instead, correspondent gcc or gcc asm primitives should > be used. There is ambiguity about what section 4.12 of SUS requires with its "The following functions synchronize memory with respect to other threads". David Butenhof was of the view that the code in question was required to work in a conforming POSIX implementation, which imposed requirements going beyond C90/99 itself. But that is academic because we now have C11 and C++11 mutexes and memory model. In my view it is entirely clear that the code must work in C++11. In fact I cannot see how anyone could argue otherwise since that would undermine an ordinary and basic use case of mutexes. It is axiomatic that where a mutex makes a series of expressions such as a test and an increment mutually atomic by placing them between an acquire and a release of the mutex (and so with mutual exclusion), the compiler must not make the expressions mutually non-atomic for another synchronizing thread by moving one or more of the memory operations ahead of the acquire operation unless that would have no observable effect. I think you agree with this. I refer to my earlier posting in which I said amongst other things: 'Non-normatively, for mutexes this is offered by §1.10/5 of C++11: "Note: For example, a call that acquires a mutex will perform an acquire operation on the locations comprising the mutex. Correspondingly, a call that releases the same mutex will perform a release operation on those same locations. Informally, performing a release operation on A forces prior side effects on other memory locations to become visible to other threads that later perform a consume or an acquire operation on A." The normative (and more hard-to-read) requirement for mutexes is in §30.4.1.2/11 and §30.4.1.2/25 ("synchronizes with") read with §1.10/11 and §1.10/12 ("happens before") and §1.10/13 ("visible side effect") of C++11.' |
Chris Vine <chris@cvine--nospam--.freeserve.co.uk>: Nov 26 12:38AM On Mon, 26 Nov 2018 00:13:48 +0100 David Brown <david.brown@hesbynett.no> wrote: [snip] > acquires_count = tmp; > That optimisation does not in any way break the requirements of > mtx_trylock() being an acquire operation. David, I don't want to appear to be piling on here because I think you are in a minority position, but I must disagree with your conclusion. In the code in question: int trylock() { int res; res = pthread_mutex_trylock(&mutex); if (res == 0) ++acquires_count; return res; } 'res' is a local variable and only has an inter-thread ordering relationship with respect to the increment of acquires_count if the trylock succeeds. In the case where the trylock does not succeed, in the test of 'res' the compiler cannot substitute a bogus value for 'res' in that test. That would break C++11 requirements for sequential ordering _within_ any one thread. Issues of inter-thread ordering don't arise. If the trylock succeeds the lock is taken, and so with C/C++11 mutexes no inter-thread reordering can occur either: the necessary "happens before" relationship exists for that case. |
"Rick C. Hodgin" <rick.c.hodgin@gmail.com>: Nov 25 04:48PM -0800 On Sunday, November 25, 2018 at 5:31:53 PM UTC-5, Mr Flibble wrote: > then you do NOT need to also use atomics. Atomics are for LOCK-FREE designs. > Read the .. standard Mr Brown. > /Flibble There's a way to help people without demeaning them, or yourself, in the process. You can help people AND lift them up / encourage them at the same time. I hope you don't have children, Leigh, or if you do you don't address them with such discourse. -- Rick C. Hodgin |
Pavel <pauldontspamtolk@removeyourself.dontspam.yahoo>: Nov 25 07:52PM -0500 Jorgen Grahn wrote: > Any compiler/library combination which claims to support pthreads > makes sure to put a suitable fence/barrier/whatever-it's-called > at a mutex lock, because anything else would be suicide. I am unfamiliar with this convention specifically but my guess is that it might partly come from the fact that old POSIX mutex lock implementations would always result in SVC call and in general compilers are not supposed to optimize across the function call if they don't have access to its code -- due to potential side effects of the call; but they can optimize across inline/intrinsic/otherwise known code. More modern implementations of POSIX mutex (e.g. in NPTL, via futexes) can lock uncontested mutex without sysem call, just at library level so the code can be inlined and be accessible to compiler (disclaimer: I did not check if this is the case for the original issue). >> be used. > Just to clarify, are you saying most software which uses POSIX > multithreading is broken, To be more specific: if some code that relies on successful locking of a POSIX mutex for ordering access to non-volatile shared variable, such code *may* be broken (specifically, it is broken if compiler optimized access to that variable in some unfortunate manner -- which I believe it has right to do; and prohibiting such optimization globally sounds like a serious pessimization). > since it (in my experience) rarely inserts > its own fences? I have usually inserted explicit memory clobbering using GCC __asm__ instruction in my Linux multithreaded code. Never done it on Solaris or AIX though as did not use gcc there (but that was ages ago and Solaris threading was badly broken regardless...) > theoretical problem, triggered by a "suicidal" compiler/library > combination. Partly because multithreaded software tends to be > subtly broken anyway.) I agree. |
Pavel <pauldontspamtolk@removeyourself.dontspam.yahoo>: Nov 25 08:04PM -0500 Chris Vine wrote: >> be used. > There is ambiguity about what section 4.12 of SUS requires with its "The > following functions synchronize memory with respect to other threads". I am of the opinion that synchronizing "memory" was a requirement to actual computer memory but not C compiler's variables. IMHO it is a serious pessimization to globally prohibit optimization in such cases especially as the explicit tool to disable access order optimization in ionteresting cases (e.g. volatile) is readily available. > memory model. In my view it is entirely clear that the code must > work in C++11. In fact I cannot see how anyone could argue otherwise > since that would undermine an ordinary and basic use case of mutexes. I don't think it would (see above, one has all tools to write correct code and not disable optimization at the same time). The code with pthread_mutex_lock still does not have to work as it's not in scope of C++11 mutex. Further C++11 mutex can be easily implemented via pthread mutex by adding explicit compiler fence in the implementation. > synchronizing thread by moving one or more of the memory operations > ahead of the acquire operation unless that would have no observable > effect. I think you agree with this. Yes -- and for that to be true, the C++11 standard had to tell this explicitly somewhere between 1.8 and 1.10 |
"Chris M. Thomasson" <invalid_chris_thomasson@invalid.invalid>: Nov 25 05:17PM -0800 On 11/22/2018 12:48 PM, Chris M. Thomasson wrote: > Well, there was a problem with a GCC optimization that broke a POSIX > mutex. It was an old post on comp.programming.threads: > https://groups.google.com/d/topic/comp.programming.threads/Y_Y2DZOWErM/discussion Fwiw, iirc, Greg Herlihy in that thread is the brother of Maurice Herlihy, the author of the following book: https://www.amazon.com/Art-Multiprocessor-Programming-Revised-Reprint/dp/0123973376 There is a post on comp.programming.threads where Greg says Maurice is his brother. Will try to find it. Dave Butenhof is in there as well. This thread is bringing back many old memories: Thanks to David Brown. :^D |
Chris Vine <chris@cvine--nospam--.freeserve.co.uk>: Nov 26 01:26AM On Mon, 26 Nov 2018 00:38:31 +0000 > 'res' in that test. That would break C++11 requirements for sequential > ordering _within_ any one thread. Issues of inter-thread ordering > don't arise. [snip] And to more completely answer your point for the case of a trylock not succeeding (in case I wasn't clear enough), I don't think in optimizing for that case the compiler can create a side effect by writing to the global variable otherwise protected by a mutex (in this case, acquires_count), even if it thinks it is the "right" value. That's close saying that in a multi-threaded program any thread of execution can use a global variable it can see but wouldn't othewise use as a temporary scratchpad for its own use (restoring the "right" value on finishing with the variable) on the grounds that it is entitled to ignore the fact that the global variable may be in use by other threads. Even your memory barrier solution wouldn't be a defence against such an "optimization". That would make writing any kind of successful multi-threaded program impossible. |
"Chris M. Thomasson" <invalid_chris_thomasson@invalid.invalid>: Nov 25 07:41PM -0800 On 11/18/2018 7:47 PM, Pavel wrote: > people and hurt performance :-( . >> It is very minimalist, but it works for what I need it to work for. >> :) Big time: Programs can definitely can modify the vtable during runtime. The vtable pointer member in struct device goes to a pointer to a const struct device_vtable: ___________________ struct device_vtable { struct object_prv_vtable const object; struct device_prv_vtable const device; }; struct device { struct device_vtable const* vtable; }; ___________________ This can be mutated without complaint from the compiler, indeed. Very tasty... Try not to drink the poison, but the other bottles taste great! ;^) |
"Chris M. Thomasson" <invalid_chris_thomasson@invalid.invalid>: Nov 25 07:44PM -0800 On 11/20/2018 10:16 AM, Jorgen Grahn wrote: > const char* const b = a + v.size(); > while(a != b) ... > // work on [a, b) Looks Kosher to me Jorgen. :^) [...] |
Pavel <pauldontspamtolk@removeyourself.dontspam.yahoo>: Nov 26 12:31AM -0500 Chris Vine wrote: > (amongst other similar operations) that "The following functions > synchronize memory with respect to other threads". In practice posix > mutexes behave identically to C/C++ mutexes. Note that none of the above mentions variables. Difference between synchronizing memory locations and variables is akin to the difference between hard and soft memory fences, respectively. C++11 addresses these whereas POSIX has no notion of a soft memory fence or equivalent. |
Pavel <pauldontspamtolk@removeyourself.dontspam.yahoo>: Nov 26 12:50AM -0500 Chris M. Thomasson wrote: >>> It is very minimalist, but it works for what I need it to work for. >>> :) > Big time: Programs can definitely can modify the vtable during runtime. That's what I said. C-style virtual tables (or function tables in objects themselves) can be modified at will. C++ "native" virtual tables or pointers to them cannot (except that the pointers are modified when it's not useful -- namely between constructor and destructor calls in class hierarchies) The vtable pointer member in struct device goes to a pointer to a const |
Jorgen Grahn <grahn+nntp@snipabacken.se>: Nov 26 06:35AM On Sun, 2018-11-25, Paavo Helde wrote: > Just curious: do you think it is just so because the multithreaded > software tends to be more complex, or do you have any deeper insights > about multithreading to back up this claim? Sorry. It was just the old "writing and maintaining threaded software is hard, and a lot of people get it wrong". /Jorgen -- // Jorgen Grahn <grahn@ Oo o. . . \X/ snipabacken.se> O o . |
"Chris M. Thomasson" <invalid_chris_thomasson@invalid.invalid>: Nov 25 10:42PM -0800 On 11/25/2018 10:35 PM, Jorgen Grahn wrote: >> about multithreading to back up this claim? > Sorry. It was just the old "writing and maintaining threaded software > is hard, and a lot of people get it wrong". Imho, it would be down right foolish to disagree with that line of thought Jorgen. It can be a royal pita, indeed: Ouch. Ever tried to juggle a shi%load of chainsaws with fully cycling chains? ;^o |
David Brown <david.brown@hesbynett.no>: Nov 26 08:25AM +0100 On 26/11/2018 00:58, Chris Vine wrote: > other scalars with some platforms/hardware: in particular volatile will > only work in this way for scalars which are also atomic at the hardware > level. Yes. Contrary to popular ideas, volatile /can/ be useful in multi-threaded systems. But it has severe limitations, and can only be useful in certain circumstances. There are large classes of application - especially in the embedded world - where you know without doubt that you are dealing with a single processor core, and you know what size of data can be handled atomically. And here, careful use of "volatile" can do the same job "std::atomic" would do - but much more efficiently. > volatile? If volatile is capable of working for the scalar is question > then std::atomic with relaxed ordering has no additional overhead and > will behave the same. That is certainly a possibility. A feature that "volatile" has, which atomics do not, is that you can have volatile accesses to non-volatile data. This gives added flexibility. > thread, and has no use in solving the reported "bug" because it has no > defined behaviour in the standard for that case. If it solves the > compiler "bug" it is a fluke and not a fix. Volatile can also be used to "limit optimisation". In this particular example, it can be used to ensure there are no speculative writes to acquires_count. > The point is that the original code that you are trying to "correct" > also works and is C11/C++11 compliant (assuming the C/C++ memory > model). The original code was not C11/C++11, so did not follow those memory models. But if C11/C++11 memory models disallow speculative writes like this, then I agree that the original code was correct. > respect to another synchronizing thread is buggy and non-compliant (it > can only do it if it can prove that the reordering has no observable > effect, which it cannot do here). Certainly additional standards, such as POSIX, can have additional requirements beyond those of the C or C++ standards. Not all C and C++ programming is POSIX, however. > atomic is a big warning that there is some wrong with the code. The > original "bug" code was correct and obviously so when viewed from the > C11/C++11 perspective. There is no need to chase shadows. The original code was wrong when viewed from a pre-C11/C++11 perspective - even if it is correct under C11/C++11, it certainly is not /obvious/ that it is correct. |
David Brown <david.brown@hesbynett.no>: Nov 26 10:19AM +0100 On 26/11/18 02:26, Chris Vine wrote: >> David, >> I don't want to appear to be piling on here because I think you are in >> a minority position, but I must disagree with your conclusion. Chris, I appreciate your posts here. You are explaining the issues here - I have learned a good deal since you entered this discussion, certainly more than from all of Chris T.'s "It works. Period." posts. And I think I have now figured out what I was missing in my understanding, and thus got things wrong in my conclusions. It is a little note in the description of multi-threaded execution in the standards (I'm taking it from C11, but there is an almost identical note in C++11): """ NOTE 13 Compiler transformations that introduce assignments to a potentially shared memory location that would not be modified by the abstract machine are generally precluded by this standard, since such an assignment might overwrite another assignment by a different thread in cases in which an abstract machine execution would not have encountered a data race. """ This says that under C11/C++11, but /not/ under previous standards, the compiler is not allowed to turn: if (res == 0) { ++acquires_count; } into acquires_count += (res == 0); The issue is not about how the mutex works or how it affects ordering of other operations - it is about the validity of such speculative writes. They were allowed before, and some compilers took advantage of that on some targets, in order to avoid branches - but they are no longer allowed. A different point that has made this a longer learning process than might otherwise have been the case, is that the standards (I've used C11 and C++14) are, IMHO, quite unhelpful here. The sections on mutexes in the library descriptions don't give any indication that their operations are fences, and the small notes under "multi-threaded execution" do not make it at all clear whether they are general fences, or merely acquire/release operations on the memory used to store the mutex itself. I am not willing to accept the argument "obviously acquiring a mutex is an acquire fence" - I want to see what the standards require and guarantee. > for that case the compiler can create a side effect by writing to the > global variable otherwise protected by a mutex (in this case, > acquires_count), even if it thinks it is the "right" value. I can't see any concept of "variable protected by a mutex" as part of the standard. What I see is: """ For example, a call that acquires a mutex will perform an acquire operation on the locations composing the mutex. Correspondingly, a call that releases the same mutex will perform a release operation on those same locations. Informally, performing a release operation on A forces prior side effects on other memory locations to become visible to other threads that later perform an acquire or consume operation on A. """ The later note (which I quoted earlier) is what stops the speculative write, not any "protection by a mutex". > ignore the fact that the global variable may be in use by other > threads. Even your memory barrier solution wouldn't be a defence > against such an "optimization". Agreed. My memory barrier solution protects against some movement of the write, but not against such "non-observable behaviour in C99" writes. > That would make writing any kind of successful multi-threaded program > impossible. I think up to C11/C++11, multi-threaded programming has relied on implementations being helpful rather than because the C standards guarantees how they work. (It may be that other standards, such as POSIX, /do/ provide the guarantees - I don't work with POSIX programming, and am not familiar with its details.) I am not convinced that C11/C++11 provide all the guarantees needed either - but I am now convinced that they are a good step along the way, and that C11/C++11 implementations work as programmers would expect. |
Paavo Helde <myfirstname@osa.pri.ee>: Nov 26 12:02PM +0200 On 26.11.2018 8:35, Jorgen Grahn wrote: >> about multithreading to back up this claim? > Sorry. It was just the old "writing and maintaining threaded software > is hard, and a lot of people get it wrong". That's true, but there have been definite improvements in the multi-threading interfaces over time (first in Boost, later adapted into std) encouraging a safe way to use these features correctly. I am meaning things like the condition_variable wait taking a reference to the lock, as well as the predicate to wait for. This is a huge step forward compared to pthread_cond_wait whose documentation just says that calling it on an unlocked mutex is UB, and just contains a lengthy discussion about checking the predicate, instead of having it in the interface. Also using something like std::shared_ptr for holding the shared data objects makes the life much easier as a std::shared_ptr does not exist before the completion of the constructor or after the start of the destructor of the most derived object. This eliminates a whole class of subtle bugs related to accessing half-constructed or half-destroyed objects. |
scott@slp53.sl.home (Scott Lurndal): Nov 26 02:23PM >"The following functions synchronize memory with respect to other threads." >Define synchronize? At least C++11/C11 defines it as an acquire-release >relationship. ;^) Although POSIX 1003.4 predated C++11/C11 by over a quarter century, do keep that in mind. It was certainly well understood by the 1003.4a working group exactly what that text meant at that time (weak memory models and out-of-order execution were just being introduced at the time). |
scott@slp53.sl.home (Scott Lurndal): Nov 26 02:24PM >> about multithreading to back up this claim? >Sorry. It was just the old "writing and maintaining threaded software >is hard, and a lot of people get it wrong". And far more of them get it right. |
wyniijj@gmail.com: Nov 25 09:01PM -0800 Recently, I am learning to use constexpr keyword. Some questions arouse in a test program t.cpp: 1: The impl. of cslen(const char*) uses ::strlen, g++ seems to accept it as 'constexpr' (std::strlen is not). Is the impl. of cslen correct? 2: As the diagnosis shown, "Token::Pi.begin()" seemd not recognized as a constexpr, why and how to correct it? //--------------------------------------------------- #include <iostream> #include <cstring> #define ENDL "\n" constexpr size_t cslen(const char* s) { return ::strlen(s); }; class Str { const char *m_str; size_t m_len; public: constexpr Str(const char* s) : m_str(s), m_len(cslen(s)) {}; constexpr size_t len() const { return m_len; }; constexpr const char* begin() const { return m_str; }; }; static constexpr char S1[cslen("ad")]={'s',0}; static constexpr Str S2{"S2..."}; struct Token { static constexpr Str Pi{"Pi"}; }; int main() { std::cout << S1 << ENDL; std::cout << S2.len() << ", " << S2.begin() << ENDL; std::cout << Token::Pi.begin() << ENDL; return 0; } //--------------------------------------------------- >g++ --version g++ (GCC) 8.2.1 20181011 (Red Hat 8.2.1-4) Copyright (C) 2018 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. >g++ t.cpp /usr/bin/ld: /tmp/ccFBOjNS.o: in function `main': t2.cpp:(.text+0x6f): undefined reference to `Token::Pi' collect2: error: ld returned 1 exit status 52,40 Bot |
"Öö Tiib" <ootiib@hot.ee>: Nov 26 05:00AM -0800 > This is free software; see the source for copying conditions. There is NO > warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. > >g++ t.cpp You need to tell g++ that you want to use C++17 features (-std=c++17). g++ -std=c++17 t.cpp Seems to work. Also it is usually worth to ask it to perform a bit more diagnostics than it does by default. At least -Wall and -Wextra and perhaps -pedantic. So: g++ -std=c++17 -Wall -Wextra -pedantic t.cpp You will get warning about redundant ; |
"Öö Tiib" <ootiib@hot.ee>: Nov 26 04:31AM -0800 On Friday, 23 November 2018 18:14:25 UTC+2, Tim Rentsch wrote: > what you mean - not just what your conclusions are but how > you got to them. Otherwise I'm not sure what you're trying > to say. I have already said all my arguments upstream and so are apparently you so I suggested to agree to disagree couple posts ago. > you seem to be saying that the standard imposes a requirement > if it imposes a requirement. The question is does the standard > impose a requirement or not? I indeed do not understand how what I wrote can be perceived as some sort of gibberish. > As you pointed out Annex B is informative, not normative. Also > the second paragrah says "[T]hese quantities are only guidelines > and do not determine compliance." The very subject of our conversation were limitations of executive environment, particularly size of call depth and automatic storage and if exceeding these limitations is undefined behavior or not. My position is that standard says that exceeding these limitations causes undefined behavior. I deduce it from two things 1) it does not impose requirements to behavior program that exceeds limitations and 2) behavior to what standard does not impose requirements is defined as undefined behavior. > parentheses, does that mean a statement like > return ((((1)))); > has undefined behavior? I would except such implementation to handle it as ill-formed since the condition is easy to diagnose. But that is just what I feel like common sense. > behavior an implementation-dependent property? Note the footnote > to section 4.1 p2.1 > "Correct execution" can include undefined behavior, [...]" "depending on the data being processed" > being undefined are orthogonal conditions, not that one is a > subset of the other. Do you agree with that implication? If > not can you say why? I interpret it so that executed program with some input data may have undefined behavior but with other input data its behavior may be defined. For example if the limits of automatic storage of executing environment are exceeded or not can depend on data that is processed. |
"Chris M. Thomasson" <invalid_chris_thomasson@invalid.invalid>: Nov 25 10:19PM -0800 This code allows a thread to dynamically lock multiple addresses in a totally atomic fashion, and free of deadlock. It is based on hashing pointers into a table of locks. A thread can create a little array of hashed pointers, sort them, remove duplicates, then take all of the locks in the table. The sorting and uniqueness factor prevents deadlock. _________________ // pseudo-code mtx_table lock_table; // some vars int a = 3; short b = 6; double x = 1.234; // a thread { locker locks(lock_table); // our hashed locks locks.push(&a); locks.push(&x); locks.push(&b); locks.lock(); // a, x, and b are all locked! locks.unlock(); } _________________ Here is my code, without using threads for now. Take a look at the main function for a real example on how to use it. The code is crude, please try to take some pity... _____________________________ // Experimental Multi-Mutex Algorithm By Chris. M. Thomasson #include <iostream> #include <functional> #include <algorithm> #include <mutex> #include <thread> #include <cassert> #include <cstddef> #include <cstdint> #include <vector> // Allows one to lock multiple addresses // at once, and never hit a deadlock. // It is an experiment. namespace ct_multex { // A table of locks struct mutex_table { std::vector<std::mutex> m_locks; mutex_table(std::size_t size) : m_locks(size) { assert(size > 0); } // totally contrived simple hash... std::size_t hash(void const* ptr) { std::uintptr_t ptr_uint = (std::uintptr_t)ptr; return (std::size_t)(((ptr_uint << 9) * 103) % m_locks.size()); } }; // A threads local indices into a table of locks struct local_locks { std::vector<std::size_t> m_lock_idxs; mutex_table& m_mtx_tbl; local_locks(mutex_table& mtx_tbl) : m_lock_idxs(), m_mtx_tbl(mtx_tbl) {} // Add an address to be locked void push_ptr(void const* ptr) { std::size_t idx = m_mtx_tbl.hash(ptr); m_lock_idxs.push_back(idx); } // Deadlock free baby! void ensure_locking_order() { // sort and remove duplicates std::sort(m_lock_idxs.begin(), m_lock_idxs.end()); m_lock_idxs.erase(std::unique(m_lock_idxs.begin(), m_lock_idxs.end()), m_lock_idxs.end()); } // Take all of the locks void lock() { // there can be a flag to minimize this... ensure_locking_order(); std::size_t n = m_lock_idxs.size(); for (std::size_t i = 0; i < n; ++i) { m_mtx_tbl.m_locks[m_lock_idxs[i]].lock(); } } // Unlock everything void unlock() { std::size_t n = m_lock_idxs.size(); for (std::size_t i = 0; i < n; ++i) { m_mtx_tbl.m_locks[m_lock_idxs[n - i - 1]].unlock(); } } }; // RAII scoped lock: Allows a thread to actualy take the locks // It locks a threads local lock indices struct scoped_lock { local_locks& m_locks; scoped_lock(local_locks& locks) : m_locks(locks) { m_locks.lock(); } ~scoped_lock() throw() { m_locks.unlock(); } }; } int main(void) { { // Our lock table ct_multex::mutex_table mtx_tbl(42); int data_0 = 123; long data_1 = 456; short data_2 = 789; // Usage... { // Our threads local locks ct_multex::local_locks locks(mtx_tbl); // Add some addresses locks.push_ptr(&data_1); locks.push_ptr(&data_0); { ct_multex::scoped_lock slock(locks); // data_1 and data_0 are locked! } // unlocked... Add data_2 to the locks... locks.push_ptr(&data_2); { ct_multex::scoped_lock slock(locks); // Now, data_1, data_0 and data_1 are locked! } } } return 0; } _____________________________ Will post more info sometime tomorrow. |
"Chris M. Thomasson" <invalid_chris_thomasson@invalid.invalid>: Nov 25 10:36PM -0800 On 11/25/2018 10:19 PM, Chris M. Thomasson wrote: > pointers into a table of locks. A thread can create a little array of > hashed pointers, sort them, remove duplicates, then take all of the > locks in the table. The sorting and uniqueness factor prevents deadlock. [...] > try to take some pity... > _____________________________ > // Experimental Multi-Mutex Algorithm By Chris. M. Thomasson [...] > { > ct_multex::scoped_lock slock(locks); > // Now, data_1, data_0 and data_1 are locked! ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Error in a comment. Funny. I meant: // Now, data_1, data_0 and data_2 are locked! Sorry about that. Fwiw, there are three addresses to be locked here, data_[0...2]. However, there might be less locks in a threads local lock vector because of hash collisions; everything is sorted and duplicates are always removed. Think if data_0 hashed to index 41, data_1 went to 3, and data_2 went back to 41. Humm... We have 41, 3, 41 After a call to my ct_multex::local_locks::ensure_locking_order... It looks like: 3, 41 So, only two locks for three addresses. The hash function can be a lot better. ;^o [...] |
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page. To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com. |
No comments:
Post a Comment