- std::atomic<std::shared_ptr<T>>... - 9 Updates
- Onwards and upwards - 7 Updates
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Feb 04 08:22PM -0800 On 2/3/2021 12:00 AM, Marcel Mueller wrote: > I am not 100% sure about the UB. std::atomic<uintptr_t> might solve the > UB problem with stolen bits. I did not test this, because with a defined > set of target platforms there was no UB. std::atomic<uintptr_t> just might do it, and also allow me to give it a go in Relacy. I should have some free time in a day or two. I need to _really_ dig into your: /// Strongly thread safe read T* acquire() volatile const; function instead of skimming. That's where a lot of the "magic" happens. If I am reading this correctly, there is a place where you need strong CAS. Wrt C++ and its weak and strong CAS variants: InterlockedCxc in line 281: if (InterlockedCxc((volatile unsigned*)&Data, old_outer, new_outer) != old_outer && new_outer) // Someone else does the job already => undo. InterlockedAdd(&((T*)new_outer)->access_counter(), outer_count); // The global count cannot return to zero here, because we have an active reference. Afaict, this will not work with weak CAS. Need to dig into it because its interesting. Fwiw, in my experimental proxy collector, I was very happy that I eliminated the need for any CAS. https://pastebin.com/raw/p1E9WN5i >>> boost::intrusive_ptr prevented strong thread safety. A reference >>> counter need to be exposed as atomic<some_integer> somehow for strong >>> thread safety. Basically, the inner count needs to be somehow transferred to the outer count. Using a loopless wait-free method tends to perform better than a CAS. > programming significantly. Less race conditions, no deadlocks, no > priority inversion. Basically it is required for anything pointer like > object to fit into a std:atomic<T>. Well, its basically required for a thread/process to be able to take a reference to something that it did not previously have a reference to. > the more effectively deduplication compresses it. > Without strong safety any read access needs to be protected by a lock. > This is almost impossible without introducing a bottleneck. Have you ever messed around with RCU? Pretty damn nice. A proxy collector can provide a sort of "poor mans" rcu. A proxy collector can be created from any strong-thread safe method. Ideally lockfree, even better when its loopless, and waitfree. > with an in memory DB anyway.) > The efficiency was so high that we decided to keep the full change > history of ten years. It just takes only little additional resources. Excellent! > :-) > More likely they simply discard google groups in the future. There is no > profit in this service. Iirc, that the reason why they totally murdered Google+. Damn. |
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Feb 04 09:09PM -0800 On 2/3/2021 12:00 AM, Marcel Mueller wrote: >>> performs even better on 64 bit because you can safely use 3 bits. I >>> am unsure if it can be implemented without introducing UB - although >>> it is likely to work on any real hardware. [...] Humm... I did a another proxy collector, old and nowhere on the net now since Comcast killed their webhosting, that used stolen bits for the collectors. Was worried about overflowing them. Wrt, Line 273 in your code: const unsigned old_outer = InterlockedXad((volatile unsigned*)&Data, 1) + 1; const unsigned outer_count = old_outer & INT_PTR_COUNTER_MASK; ASSERT((old_outer & INT_PTR_COUNTER_MASK) != 0); // overflow condition The assert is here for that right? What am I missing? The fun part about a proxy collector is that it is not a general purpose smart pointer. Instead its more of a poor mans rcu where we can steal a lot of bits by aligning a collector on a large boundary. If the number of threads that take concurrent access to a strong reference exceeds the stolen bits at any one time, then overflow can and will occur. This condition will intrude on the actual pointer part, corrupting things. Well, that's the way it was with my old code using stolen bits as a reference count. Please correct me if I am missing something from your acquire function. Thanks. |
Marcel Mueller <news.5.maazl@spamgourmet.org>: Feb 05 07:43AM +0100 Am 05.02.21 um 05:22 schrieb Chris M. Thomasson: > function instead of skimming. That's where a lot of the "magic" happens. > If I am reading this correctly, there is a place where you need strong > CAS. Wrt C++ and its weak and strong CAS variants: I never heard about a weak CAS primitive before. But yes the code relies on strong CAS. This is essential to be loop free. And you should know that the code was exclusively intended for the x86 platform. x86 has very little (or no?) ability of memory reordering. So there are no further memory barriers required when doing simple atomic reads and writes. std::atomic should be more sophisticated. > Afaict, this will not work with weak CAS. Need to dig into it because > its interesting. Fwiw, in my experimental proxy collector, I was very > happy that I eliminated the need for any CAS. CAS is not bad unless it is a CAS /loop/. Of course, if read-modify-write memory cycles of the hardware can be used this is even better. > Basically, the inner count needs to be somehow transferred to the outer > count. Using a loopless wait-free method tends to perform better than a > CAS. It is a long time ago since I wrote this code, but AFAIR it is loopless. The number of atomic operations per access was limited to exactly 3. Other approaches may need only 2, but this will not work with only 2 stolen bits. [strong thread safety] > Well, its basically required for a thread/process to be able to take a > reference to something that it did not previously have a reference to. And this is likely to happen almost every time when you deal with an atomic instance of some referencing object. > Have you ever messed around with RCU? Pretty damn nice. I shortly heard the first time of RCU and only read a few articles. AFAICS it just delays deletions a bit to keep memory access valid. This sound a bit like Thread.Sleep or setTimeout solutions. From my experience this kind of solutions mainly move the bug from the developer to the customer. Grace periods are always too short to prevent corruption under some special cases with extreme load peaks, e.g. with swapping and I/O delays involved or maybe some DDOS attack (or just children trying to connect to their school :-) Well, at some point we need to note that computers also just happen to work from the quantum mechanics point of view. So probability /is/ a valid approach. But probabilities of concurrency issues mainly scale with the number of coincidences required to trigger them not with the time taken for the step. A single thread may always block for an unknown reason at any point. But it is rather unlikely that this happens to many of them at the same instruction at the same time. If you reduce RCU by just versioning then I have probably used it for decades. Atomic pointer updates and immutable data structures were the core of the in memory database application I mentioned. But I did not use any grace period. The PM123 application where the code link came from also uses this pattern for all internal strings. They are immutable and deduplicated. > collector can provide a sort of "poor mans" rcu. A proxy collector can > be created from any strong-thread safe method. Ideally lockfree, even > better when its loopless, and waitfree. I think I did not really catch the intended use case of your proxy collector for now. Marcel |
Marcel Mueller <news.5.maazl@spamgourmet.org>: Feb 05 08:25AM +0100 Am 05.02.21 um 06:09 schrieb Chris M. Thomasson: > const unsigned outer_count = old_outer & INT_PTR_COUNTER_MASK; > ASSERT((old_outer & INT_PTR_COUNTER_MASK) != 0); // overflow condition > The assert is here for that right? What am I missing? :-) Since OS/2 is a 32 bit platform and I did not want to introduce further alignment I have only two bits for the outer count. So only 3 threads can safely acquire the pointer at the same time. The 4th one will cause the counter to return to 0 - an overflow. But it is not that bad. All I have to do is to clear the counter as fast as possible. So the transfer of the counter value is done immediately after the pointer is acquired. The main reference counter is incremented by the snapshot value of the small counter first. And if resetting the counter to zero fails another thread already did the job an the increment of the main ref count is undone. The assertion never triggered. I tested really hard with ~100 threads each hammering the same instance in an infinite loop. In fact the counter never reached the value 3. 2 was sufficient in real life. This is the price for the stolen bits hack. The platform did not support 64 bit CAS because some (supported) CPUs did not implement the hardware instruction cmpxchg8b. The performance impact of the immediate transfer is small. Most likely any additional atomic access to the same counters are L1 cache hits. By the way, I could bet 3 stolen bits on 64 bit architectures will also be sufficient for modern hardware with many cores. Well, I never tested on SPARC T3 with hundreds of hardware threads or something like that. But the cache line hopping effectively prevents too many threads to be likely suspended after the first increments. Clearing the counter is amortized faster that incrementing it. That's enough to get no cumulative effect. The number of threads has no significant impact on the maximum counter value. > will occur. This condition will intrude on the actual pointer part, > corrupting things. Well, that's the way it was with my old code using > stolen bits as a reference count. The latter point I avoided by clearing the counter fast enough. > Please correct me if I am missing something from your acquire function. > Thanks. Absolutely not. It just did not happen ;-) Marcel |
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Feb 05 11:52AM -0800 On 2/4/2021 10:43 PM, Marcel Mueller wrote: >> need strong CAS. Wrt C++ and its weak and strong CAS variants: > I never heard about a weak CAS primitive before. But yes the code relies > on strong CAS. This is essential to be loop free. Indeed. The "problem" we both have with loopfree is an architecture that uses LL/SC. x86 has pessimistic atomic RMW's where a CAS failure, actually means that it failed because the destination was not equal to the comparand. So one can use a CAS for loopfree algorithms. However... On a system with optimistic RMW's means that a compare_exchange_weak can spuriously fail. This also means that compare_exchange_strong is going to use a loop, internally. It's simply not our fault. :^) > platform. x86 has very little (or no?) ability of memory reordering. So > there are no further memory barriers required when doing simple atomic > reads and writes. std::atomic should be more sophisticated. Completely understood. Well, actually... x86 does require an explicit membar in the form of MFENCE or a LOCK'ed RMW to implement certain algorithms. One example is the hazard pointer algorihtm. It depends on a store followed by a load to another location. This can be reordered on an x86. Are you familiar with hazard pointers? > CAS is not bad unless it is a CAS /loop/. > Of course, if read-modify-write memory cycles of the hardware can be > used this is even better. Yup. I have wrote about this in the past about how loopless algorithms can tend to lose their "luster" when implemented on an arch that uses optimistic RMW's, like when CAS is implemented in terms of LL/SC. > The number of atomic operations per access was limited to exactly 3. > Other approaches may need only 2, but this will not work with only 2 > stolen bits. Your algorihtm is loopless on x86, and SPARC, and on some others. However, on a system with LL/SC, not so much. This is just the way it is. Shit happens. >> reference to something that it did not previously have a reference to. > And this is likely to happen almost every time when you deal with an > atomic instance of some referencing object. It basically depends on the program, and how it accesses things. Some programs simply do not require strong thread safety. > sound a bit like Thread.Sleep or setTimeout solutions. From my > experience this kind of solutions mainly move the bug from the developer > to the customer. What bug? RCU uses nothing like a sleep or timeout. > special cases with extreme load peaks, e.g. with swapping and I/O delays > involved or maybe some DDOS attack (or just children trying to connect > to their school :-) RCU's not really "ideal" for data-structures with high amounts of updates. However, its not going to break because of them. > time taken for the step. A single thread may always block for an unknown > reason at any point. But it is rather unlikely that this happens to many > of them at the same instruction at the same time. You just might like this: https://youtu.be/DZJPqTTt7MA >> lockfree, even better when its loopless, and waitfree. > I think I did not really catch the intended use case of your proxy > collector for now. My proxy collector use case is basically the same use case as RCU. Readers can read, while writers are mutating the data structure, even deleting things. Actually, it might be better than RCU in high writer use cases. Humm... |
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Feb 05 12:07PM -0800 On 2/4/2021 11:25 PM, Marcel Mueller wrote: > alignment I have only two bits for the outer count. So only 3 threads > can safely acquire the pointer at the same time. The 4th one will cause > the counter to return to 0 - an overflow. Indeed. > But it is not that bad. All I have to do is to clear the counter as fast > as possible. I will read it carefully. Noticed something like this but I am not sure yet. Actually, porting your code over to C++17 would help me out here. > by the snapshot value of the small counter first. And if resetting the > counter to zero fails another thread already did the job an the > increment of the main ref count is undone. I need to dig into this a lot more. > The assertion never triggered. I tested really hard with ~100 threads > each hammering the same instance in an infinite loop. In fact the > counter never reached the value 3. 2 was sufficient in real life. This sounds a bit odd to me, but then again, I need to understand your code better. And, I need to understand your use cases. In my proxy collector here: https://pastebin.com/raw/p1E9WN5i It does not use stolen bits, but another technique. Think of wrt 32-bit: 0xRRRRRRRC Where R is for the reference count, and C is for the collector index. Millions of threads can increment the outer counter at the same time. No problem. > This is the price for the stolen bits hack. Indeed. > something like that. But the cache line hopping effectively prevents too > many threads to be likely suspended after the first increments. Clearing > the counter is amortized faster that incrementing it. Is that _totally_ safe? I need to examine it further. >> part, corrupting things. Well, that's the way it was with my old code >> using stolen bits as a reference count. > The latter point I avoided by clearing the counter fast enough. fast enough makes we wonder. >> Please correct me if I am missing something from your acquire >> function. Thanks. > Absolutely not. It just did not happen ;-) Cool! |
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Feb 05 12:09PM -0800 On 2/2/2021 11:13 AM, Öö Tiib wrote: > The concept is quite handy as same with intrusive pointers > takes to manually implement some kind of "shut down" state > and then to check and handle it explicitly. To be quite honest, I still have never actually used weak pointers wrt shared_ptr in real code. Need to study up. They seem a bit odd to me, oh well. |
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Feb 05 01:00PM -0800 On 2/5/2021 11:52 AM, Chris M. Thomasson wrote: > Yup. I have wrote about this in the past about how loopless algorithms > can tend to lose their "luster" when implemented on an arch that uses > optimistic RMW's, like when CAS is implemented in terms of LL/SC. [...] For some reason, I remember writing about how a spin on an adaptive mutex, or spinlock can be used to do other useful work. Instead of spinning on a contended lock, use this moment in time to try to do some real work. Iirc, the overall high-level pseudo-code was something like this: _______________________________ void ct_special_try_lock_high_level(...) { while (! ct_try_lock(...)) { if (! ct_try_to_do_some_other_work(...)) { ct_wait_lock(...); break; } } return; } _______________________________ Without a "limit" on the number of times try_lock can be invoked, it kind of turned into a little "message pump" type of scenario. Keep in mind that each "spin" means other meaningful work was actually accomplished! ;^) |
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Feb 05 02:33PM -0800 On 2/5/2021 11:52 AM, Chris M. Thomasson wrote: >> happens to many of them at the same instruction at the same time. > You just might like this: > https://youtu.be/DZJPqTTt7MA [...] Basically, calling a blocking operation while holding a mutex is bad! Blocking while inside a RCU region is bad! |
Brian Wood <woodbrian77@gmail.com>: Feb 04 04:01PM -0800 On Thursday, February 4, 2021 at 4:54:11 PM UTC-6, Mr Flibble wrote: > > resources into it. It's important to me. Anyway, I don't expect you > > to suddenly appreciate it. Maybe in a few years. > You mean you have invested thousands of hours and a large percentage of your resources into spamming this newsgroup about it? I've invested much more in the software and company than in posting here. But I'm used to your attempted smears. They were more effective years ago in my opinion. |
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Feb 04 07:55PM -0800 On 2/4/2021 8:41 AM, Rick C. Hodgin wrote: >>> vast. I map out very large and complex projects that I can envision >>> and see all of the parts for, but I am able to do them in terms of >>> labor hours due to life things. [...] Put your CAlive on a bootable CD/DVD, provide an iso... Therefore, people can install your OS, tools, and everything. all in one go! Nice and easy. calive.iso Where do I download it? Iirc, you said you would have one? |
"Rick C. Hodgin" <rick.c.hodgin@gmail.com>: Feb 05 12:15AM -0500 On Thu, 4 Feb 2021 20:53:52 +0000 > The reason I write "And Satan invented fossils? Yes?" is because it > is a sarcastic, satirical and rhetorical question that dismisses your > childlike creationist beliefs with little effort on my part. That is exactly what that enemy spirit wants you to believe, Leigh. But here's what's really happening: Jesus is seated upon His throne. The world really is moving according to the Biblical plan. Christians have come to the cross and met Jesus and asked forgiveness, and they really are passing from death to life. And the warnings they give people like you really are warnings from God through them. That is truth. That is reality. What the lying anti-Christ spirit wants you to believe is that you have a handle on things when your not only really do not, but are so far away from the truth that, unless you come to faith in this world, one day when you stand before God in judgment you will be in awe at how lost in sin you were, and how twisted your thinking, feelings, emotions, beliefs, were by the influence of that lying spirit. ----- You're facing a calamity, Leigh. The way out is the very thing that enemy spirit entices you to not only dismiss, but also attack outright. I care about you, Leigh. I care about all people. It's why I teach you the truth. Now consider your sign-off to me: "Now .. off." Here's my sign-off to you: "You are loved, Leigh. Loved by God as part of His creation. You have sin, and Jesus came to this world to forgive YOU personally, to take on YOUR sin and set you free from the judgment you're facing today. God loves you enough to look past your hatred of Him for all these years. He loves you enough to still call out to you, still desires you to be a part of His eternal Kingdom of glory and power." You are beautiful and precious, Leigh. And there's a calling to a better way for you, your life here, and in eternity. Peace. -- Rick C. Hodgin |
David Brown <david.brown@hesbynett.no>: Feb 05 11:28AM +0100 On 05/02/2021 04:55, Chris M. Thomasson wrote: > and easy. > calive.iso > Where do I download it? Iirc, you said you would have one? If you want to encourage Rick with this, please do it by email or in his dedicated CAlive google group. (And I mean that literally - please do support him if you think there is a future for his projects, but please do it in appropriate ways.) We've had more than enough about it here and in c.l.c., where it is equally off-topic. And we have had more than enough of the pointless repetitive shooting matches between Rick and Mr. Flibble. Neither of them seems to recognize a lost cause when it is so clear to everyone else. |
David Brown <david.brown@hesbynett.no>: Feb 05 11:43AM +0100 On 04/02/2021 23:09, Brian Wood wrote: >>> in using? > Survival of the fittest -- to demonstrate their talents and strengths to > the world? So you think people will do your work for the bragging rights of finding mistakes or poor choices in your code? I think you overestimate the reputation you have as a programmer. Finding issues in well-developed, well-reviewed code that has been inspected by large numbers of programmers - /that/ gives bragging rights. Finding possible improvements in ordinary code written by one ordinary programmer and checked by no one is merely part of the daily grind for a coder. > Or to the victor go the spoils: Are you offering money for the work? That would be entirely reasonable, and much more likely to succeed. Or are you merely offering Biblical quotations and the promise of Brownie points in the next life? That's a harder sell for most potential code reviewers. > Maybe because I'm persistent? You think that if you bug people enough, someone is going to review your code to make you shut up? That's /really/ stretching it. > Or some other reason that I haven't considered. Personally, I think the main reason is that you have not stopped and taken the time to consider how you come across to others. Don't get me wrong here - I appreciate that you've put a lot of effort into your code and you want it to succeed. I'm trying to give suggestions to how you could improve your chances. And the most obvious suggestion is that when you are doing the same things again and again for years, with /zero/ results, it is time to change your tactics. Get off that "holier-than-thou" high horse of yours and try to understand what could possibly motivate someone to use your project, or to join you in its development. People need /concrete/ reasons, not delusions of grandeur, and not mindless optimism about how you think everyone will use it in the future. |
Mr Flibble <flibble@i42.REMOVETHISBIT.co.uk>: Feb 05 03:24PM On 05/02/2021 05:15, Rick C. Hodgin wrote: > really are passing from death to life. And the warnings they give > people like you really are warnings from God through them. > That is truth. That is reality. And Satan invented fossils, yes? Spammer? /Flibble -- 😎 |
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Feb 05 12:18PM -0800 On 2/5/2021 2:28 AM, David Brown wrote: > equally off-topic. And we have had more than enough of the pointless > repetitive shooting matches between Rick and Mr. Flibble. Neither of > them seems to recognize a lost cause when it is so clear to everyone else. Apparently he banned me. However, he did promise me around a year ago to have an iso. I am willing to install it on an older x86. |
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page. To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com. |
No comments:
Post a Comment