- Onwards and upwards - 2 Updates
- Poor Mans RCU... - 4 Updates
- std::atomic<std::shared_ptr<T>>... - 3 Updates
- gcc libatomic "sample library" - 2 Updates
- #include'ing .c files considered harmful? - 1 Update
Brian Wood <woodbrian77@gmail.com>: Feb 14 02:56PM -0800 On Friday, February 5, 2021 at 4:43:26 AM UTC-6, David Brown wrote: > programmers - /that/ gives bragging rights. Finding possible > improvements in ordinary code written by one ordinary programmer and > checked by no one is merely part of the daily grind for a coder. Perhaps we can at least agree that services are the most important form of software today and that C++ is the most important language for services. > and much more likely to succeed. Or are you merely offering Biblical > quotations and the promise of Brownie points in the next life? That's a > harder sell for most potential code reviewers. A lot of code review is done for free: https://www.reddit.com/r/codereview "Furthermore, the Israelites acted on Moses' word and asked the Egyptians for articles of silver and gold, and for clothing. And the L-RD gave the people such favor in the sight of the Egyptians that they granted their request. In this way they plundered the Egyptians." Exodus 12:36,37 The Israelites didn't pay for the items of gold and silver. G-d was saving them from their oppressors. Unfortunately, some of the regulars here are oppressors. Brian Ebenezer Enterprises https://github.com/Ebenezer-group/onwards |
Mr Flibble <flibble@i42.REMOVETHISBIT.co.uk>: Feb 14 10:58PM On 14/02/2021 22:56, Brian Wood wrote: > Brian > Ebenezer Enterprises > https://github.com/Ebenezer-group/onwards You might as well be a bot with that reply, fucktard. /Flibble -- 😎 |
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Feb 13 10:39PM -0800 On 2/13/2021 4:54 AM, Jorgen Grahn wrote: > I.e. std::cout should be in line buffered mode. > At least on Unix (where the whole output/error stream thing comes > from). I don't know if that's guaranteed. |
Manfred <noname@add.invalid>: Feb 14 06:01PM +0100 On 2/14/2021 7:39 AM, Chris M. Thomasson wrote: >> At least on Unix (where the whole output/error stream thing comes >> from). > I don't know if that's guaranteed. As Öö Tiib pointed out, the difference between std::endl and '\n' in C++ is exactly that the former executes basic_ostream::flush(), the latter doesn't. From the example in https://en.cppreference.com/w/cpp/io/manip/endl : |
Jorgen Grahn <grahn+nntp@snipabacken.se>: Feb 14 08:12PM On Sun, 2021-02-14, Chris M. Thomasson wrote: > On 2/13/2021 4:54 AM, Jorgen Grahn wrote: >> On Tue, 2021-02-09, Chris M. Thomasson wrote: ... >> At least on Unix (where the whole output/error stream thing comes >> from). > I don't know if that's guaranteed. I think it is, but it would be nice to have it confirmed. I think I can quote W R Stevens, but he only writes about Unix. If the people with the problems e.g. ran the code in an IDE, that would explain it. Or piped the output through less(1). /Jorgen -- // Jorgen Grahn <grahn@ Oo o. . . \X/ snipabacken.se> O o . |
scott@slp53.sl.home (Scott Lurndal): Feb 14 10:22PM >> I don't know if that's guaranteed. >I think it is, but it would be nice to have it confirmed. I think I >can quote W R Stevens, but he only writes about Unix POSIX requires stdout and stderr to be line buffered if and only if the underlying file descriptor refers to a terminal, serial port, console or pseudoterminal device (isatty() == true). Otherwise they'll be fully buffered. The application controls the buffering using setvbuf(3) or setbuf(3), and it is often useful for an application to explicity set the buffering mode to line-buffered so that when redirected to a file, the output is available to other tools like the tail(1) command line by line. |
Marcel Mueller <news.5.maazl@spamgourmet.org>: Feb 14 02:59PM +0100 Am 05.02.21 um 21:07 schrieb Chris M. Thomasson: >> fast as possible. > I will read it carefully. Noticed something like this but I am not sure > yet. Actually, porting your code over to C++17 would help me out here. I did a rough port to C++17 using atomic<uintptr_t>. Unfortunately time have changed. The code is no longer reliable in general. :-( It works under OS/2 (the original, x86). It works in a Linux VM (x64). It does /not/ work on the host of the VM (same hardware, AM4). The stolen bits counter overflows soon. It works on my local PC (AM3+). It does not work on a Ryzen 16 core. It does not work on a Xeon neither on ARMv7 quad core. The maximum stolen bits count scales nonlinear with the number of CPU cores. It is less on Xeon and ARM than on AMD. It is very interesting that the maximum count is at least 30% less if the code is executed in a VM on the same hardware. The scheduler seem to have some influence. In fact the code runs twice as fast inside the VM! I tested with 300 threads hammering on the same atomic instance in an infinite loop. The duration of the test has almost no effect. The number of threads also has no significant effect as long there are enough to reach the maximum. >> counter never reached the value 3. 2 was sufficient in real life. > This sounds a bit odd to me, but then again, I need to understand your > code better. It /is/ odd. My tests were quite long ago. And my VM usually used for development was one of them that worked. (max count 4 of 7 allowed on x64) There seems no alternative to DWCAS for the atomic version. :-/ An intrusive reference counted smart pointer is still useful. But it is no longer wait free if the platform does not support DWCAS. > Where R is for the reference count, and C is for the collector index. > Millions of threads can increment the outer counter at the same time. No > problem. The collector index? Marcel |
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Feb 14 12:46PM -0800 On 2/14/2021 5:59 AM, Marcel Mueller wrote: > There seems no alternative to DWCAS for the atomic version. :-/ > An intrusive reference counted smart pointer is still useful. But it is > no longer wait free if the platform does not support DWCAS. Yeah. Looking at your code, I was just worrying about a shi% load of threads all taking a reference to the strong pointer at the same time. That would overflow it rather quickly. Now, I have some old proxy collector code that steals enough bits to hold an 8-bit counter, so that's 256 threads. However, if more than 256 threads take the strong count at the same time, the it will overflow. I need to find it on archive.org. Luckily, it just might have it. https://web.archive.org/web/2017*/http://webpages.charter.net/appcore >> Millions of threads can increment the outer counter at the same time. >> No problem. > The collector index? The collector index is embedded within the counter so I can increment the reference count and grab the collector index in a single atomic RMW, fetch_add in this case. Then I decode the it from the return value and use it as an index into collector objects, there are two collectors in this case. Take a careful look at the following code in my proxy collector: https://pastebin.com/raw/CYZ78gVj ____________________________ collector& acquire() { // increment the master count _and_ obtain current collector. std::uint32_t current = m_current.fetch_add(ct_ref_inc, std::memory_order_acquire); // decode the collector index. return m_collectors[current & ct_proxy_mask]; } ____________________________ It returns a reference to the indexed collector. |
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Feb 14 02:05PM -0800 On 2/7/2021 11:35 PM, Öö Tiib wrote: >> Do weak_ptrs adjust the reference count at all? Please try to excuse my >> ignorance here. ;^o > The weak references are simply counted too (as weak references). This quote is interesting to me: https://en.cppreference.com/w/cpp/memory/weak_ptr ___________________ std::weak_ptr models temporary ownership: when an object needs to be accessed only if it exists, and it may be deleted at any time by someone else, std::weak_ptr is used to track the object, and it is converted to std::shared_ptr to assume temporary ownership. If the original std::shared_ptr is destroyed at this time, the object's lifetime is extended until the temporary std::shared_ptr is destroyed as well. ___________________ Does shared_ptr have a "separate" reference count to weak_ptr's? |
David Brown <david.brown@hesbynett.no>: Feb 14 11:46AM +0100 On 13/02/2021 20:42, Marcel Mueller wrote: > holds the current value, one the next value. When you synchronize only > writers they can safely swap the storage pointer. No need to synchronize > readers. Often that's enough. No, that won't work. It was the first thing that came to my mind too, but it is not sufficient. Let's model the object we want to access atomically as: typedef struct T { uint32_t lo; uint32_t hi; } T; You store two copies: volatile T d[2]; and a pointer: volatile T* volatile p = &d[0]; Updating will be something like: void update(T x) { get_writer_lock(); volatile T * q = &d[1 - (p - d)]; *q = x; p = q; release_lock(); } You are using extra synchronisation for writing, which is not ideal, but it is not uncommon to have only a single writer. You are then suggesting that this is safe for reading: T read(void) { return *p; } Let's break this down. The implementation will be something like: T read(void) { T x; volatile T* q = p; // point 1 x.lo = q->lo; // point 2 x.hi = q->hi; return x; } If the reader thread is pre-empted (or interrupted) at point 1 by the writer, that's okay - the reader doesn't see the new data, but it gets the consistent old one, as the writer has modified the new copy. The same happens if it is pre-empted at point 2. Since the pointer is read atomically, the data is consistent. Except... what if the writer does two updates? Or two writer threads run while the reader thread is paused? Then a writer is stomping all over the data that the reader thread has partially read. So it is not nearly as simple as you imply. It is a step towards a solution, but requires work. A "store/load exclusive" loop can make reading safe, but you still need a synchronisation mechanism for the writers that requires locking (and thus fails to be a generic lock-free mechanism). In the common situation of a single writer and a single reader, you can use this kind of arrangement. But you use three copies, not two, and you have tracking of which buffer is used by the reader and writer. Even then it's a bit fiddly, and the most efficient solutions need knowledge of how the threads can interact and interrupt each other. > increment the master pointer atomically before using the storage it > points to. Now writers know that they should not discard or modify this > storage. Your mention of "high update rate" is perhaps why I am not happy with your solutions. You are talking about things that will likely work well in most cases - I am trying to look at things that are guaranteed to work in /every/ case. When you have specific cases, you can pick solutions that are efficient and work given the assumptions that are valid in that case. Maybe you have only one writer, maybe you know about the synchronisation and which thread can pre-empt the other - and so on. But a library that comes with a toolchain that implements atomic read and write of any sized data needs to work in /all/ cases. "Usually sufficient" is not good enough, nor is "assuming a low update rate" or any other assumption. It needs to work /every/ time, with no additional assumptions (except perhaps assumptions that can be checked at compile or link time, such as specific OS support). >>> with different priorities it is not suitable for this case. >> All RTOS systems are sensitive to priority inversion, > Sure, but lock free algorithms are not. ;-) Agreed. But atomics are not exclusively about being lock-free. Lock-free atomics let you build lock-free algorithms that can scale well across multiple cores. However, atomics are fundamentally about making particular operations appear unbreakable - and that applies even if the atomic operation in question is not lock-free. Beyond a certain size or complexity, operations invariably require some kind of lock (or special hardware support) to be atomic - and then you have locks, and you have sensitivity to priority inversion. > Is it possible to raise the priority of all mutex users for the time of > the critical section? This will still abuse a time slice if the spin > lock does not explicitly call sleep. But at least it will not deadlock. Doing something like that would negate all the benefits of trying to use atomics rather than simply using mutexes in the first place. It is far better (again, in the single core case) to simply disable interrupts for the short code section needed to do the atomic access. This has the effect of raising the priority of the current thread to the maximum - bit it does so for the shortest possible time. |
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Feb 14 12:55PM -0800 On 2/14/2021 2:46 AM, David Brown wrote: > but it is not sufficient. > Let's model the object we want to access atomically as: > typedef struct T { uint32_t lo; uint32_t hi; } T; For some reason this reminds me of Joe Seighs 63-bit atomic counter: https://groups.google.com/g/comp.lang.asm.x86/c/FScbTaQEYLc/m/X0gAskwQW44J ;^) [...] |
Brian Wood <woodbrian77@gmail.com>: Feb 13 06:42PM -0800 On Friday, February 12, 2021 at 2:25:27 AM UTC-6, David Brown wrote: > > size_t stream_counter; > > }; > Both are horrible and unreadable. What I want to be be sure of is that the second form is a benign refactoring. >This is C++ - try making a template > or inherited structures, perhaps with a /single/ conditional compilation > part at the end to give an alias to the struct you want. This is a C library that I'm using. Brian Ebenezer Enterprises - Enjoying programming again. https://webEbenezer.net |
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page. To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com. |
No comments:
Post a Comment