- ???Microsoft Azure CTO Mark Russinovich: C/C++ should be deprecated??? - 4 Updates
- Never use strncpy! - 3 Updates
| Kaz Kylheku <864-117-4973@kylheku.com>: Sep 28 04:56PM > were pushed in reverse order and the target architectures of the day > had limited registers, it follows that compiler writers in the day > would process arguments in reverse order when generating code. Not just limited registers; but limited analysis. Because you can reorder the evaluation to go in the stack-convenient order, in cases where you can confirm that it makes no difference. |
| James Kuyper <jameskuyper@alumni.caltech.edu>: Sep 28 01:15PM -0400 On 9/28/22 04:26, Kaz Kylheku wrote: >> You only do it when the order actually matters, not automatically for >> all function calls. > OK; I still need the compiler to tell me where those places are; You've got the communications direction wrong. Writing code that way is how you tell the compiler that the order matters. Writing it as a simple function call without explicit temporaries is how you tell the compiler that you think the order doesn't matter. A good compiler should inform you if realizes that the assumption is incorrect. |
| Bo Persson <bo@bo-persson.se>: Sep 28 08:31PM +0200 On 2022-09-28 at 18:56, Kaz Kylheku wrote: > Not just limited registers; but limited analysis. Because you can > reorder the evaluation to go in the stack-convenient order, in cases > where you can confirm that it makes no difference. The "stack convenient" order is very important for some common functions, like printf. Evaluating that function call left-to-right and push the parameters so that the format string is at the bottom of an unknown sized parameter pack is - Extremely Inconvenient(TM). So the evaluation is ordered so that the format string is pushed last and easy to find for printf. Kind of important for it to figure out the types and numbers of the other parameters. |
| doctor@doctor.nl2k.ab.ca (The Doctor): Sep 28 09:45PM In article <jpji7nFi2gbU1@mid.individual.net>, >So the evaluation is ordered so that the format string is pushed last >and easy to find for printf. Kind of important for it to figure out the >types and numbers of the other parameters. C/C++ being deprecated? How would computing work? -- Member - Liberal International This is doctor@nk.ca Ici doctor@nk.ca Yahweh, King & country!Never Satan President Republic!Beware AntiChrist rising! Look at Psalms 14 and 53 on Atheism https://www.empire.kred/ROOTNK?t=94a1f39b Quebec oubliez les extremes et votez PLQ Beware https://mindspring.com |
| David Brown <david.brown@hesbynett.no>: Sep 28 09:32AM +0200 On 27/09/2022 21:42, Chris M. Thomasson wrote: > Yes. Using an address based hashed locking scheme works just in case the > arch does not support the direct CPU instruction(s) (think CAS vs LL/SC) > for an atomic RMW operation. LL/SC /is/ a locking scheme - using a hardware lock. And neither CAS nor LL/SC work for RMW or even plain write operations that are bigger than the processor can handle in a single write action. > However, the locking emulation is most > definitely, not ideal. Not lock-free, indeed. Processors can generally handle lock-free atomic access of a single object of limited size - usually the natural width for the processor. Some processors have instructions for double-width atomic accesses (such as a double compare-and-swap). And sometimes instruction sequences, such as LL/SC with loops, are needed - especially for RMW. Lock-free algorithms beyond that are for specific data structures. You can't make lock-free atomic access to a 32 byte object. You either have to use locks (as will be done with a std::atomic<> for the type, or using the C11 _Atomic qualifier). If you want lock-free access, you have to wrap it all up in a more advanced structure, using something like a lock-free atomic pointer to the "current" version of the data allocated on a heap. > Sorry about that non-sense David: Wrt the dangling comma. Forgot to > introduce the addend for the fetch-add RMW operation. > Shit happens. :^) That's just minor detail, so not a problem at all. |
| "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Sep 28 01:34PM -0700 On 9/28/2022 12:32 AM, David Brown wrote: > LL/SC /is/ a locking scheme - using a hardware lock. And neither CAS > nor LL/SC work for RMW or even plain write operations that are bigger > than the processor can handle in a single write action. Correct. Imvho, the hardware itself is a _lot_ more efficient at these types of things... Agreed in a sense? I actually prefer pessimistic CAS over optimistic primitives like LL/SC. Iirc, a LL/SC can fail just by reading from the reservation granule. Let alone writing to it... PPC had a special section in its docs that explain the possible issue of a live lock. Iirc, even CAS has some special logic in the processor that can actually assert a bus lock. > Some processors have instructions for double-width atomic accesses (such > as a double compare-and-swap). And sometimes instruction sequences, > such as LL/SC with loops, are needed - especially for RMW. Afaict, DWCAS is there to help get around the ABA problem ala IBM sysv appendix, oh shit, I forgot the appendix number. I used to know it, decades ago. I will try to find it. > have to wrap it all up in a more advanced structure, using something > like a lock-free atomic pointer to the "current" version of the data > allocated on a heap. Agreed. Although, I have created lock-free allocators that never used dynamic memory, believe it or not. Everything exists on threads stacks. And memory from thread A could be "freed" by another thread. I remember a project I had to do for a Quadros based system. Completely based on stacks. Wow, what a time. >> introduce the addend for the fetch-add RMW operation. >> Shit happens. :^) > That's just minor detail, so not a problem at all. Thanks. :^) |
| scott@slp53.sl.home (Scott Lurndal): Sep 28 09:11PM >a special section in its docs that explain the possible issue of a live >lock. Iirc, even CAS has some special logic in the processor that can >actually assert a bus lock. When ARM was designing their 64-bit architecture (ARMv8) circa 2011/2, they only provided a LL/SC equivalent (load-exclusive/store-exclusive). Their architecture partners at the time quickly requested support for real RMW atomics, which were added as part of the LSE (Large System ISA Extensions). LDADD, LDCLR (and with complement), LDSET (or), LDEOR (xor), LDSMAX (signed maximum), LDUMAX (unsigned maximum), LDSMIN, LDUMIN. The processor fabric forwards the operation to the point of coherency (e.g. the L2/LLC) for cachable memory locations and to the endpoint for uncachable memory locations (e.g. a PCIexpress or CXL endpoint). |
| You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page. To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com. |
No comments:
Post a Comment