soft and program: Digest for comp.lang.c++@googlegroups.com

comp.lang.c++@googlegroups.com

Google Groups

???Microsoft Azure CTO Mark Russinovich: C/C++ should be deprecated??? - 4 Updates
Never use strncpy! - 3 Updates

???Microsoft Azure CTO Mark Russinovich: C/C++ should be deprecated???

Kaz Kylheku <864-117-4973@kylheku.com>: Sep 28 04:56PM

> were pushed in reverse order and the target architectures of the day
> had limited registers, it follows that compiler writers in the day
> would process arguments in reverse order when generating code.

Not just limited registers; but limited analysis. Because you can
reorder the evaluation to go in the stack-convenient order, in cases
where you can confirm that it makes no difference.

James Kuyper <jameskuyper@alumni.caltech.edu>: Sep 28 01:15PM -0400

On 9/28/22 04:26, Kaz Kylheku wrote:

>> You only do it when the order actually matters, not automatically for
>> all function calls.

> OK; I still need the compiler to tell me where those places are;

You've got the communications direction wrong. Writing code that way is
how you tell the compiler that the order matters. Writing it as a simple
function call without explicit temporaries is how you tell the compiler
that you think the order doesn't matter. A good compiler should inform
you if realizes that the assumption is incorrect.

Bo Persson <bo@bo-persson.se>: Sep 28 08:31PM +0200

On 2022-09-28 at 18:56, Kaz Kylheku wrote:

> Not just limited registers; but limited analysis. Because you can
> reorder the evaluation to go in the stack-convenient order, in cases
> where you can confirm that it makes no difference.

The "stack convenient" order is very important for some common
functions, like printf. Evaluating that function call left-to-right and
push the parameters so that the format string is at the bottom of an
unknown sized parameter pack is - Extremely Inconvenient(TM).

So the evaluation is ordered so that the format string is pushed last
and easy to find for printf. Kind of important for it to figure out the
types and numbers of the other parameters.

doctor@doctor.nl2k.ab.ca (The Doctor): Sep 28 09:45PM

In article <jpji7nFi2gbU1@mid.individual.net>,

>So the evaluation is ordered so that the format string is pushed last
>and easy to find for printf. Kind of important for it to figure out the
>types and numbers of the other parameters.

C/C++ being deprecated? How would computing work?
--
Member - Liberal International This is doctor@nk.ca Ici doctor@nk.ca
Yahweh, King & country!Never Satan President Republic!Beware AntiChrist rising!
Look at Psalms 14 and 53 on Atheism https://www.empire.kred/ROOTNK?t=94a1f39b
Quebec oubliez les extremes et votez PLQ Beware https://mindspring.com

Never use strncpy!

David Brown <david.brown@hesbynett.no>: Sep 28 09:32AM +0200

On 27/09/2022 21:42, Chris M. Thomasson wrote:

> Yes. Using an address based hashed locking scheme works just in case the
> arch does not support the direct CPU instruction(s) (think CAS vs LL/SC)
> for an atomic RMW operation.

LL/SC /is/ a locking scheme - using a hardware lock. And neither CAS
nor LL/SC work for RMW or even plain write operations that are bigger
than the processor can handle in a single write action.

> However, the locking emulation is most
> definitely, not ideal. Not lock-free, indeed.

Processors can generally handle lock-free atomic access of a single
object of limited size - usually the natural width for the processor.
Some processors have instructions for double-width atomic accesses (such
as a double compare-and-swap). And sometimes instruction sequences,
such as LL/SC with loops, are needed - especially for RMW.

Lock-free algorithms beyond that are for specific data structures. You
can't make lock-free atomic access to a 32 byte object. You either have
to use locks (as will be done with a std::atomic<> for the type, or
using the C11 _Atomic qualifier). If you want lock-free access, you
have to wrap it all up in a more advanced structure, using something
like a lock-free atomic pointer to the "current" version of the data
allocated on a heap.

> Sorry about that non-sense David: Wrt the dangling comma. Forgot to
> introduce the addend for the fetch-add RMW operation.

> Shit happens. :^)

That's just minor detail, so not a problem at all.

"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Sep 28 01:34PM -0700

On 9/28/2022 12:32 AM, David Brown wrote:

> LL/SC /is/ a locking scheme - using a hardware lock. And neither CAS
> nor LL/SC work for RMW or even plain write operations that are bigger
> than the processor can handle in a single write action.

Correct. Imvho, the hardware itself is a _lot_ more efficient at these
types of things... Agreed in a sense? I actually prefer pessimistic CAS
over optimistic primitives like LL/SC. Iirc, a LL/SC can fail just by
reading from the reservation granule. Let alone writing to it... PPC had
a special section in its docs that explain the possible issue of a live
lock. Iirc, even CAS has some special logic in the processor that can
actually assert a bus lock.

> Some processors have instructions for double-width atomic accesses (such
> as a double compare-and-swap). And sometimes instruction sequences,
> such as LL/SC with loops, are needed - especially for RMW.

Afaict, DWCAS is there to help get around the ABA problem ala IBM sysv
appendix, oh shit, I forgot the appendix number. I used to know it,
decades ago. I will try to find it.

> have to wrap it all up in a more advanced structure, using something
> like a lock-free atomic pointer to the "current" version of the data
> allocated on a heap.

Agreed. Although, I have created lock-free allocators that never used
dynamic memory, believe it or not. Everything exists on threads stacks.
And memory from thread A could be "freed" by another thread. I remember
a project I had to do for a Quadros based system. Completely based on
stacks. Wow, what a time.

>> introduce the addend for the fetch-add RMW operation.

>> Shit happens. :^)

> That's just minor detail, so not a problem at all.

Thanks. :^)

scott@slp53.sl.home (Scott Lurndal): Sep 28 09:11PM

>a special section in its docs that explain the possible issue of a live
>lock. Iirc, even CAS has some special logic in the processor that can
>actually assert a bus lock.

When ARM was designing their 64-bit architecture (ARMv8) circa 2011/2, they only
provided a LL/SC equivalent (load-exclusive/store-exclusive). Their
architecture partners at the time quickly requested support for
real RMW atomics, which were added as part of the LSE (Large System ISA
Extensions). LDADD, LDCLR (and with complement), LDSET (or), LDEOR (xor),
LDSMAX (signed maximum), LDUMAX (unsigned maximum), LDSMIN, LDUMIN.

The processor fabric forwards the operation to the point of coherency
(e.g. the L2/LLC) for cachable memory locations and to the endpoint for
uncachable memory locations (e.g. a PCIexpress or CXL endpoint).

You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

soft and program

Wednesday, September 28, 2022

Digest for comp.lang.c++@googlegroups.com - 7 updates in 2 topics

No comments:

Blog Archive

About Me