Tuesday, September 26, 2023

Digest for comp.lang.c++@googlegroups.com - 25 updates in 1 topic

scott@slp53.sl.home (Scott Lurndal): Sep 26 03:25PM

>> "defective" method can be used.
 
>Pthread mutexes can be initialized statically, but then you have the
>same kind of delayed inititialization - which may fail.
 
No, there is no delayed initialization with pthread mutexes.
Bonita Montero <Bonita.Montero@gmail.com>: Sep 26 05:30PM +0200

Am 26.09.2023 um 17:25 schrieb Scott Lurndal:
 
> No, there is no delayed initialization with pthread mutexes.
 
There must be since you can't create a semaphore by pure assignment.
Bonita Montero <Bonita.Montero@gmail.com>: Sep 26 05:31PM +0200

Am 26.09.2023 um 17:23 schrieb Scott Lurndal:
 
> No, it is not.
 
What else ?
Bonita Montero <Bonita.Montero@gmail.com>: Sep 26 05:33PM +0200

Am 26.09.2023 um 17:24 schrieb Scott Lurndal:
 
> a non-runnable queue in the dispatcher until a call to 'unlock'
> determines that there is a waiter and tells the kernel to
> awaken it.
 
The userspace code must tell the kernel what it its waiting for.
Mutexes are always a combination of an atomic for the fast path
and a binary semapohre for the slow path.
Michael S <already5chosen@yahoo.com>: Sep 26 08:52AM -0700

On Tuesday, September 26, 2023 at 6:31:13 PM UTC+3, Bonita Montero wrote:
> Am 26.09.2023 um 17:25 schrieb Scott Lurndal:
 
> > No, there is no delayed initialization with pthread mutexes.
> There must be since you can't create a semaphore by pure assignment.
 
I don't know how it works for pthread mutexes on any possible OS
that supports pthreads.
But I do know that Windows SRW Lock can be initialized statically.
Also I do know that it does not have to be destroyed. As long
as no threads are locked on it, it can be simply abandoned.
 
https://learn.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-initializesrwlock
"An unlocked SRW lock with no waiting threads is in its initial state and can be copied,
moved, and forgotten without being explicitly destroyed."
 
Which is a very strong hint that there is no hidden kernel object per
lock behind the scene. And very likely no hidden kernel object per any
fixed number of SRW locks.
Michael S <already5chosen@yahoo.com>: Sep 26 09:05AM -0700

On Tuesday, September 26, 2023 at 8:53:13 AM UTC+3, Chris M. Thomasson wrote:
> performance for certain workloads. Actually, there is a way to use it to
> get a RCU in userspace and/or hazard pointers without any explicit
> #StoreLoad membars.
 
I would expect FlushProcessWriteBuffers() to be 2 orders of magnitude
slower than #StoreLoad membar. I could be wrong, of course, the function
is close to undocumented.
 
Bonita Montero <Bonita.Montero@gmail.com>: Sep 26 06:08PM +0200

Am 26.09.2023 um 17:52 schrieb Michael S:
 
> I don't know how it works for pthread mutexes ...
 
LOL
 
> But I do know that Windows SRW Lock can be initialized statically.
 
Maybe, but this is also possible with delayed initialization.
Kaz Kylheku <864-117-4973@kylheku.com>: Sep 26 04:46PM

>> as documented.
 
> I meant to say that this is something to be expected on any platform,
> not that it actually happens.
 
If you're betting your business on writing some multithreading
application for a platform, you should know what actually happens on
the platform.
 
You probably don't choose a platform where mutexes require resources
that are only allocated on lock contention, and which can then blow up.
 
If you do choose that platform, you have some plan. You identify under
which conditions such a problem happens. Perhaps the system can be
configured to minimize the risk. Or perhasps there are warning signs,
like low memory, so that evasive action can be taken: clean shutdown and
restart.
 
Memory overcommit is enabled on most GNU/Linux system installations.
This means that memory allocations can fail not when you allocate
the memory (e.g. mmap() returning MAP_FAILED or malloc returning null)
but when you try to access the memory.
 
You don't necessarily combat this in the application code. Ways to deal
with it range from reducing or disabling overcommit, to just having
enough swap space for a soft landing: evasive action taken when the
system shows signs of thrashing.
 
Most programs at most check for up-front allocator failure; most are not
prepared to handle an access violation upon using the result of a
successful-looking allocation.
 
Assuming you have some mutex type which is "overcommitted" similarly,
similar reasoning applies.
 
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca
NOTE: If you use Google Groups, I don't see you, unless you're whitelisted.
Bonita Montero <Bonita.Montero@gmail.com>: Sep 26 06:54PM +0200

Am 26.09.2023 um 18:46 schrieb Kaz Kylheku:
 
> You probably don't choose a platform where mutexes require resources
> that are only allocated on lock contention, and which can then blow up.
 
Contention is always possible and always expensive.
 
> This means that memory allocations can fail not when you allocate
> the memory (e.g. mmap() returning MAP_FAILED or malloc returning null)
> but when you try to access the memory.
 
Memory overcommit partitially bases on swappable memory. Seamphores are
in non-pageable memory inside the kernel. But I accept what you mean: if
you accept that an application can suddenly terminate because of the OOM
killer you also might accept that it would be terminated because a sin-
gle kernel semaphore coudn't be created. That may be acceptable for most
users but if there's an error code from the kernel this still should be
forwarded through your runtime for those who coudln't accept that.
Michael S <already5chosen@yahoo.com>: Sep 26 10:00AM -0700

On Tuesday, September 26, 2023 at 7:08:43 PM UTC+3, Bonita Montero wrote:
 
> LOL
> > But I do know that Windows SRW Lock can be initialized statically.
> Maybe, but this is also possible with delayed initialization.
 
If you were reading my post up to the end you were understanding that
such possibility is extremely low.
Kaz Kylheku <864-117-4973@kylheku.com>: Sep 26 05:01PM

>> "defective" method can be used.
 
> Pthread mutexes can be initialized statically, but then you have the
> same kind of delayed inititialization - which may fail.
 
It has already been explained that none of the error codes given in
POSIX for the pthread_mutex_lock function have anything to do with
resource acquisition.
 
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca
NOTE: If you use Google Groups, I don't see you, unless you're whitelisted.
Michael S <already5chosen@yahoo.com>: Sep 26 10:04AM -0700

On Tuesday, September 26, 2023 at 8:01:00 PM UTC+3, Michael S wrote:
> > Maybe, but this is also possible with delayed initialization.
> If you were reading my post up to the end you were understanding that
> such possibility is extremely low.
 
I meant to write 'extremely unlikely'.
Kaz Kylheku <864-117-4973@kylheku.com>: Sep 26 05:09PM

> Am 26.09.2023 um 17:25 schrieb Scott Lurndal:
 
>> No, there is no delayed initialization with pthread mutexes.
 
> There must be since you can't create a semaphore by pure assignment.
 
It has already been explained that there are multiple ways to implement
mutexes.
 
The approach which requires a kernel semaphore handle to be allocated
is a possibility. It would be difficult under POSIX because the
static initializer will not do that, and the lock operation is not
described as returning a resource-related error code.
 
In Linux, the original threads implementation in Glibc, based on
Xavier Leroy's LinuxThreads didn't use one semaphore per mutex.
It used POSIX signals to implement internal functions suspend()
and resume(). There was no kernel object to allocate.
 
The NPTL rewrite of the treading library was accompanied by futexes
in the kernel. Futexes also don't require allocation. Any memory
location in your virtual space can be used as a futex without any
resources being registered. Futexes were designed that way specifically
in order not to have mutex initialization fail on resource acquisition,
or waste time in the kernel, so that mutexes can be cheaply and safely.
 
Futexes use a fixed pool of wait queues which is allocated upfront. The
memory location is hashed to a wait queue, which it possibly shares with
unrelated futexes due to hash collisions.
 
There is no reason for any commercial, proprietary platform to do worse
than open source did twenty years ago, or for anyone to use it if it
sodes.
 
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca
NOTE: If you use Google Groups, I don't see you, unless you're whitelisted.
scott@slp53.sl.home (Scott Lurndal): Sep 26 05:13PM


>It has already been explained that none of the error codes given in
>POSIX for the pthread_mutex_lock function have anything to do with
>resource acquisition.
 
Although it is fair to point out that POSIX allows additional
implementation-defined error codes for POSIX interfaces, so long
as the listed conditions return the listed error codes.
 
That's not meant to imply that posix mutexes will ever fail, however.
Bonita Montero <Bonita.Montero@gmail.com>: Sep 26 07:30PM +0200

Am 26.09.2023 um 19:00 schrieb Michael S:
 
> If you were reading my post up to the end you were understanding that
> such possibility is extremely low.
 
I think that's a common optimization since in many cases mutexes aren't
contended at all, thereby not needing any semaphore, or contended very
late.
Bonita Montero <Bonita.Montero@gmail.com>: Sep 26 07:33PM +0200

Am 26.09.2023 um 19:09 schrieb Kaz Kylheku:
 
> Xavier Leroy's LinuxThreads didn't use one semaphore per mutex.
> It used POSIX signals to implement internal functions suspend()
> and resume(). There was no kernel object to allocate.
 
Maybe, but the C++ standard shoudn't mandate that. If you implement
a mutex yourself under Windows you'd need a kernel event or semaphore.
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Sep 26 12:58PM -0700

On 9/26/2023 9:05 AM, Michael S wrote:
 
> I would expect FlushProcessWriteBuffers() to be 2 orders of magnitude
> slower than #StoreLoad membar. I could be wrong, of course, the function
> is close to undocumented.
 
FlushProcessWriteBuffers() can be used to gain a quiescent state wrt
memory visibility for some exotic async algorithms, it is called in
certain places and is RARE. However, on the other hand, having to use a
#StoreLoad membar on the fast path is not good. Keep in mind that this
is meant to handle a store followed by a load to another location
scenario. We need to honor the #StoreLoad relationship for this type of
setup for these types of algorithms. For instance, hazard pointers
require a #StoreLoad, however there is a way to eliminate them using
some special magic wrt RCU...
 
My friend Joe Seigh created SMR+RCU to gain this. SMR is hazard
pointers. Now, keep in mind that these are very "specialized" algorithms
that try to make certain workloads MUCH faster and way more efficient.
They work for that.
 
Also, keep in mind that a normal mutex does not require #StoreLoad. Only
acquire and release:
 
acquire = #LoadStore | #LoadLoad
release = #LoadStore | #StoreStore
 
In SPARC parlance. For instance, an atomic store on x86 automatically
implies release. Btw, have you ever worked with SPARC in RMO mode? Its
fun because its so weak.
 
I still need to find that old paper called QPI, if I remember correctly.
It should still be out there. I will try to find it sometime today.
 
It think is was from Dice wrt a Java VM. David Dice? Damn I cannot
remember it right now. Sorry. ;^o
 
 
[...]
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Sep 26 01:02PM -0700

On 9/26/2023 8:31 AM, Bonita Montero wrote:
> Am 26.09.2023 um 17:23 schrieb Scott Lurndal:
 
>> No, it is not.
 
> What else ?
 
Sigh.
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Sep 26 01:04PM -0700

On 9/26/2023 4:23 AM, Richard Damon wrote:
>> initialization can fail.
 
> Nope, if you do it right, a suitable synchronization object can be
> created without possibility of error.
 
Bonita needs to learn that there are ways to, do it right. For some damn
reason, it reminds me of the following crazy song from Daft Punk.
Actually, Daft Punk reminds me of Bonita for some strange reason... Robots?
 
https://youtu.be/LL-gyhZVvx0
 
Everybody will be dancing when we are doing it right...
 
;^)
 
 
 
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Sep 26 01:06PM -0700

On 9/26/2023 5:37 AM, Bonita Montero wrote:
>> a bad implementation.
 
> The slow path always is backed by a binary semaphore of the kernel.
> You can't create an arbitrary number of that.
 
Barf! God damn it Bonita! I might have to try to refrain from reading
your posts. PUKE!
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Sep 26 01:07PM -0700

On 9/26/2023 8:33 AM, Bonita Montero wrote:
 
> The userspace code must tell the kernel what it its waiting for.
> Mutexes are always a combination of an atomic for the fast path
> and a binary semapohre for the slow path.
 
Word salad that you do not even understand. Sorry buddy.
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Sep 26 01:21PM -0700

On 9/26/2023 10:13 AM, Scott Lurndal wrote:
> implementation-defined error codes for POSIX interfaces, so long
> as the listed conditions return the listed error codes.
 
> That's not meant to imply that posix mutexes will ever fail, however.
 
Good point. :^)
Michael S <already5chosen@yahoo.com>: Sep 26 01:58PM -0700

On Tuesday, September 26, 2023 at 10:59:06 PM UTC+3, Chris M. Thomasson wrote:
 
> In SPARC parlance. For instance, an atomic store on x86 automatically
> implies release. Btw, have you ever worked with SPARC in RMO mode? Its
> fun because its so weak.
 
We discussed it already. I don't think that SPARCv9 RMO hardware
ever existed and you failed to convince me otherwise.
 
As to weakness, I don't think that by specification it is weaker than IPF,
may be not even weaker than POWER. Certainly stronger than Alpha.
But it does not matter since it does not exist.
 
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Sep 26 02:06PM -0700

On 9/26/2023 1:58 PM, Michael S wrote:
>> fun because its so weak.
 
> We discussed it already. I don't think that SPARCv9 RMO hardware
> ever existed and you failed to convince me otherwise.
 
Iirc, SPARC64 V9 can put into RMO mode.
 
 
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Sep 26 02:07PM -0700

On 9/26/2023 1:58 PM, Michael S wrote:
>> fun because its so weak.
 
> We discussed it already. I don't think that SPARCv9 RMO hardware
> ever existed and you failed to convince me otherwise.
 
Have you ever used SPARC before? Solaris?
 
 
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

No comments: