Saturday, January 16, 2021

Digest for comp.programming.threads@googlegroups.com - 5 updates in 5 topics

Amine Moulay Ramdane <aminer68@gmail.com>: Jan 15 04:17PM -0800

Hello,
 
 
About smartness and about MCS Lock and more..
 
I am white arab and i think i am smart since i have invented many scalable algorithms and algorithms..
 
I have just read the following article from ACM:
 
Scalability Techniques for Practical Synchronization Primitives
 
https://queue.acm.org/detail.cfm?id=2698990
 
Notice how they are speaking about one of the best scalable Lock that we call MCS lock, but i think that CLH and MCS locks are not smart since those scalable Locks are like intrusive, since they have to hide the required parameter to be passed, this is why i think i am smart since i have invented a scalable Lock that is better than MCS Lock since my scalable Lock doesn't require any parameter to be passed, just call the Enter() and Leave() methods and that's all, here it is, read carefully about it in my website here:
 
https://sites.google.com/site/scalable68/scalable-mlock
 
 
I have also just enhanced it more and i will post it soon.
 
I have also invented many other scalable algorithms and algorithms..
 
 
 
Thank you,
Amine Moulay Ramdane,.
Amine Moulay Ramdane <aminer68@gmail.com>: Jan 15 12:59PM -0800

Hello,
 
 
More precision about the Microsoft windows futex..
 
 
Bonita Montero has just posted a simple benchmark that shows the difference in speed between a windows futex and a spinlock in a contention
context with two threads, and here is my conclusion:
 
I think i can logically infer the following from the benchmark,
 
Since the benchmark is giving 6x times more speed to the spinlock than
to the windows futex, so i think that the windows futex is still doing a system call directly without spinning so that to reduce the system calls that are expensive and so that to avoid convoying, this is why i think the windows futex is still slow and it is not good.
 
And since the windows futex is not reducing the system calls, here is
another problem with the system calls of the windows futex:
 
System calls have become more expensive with Meltdown
 
Read more here:
 
https://hackernoon.com/system-calls-have-been-more-expensive-with-meltdown-how-to-avoid-them-af4b0026d35a
 
Note: Futexes have been implemented in Microsoft Windows since Windows 8 or Windows Server 2012 under the name WaitOnAddress.
 
 
Thank you,
Amine Moulay Ramdane.
Amine Moulay Ramdane <aminer68@gmail.com>: Jan 15 12:51PM -0800

Hello,
 
 
Read again: About the Microsoft windows futex..
 
Bonita Montero has just posted a simple benchmark that shows the difference in speed between a windows futex and a spinlock, and here is my
conclusion:
 
I think i can logically infer the following from the benchmark,
 
Since the benchmark is giving 6x times more speed to the spinlock than
to the windows futex, so i think that the windows futex is still doing a system call directly without spinning so that to reduce the system calls that are expensive and so that to avoid convoying, this is why i think the windows futex is still slow and it is not good.
 
And since the windows futex is not reducing the system calls, here is
another problem with the system calls of the windows futex:
 
System calls have become more expensive with Meltdown
 
Read more here:
 
https://hackernoon.com/system-calls-have-been-more-expensive-with-meltdown-how-to-avoid-them-af4b0026d35a
 
Note: Futexes have been implemented in Microsoft Windows since Windows 8 or Windows Server 2012 under the name WaitOnAddress.
 
Thank you,
Amine Moulay Ramdane.
Amine Moulay Ramdane <aminer68@gmail.com>: Jan 15 12:50PM -0800

Hello,
 
 
About the Mircrosoft windows futex..
 
Bonita Montero has just posted a simple benchmark that shows the difference in speed between a windows futex and a spinlock, and here is my
conclusion:
 
I think i can logically infer the following from the benchmark,
 
Since the benchmark is giving 6x times more speed to the spinlock than
to the windows futex, so i think that the windows futex is still doing a system call directly without spinning so that to reduce the system calls that are expensive and so that to avoid convoying, this is why i think the windows futex is still slow and it is not good.
 
And since the windows futex is not reducing the system calls, here is
another problem with the system calls of the windows futex:
 
System calls have become more expensive with Meltdown
 
Read more here:
 
https://hackernoon.com/system-calls-have-been-more-expensive-with-meltdown-how-to-avoid-them-af4b0026d35a
 
Note: Futexes have been implemented in Microsoft Windows since Windows 8 or Windows Server 2012 under the name WaitOnAddress.
 
Thank you,
Amine Moulay Ramdane.
Amine Moulay Ramdane <aminer68@gmail.com>: Jan 15 11:54AM -0800

Hello,
 
 
About x86 and ARM and M1 processors..
 
I have just noticed that Apple has just released the M1 processor and it is
ARM-based, here it is:
 
https://www.theverge.com/2020/11/19/21574057/apple-m1-chips-laptop-performance-intel-qualcomm-competition

 
But i think this ARM-based processor uses WMO hardware memory
model, but i think the WMO hardware memory model is dangerous to
program with(such as synchronization algorithms), also look here at the RISC-V processor that supports both TSO and WMO hardware memory models, and it says that RISC-V TSO hardware memory model is as fast as RISC-V WMO memory model, read here:
 
https://riscv.org/wp-content/uploads/2019/06/16.15-Stefanos-Kaxiras.pdf
 
 
So i think that ARM processors are not smart since they have to support both TSO and WMO memory models like RISC-V.
 
 
About SC and TSO and RMO hardware memory models..
 
I have just read the following webpage about the performance difference
between: SC and TSO and RMO hardware memory models
 
I think TSO is better, it is just around 3% ~ 6% less performance
than RMO and it is a simpler programming model than RMO. So i think ARM
must support TSO to be compatible with x86 that is TSO.
 
Read more here to notice it:
 
https://infoscience.epfl.ch/record/201695/files/CS471_proj_slides_Tao_Marc_2011_1222_1.pdf
 
About memory models and sequential consistency:
 
As you have noticed i am working with x86 architecture..
 
Even though x86 gives up on sequential consistency, it's among the most
well-behaved architectures in terms of the crazy behaviors it allows.
Most other architectures implement even weaker memory models.
 
ARM memory model is notoriously underspecified, but is essentially a
form of weak ordering, which provides very few guarantees. Weak ordering
allows almost any operation to be reordered, which enables a variety of
hardware optimizations but is also a nightmare to program at the lowest
levels.
 
Read more here:
 
https://homes.cs.washington.edu/~bornholt/post/memory-models.html
 
 
Memory Models: x86 is TSO, TSO is Good
 
Essentially, the conclusion is that x86 in practice implements the old
SPARC TSO memory model.
 
The big take-away from the talk for me is that it confirms the
observation made may times before that SPARC TSO seems to be the optimal
memory model. It is sufficiently understandable that programmers can
write correct code without having barriers everywhere. It is
sufficiently weak that you can build fast hardware implementation that
can scale to big machines.
 
Read more here:
 
https://jakob.engbloms.se/archives/1435
 
 
 
Thank you,
Amine Moulay Ramdane.
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.programming.threads+unsubscribe@googlegroups.com.

No comments: