- About smartness and about MCS Lock and more.. - 1 Update
- More precision about the Microsoft windows futex.. - 1 Update
- Read again: About the Microsoft windows futex.. - 1 Update
- About the Mircosoft windows futex.. - 1 Update
- About x86 and ARM and M1 processors.. - 1 Update
Amine Moulay Ramdane <aminer68@gmail.com>: Jan 15 04:17PM -0800 Hello, About smartness and about MCS Lock and more.. I am white arab and i think i am smart since i have invented many scalable algorithms and algorithms.. I have just read the following article from ACM: Scalability Techniques for Practical Synchronization Primitives https://queue.acm.org/detail.cfm?id=2698990 Notice how they are speaking about one of the best scalable Lock that we call MCS lock, but i think that CLH and MCS locks are not smart since those scalable Locks are like intrusive, since they have to hide the required parameter to be passed, this is why i think i am smart since i have invented a scalable Lock that is better than MCS Lock since my scalable Lock doesn't require any parameter to be passed, just call the Enter() and Leave() methods and that's all, here it is, read carefully about it in my website here: https://sites.google.com/site/scalable68/scalable-mlock I have also just enhanced it more and i will post it soon. I have also invented many other scalable algorithms and algorithms.. Thank you, Amine Moulay Ramdane,. |
Amine Moulay Ramdane <aminer68@gmail.com>: Jan 15 12:59PM -0800 Hello, More precision about the Microsoft windows futex.. Bonita Montero has just posted a simple benchmark that shows the difference in speed between a windows futex and a spinlock in a contention context with two threads, and here is my conclusion: I think i can logically infer the following from the benchmark, Since the benchmark is giving 6x times more speed to the spinlock than to the windows futex, so i think that the windows futex is still doing a system call directly without spinning so that to reduce the system calls that are expensive and so that to avoid convoying, this is why i think the windows futex is still slow and it is not good. And since the windows futex is not reducing the system calls, here is another problem with the system calls of the windows futex: System calls have become more expensive with Meltdown Read more here: https://hackernoon.com/system-calls-have-been-more-expensive-with-meltdown-how-to-avoid-them-af4b0026d35a Note: Futexes have been implemented in Microsoft Windows since Windows 8 or Windows Server 2012 under the name WaitOnAddress. Thank you, Amine Moulay Ramdane. |
Amine Moulay Ramdane <aminer68@gmail.com>: Jan 15 12:51PM -0800 Hello, Read again: About the Microsoft windows futex.. Bonita Montero has just posted a simple benchmark that shows the difference in speed between a windows futex and a spinlock, and here is my conclusion: I think i can logically infer the following from the benchmark, Since the benchmark is giving 6x times more speed to the spinlock than to the windows futex, so i think that the windows futex is still doing a system call directly without spinning so that to reduce the system calls that are expensive and so that to avoid convoying, this is why i think the windows futex is still slow and it is not good. And since the windows futex is not reducing the system calls, here is another problem with the system calls of the windows futex: System calls have become more expensive with Meltdown Read more here: https://hackernoon.com/system-calls-have-been-more-expensive-with-meltdown-how-to-avoid-them-af4b0026d35a Note: Futexes have been implemented in Microsoft Windows since Windows 8 or Windows Server 2012 under the name WaitOnAddress. Thank you, Amine Moulay Ramdane. |
Amine Moulay Ramdane <aminer68@gmail.com>: Jan 15 12:50PM -0800 Hello, About the Mircrosoft windows futex.. Bonita Montero has just posted a simple benchmark that shows the difference in speed between a windows futex and a spinlock, and here is my conclusion: I think i can logically infer the following from the benchmark, Since the benchmark is giving 6x times more speed to the spinlock than to the windows futex, so i think that the windows futex is still doing a system call directly without spinning so that to reduce the system calls that are expensive and so that to avoid convoying, this is why i think the windows futex is still slow and it is not good. And since the windows futex is not reducing the system calls, here is another problem with the system calls of the windows futex: System calls have become more expensive with Meltdown Read more here: https://hackernoon.com/system-calls-have-been-more-expensive-with-meltdown-how-to-avoid-them-af4b0026d35a Note: Futexes have been implemented in Microsoft Windows since Windows 8 or Windows Server 2012 under the name WaitOnAddress. Thank you, Amine Moulay Ramdane. |
Amine Moulay Ramdane <aminer68@gmail.com>: Jan 15 11:54AM -0800 Hello, About x86 and ARM and M1 processors.. I have just noticed that Apple has just released the M1 processor and it is ARM-based, here it is: https://www.theverge.com/2020/11/19/21574057/apple-m1-chips-laptop-performance-intel-qualcomm-competition But i think this ARM-based processor uses WMO hardware memory model, but i think the WMO hardware memory model is dangerous to program with(such as synchronization algorithms), also look here at the RISC-V processor that supports both TSO and WMO hardware memory models, and it says that RISC-V TSO hardware memory model is as fast as RISC-V WMO memory model, read here: https://riscv.org/wp-content/uploads/2019/06/16.15-Stefanos-Kaxiras.pdf So i think that ARM processors are not smart since they have to support both TSO and WMO memory models like RISC-V. About SC and TSO and RMO hardware memory models.. I have just read the following webpage about the performance difference between: SC and TSO and RMO hardware memory models I think TSO is better, it is just around 3% ~ 6% less performance than RMO and it is a simpler programming model than RMO. So i think ARM must support TSO to be compatible with x86 that is TSO. Read more here to notice it: https://infoscience.epfl.ch/record/201695/files/CS471_proj_slides_Tao_Marc_2011_1222_1.pdf About memory models and sequential consistency: As you have noticed i am working with x86 architecture.. Even though x86 gives up on sequential consistency, it's among the most well-behaved architectures in terms of the crazy behaviors it allows. Most other architectures implement even weaker memory models. ARM memory model is notoriously underspecified, but is essentially a form of weak ordering, which provides very few guarantees. Weak ordering allows almost any operation to be reordered, which enables a variety of hardware optimizations but is also a nightmare to program at the lowest levels. Read more here: https://homes.cs.washington.edu/~bornholt/post/memory-models.html Memory Models: x86 is TSO, TSO is Good Essentially, the conclusion is that x86 in practice implements the old SPARC TSO memory model. The big take-away from the talk for me is that it confirms the observation made may times before that SPARC TSO seems to be the optimal memory model. It is sufficiently understandable that programmers can write correct code without having barriers everywhere. It is sufficiently weak that you can build fast hardware implementation that can scale to big machines. Read more here: https://jakob.engbloms.se/archives/1435 Thank you, Amine Moulay Ramdane. |
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page. To unsubscribe from this group and stop receiving emails from it send an email to comp.programming.threads+unsubscribe@googlegroups.com. |
No comments:
Post a Comment