- You have to be more convinced by my writing about weak memory models - 1 Update
- [Modération JNTP] Annulation de <q31rgr$oqm$12@dont-email.me> - 8 Updates
- I have just taken a look at the Memory ordering of ARM CPU architecture - 1 Update
- About my scalable algorithms.. - 1 Update
- About memory models... - 1 Update
- I have just taken a look at the following algorithm invented by ,Dmitry Vyukov - 1 Update
- My Scalable RWLocks for Delphi and Freepascal were updated to version 4.18 - 1 Update
- My C++ synchronization objects library for Windows and Linux was again updated .. - 1 Update
- My Scalable RWLocks for Delphi and Freepascal were updated to version 4.17 - 1 Update
- My C++ synchronization objects library for Windows and Linux was updated.. - 1 Update
Horizon68 <horizon@horizon.com>: Feb 01 03:15PM -0800 Hello.. You have to be more convinced by my writing about weak memory models like the ARM CPU architecture: Read more here what is saying Herb Sutter about "Strong" and "weak" hardware memory models: https://herbsutter.com/2012/08/02/strong-and-weak-hardware-memory-models/ As you have noticed i have to wait for ARM to switch to TSO like it is doing with ARMv8, this will enable us to very easily port some aglorithms from x86 TSO to ARM TSO. Now my scalable algorithms are working on x86, so i think that my scalable algorithms will be very easily portable to ARM CPU architecture as soon as ARM will switch to TSO memory model. Thank you, Amine Moulay Ramdane. |
Elephant Man <conanospamic@gmail.com>: Feb 01 05:42PM Article d'annulation émis par un modérateur JNTP via Nemo. |
Elephant Man <conanospamic@gmail.com>: Feb 01 05:42PM Article d'annulation émis par un modérateur JNTP via Nemo. |
Elephant Man <conanospamic@gmail.com>: Feb 01 09:13PM Article d'annulation émis par un modérateur JNTP via Nemo. |
Elephant Man <conanospamic@gmail.com>: Feb 01 09:13PM Article d'annulation émis par un modérateur JNTP via Nemo. |
Elephant Man <conanospamic@gmail.com>: Feb 01 09:13PM Article d'annulation émis par un modérateur JNTP via Nemo. |
Elephant Man <conanospamic@gmail.com>: Feb 01 09:41PM Article d'annulation émis par un modérateur JNTP via Nemo. |
Elephant Man <conanospamic@gmail.com>: Feb 01 10:22PM Article d'annulation émis par un modérateur JNTP via Nemo. |
Elephant Man <conanospamic@gmail.com>: Feb 01 10:42PM Article d'annulation émis par un modérateur JNTP via Nemo. |
Horizon68 <horizon@horizon.com>: Feb 01 02:35PM -0800 Hello.. I have just taken a look at the Memory ordering of ARM CPU architecture here: https://en.wikipedia.org/wiki/Memory_ordering And i think that there is something happening with weak memory models like ARM and this is not good, take for example my scalable MLock or my scalable AMLock that i have invented here: https://sites.google.com/site/scalable68/scalable-mlock and here: https://sites.google.com/site/scalable68/scalable-amlock They are working now with x86 memory model that is a TSO memory model like Sparc TSO, but to be able to port them to ARM i have to use more memory barrier that costs more than on x86, also there is another problem is that you can make more errors on the memory ordering process on the weaker memory model of ARM. So i think i will stay with x86 CPU architecture, and not port my scalable algorithms on ARM, because ARM must provide us with a TSO memory model to be able to be more efficient. About memory models and sequential consistency: As you have noticed i am working with x86 architecture.. Even though x86 gives up on sequential consistency, it's among the most well-behaved architectures in terms of the crazy behaviors it allows. Most other architectures implement even weaker memory models. ARM memory model is notoriously underspecified, but is essentially a form of weak ordering, which provides very few guarantees. Weak ordering allows almost any operation to be reordered, which enables a variety of hardware optimizations but is also a nightmare to program at the lowest levels. Read more here: https://homes.cs.washington.edu/~bornholt/post/memory-models.html Memory Models: x86 is TSO, TSO is Good Essentially, the conclusion is that x86 in practice implements the old SPARC TSO memory model. The big take-away from the talk for me is that it confirms the observation made may times before that SPARC TSO seems to be the optimal memory model. It is sufficiently understandable that programmers can write correct code without having barriers everywhere. It is sufficiently weak that you can build fast hardware implementation that can scale to big machines. Read more here: https://jakob.engbloms.se/archives/1435 Thank you, Amine Moulay Ramdane. |
Horizon68 <horizon@horizon.com>: Feb 01 02:02PM -0800 Hello.. About my scalable algorithms.. My scalable algorithms that i have invented are working now on x86. But i will port them smartly soon to other CPU architecture like ARM.. I think i will port just my scalable MLock and my scalable AMLock for Delphi and Freepascal so that my scalable algorithms work on ARM CPU architecture, ARM that is of a weak memory model, here is the Memory ordering of ARM: https://en.wikipedia.org/wiki/Memory_ordering Here is my scalable MLock and my scalable AMLock that i have invented: https://sites.google.com/site/scalable68/scalable-mlock and here: https://sites.google.com/site/scalable68/scalable-amlock I think x86 is TSO and is the same as Sparc TSO. About memory models and sequential consistency: As you have noticed i am working with x86 architecture.. Even though x86 gives up on sequential consistency, it's among the most well-behaved architectures in terms of the crazy behaviors it allows. Most other architectures implement even weaker memory models. ARM memory model is notoriously underspecified, but is essentially a form of weak ordering, which provides very few guarantees. Weak ordering allows almost any operation to be reordered, which enables a variety of hardware optimizations but is also a nightmare to program at the lowest levels. Read more here: https://homes.cs.washington.edu/~bornholt/post/memory-models.html Memory Models: x86 is TSO, TSO is Good Essentially, the conclusion is that x86 in practice implements the old SPARC TSO memory model. The big take-away from the talk for me is that it confirms the observation made may times before that SPARC TSO seems to be the optimal memory model. It is sufficiently understandable that programmers can write correct code without having barriers everywhere. It is sufficiently weak that you can build fast hardware implementation that can scale to big machines. Read more here: https://jakob.engbloms.se/archives/1435 Thank you, Amine Moulay Ramdane. |
Horizon68 <horizon@horizon.com>: Feb 01 01:36PM -0800 Hello.. About memory models... I wrote about memory models, and i think that Delphi and FreePascal had the necessary to make it easier even if they have no memory model, here is the functions that you need: For Delphi there is a function called System.MemoryBarrier and here it is: http://docwiki.embarcadero.com/Libraries/Tokyo/en/System.MemoryBarrier And for FreePascal there three functions called: - ReadWriteBarrier - WriteBarrier - ReadBarrier Here they are: https://www.freepascal.org/docs-html/rtl/system/readwritebarrier.html So as you have noticed my scalable algorithms works on x86 , but with the above functions i will make them more easily portable to ARM and to other CPU architetures. And I have just taken a look at the following algorithm invented by Dmitry Vyukov: https://groups.google.com/forum/#!topic/lock-free/Hv3GUlccYTc Notice that it is using FlushProcessWriteBuffers() , and notice on the following what said Chris Thomasson about FlushProcessWriteBuffers() == Well, the thing with FlushProcessWriteBuffers() is that it will generate a lot of traffic in the sense of sending the interrupts to all the CPUS in the processes affinity mask. This is an "active" form of quiescent state auto-detection. As of now, vZOOM uses "passive" detection technique on Windows; It does not need to interrupt CPU activity. AFAICT, that is the only advantage I can see to passive epoch detection, rather than active. Also, for PDR, the epochs should be detected on a frequent enough basis to keep the deferred object lists from backing up too much. The frequency of epochs in an active system will be creating a lot of IPI traffic, while the passive system will be creating none. Read more here: https://groups.google.com/forum/#!topic/comp.programming.threads/E0gGTkg46HE == But I i have just invented a new scalable RWLock algorithm that is "better" than the above because it doesn't need FlushProcessWriteBuffers() and it doesn't use any membar or lock in the readers side and it is starvation-free and it is FIFO fair on the readers side and FIFO fair on the writers side, and i will implement my new scalable algorithm in C++ and Delphi and Freepascal. Thank youm Amine Moulay Ramdane. |
Horizon68 <horizon@horizon.com>: Feb 01 12:40PM -0800 Hello.. I have just taken a look at the following algorithm invented by Dmitry Vyukov: https://groups.google.com/forum/#!topic/lock-free/Hv3GUlccYTc Notice that it is using FlushProcessWriteBuffers() , and notice on the following what said Chris Thomasson about FlushProcessWriteBuffers() == Well, the thing with FlushProcessWriteBuffers() is that it will generate a lot of traffic in the sense of sending the interrupts to all the CPUS in the processes affinity mask. This is an "active" form of quiescent state auto-detection. As of now, vZOOM uses "passive" detection technique on Windows; It does not need to interrupt CPU activity. AFAICT, that is the only advantage I can see to passive epoch detection, rather than active. Also, for PDR, the epochs should be detected on a frequent enough basis to keep the deferred object lists from backing up too much. The frequency of epochs in an active system will be creating a lot of IPI traffic, while the passive system will be creating none. Read more here: https://groups.google.com/forum/#!topic/comp.programming.threads/E0gGTkg46HE == But I i have just invented a new scalable RWLock algorithm that is "better" than the above because it doesn't need FlushProcessWriteBuffers() and it doesn't use any membar or lock in the readers side and it is starvation-free and it is FIFO fair on the readers side and FIFO fair on the writers side, and i will implement my new scalable algorithm on C++ and Delphi and Freepascal. Thank youm Amine Moulay Ramdane. |
Horizon68 <horizon@horizon.com>: Feb 01 10:41AM -0800 Hello, My Scalable RWLocks for Delphi and Freepascal were updated to version 4.18 Now i think they are fast and they are working correctly. You can download them from: https://sites.google.com/site/scalable68/scalable-rwlock Thank you, Amine Moulay Ramdane. |
Horizon68 <horizon@horizon.com>: Feb 01 10:38AM -0800 Hello, My C++ synchronization objects library for Windows and Linux was again updated .. Now i think it is fast and it is working correct.. You can download it from: https://sites.google.com/site/scalable68/c-synchronization-objects-library Thank you, Amine Moulay Ramdane. |
Horizon68 <horizon@horizon.com>: Feb 01 08:18AM -0800 Hello, My Scalable RWLocks for Delphi and Freepascal were updated to version 4.17 You can download them from: https://sites.google.com/site/scalable68/scalable-rwlock Thank you, Amine Moulay Ramdane. |
Horizon68 <horizon@horizon.com>: Feb 01 08:14AM -0800 Hello, My C++ synchronization objects library for Windows and Linux was updated.. You can download it from: https://sites.google.com/site/scalable68/c-synchronization-objects-library Thank you, Amine Moulay Ramdane. |
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page. To unsubscribe from this group and stop receiving emails from it send an email to comp.programming.threads+unsubscribe@googlegroups.com. |
No comments:
Post a Comment