soft and program: Digest for comp.programming.threads@googlegroups.com

comp.programming.threads@googlegroups.com

Google Groups

More about my inventions and about Locks.. - 1 Update
Here is my new invention.. - 1 Update
About Lockfree and Waitfree and Locks.. - 1 Update
You cannot scale creativity.. - 1 Update
More of my thoughts on parallel computing and computing.. - 1 Update

More about my inventions and about Locks..

aminer68@gmail.com: Jun 29 01:47PM -0700

Hello..

More about my inventions and about Locks..

I have just read the following thoughts of a PhD researcher, and he says the following:

"4) using locks is prone to convoying effects;"

Read more here:

http://concurrencyfreaks.blogspot.com/2019/04/onefile-and-tail-latency.html

I am a white arab and i am smart like a genius, and this PhD
researcher is not so smart, notice that he is saying:

"4) using locks is prone to convoying effects;"

And i think he is not right, because i have invented the Holy Grail
of Locks, and it is not prone to convoying, read my following writing
about it:

----------------------------------------------------------------------

You have to understand deeply what is to invent my scalable algorithms
and there implementations so that to understand that it is powerful,
i give you an example: So i have invented a scalable algorithm that is a scalable Mutex that is remarkable and that is the Holy Grail of scalable Locks, it has the following characteristics, read my following thoughts
to understand:

About fair and unfair locking..

I have just read the following lead engineer at Amazon:

Highly contended and fair locking in Java

https://brooker.co.za/blog/2012/09/10/locking.html

So as you are noticing that you can use unfair locking that can have starvation or fair locking that is slower than unfair locking.

I think that Microsoft synchronization objects like the Windows critical section uses unfair locking, but they still can have starvation.

But i think that this not the good way to do, because i am an inventor and i have invented a scalable Fast Mutex that is much more powerful , because with my Fast Mutex you are capable to tune the "fairness" of the lock, and my Fast Mutex is capable of more than that, read about it on my following thoughts:

More about research and software development..

I have just looked at the following new video:

Why is coding so hard...

https://www.youtube.com/watch?v=TAAXwrgd1U8

I am understanding this video, but i have to explain my work:

I am not like this techlead in the video above, because i am also an "inventor" that has invented many scalable algorithms and there implementions, i am also inventing effective abstractions, i give you an example:

Read the following of the senior research scientist that is called Dave Dice:

Preemption tolerant MCS locks

https://blogs.oracle.com/dave/preemption-tolerant-mcs-locks

As you are noticing he is trying to invent a new lock that is preemption tolerant, but his lock lacks some important characteristics, this is why i have just invented a new Fast Mutex that is adaptative and that is much much better and i think mine is the "best", and i think you will not find it anywhere, my new Fast Mutex has the following characteristics:

1- Starvation-free
2- Tunable fairness
3- It keeps efficiently and very low its cache coherence traffic
4- Very good fast path performance
5- And it has a good preemption tolerance.
6- It is faster than scalable MCS lock
7- Not prone to convoying.
------------------------------------------------------------------------------

Also he is saying the following:

"1) if we use more than one lock, we're subject to having deadlock"

But you have to look here at our DelphiConcurrent and FreepascalConcurrent:

https://sites.google.com/site/scalable68/delphiconcurrent-and-freepascalconcurrent

And here is my new invention..

I think a Seqlock is a high-performance but restricted use of software Transactional Memory.

So i have just read about Seqlocks here on wikipedia:

https://en.wikipedia.org/wiki/Seqlock

And it says about Seqlock:

"The drawback is that if there is too much write activity or the reader is too slow, they might livelock (and the readers may starve)."

I am a white arab, and i think i am smart, so i have just invented a variant of Seqlock that has no livelock (when also there is too much write activity or the reader is too slow) and it is starvation-free.

So i think my new invention that is a variant of Seqlock is powerful.

And More now about Lockfree and Waitfree and Locks..

I have just read the following thoughts of a PhD researcher, and he says the following:

"5) mutual exclusion locks don't scale for read-only operations, it takes a reader-writer lock to have some scalability for read-only operations and even then, we either execute read-only operations or one write, but never both at the same time. Until today, there is no known efficient reader-writer lock with starvation-freedom guarantees;"

Read more here:

http://concurrencyfreaks.blogspot.com/2019/04/onefile-and-tail-latency.html

But i think that he is not right by saying the following:

"Until today, there is no known efficient reader-writer lock with starvation-freedom guarantees"

Because i am an inventor of many scalable algorithms and there implementations, and i have invented scalable and efficient
starvation-free reader-writer locks, read my following thoughts below
to notice it..

Also look at his following webpage:

OneFile - The world's first wait-free Software Transactional Memory

http://concurrencyfreaks.blogspot.com/2019/04/onefile-worlds-first-wait-free-software.html

But i think he is not right, because read the following thoughts that i have just posted that applies to waitfree and lockfree:

https://groups.google.com/forum/#!topic/comp.programming.threads/F_cF4ft1Qic

And read all my following thoughts to understand:

About Lock elision and Transactional memory..

I have just read the following:

Lock elision in the GNU C library

https://lwn.net/Articles/534758/

So it says the following:

"Lock elision uses the same programming model as normal locks, so it can be directly applied to existing programs. The programmer keeps using locks, but the locks are faster as they can use hardware transactional memory internally for more parallelism. Lock elision uses memory transactions as a fast path, while the slow path is still a normal lock. Deadlocks and other classic locking problems are still possible, because the transactions may fall back to a real lock at any time."

So i think this is not good, because one of the benefits of Transactional memory is that it solves the deadlock problem, but
with Lock elision you bring back the deadlock problem.

More about Locks and Transactional memory..

I have just looked at the following webpage about understanding Transactional memory performance:

https://www.cs.utexas.edu/users/witchel/pubs/porter10ispass-tm-slides.pdf

And as you are noticing, it says that in practice Transactional memory
is worse than Locks at high contention, and it says that in practice Transactional memory is 40% worse than Locks at 100% contention.

This is why i have invented scalable Locks and scalable RWLocks, read
my following thoughts to notice it:

About beating Moore's Law with software..

bmoore has responded to me the following:

https://groups.google.com/forum/#!topic/soc.culture.china/Uu15FIknU0s

So as you are noticing he is asking me the following:

"Are you talking about beating Moore's Law with software?"

But i think that there is some of the following constraints:

"Modern programing environments contribute to the problem of software bloat by placing ease of development and portable code above speed or memory usage. While this is a sound business model in a commercial environment, it does not make sense where IT resources are constrained. Languages such as Java, C-Sharp, and Python have opted for code portability and software development speed above execution speed and memory usage, while modern data storage and transfer standards such as XML and JSON place flexibility and readability above efficiency."

Read the following:

https://smallwarsjournal.com/jrnl/art/overcoming-death-moores-law-role-software-advances-and-non-semiconductor-technologies

Also there remains the following to also beat Moores's Law:

"Improved Algorithms

Hardware improvements mean little if software cannot effectively use the resources available to it. The Army should shape future software algorithms by funding basic research on improved software algorithms to meet its specific needs. The Army should also search for new algorithms and techniques which can be applied to meet specific needs and develop a learning culture within its software community to disseminate this information."

And about scalable algorithms, as you know i am a white arab
that is an inventor of many scalable algorithms and there implementations, read my following thoughts to notice it:

About my new invention that is a scalable algorithm..

I am a white arab, and i think i am more smart,
and i think i am like a genius, because i have again just invented
a new scalable algorithm, but i will briefly talk about the following best scalable reader-writer lock inventions, the first one is the following:

Scalable Read-mostly Synchronization Using Passive Reader-Writer Locks

https://www.usenix.org/system/files/conference/atc14/atc14-paper-liu.pdf

You will notice that it has a first weakness that it is for TSO hardware memory model and the second weakness is that the writers latency is very expensive when there is few readers.

And here is the other best scalable reader-writer lock invention of Facebook:

SharedMutex is a reader-writer lock. It is small, very fast, scalable
on multi-core

Read here:

https://github.com/facebook/folly/blob/master/folly/SharedMutex.h

But you will notice that the weakness of this scalable reader-writer lock is that the priority can only be configured as the following:

SharedMutexReadPriority gives priority to readers,
SharedMutexWritePriority gives priority to writers.

So the weakness of this scalable reader-writer lock is that
you can have starvation with it.

So this is why i have just invented a scalable algorithm that is
a scalable reader-writer lock that is better than the above and that is starvation-free and that is fair and that has a small writers latency.

So i think mine is the best and i will sell many of my scalable algorithms to software companies such as Microsoft or Google or Embardero..

What is it to be an inventor of many scalable algorithms ?

The Holy Grail of parallel programming is to provide good speedup while
hiding or avoiding the pitfalls of concurrency. You have to understand it to be able to understand what i am doing, i am an inventor of
many scalable algorithms and there implementations, but how can we define the kind of inventor like me? i think there is the following kinds of inventors, the ones that are PhD researchers and inventors like Albert Einstein, and the ones that are engineers and inventors like Nikola Tesla, and i think that i am of the kind of inventor of Nikola Tesla, i am not a PhD researcher like Albert Einstein, i am like an engineer who invented many scalable algorithms and there implementations, so i am like the following inventor that we call Nikola Tesla:

https://en.wikipedia.org/wiki/Nikola_Tesla

But i think that both those PhD researchers that are inventors and those Engineers that are inventors are powerful.

You have to understand deeply what is to invent my scalable algorithms
and there implementations so that to understand that it is powerful,
i give you an example: So i have invented a scalable algorithm that is a scalable Mutex that is remarkable and that is the Holy Grail of scalable Locks, it has the following characteristics, read my following thoughts
to understand:

About fair and unfair locking..

I have just read the following lead engineer at Amazon:

Highly contended and fair locking in Java

https://brooker.co.za/blog/2012/09/10/locking.html

So as you are noticing that you can use unfair locking that can have starvation or fair locking that is slower than unfair locking.

I think that Microsoft synchronization objects like the Windows critical section uses unfair locking, but they still can have starvation.

But i think that this not the good way to do, because i am an inventor and i have invented a scalable Fast Mutex that is much more powerful , because with my Fast Mutex you are capable to tune the "fairness" of the lock, and my Fast Mutex is capable of more than that, read about it on my following thoughts:

More about research and software development..

I have just looked at the following new video:

Why is coding so hard...

https://www.youtube.com/watch?v=TAAXwrgd1U8

I am understanding this video, but i have to explain my work:

I am not like this techlead in the video above, because i am also an "inventor" that has invented many scalable algorithms and there implementions, i am also inventing effective abstractions, i give you an example:

Read the following of the senior research scientist that is called Dave Dice:

Preemption tolerant MCS locks

https://blogs.oracle.com/dave/preemption-tolerant-mcs-locks

As you are noticing he is trying to invent a new lock that is preemption tolerant, but his lock lacks some important characteristics, this is why i have just invented a new Fast Mutex that is adaptative and that is much much better and i think mine is the "best", and i think you will not find it anywhere, my new Fast Mutex has the following characteristics:

1- Starvation-free
2- Tunable fairness
3- It keeps efficiently and very low its cache coherence traffic
4- Very good fast path performance
5- And it has a good preemption tolerance.
6- It is faster than scalable MCS lock
7- It is not prone to convoying

this is how i am an "inventor", and i have also invented other scalable algorithms such as a scalable reference counting with efficient support for weak references, and i have invented a fully scalable Threadpool, and i have also invented a Fully scalable FIFO queue, and i have
also invented other scalable algorithms and there implementations, and i think i will sell some of them to Microsoft or to Google or Embarcadero or such software companies.

And here is my other previous new invention of a scalable algorithm:

I have just read the following PhD paper about the invention that we call counting networks and they are better than Software combining trees:

Counting Networks

http://people.csail.mit.edu/shanir/publications/AHS.pdf

And i have read the following PhD paper:

http://people.csail.mit.edu/shanir/publications/HLS.pdf

So as you are noticing they are saying in the conclusion that:

"Software combining trees and counting networks which are the only techniques we observed to be truly scalable"

But i just found that this counting networks algorithm is not generally scalable, and i have the logical proof here, this is why i have just come with a new invention that enhance the counting networks algorithm to be generally scalable. And i think i will sell my new algorithm
of a generally scalable counting networks to Microsoft or Google or Embarcadero or such software companies.

So you have to be careful with the actual counting networks algorithm that is not generally scalable.

My other new invention is my scalable reference counting and here it is:

https://sites.google.com/site/scalable68/scalable-reference-counting-with-efficient-support-for-weak-references

And here is my just new invention of a scalable algorithm:

My Scalable RWLock that works across processes and threads was updated to version 4.62

Now i think it is working correctly in both Windows and Linux..

You can download it from my website here:

https://sites.google.com/site/scalable68/scalable-rwlock-that-works-accross-processes-and-threads

More about me as an inventor of many scalable algorithms..

I am a white arab and i think i am like a genius, because i have

Here is my new invention..

aminer68@gmail.com: Jun 29 12:13PM -0700

Hello,

Here is my new invention..

I think a Seqlock is a high-performance but restricted use of software Transactional Memory.

So i have just read about Seqlocks here on wikipedia:

https://en.wikipedia.org/wiki/Seqlock

And it says about:

"The drawback is that if there is too much write activity or the reader is too slow, they might livelock (and the readers may starve)."

I am a white arab, and i think i am smart, so i have just invented a variant of Seqlock that has no livelock (when also there is too much write activity or the reader is too slow) and it is starvation-free.

So i think my new invention that is a variant of Seqlock is powerful.

And now about Lockfree and Waitfree and Locks..

I have just read the following thoughts of a PhD researcher, and he says the following:

"Lock-based concurrency mechanism (locks) have several difficulties in practice:
1) if we use more than one lock, we're subject to having deadlock issues;
2) if we use priority locks, we're subject to having priority inversion issues;
3) if we use a lock without starvation-freedom guarantees (such as a spinlock), we're subject to starvation and live-lock;
4) using locks is prone to convoying effects;
5) mutual exclusion locks don't scale for read-only operations, it takes a reader-writer lock to have some scalability for read-only operations and even then, we either execute read-only operations or one write, but never both at the same time. Until today, there is no known efficient reader-writer lock with starvation-freedom guarantees;"

Read more here:

http://concurrencyfreaks.blogspot.com/2019/04/onefile-and-tail-latency.html

But i think that he is not right by saying the following:

"Until today, there is no known efficient reader-writer lock with starvation-freedom guarantees"

Because i am an inventor of many scalable algorithms and there implementations, and i have invented scalable and efficient
starvation-free reader-writer locks, read my following thoughts below
to notice it..

Also look at his following webpage:

OneFile - The world's first wait-free Software Transactional Memory

http://concurrencyfreaks.blogspot.com/2019/04/onefile-worlds-first-wait-free-software.html

But i think he is not right, because read the following thoughts that i have just posted that applies to waitfree and lockfree:

https://groups.google.com/forum/#!topic/comp.programming.threads/F_cF4ft1Qic

And read all my following thoughts to understand:

About Lock elision and Transactional memory..

I have just read the following:

Lock elision in the GNU C library

https://lwn.net/Articles/534758/

So it says the following:

"Lock elision uses the same programming model as normal locks, so it can be directly applied to existing programs. The programmer keeps using locks, but the locks are faster as they can use hardware transactional memory internally for more parallelism. Lock elision uses memory transactions as a fast path, while the slow path is still a normal lock. Deadlocks and other classic locking problems are still possible, because the transactions may fall back to a real lock at any time."

So i think this is not good, because one of the benefits of Transactional memory is that it solves the deadlock problem, but
with Lock elision you bring back the deadlock problem.

More about Locks and Transactional memory..

I have just looked at the following webpage about understanding Transactional memory performance:

https://www.cs.utexas.edu/users/witchel/pubs/porter10ispass-tm-slides.pdf

And as you are noticing, it says that in practice Transactional memory
is worse than Locks at high contention, and it says that in practice Transactional memory is 40% worse than Locks at 100% contention.

This is why i have invented scalable Locks and scalable RWLocks, read
my following thoughts to notice it:

About beating Moore's Law with software..

bmoore has responded to me the following:

https://groups.google.com/forum/#!topic/soc.culture.china/Uu15FIknU0s

So as you are noticing he is asking me the following:

"Are you talking about beating Moore's Law with software?"

But i think that there is some of the following constraints:

"Modern programing environments contribute to the problem of software bloat by placing ease of development and portable code above speed or memory usage. While this is a sound business model in a commercial environment, it does not make sense where IT resources are constrained. Languages such as Java, C-Sharp, and Python have opted for code portability and software development speed above execution speed and memory usage, while modern data storage and transfer standards such as XML and JSON place flexibility and readability above efficiency."

Read the following:

https://smallwarsjournal.com/jrnl/art/overcoming-death-moores-law-role-software-advances-and-non-semiconductor-technologies

Also there remains the following to also beat Moores's Law:

"Improved Algorithms

Hardware improvements mean little if software cannot effectively use the resources available to it. The Army should shape future software algorithms by funding basic research on improved software algorithms to meet its specific needs. The Army should also search for new algorithms and techniques which can be applied to meet specific needs and develop a learning culture within its software community to disseminate this information."

And about scalable algorithms, as you know i am a white arab
that is an inventor of many scalable algorithms and there implementations, read my following thoughts to notice it:

About my new invention that is a scalable algorithm..

I am a white arab, and i think i am more smart,
and i think i am like a genius, because i have again just invented
a new scalable algorithm, but i will briefly talk about the following best scalable reader-writer lock inventions, the first one is the following:

Scalable Read-mostly Synchronization Using Passive Reader-Writer Locks

https://www.usenix.org/system/files/conference/atc14/atc14-paper-liu.pdf

You will notice that it has a first weakness that it is for TSO hardware memory model and the second weakness is that the writers latency is very expensive when there is few readers.

And here is the other best scalable reader-writer lock invention of Facebook:

SharedMutex is a reader-writer lock. It is small, very fast, scalable
on multi-core

Read here:

https://github.com/facebook/folly/blob/master/folly/SharedMutex.h

But you will notice that the weakness of this scalable reader-writer lock is that the priority can only be configured as the following:

SharedMutexReadPriority gives priority to readers,
SharedMutexWritePriority gives priority to writers.

So the weakness of this scalable reader-writer lock is that
you can have starvation with it.

So this is why i have just invented a scalable algorithm that is
a scalable reader-writer lock that is better than the above and that is starvation-free and that is fair and that has a small writers latency.

So i think mine is the best and i will sell many of my scalable algorithms to software companies such as Microsoft or Google or Embardero..

What is it to be an inventor of many scalable algorithms ?

The Holy Grail of parallel programming is to provide good speedup while
hiding or avoiding the pitfalls of concurrency. You have to understand it to be able to understand what i am doing, i am an inventor of
many scalable algorithms and there implementations, but how can we define the kind of inventor like me? i think there is the following kinds of inventors, the ones that are PhD researchers and inventors like Albert Einstein, and the ones that are engineers and inventors like Nikola Tesla, and i think that i am of the kind of inventor of Nikola Tesla, i am not a PhD researcher like Albert Einstein, i am like an engineer who invented many scalable algorithms and there implementations, so i am like the following inventor that we call Nikola Tesla:

https://en.wikipedia.org/wiki/Nikola_Tesla

But i think that both those PhD researchers that are inventors and those Engineers that are inventors are powerful.

You have to understand deeply what is to invent my scalable algorithms
and there implementations so that to understand that it is powerful,
i give you an example: So i have invented a scalable algorithm that is a scalable Mutex that is remarkable and that is the Holy Grail of scalable Locks, it has the following characteristics, read my following thoughts
to understand:

About fair and unfair locking..

I have just read the following lead engineer at Amazon:

Highly contended and fair locking in Java

https://brooker.co.za/blog/2012/09/10/locking.html

So as you are noticing that you can use unfair locking that can have starvation or fair locking that is slower than unfair locking.

I think that Microsoft synchronization objects like the Windows critical section uses unfair locking, but they still can have starvation.

But i think that this not the good way to do, because i am an inventor and i have invented a scalable Fast Mutex that is much more powerful , because with my Fast Mutex you are capable to tune the "fairness" of the lock, and my Fast Mutex is capable of more than that, read about it on my following thoughts:

More about research and software development..

I have just looked at the following new video:

Why is coding so hard...

https://www.youtube.com/watch?v=TAAXwrgd1U8

I am understanding this video, but i have to explain my work:

I am not like this techlead in the video above, because i am also an "inventor" that has invented many scalable algorithms and there implementions, i am also inventing effective abstractions, i give you an example:

Read the following of the senior research scientist that is called Dave Dice:

Preemption tolerant MCS locks

https://blogs.oracle.com/dave/preemption-tolerant-mcs-locks

As you are noticing he is trying to invent a new lock that is preemption tolerant, but his lock lacks some important characteristics, this is why i have just invented a new Fast Mutex that is adaptative and that is much much better and i think mine is the "best", and i think you will not find it anywhere, my new Fast Mutex has the following characteristics:

1- Starvation-free
2- Tunable fairness
3- It keeps efficiently and very low its cache coherence traffic
4- Very good fast path performance
5- And it has a good preemption tolerance.
6- It is faster than scalable MCS lock

this is how i am an "inventor", and i have also invented other scalable algorithms such as a scalable reference counting with efficient support for weak references, and i have invented a fully scalable Threadpool, and i have also invented a Fully scalable FIFO queue, and i have
also invented other scalable algorithms and there implementations, and i think i will sell some of them to Microsoft or to Google or Embarcadero or such software companies.

And here is my other previous new invention of a scalable algorithm:

I have just read the following PhD paper about the invention that we call counting networks and they are better than Software combining trees:

Counting Networks

http://people.csail.mit.edu/shanir/publications/AHS.pdf

And i have read the following PhD paper:

http://people.csail.mit.edu/shanir/publications/HLS.pdf

So as you are noticing they are saying in the conclusion that:

"Software combining trees and counting networks which are the only techniques we observed to be truly scalable"

But i just found that this counting networks algorithm is not generally scalable, and i have the logical proof here, this is why i have just come with a new invention that enhance the counting networks algorithm to be generally scalable. And i think i will sell my new algorithm
of a generally scalable counting networks to Microsoft or Google or Embarcadero or such software companies.

So you have to be careful with the actual counting networks algorithm that is not generally scalable.

My other new invention is my scalable reference counting and here it is:

https://sites.google.com/site/scalable68/scalable-reference-counting-with-efficient-support-for-weak-references

And here is my just new invention of a scalable algorithm:

My Scalable RWLock that works across processes and threads was updated to version 4.62

Now i think it is working correctly in both Windows and Linux..

You can download it from my website here:

https://sites.google.com/site/scalable68/scalable-rwlock-that-works-accross-processes-and-threads

More about me as an inventor of many scalable algorithms..

I am a white arab and i think i am like a genius, because i have invented many scalable algorithms and there implementations, and look for example at my just new invention of a scalable algorithm here:

https://sites.google.com/site/scalable68/scalable-rwlock-that-works-accross-processes-and-threads

As you have noticed, you have to be like a genius to be able to invent
my above scalable algorithm of a scalable RWLock, because it has the following characteristics:

1- It is Scalable
2- It is Starvation-free
3- It is fair
4- It can be used across processes and threads
5- It can be used as a scalable Lock across processes and threads
by using my scalable AMLock that is FIFO fair on the writers side, or it can be
used as a scalable RWLock.

I am using my scalable Lock that is FIFO fair that is called scalable AMLock on the writers side.

Here is why scalable Locks are really important:

https://queue.acm.org/detail.cfm?id=2698990

So all in all it is a really good invention of mine.

Read my previous thoughts:

Here is how to use my new invention that is my scalable RWLock
across processes:

Just create an scalable rwlock object by giving a name in one process by calling the constructor like this:

scalable_rwlock.create('amine');

And you can use the scalable rwlock object from another process by calling the constructor by using the name like this:

scalable_rwlock.create('amine');

So as you are noticing i have abstracted it efficiently..

Read the rest of my previous thoughts:

My new invention of a Scalable RWLock that works across processes and threads is here, and now it works on both Windows and Linux..

Please download my source code and take a look at how i am making it work across processes by using FNV1a hash on both process ID and thread ID, FNV1a has a good dispersion, and FNV1a hash permits also my RWLock to be scalable.

You can download it from my website here:

https://sites.google.com/site/scalable68/scalable-rwlock-that-works-accross-processes-and-threads

Description:

This is my invention of a fast, and scalable and starvation-free and fair and lightweight Multiple-Readers-Exclusive-Writer Lock called LW_RWLockX, it works across processes and threads.

The parameters of the constructor are: first parameter is the name of the scalable RWLock to be used across processes, if the name is empty, it will only be used across threads. The second parameter is the size of the array of the readers, so if the size of the array is equal to the number of parallel readers, so it will be scalable, but if the number of readers are greater than the size of the array , you will start to have contention. The third parameter is the size of the array of my scalable Lock that is called AMLock, the number of threads can go beyond the size of the array of the scalable AMLock, please look at the source code of my scalable algorithms to understand.

I have also used my following implementation of FNV1a hash function to make my new variants of RWLocks scalable (since FNV1a is a hash algorithm that has good dispersion):

function FNV1aHash(key:int64):

About Lockfree and Waitfree and Locks..

aminer68@gmail.com: Jun 29 10:57AM -0700

Hello,

I am a white arab, and now about Lockfree and Waitfree and Locks..

I have just read the following thoughts of a PhD researcher, and he says the following:

"Lock-based concurrency mechanism (locks) have several difficulties in practice:
1) if we use more than one lock, we're subject to having deadlock issues;
2) if we use priority locks, we're subject to having priority inversion issues;
3) if we use a lock without starvation-freedom guarantees (such as a spinlock), we're subject to starvation and live-lock;
4) using locks is prone to convoying effects;
5) mutual exclusion locks don't scale for read-only operations, it takes a reader-writer lock to have some scalability for read-only operations and even then, we either execute read-only operations or one write, but never both at the same time. Until today, there is no known efficient reader-writer lock with starvation-freedom guarantees;"

Read more here:

http://concurrencyfreaks.blogspot.com/2019/04/onefile-and-tail-latency.html

But i think that he is not right by saying the following:

"Until today, there is no known efficient reader-writer lock with starvation-freedom guarantees"

Because i am an inventor of many scalable algorithms and there implementations, and i have invented scalable and efficient
starvation-free reader-writer locks, read my following thoughts below
to notice it..

Also look at his following webpage:

OneFile - The world's first wait-free Software Transactional Memory

http://concurrencyfreaks.blogspot.com/2019/04/onefile-worlds-first-wait-free-software.html

But i think he is not right, because read the following thoughts that i have just posted that applies to waitfree and lockfree:

https://groups.google.com/forum/#!topic/comp.programming.threads/F_cF4ft1Qic

And read all my following thoughts to understand:

About Lock elision and Transactional memory..

I have just read the following:

Lock elision in the GNU C library

https://lwn.net/Articles/534758/

So it says the following:

"Lock elision uses the same programming model as normal locks, so it can be directly applied to existing programs. The programmer keeps using locks, but the locks are faster as they can use hardware transactional memory internally for more parallelism. Lock elision uses memory transactions as a fast path, while the slow path is still a normal lock. Deadlocks and other classic locking problems are still possible, because the transactions may fall back to a real lock at any time."

So i think this is not good, because one of the benefits of Transactional memory is that it solves the deadlock problem, but
with Lock elision you bring back the deadlock problem.

More about Locks and Transactional memory..

I have just looked at the following webpage about understanding Transactional memory performance:

https://www.cs.utexas.edu/users/witchel/pubs/porter10ispass-tm-slides.pdf

And as you are noticing, it says that in practice Transactional memory
is worse than Locks at high contention, and it says that in practice Transactional memory is 40% worse than Locks at 100% contention.

This is why i have invented scalable Locks and scalable RWLocks, read
my following thoughts to notice it:

About beating Moore's Law with software..

bmoore has responded to me the following:

https://groups.google.com/forum/#!topic/soc.culture.china/Uu15FIknU0s

So as you are noticing he is asking me the following:

"Are you talking about beating Moore's Law with software?"

But i think that there is some of the following constraints:

"Modern programing environments contribute to the problem of software bloat by placing ease of development and portable code above speed or memory usage. While this is a sound business model in a commercial environment, it does not make sense where IT resources are constrained. Languages such as Java, C-Sharp, and Python have opted for code portability and software development speed above execution speed and memory usage, while modern data storage and transfer standards such as XML and JSON place flexibility and readability above efficiency."

Read the following:

https://smallwarsjournal.com/jrnl/art/overcoming-death-moores-law-role-software-advances-and-non-semiconductor-technologies

Also there remains the following to also beat Moores's Law:

"Improved Algorithms

Hardware improvements mean little if software cannot effectively use the resources available to it. The Army should shape future software algorithms by funding basic research on improved software algorithms to meet its specific needs. The Army should also search for new algorithms and techniques which can be applied to meet specific needs and develop a learning culture within its software community to disseminate this information."

And about scalable algorithms, as you know i am a white arab
that is an inventor of many scalable algorithms and there implementations, read my following thoughts to notice it:

About my new invention that is a scalable algorithm..

I am a white arab, and i think i am more smart,
and i think i am like a genius, because i have again just invented
a new scalable algorithm, but i will briefly talk about the following best scalable reader-writer lock inventions, the first one is the following:

Scalable Read-mostly Synchronization Using Passive Reader-Writer Locks

https://www.usenix.org/system/files/conference/atc14/atc14-paper-liu.pdf

You will notice that it has a first weakness that it is for TSO hardware memory model and the second weakness is that the writers latency is very expensive when there is few readers.

And here is the other best scalable reader-writer lock invention of Facebook:

SharedMutex is a reader-writer lock. It is small, very fast, scalable
on multi-core

Read here:

https://github.com/facebook/folly/blob/master/folly/SharedMutex.h

But you will notice that the weakness of this scalable reader-writer lock is that the priority can only be configured as the following:

SharedMutexReadPriority gives priority to readers,
SharedMutexWritePriority gives priority to writers.

So the weakness of this scalable reader-writer lock is that
you can have starvation with it.

So this is why i have just invented a scalable algorithm that is
a scalable reader-writer lock that is better than the above and that is starvation-free and that is fair and that has a small writers latency.

So i think mine is the best and i will sell many of my scalable algorithms to software companies such as Microsoft or Google or Embardero..

What is it to be an inventor of many scalable algorithms ?

The Holy Grail of parallel programming is to provide good speedup while
hiding or avoiding the pitfalls of concurrency. You have to understand it to be able to understand what i am doing, i am an inventor of
many scalable algorithms and there implementations, but how can we define the kind of inventor like me? i think there is the following kinds of inventors, the ones that are PhD researchers and inventors like Albert Einstein, and the ones that are engineers and inventors like Nikola Tesla, and i think that i am of the kind of inventor of Nikola Tesla, i am not a PhD researcher like Albert Einstein, i am like an engineer who invented many scalable algorithms and there implementations, so i am like the following inventor that we call Nikola Tesla:

https://en.wikipedia.org/wiki/Nikola_Tesla

But i think that both those PhD researchers that are inventors and those Engineers that are inventors are powerful.

You have to understand deeply what is to invent my scalable algorithms
and there implementations so that to understand that it is powerful,
i give you an example: So i have invented a scalable algorithm that is a scalable Mutex that is remarkable and that is the Holy Grail of scalable Locks, it has the following characteristics, read my following thoughts
to understand:

About fair and unfair locking..

I have just read the following lead engineer at Amazon:

Highly contended and fair locking in Java

https://brooker.co.za/blog/2012/09/10/locking.html

So as you are noticing that you can use unfair locking that can have starvation or fair locking that is slower than unfair locking.

I think that Microsoft synchronization objects like the Windows critical section uses unfair locking, but they still can have starvation.

But i think that this not the good way to do, because i am an inventor and i have invented a scalable Fast Mutex that is much more powerful , because with my Fast Mutex you are capable to tune the "fairness" of the lock, and my Fast Mutex is capable of more than that, read about it on my following thoughts:

More about research and software development..

I have just looked at the following new video:

Why is coding so hard...

https://www.youtube.com/watch?v=TAAXwrgd1U8

I am understanding this video, but i have to explain my work:

I am not like this techlead in the video above, because i am also an "inventor" that has invented many scalable algorithms and there implementions, i am also inventing effective abstractions, i give you an example:

Read the following of the senior research scientist that is called Dave Dice:

Preemption tolerant MCS locks

https://blogs.oracle.com/dave/preemption-tolerant-mcs-locks

As you are noticing he is trying to invent a new lock that is preemption tolerant, but his lock lacks some important characteristics, this is why i have just invented a new Fast Mutex that is adaptative and that is much much better and i think mine is the "best", and i think you will not find it anywhere, my new Fast Mutex has the following characteristics:

1- Starvation-free
2- Tunable fairness
3- It keeps efficiently and very low its cache coherence traffic
4- Very good fast path performance
5- And it has a good preemption tolerance.
6- It is faster than scalable MCS lock

this is how i am an "inventor", and i have also invented other scalable algorithms such as a scalable reference counting with efficient support for weak references, and i have invented a fully scalable Threadpool, and i have also invented a Fully scalable FIFO queue, and i have
also invented other scalable algorithms and there implementations, and i think i will sell some of them to Microsoft or to Google or Embarcadero or such software companies.

And here is my other previous new invention of a scalable algorithm:

I have just read the following PhD paper about the invention that we call counting networks and they are better than Software combining trees:

Counting Networks

http://people.csail.mit.edu/shanir/publications/AHS.pdf

And i have read the following PhD paper:

http://people.csail.mit.edu/shanir/publications/HLS.pdf

So as you are noticing they are saying in the conclusion that:

"Software combining trees and counting networks which are the only techniques we observed to be truly scalable"

But i just found that this counting networks algorithm is not generally scalable, and i have the logical proof here, this is why i have just come with a new invention that enhance the counting networks algorithm to be generally scalable. And i think i will sell my new algorithm
of a generally scalable counting networks to Microsoft or Google or Embarcadero or such software companies.

So you have to be careful with the actual counting networks algorithm that is not generally scalable.

My other new invention is my scalable reference counting and here it is:

https://sites.google.com/site/scalable68/scalable-reference-counting-with-efficient-support-for-weak-references

And here is my just new invention of a scalable algorithm:

My Scalable RWLock that works across processes and threads was updated to version 4.62

Now i think it is working correctly in both Windows and Linux..

You can download it from my website here:

https://sites.google.com/site/scalable68/scalable-rwlock-that-works-accross-processes-and-threads

More about me as an inventor of many scalable algorithms..

I am a white arab and i think i am like a genius, because i have invented many scalable algorithms and there implementations, and look for example at my just new invention of a scalable algorithm here:

https://sites.google.com/site/scalable68/scalable-rwlock-that-works-accross-processes-and-threads

As you have noticed, you have to be like a genius to be able to invent
my above scalable algorithm of a scalable RWLock, because it has the following characteristics:

1- It is Scalable
2- It is Starvation-free
3- It is fair
4- It can be used across processes and threads
5- It can be used as a scalable Lock across processes and threads
by using my scalable AMLock that is FIFO fair on the writers side, or it can be
used as a scalable RWLock.

I am using my scalable Lock that is FIFO fair that is called scalable AMLock on the writers side.

Here is why scalable Locks are really important:

https://queue.acm.org/detail.cfm?id=2698990

So all in all it is a really good invention of mine.

Read my previous thoughts:

Here is how to use my new invention that is my scalable RWLock
across processes:

Just create an scalable rwlock object by giving a name in one process by calling the constructor like this:

scalable_rwlock.create('amine');

And you can use the scalable rwlock object from another process by calling the constructor by using the name like this:

scalable_rwlock.create('amine');

So as you are noticing i have abstracted it efficiently..

Read the rest of my previous thoughts:

My new invention of a Scalable RWLock that works across processes and threads is here, and now it works on both Windows and Linux..

Please download my source code and take a look at how i am making it work across processes by using FNV1a hash on both process ID and thread ID, FNV1a has a good dispersion, and FNV1a hash permits also my RWLock to be scalable.

You can download it from my website here:

https://sites.google.com/site/scalable68/scalable-rwlock-that-works-accross-processes-and-threads

Description:

This is my invention of a fast, and scalable and starvation-free and fair and lightweight Multiple-Readers-Exclusive-Writer Lock called LW_RWLockX, it works across processes and threads.

The parameters of the constructor are: first parameter is the name of the scalable RWLock to be used across processes, if the name is empty, it will only be used across threads. The second parameter is the size of the array of the readers, so if the size of the array is equal to the number of parallel readers, so it will be scalable, but if the number of readers are greater than the size of the array , you will start to have contention. The third parameter is the size of the array of my scalable Lock that is called AMLock, the number of threads can go beyond the size of the array of the scalable AMLock, please look at the source code of my scalable algorithms to understand.

I have also used my following implementation of FNV1a hash function to make my new variants of RWLocks scalable (since FNV1a is a hash algorithm that has good dispersion):

function FNV1aHash(key:int64): UInt64;

var
i: Integer;
key1:uint64;

const

FNV_offset_basis: UInt64 = 14695981039346656037;
FNV_prime: UInt64 = 1099511628211;

begin

//FNV-1a hash

Result := FNV_offset_basis;

for i := 1 to 8 do
begin
key1:=(key shr ((i-1)*8)) and $00000000000000ff;
Result := (Result xor key1) * FNV_prime;
end;

end;

- Platform: Windows, Unix and Linux on x86

Required FPC switches: -O3 -Sd

-Sd for delphi mode....

Required Delphi switches: -$H+ -DDelphi

For Delphi XE-XE7 and Delphi tokyo use the -DXE switch

You can configure it as follows from inside defines.inc file:

{$DEFINE CPU32} and {$DEFINE Windows32} for 32

You cannot scale creativity..

aminer68@gmail.com: Jun 29 10:45AM -0700

Hello,

You cannot scale creativity..

I am a white arab and i think i am smart, and i invite you to read the
following thoughts about: "You cannot scale creativity", it is related to my following powerful product that i have designed and implemented, because as you will read below: "The solution is to lessen the need for coordination: have different people work on different things, use smaller teams, and employ fewer managers.", here is my powerful product (that can also be applied to organizations):

https://sites.google.com/site/scalable68/universal-scalability-law-for-delphi-and-freepascal

Please read the following about Applying the Universal Scalability Law to organisations:

https://blog.acolyer.org/2015/04/29/applying-the-universal-scalability-law-to-organisations/

So read the following to understand:

Read the following from the following PhD computer scientist:

https://lemire.me/blog/about-me/

You cannot scale creativity

As a teenager, I was genuinely impressed by communism. The way I saw it, the West could never compete. The USSR offered a centralized and efficient system that could eliminate waste and ensure optimal efficiency. If a scientific problem appeared, the USSR could throw 10, 100 or 1000 scientists at it without having to cajole anyone.

I could not quite understand why the communist countries always appeared to be technologically so backward. Weren't their coordinated engineers and scientists out-innovating our scientists and engineers?

I was making a reasoning error. I had misunderstood the concept of economy of scale best exemplified by Ford. To me, communism was more or less a massive application of the Fordian approach. It ought to make everything better and cheaper!

The industrial revolution was made possible by economies of scale: it costs far less per car to produce 10,000 cars than to make just one. Bill Gates became the richest man in the world because software offers an optimal economy of scale: it costs the same to produce one copy of Windows or 100 million copies.

Trade and employment can also scale: the transaction costs go down if you sell 10,000 objects a day, or hire 10,000 people a year. Accordingly, people living in cities are typically better off and more productive.

This has lead to the belief that if you regroup more people and you organize them, you get better productivity. I want to stress how different this statement is from the previous observations. We can scale products, services, trade and interaction. Scaling comes from the fact that we need reproduce many copies of the essentially the same object or service. But merely regrouping people only involves scaling in accounting and human ressources: if these are the costs holding you back, you are probably not doing anything important. To get ten people together to have much more than ten times the output is only possible if you are producing an uniform product or service.

Yet, somehow, people conclude that regroup people and getting them to work on a common goal, by itself, will improve productivity. Fred Brooks put a dent in this theory with his Brook's law:

Adding manpower to a late software project makes it later.

While it is true that almost all my work is collaborative, I consistently found it counterproductive to work in large groups. Of course, as an introvert, this goes against all my instincts. But I also fail to see the productivity gains in practice whereas I do notice the more frequent meetings.

Abramo et al. (2012) looked seriously at this issue and found that you get no more than linear scaling. That is, a group of two researchers will produce twice as much as one researcher. Period. There is no economy of scale when coordinating human brains. Their finding contradicts decades of science policy where we have tried to organize people into larger and better coordinated groups (a concept eerily reminiscent of communism).

We can make an analogy with computers. Your quad-core processor will not run Microsoft Word four times as far. It probably won't even run it twice as fast. In fact, poorly written software may even run slower when there are more than one core. Coordination is expensive.

The solution is to lessen the need for coordination: have different people work on different things, use smaller teams, and employ fewer managers.

Read more here:

https://lemire.me/blog/2012/10/15/you-cannot-scale-creativity/

Thank you,
Amine Moulay Ramdane.

More of my thoughts on parallel computing and computing..

aminer68@gmail.com: Jun 29 10:16AM -0700

Hello,

More of my thoughts on parallel computing and computing..

I am an inventor of many scalable algorithms and there implementations
and i am a more serious software developer specialized in parallel computing and synchronization algorithms, so I invite you to read my following thoughts in the following web links about parallel computing and computing:

https://community.idera.com/developer-tools/general-development/f/getit-and-third-party/71464/about-turing-completeness-and-parallel-programming

https://community.idera.com/developer-tools/general-development/f/getit-and-third-party/71555/here-is-more-of-my-thoughts-on-programming-languages

https://community.idera.com/developer-tools/general-development/f/getit-and-third-party/72018/about-the-threadpool

https://community.idera.com/developer-tools/general-development/f/getit-and-third-party/70188/today-i-will-talk-about-data-dependency-and-parallel-loops

And here is "some" of my inventions that i have put in my website:

https://sites.google.com/site/scalable68/scalable-mlock

https://sites.google.com/site/scalable68/scalable-reference-counting-with-efficient-support-for-weak-references

https://sites.google.com/site/scalable68/scalable-rwlock

https://sites.google.com/site/scalable68/scalable-rwlock-that-works-accross-processes-and-threads

https://groups.google.com/forum/#!topic/comp.programming.threads/VaOo1WVACgs

https://sites.google.com/site/scalable68/an-efficient-threadpool-engine-with-priorities-that-scales-very-well

Thank you,
Amine Moulay Ramdane.

You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.programming.threads+unsubscribe@googlegroups.com.

soft and program

Tuesday, June 30, 2020

Digest for comp.programming.threads@googlegroups.com - 5 updates in 5 topics

No comments:

Blog Archive

About Me