soft and program: June 2020

Tuesday, June 30, 2020

Re:

It's gorgeous to be CHINESE !!!!!!!!!!!!!!!!

JCH

James C. Hsiung, Ph.D.
Professor of Politics & Int'l Law
New York University
19 West 4th St.
New York, N.Y.
(212) 998-8523

On Tue, Jun 30, 2020 at 10:35 PM tina Soong <tsoongtotherim@aol.com> wrote:

From: Dolores Kuo <doloresmkuo@gmail.com>

*發現一個有趣的現象*:

Clever      聰明的
Honest      誠實的
Intelligent 智慧的
Noble      高貴的
Excellent 卓越的
Smart      機伶的
Elegant    優雅的
把以上這些英文字的頭一個字母放一起就是：CHINESE---"華人"。

     是"華人"就在你的每一個群組裏發一次吧。—— 😉😊 蠻有意思👍

Sent from my iPhone

Fwd:

From: Dolores Kuo <doloresmkuo@gmail.com>

*發現一個有趣的現象*:

Clever      聰明的
Honest      誠實的
Intelligent 智慧的
Noble      高貴的
Excellent 卓越的
Smart      機伶的
Elegant    優雅的
把以上這些英文字的頭一個字母放一起就是：CHINESE---"華人"。

     是"華人"就在你的每一個群組裏發一次吧。—— 😉😊 蠻有意思👍

Sent from my iPhone

Digest for comp.lang.c++@googlegroups.com - 10 updates in 4 topics

comp.lang.c++@googlegroups.com

Google Groups

Topic digest
View all topics

using a special pointer value... - 1 Update
Observable end padding in arrays - 3 Updates
who's at fault, me or compiler? - 4 Updates
Optimiser Repressing Linker Error - 2 Updates

using a special pointer value...

"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Jun 30 03:34PM -0700

On 6/29/2020 1:19 AM, Chris M. Thomasson wrote:
>> The original message for this can be found here:

>> https://groups.google.com/forum/#!original/comp.arch/k1Qr520dcDk/4vl_jiVRBQAJ

>> within the following thread:
[...]
>             return m_head.exchange(nullptr, std::memory_order_acquire);
>         }
>     };
[...]

fwiw, there is a way to get rid of the acquire barrier in the
stack::flush function. The only difference is that some relaxed barriers
would become consume. This is fine on most arch's because they are
no-ops wrt memory barriers, except dec alpha.

Observable end padding in arrays

Juha Nieminen <nospam@thanks.invalid>: Jun 30 06:20AM

> I have seen lots of code which assumes contiguity over the whole of a
> multi-dimensional array, and I am pretty certain this requirement
> exists, but can anyone point me to where I can find it?

Not an answer to your question, but it reminded me of a conversation I had
recently about why there is no multidimensional version of std::array.

Why should there be? Because, AFAIK, std::array<std::array<T, X>, Y> is not
guaranteed to be contiguous.

Chris Vine <chris@cvine--nospam--.freeserve.co.uk>: Jun 30 11:39AM +0100

On Mon, 29 Jun 2020 14:41:07 -0700
> wording seems to indicate that it's a consequence of some normative
> requirement elsewhere in the standard. (Nevertheless, I accept
> that it's required.)

It is odd, which is maybe why I missed it.

The first sentence of the text extracted by Alf ("When applied to an
array, the result is the total number of bytes in the array") is a
truism, arising from the specification for the sizeof operator itself
in §8.3.3/1: "The sizeof operator yields the number of bytes in the
object representation of its operand".

So the "this implies ..." is not implied at all. It does however appear
to be normative, which is good enough for me. (At least, until the C++
standard committee notice this infelicity in the standard and decide to
resolve it by removing the requirement for no end padding. Removing
that requirement would break some pre-existing code concerning the
accessing of elements of multi-dimensional arrays, but that has not
dissuaded the committee before with their changes to object lifetime
rules.)

Vir Campestris <vir.campestris@invalid.invalid>: Jun 30 09:44PM +0100

On 29/06/2020 22:08, Alf P. Steinbach wrote:

>> I think if there was such a requirement every compiler I've ever used
>> would disobey it.

> Surely you mean the opposite?

The bit I meant you snipped "may not have observable end padding".

Andy

who's at fault, me or compiler?

Juha Nieminen <nospam@thanks.invalid>: Jun 30 06:02AM

>>and then you say that it's never instantiated.

> Instantiation implies an action at runtime, not binary load time which is
> what happens in this case.

Says who? Instantiation means that the object physically exists at runtime.
If the object never exists, it has never been instantiated.

Ian Collins <ian-news@hotmail.com>: Jun 30 06:11PM +1200

On 30/06/2020 18:02, Juha Nieminen wrote:
>> what happens in this case.

> Says who? Instantiation means that the object physically exists at runtime.
> If the object never exists, it has never been instantiated.

Says the troll who can't count code changes or compile code!

--
Ian.

boltar@nowhere.co.uk: Jun 30 11:02AM

On Tue, 30 Jun 2020 06:02:23 +0000 (UTC)
>> what happens in this case.

>Says who? Instantiation means that the object physically exists at runtime.
>If the object never exists, it has never been instantiated.

So is the machine code in the binary instantiated as well then? Instantiation
means run time creation, not simply what the loader copied into memory before
a single CPU instruction of the program has been run.

Juha Nieminen <nospam@thanks.invalid>: Jun 30 04:16PM

> So is the machine code in the binary instantiated as well then?

In this sense, yes. As opposed to the compiler evaluating the
program at compile time and simply using the end result of the
calculations (which is quite common especially nowadays with
constexpr functions).

If a constexpr function is evaluated at compile time and only
its end result ends up being used, then you could say that the
function was never instantiated. If the function gets compiled
into the binary and thus actually exists there at runtime, then
it got instantiated.

Data or code, it works very similary.

> Instantiation
> means run time creation, not simply what the loader copied into memory before
> a single CPU instruction of the program has been run.

And where are you getting this weird definition from?

Optimiser Repressing Linker Error

Juha Nieminen <nospam@thanks.invalid>: Jun 30 06:15AM

> It seems that an automated optimiser realises that "Func" will never be called and so it doesn't require its definition.

I doubt that the standard guarantees linker errors for code that might
or might not be optimized away by the linker.

Btw, this is one of the reasons why one should *always* test compiling
one's projects *with* and *without* optimizations. It has bitten me
in the behind more than once that I have the habit of always
compiling with optimizations and never trying what happens if I leave
optimizations out. More than once when compiling without optimizations
I have got linker errors because I didn't realize that something
was not defined properly but the compiler was hiding the error with
optimizations.

(Of course even then you can really rely on finding this out by
simply leaving optimizations options out, as there's no guarantee that
the compiler will not keep optimizing some things out.)

Frederick Gotham <cauldwell.thomas@gmail.com>: Jun 30 01:08AM -0700

On Tuesday, June 30, 2020 at 7:15:52 AM UTC+1, Juha Nieminen wrote:

> (Of course even then you can really rely on finding this out by
> simply leaving optimizations options out, as there's no guarantee that
> the compiler will not keep optimizing some things out.)

I had one executable program that I decided to split into one executable program and two libraries.

So I did the split, and then I tried to link everything together. Well the exectuable program linked successfully even when I didn't link it with one of the new libraries I had created. I spent maybe about an hour copy-pasting function declarations and moving ".cpp" files, before I decided to look at where the function is called in the code, and I saw that I had an if-statement with a compile-time-constant boolean.

You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

Digest for comp.programming.threads@googlegroups.com - 5 updates in 5 topics

comp.programming.threads@googlegroups.com

Google Groups

Topic digest
View all topics

More about my inventions and about Locks.. - 1 Update
Here is my new invention.. - 1 Update
About Lockfree and Waitfree and Locks.. - 1 Update
You cannot scale creativity.. - 1 Update
More of my thoughts on parallel computing and computing.. - 1 Update

More about my inventions and about Locks..

aminer68@gmail.com: Jun 29 01:47PM -0700

Hello..

More about my inventions and about Locks..

I have just read the following thoughts of a PhD researcher, and he says the following:

"4) using locks is prone to convoying effects;"

Read more here:

http://concurrencyfreaks.blogspot.com/2019/04/onefile-and-tail-latency.html

I am a white arab and i am smart like a genius, and this PhD
researcher is not so smart, notice that he is saying:

"4) using locks is prone to convoying effects;"

And i think he is not right, because i have invented the Holy Grail
of Locks, and it is not prone to convoying, read my following writing
about it:

----------------------------------------------------------------------

You have to understand deeply what is to invent my scalable algorithms
and there implementations so that to understand that it is powerful,
i give you an example: So i have invented a scalable algorithm that is a scalable Mutex that is remarkable and that is the Holy Grail of scalable Locks, it has the following characteristics, read my following thoughts
to understand:

About fair and unfair locking..

I have just read the following lead engineer at Amazon:

Highly contended and fair locking in Java

https://brooker.co.za/blog/2012/09/10/locking.html

So as you are noticing that you can use unfair locking that can have starvation or fair locking that is slower than unfair locking.

I think that Microsoft synchronization objects like the Windows critical section uses unfair locking, but they still can have starvation.

But i think that this not the good way to do, because i am an inventor and i have invented a scalable Fast Mutex that is much more powerful , because with my Fast Mutex you are capable to tune the "fairness" of the lock, and my Fast Mutex is capable of more than that, read about it on my following thoughts:

More about research and software development..

I have just looked at the following new video:

Why is coding so hard...

https://www.youtube.com/watch?v=TAAXwrgd1U8

I am understanding this video, but i have to explain my work:

I am not like this techlead in the video above, because i am also an "inventor" that has invented many scalable algorithms and there implementions, i am also inventing effective abstractions, i give you an example:

Read the following of the senior research scientist that is called Dave Dice:

Preemption tolerant MCS locks

https://blogs.oracle.com/dave/preemption-tolerant-mcs-locks

As you are noticing he is trying to invent a new lock that is preemption tolerant, but his lock lacks some important characteristics, this is why i have just invented a new Fast Mutex that is adaptative and that is much much better and i think mine is the "best", and i think you will not find it anywhere, my new Fast Mutex has the following characteristics:

1- Starvation-free
2- Tunable fairness
3- It keeps efficiently and very low its cache coherence traffic
4- Very good fast path performance
5- And it has a good preemption tolerance.
6- It is faster than scalable MCS lock
7- Not prone to convoying.
------------------------------------------------------------------------------

Also he is saying the following:

"1) if we use more than one lock, we're subject to having deadlock"

But you have to look here at our DelphiConcurrent and FreepascalConcurrent:

https://sites.google.com/site/scalable68/delphiconcurrent-and-freepascalconcurrent

And here is my new invention..

I think a Seqlock is a high-performance but restricted use of software Transactional Memory.

So i have just read about Seqlocks here on wikipedia:

https://en.wikipedia.org/wiki/Seqlock

And it says about Seqlock:

"The drawback is that if there is too much write activity or the reader is too slow, they might livelock (and the readers may starve)."

I am a white arab, and i think i am smart, so i have just invented a variant of Seqlock that has no livelock (when also there is too much write activity or the reader is too slow) and it is starvation-free.

So i think my new invention that is a variant of Seqlock is powerful.

And More now about Lockfree and Waitfree and Locks..

I have just read the following thoughts of a PhD researcher, and he says the following:

"5) mutual exclusion locks don't scale for read-only operations, it takes a reader-writer lock to have some scalability for read-only operations and even then, we either execute read-only operations or one write, but never both at the same time. Until today, there is no known efficient reader-writer lock with starvation-freedom guarantees;"

Read more here:

http://concurrencyfreaks.blogspot.com/2019/04/onefile-and-tail-latency.html

But i think that he is not right by saying the following:

"Until today, there is no known efficient reader-writer lock with starvation-freedom guarantees"

Because i am an inventor of many scalable algorithms and there implementations, and i have invented scalable and efficient
starvation-free reader-writer locks, read my following thoughts below
to notice it..

Also look at his following webpage:

OneFile - The world's first wait-free Software Transactional Memory

http://concurrencyfreaks.blogspot.com/2019/04/onefile-worlds-first-wait-free-software.html

But i think he is not right, because read the following thoughts that i have just posted that applies to waitfree and lockfree:

https://groups.google.com/forum/#!topic/comp.programming.threads/F_cF4ft1Qic

And read all my following thoughts to understand:

About Lock elision and Transactional memory..

I have just read the following:

Lock elision in the GNU C library

https://lwn.net/Articles/534758/

So it says the following:

"Lock elision uses the same programming model as normal locks, so it can be directly applied to existing programs. The programmer keeps using locks, but the locks are faster as they can use hardware transactional memory internally for more parallelism. Lock elision uses memory transactions as a fast path, while the slow path is still a normal lock. Deadlocks and other classic locking problems are still possible, because the transactions may fall back to a real lock at any time."

So i think this is not good, because one of the benefits of Transactional memory is that it solves the deadlock problem, but
with Lock elision you bring back the deadlock problem.

More about Locks and Transactional memory..

I have just looked at the following webpage about understanding Transactional memory performance:

https://www.cs.utexas.edu/users/witchel/pubs/porter10ispass-tm-slides.pdf

And as you are noticing, it says that in practice Transactional memory
is worse than Locks at high contention, and it says that in practice Transactional memory is 40% worse than Locks at 100% contention.

This is why i have invented scalable Locks and scalable RWLocks, read
my following thoughts to notice it:

About beating Moore's Law with software..

bmoore has responded to me the following:

https://groups.google.com/forum/#!topic/soc.culture.china/Uu15FIknU0s

So as you are noticing he is asking me the following:

"Are you talking about beating Moore's Law with software?"

But i think that there is some of the following constraints:

"Modern programing environments contribute to the problem of software bloat by placing ease of development and portable code above speed or memory usage. While this is a sound business model in a commercial environment, it does not make sense where IT resources are constrained. Languages such as Java, C-Sharp, and Python have opted for code portability and software development speed above execution speed and memory usage, while modern data storage and transfer standards such as XML and JSON place flexibility and readability above efficiency."

Read the following:

https://smallwarsjournal.com/jrnl/art/overcoming-death-moores-law-role-software-advances-and-non-semiconductor-technologies

Also there remains the following to also beat Moores's Law:

"Improved Algorithms

Hardware improvements mean little if software cannot effectively use the resources available to it. The Army should shape future software algorithms by funding basic research on improved software algorithms to meet its specific needs. The Army should also search for new algorithms and techniques which can be applied to meet specific needs and develop a learning culture within its software community to disseminate this information."

And about scalable algorithms, as you know i am a white arab
that is an inventor of many scalable algorithms and there implementations, read my following thoughts to notice it:

About my new invention that is a scalable algorithm..

I am a white arab, and i think i am more smart,
and i think i am like a genius, because i have again just invented
a new scalable algorithm, but i will briefly talk about the following best scalable reader-writer lock inventions, the first one is the following:

Scalable Read-mostly Synchronization Using Passive Reader-Writer Locks

https://www.usenix.org/system/files/conference/atc14/atc14-paper-liu.pdf

You will notice that it has a first weakness that it is for TSO hardware memory model and the second weakness is that the writers latency is very expensive when there is few readers.

And here is the other best scalable reader-writer lock invention of Facebook:

SharedMutex is a reader-writer lock. It is small, very fast, scalable
on multi-core

Read here:

https://github.com/facebook/folly/blob/master/folly/SharedMutex.h

But you will notice that the weakness of this scalable reader-writer lock is that the priority can only be configured as the following:

SharedMutexReadPriority gives priority to readers,
SharedMutexWritePriority gives priority to writers.

So the weakness of this scalable reader-writer lock is that
you can have starvation with it.

So this is why i have just invented a scalable algorithm that is
a scalable reader-writer lock that is better than the above and that is starvation-free and that is fair and that has a small writers latency.

So i think mine is the best and i will sell many of my scalable algorithms to software companies such as Microsoft or Google or Embardero..

What is it to be an inventor of many scalable algorithms ?

The Holy Grail of parallel programming is to provide good speedup while
hiding or avoiding the pitfalls of concurrency. You have to understand it to be able to understand what i am doing, i am an inventor of
many scalable algorithms and there implementations, but how can we define the kind of inventor like me? i think there is the following kinds of inventors, the ones that are PhD researchers and inventors like Albert Einstein, and the ones that are engineers and inventors like Nikola Tesla, and i think that i am of the kind of inventor of Nikola Tesla, i am not a PhD researcher like Albert Einstein, i am like an engineer who invented many scalable algorithms and there implementations, so i am like the following inventor that we call Nikola Tesla:

https://en.wikipedia.org/wiki/Nikola_Tesla

But i think that both those PhD researchers that are inventors and those Engineers that are inventors are powerful.

You have to understand deeply what is to invent my scalable algorithms
and there implementations so that to understand that it is powerful,
i give you an example: So i have invented a scalable algorithm that is a scalable Mutex that is remarkable and that is the Holy Grail of scalable Locks, it has the following characteristics, read my following thoughts
to understand:

About fair and unfair locking..

I have just read the following lead engineer at Amazon:

Highly contended and fair locking in Java

https://brooker.co.za/blog/2012/09/10/locking.html

So as you are noticing that you can use unfair locking that can have starvation or fair locking that is slower than unfair locking.

I think that Microsoft synchronization objects like the Windows critical section uses unfair locking, but they still can have starvation.

But i think that this not the good way to do, because i am an inventor and i have invented a scalable Fast Mutex that is much more powerful , because with my Fast Mutex you are capable to tune the "fairness" of the lock, and my Fast Mutex is capable of more than that, read about it on my following thoughts:

More about research and software development..

I have just looked at the following new video:

Why is coding so hard...

https://www.youtube.com/watch?v=TAAXwrgd1U8

I am understanding this video, but i have to explain my work:

I am not like this techlead in the video above, because i am also an "inventor" that has invented many scalable algorithms and there implementions, i am also inventing effective abstractions, i give you an example:

Read the following of the senior research scientist that is called Dave Dice:

Preemption tolerant MCS locks

https://blogs.oracle.com/dave/preemption-tolerant-mcs-locks

As you are noticing he is trying to invent a new lock that is preemption tolerant, but his lock lacks some important characteristics, this is why i have just invented a new Fast Mutex that is adaptative and that is much much better and i think mine is the "best", and i think you will not find it anywhere, my new Fast Mutex has the following characteristics:

1- Starvation-free
2- Tunable fairness
3- It keeps efficiently and very low its cache coherence traffic
4- Very good fast path performance
5- And it has a good preemption tolerance.
6- It is faster than scalable MCS lock
7- It is not prone to convoying

this is how i am an "inventor", and i have also invented other scalable algorithms such as a scalable reference counting with efficient support for weak references, and i have invented a fully scalable Threadpool, and i have also invented a Fully scalable FIFO queue, and i have
also invented other scalable algorithms and there implementations, and i think i will sell some of them to Microsoft or to Google or Embarcadero or such software companies.

And here is my other previous new invention of a scalable algorithm:

I have just read the following PhD paper about the invention that we call counting networks and they are better than Software combining trees:

Counting Networks

http://people.csail.mit.edu/shanir/publications/AHS.pdf

And i have read the following PhD paper:

http://people.csail.mit.edu/shanir/publications/HLS.pdf

So as you are noticing they are saying in the conclusion that:

"Software combining trees and counting networks which are the only techniques we observed to be truly scalable"

But i just found that this counting networks algorithm is not generally scalable, and i have the logical proof here, this is why i have just come with a new invention that enhance the counting networks algorithm to be generally scalable. And i think i will sell my new algorithm
of a generally scalable counting networks to Microsoft or Google or Embarcadero or such software companies.

So you have to be careful with the actual counting networks algorithm that is not generally scalable.

My other new invention is my scalable reference counting and here it is:

https://sites.google.com/site/scalable68/scalable-reference-counting-with-efficient-support-for-weak-references

And here is my just new invention of a scalable algorithm:

My Scalable RWLock that works across processes and threads was updated to version 4.62

Now i think it is working correctly in both Windows and Linux..

You can download it from my website here:

https://sites.google.com/site/scalable68/scalable-rwlock-that-works-accross-processes-and-threads

More about me as an inventor of many scalable algorithms..

I am a white arab and i think i am like a genius, because i have

Here is my new invention..

aminer68@gmail.com: Jun 29 12:13PM -0700

Hello,

Here is my new invention..

I think a Seqlock is a high-performance but restricted use of software Transactional Memory.

So i have just read about Seqlocks here on wikipedia:

https://en.wikipedia.org/wiki/Seqlock

And it says about:

"The drawback is that if there is too much write activity or the reader is too slow, they might livelock (and the readers may starve)."

I am a white arab, and i think i am smart, so i have just invented a variant of Seqlock that has no livelock (when also there is too much write activity or the reader is too slow) and it is starvation-free.

So i think my new invention that is a variant of Seqlock is powerful.

And now about Lockfree and Waitfree and Locks..

I have just read the following thoughts of a PhD researcher, and he says the following:

"Lock-based concurrency mechanism (locks) have several difficulties in practice:
1) if we use more than one lock, we're subject to having deadlock issues;
2) if we use priority locks, we're subject to having priority inversion issues;
3) if we use a lock without starvation-freedom guarantees (such as a spinlock), we're subject to starvation and live-lock;
4) using locks is prone to convoying effects;
5) mutual exclusion locks don't scale for read-only operations, it takes a reader-writer lock to have some scalability for read-only operations and even then, we either execute read-only operations or one write, but never both at the same time. Until today, there is no known efficient reader-writer lock with starvation-freedom guarantees;"

Read more here:

http://concurrencyfreaks.blogspot.com/2019/04/onefile-and-tail-latency.html

But i think that he is not right by saying the following:

"Until today, there is no known efficient reader-writer lock with starvation-freedom guarantees"

Because i am an inventor of many scalable algorithms and there implementations, and i have invented scalable and efficient
starvation-free reader-writer locks, read my following thoughts below
to notice it..

Also look at his following webpage:

OneFile - The world's first wait-free Software Transactional Memory

http://concurrencyfreaks.blogspot.com/2019/04/onefile-worlds-first-wait-free-software.html

But i think he is not right, because read the following thoughts that i have just posted that applies to waitfree and lockfree:

https://groups.google.com/forum/#!topic/comp.programming.threads/F_cF4ft1Qic

And read all my following thoughts to understand:

About Lock elision and Transactional memory..

I have just read the following:

Lock elision in the GNU C library

https://lwn.net/Articles/534758/

So it says the following:

"Lock elision uses the same programming model as normal locks, so it can be directly applied to existing programs. The programmer keeps using locks, but the locks are faster as they can use hardware transactional memory internally for more parallelism. Lock elision uses memory transactions as a fast path, while the slow path is still a normal lock. Deadlocks and other classic locking problems are still possible, because the transactions may fall back to a real lock at any time."

So i think this is not good, because one of the benefits of Transactional memory is that it solves the deadlock problem, but
with Lock elision you bring back the deadlock problem.

More about Locks and Transactional memory..

I have just looked at the following webpage about understanding Transactional memory performance:

https://www.cs.utexas.edu/users/witchel/pubs/porter10ispass-tm-slides.pdf

And as you are noticing, it says that in practice Transactional memory
is worse than Locks at high contention, and it says that in practice Transactional memory is 40% worse than Locks at 100% contention.

This is why i have invented scalable Locks and scalable RWLocks, read
my following thoughts to notice it:

About beating Moore's Law with software..

bmoore has responded to me the following:

https://groups.google.com/forum/#!topic/soc.culture.china/Uu15FIknU0s

So as you are noticing he is asking me the following:

"Are you talking about beating Moore's Law with software?"

But i think that there is some of the following constraints:

"Modern programing environments contribute to the problem of software bloat by placing ease of development and portable code above speed or memory usage. While this is a sound business model in a commercial environment, it does not make sense where IT resources are constrained. Languages such as Java, C-Sharp, and Python have opted for code portability and software development speed above execution speed and memory usage, while modern data storage and transfer standards such as XML and JSON place flexibility and readability above efficiency."

Read the following:

https://smallwarsjournal.com/jrnl/art/overcoming-death-moores-law-role-software-advances-and-non-semiconductor-technologies

Also there remains the following to also beat Moores's Law:

"Improved Algorithms

Hardware improvements mean little if software cannot effectively use the resources available to it. The Army should shape future software algorithms by funding basic research on improved software algorithms to meet its specific needs. The Army should also search for new algorithms and techniques which can be applied to meet specific needs and develop a learning culture within its software community to disseminate this information."

And about scalable algorithms, as you know i am a white arab
that is an inventor of many scalable algorithms and there implementations, read my following thoughts to notice it:

About my new invention that is a scalable algorithm..

I am a white arab, and i think i am more smart,
and i think i am like a genius, because i have again just invented
a new scalable algorithm, but i will briefly talk about the following best scalable reader-writer lock inventions, the first one is the following:

Scalable Read-mostly Synchronization Using Passive Reader-Writer Locks

https://www.usenix.org/system/files/conference/atc14/atc14-paper-liu.pdf

You will notice that it has a first weakness that it is for TSO hardware memory model and the second weakness is that the writers latency is very expensive when there is few readers.

And here is the other best scalable reader-writer lock invention of Facebook:

SharedMutex is a reader-writer lock. It is small, very fast, scalable
on multi-core

Read here:

https://github.com/facebook/folly/blob/master/folly/SharedMutex.h

But you will notice that the weakness of this scalable reader-writer lock is that the priority can only be configured as the following:

SharedMutexReadPriority gives priority to readers,
SharedMutexWritePriority gives priority to writers.

So the weakness of this scalable reader-writer lock is that
you can have starvation with it.

So this is why i have just invented a scalable algorithm that is
a scalable reader-writer lock that is better than the above and that is starvation-free and that is fair and that has a small writers latency.

So i think mine is the best and i will sell many of my scalable algorithms to software companies such as Microsoft or Google or Embardero..

What is it to be an inventor of many scalable algorithms ?

The Holy Grail of parallel programming is to provide good speedup while
hiding or avoiding the pitfalls of concurrency. You have to understand it to be able to understand what i am doing, i am an inventor of
many scalable algorithms and there implementations, but how can we define the kind of inventor like me? i think there is the following kinds of inventors, the ones that are PhD researchers and inventors like Albert Einstein, and the ones that are engineers and inventors like Nikola Tesla, and i think that i am of the kind of inventor of Nikola Tesla, i am not a PhD researcher like Albert Einstein, i am like an engineer who invented many scalable algorithms and there implementations, so i am like the following inventor that we call Nikola Tesla:

https://en.wikipedia.org/wiki/Nikola_Tesla

But i think that both those PhD researchers that are inventors and those Engineers that are inventors are powerful.

You have to understand deeply what is to invent my scalable algorithms
and there implementations so that to understand that it is powerful,
i give you an example: So i have invented a scalable algorithm that is a scalable Mutex that is remarkable and that is the Holy Grail of scalable Locks, it has the following characteristics, read my following thoughts
to understand:

About fair and unfair locking..

I have just read the following lead engineer at Amazon:

Highly contended and fair locking in Java

https://brooker.co.za/blog/2012/09/10/locking.html

So as you are noticing that you can use unfair locking that can have starvation or fair locking that is slower than unfair locking.

I think that Microsoft synchronization objects like the Windows critical section uses unfair locking, but they still can have starvation.

But i think that this not the good way to do, because i am an inventor and i have invented a scalable Fast Mutex that is much more powerful , because with my Fast Mutex you are capable to tune the "fairness" of the lock, and my Fast Mutex is capable of more than that, read about it on my following thoughts:

More about research and software development..

I have just looked at the following new video:

Why is coding so hard...

https://www.youtube.com/watch?v=TAAXwrgd1U8

I am understanding this video, but i have to explain my work:

I am not like this techlead in the video above, because i am also an "inventor" that has invented many scalable algorithms and there implementions, i am also inventing effective abstractions, i give you an example:

Read the following of the senior research scientist that is called Dave Dice:

Preemption tolerant MCS locks

https://blogs.oracle.com/dave/preemption-tolerant-mcs-locks

As you are noticing he is trying to invent a new lock that is preemption tolerant, but his lock lacks some important characteristics, this is why i have just invented a new Fast Mutex that is adaptative and that is much much better and i think mine is the "best", and i think you will not find it anywhere, my new Fast Mutex has the following characteristics:

1- Starvation-free
2- Tunable fairness
3- It keeps efficiently and very low its cache coherence traffic
4- Very good fast path performance
5- And it has a good preemption tolerance.
6- It is faster than scalable MCS lock

this is how i am an "inventor", and i have also invented other scalable algorithms such as a scalable reference counting with efficient support for weak references, and i have invented a fully scalable Threadpool, and i have also invented a Fully scalable FIFO queue, and i have
also invented other scalable algorithms and there implementations, and i think i will sell some of them to Microsoft or to Google or Embarcadero or such software companies.

And here is my other previous new invention of a scalable algorithm:

I have just read the following PhD paper about the invention that we call counting networks and they are better than Software combining trees:

Counting Networks

http://people.csail.mit.edu/shanir/publications/AHS.pdf

And i have read the following PhD paper:

http://people.csail.mit.edu/shanir/publications/HLS.pdf

So as you are noticing they are saying in the conclusion that:

"Software combining trees and counting networks which are the only techniques we observed to be truly scalable"

But i just found that this counting networks algorithm is not generally scalable, and i have the logical proof here, this is why i have just come with a new invention that enhance the counting networks algorithm to be generally scalable. And i think i will sell my new algorithm
of a generally scalable counting networks to Microsoft or Google or Embarcadero or such software companies.

So you have to be careful with the actual counting networks algorithm that is not generally scalable.

My other new invention is my scalable reference counting and here it is:

https://sites.google.com/site/scalable68/scalable-reference-counting-with-efficient-support-for-weak-references

And here is my just new invention of a scalable algorithm:

My Scalable RWLock that works across processes and threads was updated to version 4.62

Now i think it is working correctly in both Windows and Linux..

You can download it from my website here:

https://sites.google.com/site/scalable68/scalable-rwlock-that-works-accross-processes-and-threads

More about me as an inventor of many scalable algorithms..

I am a white arab and i think i am like a genius, because i have invented many scalable algorithms and there implementations, and look for example at my just new invention of a scalable algorithm here:

https://sites.google.com/site/scalable68/scalable-rwlock-that-works-accross-processes-and-threads

As you have noticed, you have to be like a genius to be able to invent
my above scalable algorithm of a scalable RWLock, because it has the following characteristics:

1- It is Scalable
2- It is Starvation-free
3- It is fair
4- It can be used across processes and threads
5- It can be used as a scalable Lock across processes and threads
by using my scalable AMLock that is FIFO fair on the writers side, or it can be
used as a scalable RWLock.

I am using my scalable Lock that is FIFO fair that is called scalable AMLock on the writers side.

Here is why scalable Locks are really important:

https://queue.acm.org/detail.cfm?id=2698990

So all in all it is a really good invention of mine.

Read my previous thoughts:

Here is how to use my new invention that is my scalable RWLock
across processes:

Just create an scalable rwlock object by giving a name in one process by calling the constructor like this:

scalable_rwlock.create('amine');

And you can use the scalable rwlock object from another process by calling the constructor by using the name like this:

scalable_rwlock.create('amine');

So as you are noticing i have abstracted it efficiently..

Read the rest of my previous thoughts:

My new invention of a Scalable RWLock that works across processes and threads is here, and now it works on both Windows and Linux..

Please download my source code and take a look at how i am making it work across processes by using FNV1a hash on both process ID and thread ID, FNV1a has a good dispersion, and FNV1a hash permits also my RWLock to be scalable.

You can download it from my website here:

https://sites.google.com/site/scalable68/scalable-rwlock-that-works-accross-processes-and-threads

Description:

This is my invention of a fast, and scalable and starvation-free and fair and lightweight Multiple-Readers-Exclusive-Writer Lock called LW_RWLockX, it works across processes and threads.

The parameters of the constructor are: first parameter is the name of the scalable RWLock to be used across processes, if the name is empty, it will only be used across threads. The second parameter is the size of the array of the readers, so if the size of the array is equal to the number of parallel readers, so it will be scalable, but if the number of readers are greater than the size of the array , you will start to have contention. The third parameter is the size of the array of my scalable Lock that is called AMLock, the number of threads can go beyond the size of the array of the scalable AMLock, please look at the source code of my scalable algorithms to understand.

I have also used my following implementation of FNV1a hash function to make my new variants of RWLocks scalable (since FNV1a is a hash algorithm that has good dispersion):

function FNV1aHash(key:int64):

About Lockfree and Waitfree and Locks..

aminer68@gmail.com: Jun 29 10:57AM -0700

Hello,

I am a white arab, and now about Lockfree and Waitfree and Locks..

I have just read the following thoughts of a PhD researcher, and he says the following:

"Lock-based concurrency mechanism (locks) have several difficulties in practice:
1) if we use more than one lock, we're subject to having deadlock issues;
2) if we use priority locks, we're subject to having priority inversion issues;
3) if we use a lock without starvation-freedom guarantees (such as a spinlock), we're subject to starvation and live-lock;
4) using locks is prone to convoying effects;
5) mutual exclusion locks don't scale for read-only operations, it takes a reader-writer lock to have some scalability for read-only operations and even then, we either execute read-only operations or one write, but never both at the same time. Until today, there is no known efficient reader-writer lock with starvation-freedom guarantees;"

Read more here:

http://concurrencyfreaks.blogspot.com/2019/04/onefile-and-tail-latency.html

But i think that he is not right by saying the following:

"Until today, there is no known efficient reader-writer lock with starvation-freedom guarantees"

Because i am an inventor of many scalable algorithms and there implementations, and i have invented scalable and efficient
starvation-free reader-writer locks, read my following thoughts below
to notice it..

Also look at his following webpage:

OneFile - The world's first wait-free Software Transactional Memory

http://concurrencyfreaks.blogspot.com/2019/04/onefile-worlds-first-wait-free-software.html

But i think he is not right, because read the following thoughts that i have just posted that applies to waitfree and lockfree:

https://groups.google.com/forum/#!topic/comp.programming.threads/F_cF4ft1Qic

And read all my following thoughts to understand:

About Lock elision and Transactional memory..

I have just read the following:

Lock elision in the GNU C library

https://lwn.net/Articles/534758/

So it says the following:

"Lock elision uses the same programming model as normal locks, so it can be directly applied to existing programs. The programmer keeps using locks, but the locks are faster as they can use hardware transactional memory internally for more parallelism. Lock elision uses memory transactions as a fast path, while the slow path is still a normal lock. Deadlocks and other classic locking problems are still possible, because the transactions may fall back to a real lock at any time."

So i think this is not good, because one of the benefits of Transactional memory is that it solves the deadlock problem, but
with Lock elision you bring back the deadlock problem.

More about Locks and Transactional memory..

I have just looked at the following webpage about understanding Transactional memory performance:

https://www.cs.utexas.edu/users/witchel/pubs/porter10ispass-tm-slides.pdf

And as you are noticing, it says that in practice Transactional memory
is worse than Locks at high contention, and it says that in practice Transactional memory is 40% worse than Locks at 100% contention.

This is why i have invented scalable Locks and scalable RWLocks, read
my following thoughts to notice it:

About beating Moore's Law with software..

bmoore has responded to me the following:

https://groups.google.com/forum/#!topic/soc.culture.china/Uu15FIknU0s

So as you are noticing he is asking me the following:

"Are you talking about beating Moore's Law with software?"

But i think that there is some of the following constraints:

"Modern programing environments contribute to the problem of software bloat by placing ease of development and portable code above speed or memory usage. While this is a sound business model in a commercial environment, it does not make sense where IT resources are constrained. Languages such as Java, C-Sharp, and Python have opted for code portability and software development speed above execution speed and memory usage, while modern data storage and transfer standards such as XML and JSON place flexibility and readability above efficiency."

Read the following:

https://smallwarsjournal.com/jrnl/art/overcoming-death-moores-law-role-software-advances-and-non-semiconductor-technologies

Also there remains the following to also beat Moores's Law:

"Improved Algorithms

Hardware improvements mean little if software cannot effectively use the resources available to it. The Army should shape future software algorithms by funding basic research on improved software algorithms to meet its specific needs. The Army should also search for new algorithms and techniques which can be applied to meet specific needs and develop a learning culture within its software community to disseminate this information."

And about scalable algorithms, as you know i am a white arab
that is an inventor of many scalable algorithms and there implementations, read my following thoughts to notice it:

About my new invention that is a scalable algorithm..

I am a white arab, and i think i am more smart,
and i think i am like a genius, because i have again just invented
a new scalable algorithm, but i will briefly talk about the following best scalable reader-writer lock inventions, the first one is the following:

Scalable Read-mostly Synchronization Using Passive Reader-Writer Locks

https://www.usenix.org/system/files/conference/atc14/atc14-paper-liu.pdf

You will notice that it has a first weakness that it is for TSO hardware memory model and the second weakness is that the writers latency is very expensive when there is few readers.

And here is the other best scalable reader-writer lock invention of Facebook:

SharedMutex is a reader-writer lock. It is small, very fast, scalable
on multi-core

Read here:

https://github.com/facebook/folly/blob/master/folly/SharedMutex.h

But you will notice that the weakness of this scalable reader-writer lock is that the priority can only be configured as the following:

SharedMutexReadPriority gives priority to readers,
SharedMutexWritePriority gives priority to writers.

So the weakness of this scalable reader-writer lock is that
you can have starvation with it.

So this is why i have just invented a scalable algorithm that is
a scalable reader-writer lock that is better than the above and that is starvation-free and that is fair and that has a small writers latency.

So i think mine is the best and i will sell many of my scalable algorithms to software companies such as Microsoft or Google or Embardero..

What is it to be an inventor of many scalable algorithms ?

The Holy Grail of parallel programming is to provide good speedup while
hiding or avoiding the pitfalls of concurrency. You have to understand it to be able to understand what i am doing, i am an inventor of
many scalable algorithms and there implementations, but how can we define the kind of inventor like me? i think there is the following kinds of inventors, the ones that are PhD researchers and inventors like Albert Einstein, and the ones that are engineers and inventors like Nikola Tesla, and i think that i am of the kind of inventor of Nikola Tesla, i am not a PhD researcher like Albert Einstein, i am like an engineer who invented many scalable algorithms and there implementations, so i am like the following inventor that we call Nikola Tesla:

https://en.wikipedia.org/wiki/Nikola_Tesla

But i think that both those PhD researchers that are inventors and those Engineers that are inventors are powerful.

You have to understand deeply what is to invent my scalable algorithms
and there implementations so that to understand that it is powerful,
i give you an example: So i have invented a scalable algorithm that is a scalable Mutex that is remarkable and that is the Holy Grail of scalable Locks, it has the following characteristics, read my following thoughts
to understand:

About fair and unfair locking..

I have just read the following lead engineer at Amazon:

Highly contended and fair locking in Java

https://brooker.co.za/blog/2012/09/10/locking.html

So as you are noticing that you can use unfair locking that can have starvation or fair locking that is slower than unfair locking.

I think that Microsoft synchronization objects like the Windows critical section uses unfair locking, but they still can have starvation.

But i think that this not the good way to do, because i am an inventor and i have invented a scalable Fast Mutex that is much more powerful , because with my Fast Mutex you are capable to tune the "fairness" of the lock, and my Fast Mutex is capable of more than that, read about it on my following thoughts:

More about research and software development..

I have just looked at the following new video:

Why is coding so hard...

https://www.youtube.com/watch?v=TAAXwrgd1U8

I am understanding this video, but i have to explain my work:

I am not like this techlead in the video above, because i am also an "inventor" that has invented many scalable algorithms and there implementions, i am also inventing effective abstractions, i give you an example:

Read the following of the senior research scientist that is called Dave Dice:

Preemption tolerant MCS locks

https://blogs.oracle.com/dave/preemption-tolerant-mcs-locks

As you are noticing he is trying to invent a new lock that is preemption tolerant, but his lock lacks some important characteristics, this is why i have just invented a new Fast Mutex that is adaptative and that is much much better and i think mine is the "best", and i think you will not find it anywhere, my new Fast Mutex has the following characteristics:

1- Starvation-free
2- Tunable fairness
3- It keeps efficiently and very low its cache coherence traffic
4- Very good fast path performance
5- And it has a good preemption tolerance.
6- It is faster than scalable MCS lock

this is how i am an "inventor", and i have also invented other scalable algorithms such as a scalable reference counting with efficient support for weak references, and i have invented a fully scalable Threadpool, and i have also invented a Fully scalable FIFO queue, and i have
also invented other scalable algorithms and there implementations, and i think i will sell some of them to Microsoft or to Google or Embarcadero or such software companies.

And here is my other previous new invention of a scalable algorithm:

I have just read the following PhD paper about the invention that we call counting networks and they are better than Software combining trees:

Counting Networks

http://people.csail.mit.edu/shanir/publications/AHS.pdf

And i have read the following PhD paper:

http://people.csail.mit.edu/shanir/publications/HLS.pdf

So as you are noticing they are saying in the conclusion that:

"Software combining trees and counting networks which are the only techniques we observed to be truly scalable"

But i just found that this counting networks algorithm is not generally scalable, and i have the logical proof here, this is why i have just come with a new invention that enhance the counting networks algorithm to be generally scalable. And i think i will sell my new algorithm
of a generally scalable counting networks to Microsoft or Google or Embarcadero or such software companies.

So you have to be careful with the actual counting networks algorithm that is not generally scalable.

My other new invention is my scalable reference counting and here it is:

https://sites.google.com/site/scalable68/scalable-reference-counting-with-efficient-support-for-weak-references

And here is my just new invention of a scalable algorithm:

My Scalable RWLock that works across processes and threads was updated to version 4.62

Now i think it is working correctly in both Windows and Linux..

You can download it from my website here:

https://sites.google.com/site/scalable68/scalable-rwlock-that-works-accross-processes-and-threads

More about me as an inventor of many scalable algorithms..

I am a white arab and i think i am like a genius, because i have invented many scalable algorithms and there implementations, and look for example at my just new invention of a scalable algorithm here:

https://sites.google.com/site/scalable68/scalable-rwlock-that-works-accross-processes-and-threads

As you have noticed, you have to be like a genius to be able to invent
my above scalable algorithm of a scalable RWLock, because it has the following characteristics:

1- It is Scalable
2- It is Starvation-free
3- It is fair
4- It can be used across processes and threads
5- It can be used as a scalable Lock across processes and threads
by using my scalable AMLock that is FIFO fair on the writers side, or it can be
used as a scalable RWLock.

I am using my scalable Lock that is FIFO fair that is called scalable AMLock on the writers side.

Here is why scalable Locks are really important:

https://queue.acm.org/detail.cfm?id=2698990

So all in all it is a really good invention of mine.

Read my previous thoughts:

Here is how to use my new invention that is my scalable RWLock
across processes:

Just create an scalable rwlock object by giving a name in one process by calling the constructor like this:

scalable_rwlock.create('amine');

And you can use the scalable rwlock object from another process by calling the constructor by using the name like this:

scalable_rwlock.create('amine');

So as you are noticing i have abstracted it efficiently..

Read the rest of my previous thoughts:

My new invention of a Scalable RWLock that works across processes and threads is here, and now it works on both Windows and Linux..

Please download my source code and take a look at how i am making it work across processes by using FNV1a hash on both process ID and thread ID, FNV1a has a good dispersion, and FNV1a hash permits also my RWLock to be scalable.

You can download it from my website here:

https://sites.google.com/site/scalable68/scalable-rwlock-that-works-accross-processes-and-threads

Description:

This is my invention of a fast, and scalable and starvation-free and fair and lightweight Multiple-Readers-Exclusive-Writer Lock called LW_RWLockX, it works across processes and threads.

The parameters of the constructor are: first parameter is the name of the scalable RWLock to be used across processes, if the name is empty, it will only be used across threads. The second parameter is the size of the array of the readers, so if the size of the array is equal to the number of parallel readers, so it will be scalable, but if the number of readers are greater than the size of the array , you will start to have contention. The third parameter is the size of the array of my scalable Lock that is called AMLock, the number of threads can go beyond the size of the array of the scalable AMLock, please look at the source code of my scalable algorithms to understand.

I have also used my following implementation of FNV1a hash function to make my new variants of RWLocks scalable (since FNV1a is a hash algorithm that has good dispersion):

function FNV1aHash(key:int64): UInt64;

var
i: Integer;
key1:uint64;

const

FNV_offset_basis: UInt64 = 14695981039346656037;
FNV_prime: UInt64 = 1099511628211;

begin

//FNV-1a hash

Result := FNV_offset_basis;

for i := 1 to 8 do
begin
key1:=(key shr ((i-1)*8)) and $00000000000000ff;
Result := (Result xor key1) * FNV_prime;
end;

end;

- Platform: Windows, Unix and Linux on x86

Required FPC switches: -O3 -Sd

-Sd for delphi mode....

Required Delphi switches: -$H+ -DDelphi

For Delphi XE-XE7 and Delphi tokyo use the -DXE switch

You can configure it as follows from inside defines.inc file:

{$DEFINE CPU32} and {$DEFINE Windows32} for 32

You cannot scale creativity..

aminer68@gmail.com: Jun 29 10:45AM -0700

Hello,

You cannot scale creativity..

I am a white arab and i think i am smart, and i invite you to read the
following thoughts about: "You cannot scale creativity", it is related to my following powerful product that i have designed and implemented, because as you will read below: "The solution is to lessen the need for coordination: have different people work on different things, use smaller teams, and employ fewer managers.", here is my powerful product (that can also be applied to organizations):

https://sites.google.com/site/scalable68/universal-scalability-law-for-delphi-and-freepascal

Please read the following about Applying the Universal Scalability Law to organisations:

https://blog.acolyer.org/2015/04/29/applying-the-universal-scalability-law-to-organisations/

So read the following to understand:

Read the following from the following PhD computer scientist:

https://lemire.me/blog/about-me/

You cannot scale creativity

As a teenager, I was genuinely impressed by communism. The way I saw it, the West could never compete. The USSR offered a centralized and efficient system that could eliminate waste and ensure optimal efficiency. If a scientific problem appeared, the USSR could throw 10, 100 or 1000 scientists at it without having to cajole anyone.

I could not quite understand why the communist countries always appeared to be technologically so backward. Weren't their coordinated engineers and scientists out-innovating our scientists and engineers?

I was making a reasoning error. I had misunderstood the concept of economy of scale best exemplified by Ford. To me, communism was more or less a massive application of the Fordian approach. It ought to make everything better and cheaper!

The industrial revolution was made possible by economies of scale: it costs far less per car to produce 10,000 cars than to make just one. Bill Gates became the richest man in the world because software offers an optimal economy of scale: it costs the same to produce one copy of Windows or 100 million copies.

Trade and employment can also scale: the transaction costs go down if you sell 10,000 objects a day, or hire 10,000 people a year. Accordingly, people living in cities are typically better off and more productive.

This has lead to the belief that if you regroup more people and you organize them, you get better productivity. I want to stress how different this statement is from the previous observations. We can scale products, services, trade and interaction. Scaling comes from the fact that we need reproduce many copies of the essentially the same object or service. But merely regrouping people only involves scaling in accounting and human ressources: if these are the costs holding you back, you are probably not doing anything important. To get ten people together to have much more than ten times the output is only possible if you are producing an uniform product or service.

Yet, somehow, people conclude that regroup people and getting them to work on a common goal, by itself, will improve productivity. Fred Brooks put a dent in this theory with his Brook's law:

Adding manpower to a late software project makes it later.

While it is true that almost all my work is collaborative, I consistently found it counterproductive to work in large groups. Of course, as an introvert, this goes against all my instincts. But I also fail to see the productivity gains in practice whereas I do notice the more frequent meetings.

Abramo et al. (2012) looked seriously at this issue and found that you get no more than linear scaling. That is, a group of two researchers will produce twice as much as one researcher. Period. There is no economy of scale when coordinating human brains. Their finding contradicts decades of science policy where we have tried to organize people into larger and better coordinated groups (a concept eerily reminiscent of communism).

We can make an analogy with computers. Your quad-core processor will not run Microsoft Word four times as far. It probably won't even run it twice as fast. In fact, poorly written software may even run slower when there are more than one core. Coordination is expensive.

The solution is to lessen the need for coordination: have different people work on different things, use smaller teams, and employ fewer managers.

Read more here:

https://lemire.me/blog/2012/10/15/you-cannot-scale-creativity/

Thank you,
Amine Moulay Ramdane.

More of my thoughts on parallel computing and computing..

aminer68@gmail.com: Jun 29 10:16AM -0700

Hello,

More of my thoughts on parallel computing and computing..

I am an inventor of many scalable algorithms and there implementations
and i am a more serious software developer specialized in parallel computing and synchronization algorithms, so I invite you to read my following thoughts in the following web links about parallel computing and computing:

https://community.idera.com/developer-tools/general-development/f/getit-and-third-party/71464/about-turing-completeness-and-parallel-programming

https://community.idera.com/developer-tools/general-development/f/getit-and-third-party/71555/here-is-more-of-my-thoughts-on-programming-languages

https://community.idera.com/developer-tools/general-development/f/getit-and-third-party/72018/about-the-threadpool

https://community.idera.com/developer-tools/general-development/f/getit-and-third-party/70188/today-i-will-talk-about-data-dependency-and-parallel-loops

And here is "some" of my inventions that i have put in my website:

https://sites.google.com/site/scalable68/scalable-mlock

https://sites.google.com/site/scalable68/scalable-reference-counting-with-efficient-support-for-weak-references

https://sites.google.com/site/scalable68/scalable-rwlock

https://sites.google.com/site/scalable68/scalable-rwlock-that-works-accross-processes-and-threads

https://groups.google.com/forum/#!topic/comp.programming.threads/VaOo1WVACgs

https://sites.google.com/site/scalable68/an-efficient-threadpool-engine-with-priorities-that-scales-very-well

Thank you,
Amine Moulay Ramdane.

Monday, June 29, 2020

Digest for comp.lang.c++@googlegroups.com - 11 updates in 3 topics

comp.lang.c++@googlegroups.com

Google Groups

Topic digest
View all topics

Diagram showing memory access - 1 Update
Observable end padding in arrays - 9 Updates
Optimiser Repressing Linker Error - 1 Update

Diagram showing memory access

raltbos@xs4all.nl (Richard Bos): Jun 29 09:58PM

> @Bqv~~o?gx|\<wI!SXOV:^]q+sIjeyi(<H{K-KWm"x/t
> a?<lk?hp?+xtTB,JdG\pU9O'vvFw=d(*7P?i4cW{AYYB
> 1,DCxI8=kU:db\F;*r^9!6hpo,"0#Ja(XK&|+4G(X@N#

You can't fool me. That's not Befunge. This is:

"'.niagA kcaB d"v> v
v$<v"r There an"< >$v
>#| :,: |#<
@>"o ,egnufeB'"^@

Richard

Observable end padding in arrays

Chris Vine <chris@cvine--nospam--.freeserve.co.uk>: Jun 29 09:49PM +0100

Hi,

I have been trying to prove to myself that, under the C++ standard,
one-dimensional arrays may not have observable end padding. It is clear
that the elements of an array must be contiguous, in the sense that for
an array of T of more than one element, &T[1] must be the same as
&T[0]+sizeof(T), and so on, but I have not yet managed to find a
provision which unambiguously forbids array end padding.

In other words I have not found a requirement that, for any type T,
sizeof(T[sz]) must be the same as sizeof(T)*sz.

Where you might care about this is in relation to matrices (and other
multi-dimensional arrays), where the issue translates into the extent to
which you are allowed to use pointer arithmetic to access individual
elements in matrices instead of using the subscript operators.

I have seen lots of code which assumes contiguity over the whole of a
multi-dimensional array, and I am pretty certain this requirement
exists, but can anyone point me to where I can find it?

Vir Campestris <vir.campestris@invalid.invalid>: Jun 29 09:53PM +0100

On 29/06/2020 21:49, Chris Vine wrote:

> I have seen lots of code which assumes contiguity over the whole of a
> multi-dimensional array, and I am pretty certain this requirement
> exists, but can anyone point me to where I can find it?

I think if there was such a requirement every compiler I've ever used
would disobey it.

Just the way all the libraries pad malloc (etc) requests up to some
suitable block size.

If I have an on-stack int64_t, then an array char[5], then another
int64_t I'd be astonished to find there wasn't a gap at the end of the
array.

Andy

"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Jun 29 01:55PM -0700

On 6/29/2020 1:49 PM, Chris Vine wrote:

> I have seen lots of code which assumes contiguity over the whole of a
> multi-dimensional array, and I am pretty certain this requirement
> exists, but can anyone point me to where I can find it?

Fwiw, I always liked using a one dimensional array, then partitioning it
using some math. Think of representing a 2d plane in a 1d array.

Using the array form of new seems to create an unseen header.

Chris Vine <chris@cvine--nospam--.freeserve.co.uk>: Jun 29 10:04PM +0100

On Mon, 29 Jun 2020 21:53:16 +0100

> If I have an on-stack int64_t, then an array char[5], then another
> int64_t I'd be astonished to find there wasn't a gap at the end of the
> array.

Yes, I formulated my question poorly, as that was not my point about
"observable padding". My "observable" was in the context of
multi-dimensional arrays and the size of component sub-arrays. I would
bet that sizeof(char[5]) is indeed 5 with your compiler, and I would
like to prove that that is required.

Can you point me to something that says that, for any type T,
sizeof(T[sz]) must be the same as sizeof(T)*sz?

"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Jun 29 11:08PM +0200

On 29.06.2020 22:53, Vir Campestris wrote:
>> exists, but can anyone point me to where I can find it?

> I think if there was such a requirement every compiler I've ever used
> would disobey it.

Surely you mean the opposite?

> If I have an on-stack int64_t, then an array char[5], then another
> int64_t I'd be astonished to find there wasn't a gap at the end of the
> array.

That doesn't appear to be what Chris is talking about.

As I see it he's talking about

>> a requirement that, for any type T, sizeof(T[sz]) must be the same as
>> sizeof(T)*sz

In C++17 this is specified by §8.3.3/2,

❝When applied to an array, the result is the total number of bytes in
the array. This implies that the size of an array of /n/ elements is /n/
times the size of an element.❞

... where the ❝This implies❞ is normative text, not a note.

- Alf

Chris Vine <chris@cvine--nospam--.freeserve.co.uk>: Jun 29 10:15PM +0100

On Mon, 29 Jun 2020 23:08:13 +0200
> the array. This implies that the size of an array of /n/ elements is /n/
> times the size of an element.❞

> ... where the ❝This implies❞ is normative text, not a note.

That is indeed what I was looking for. Many thanks.

Keith Thompson <Keith.S.Thompson+u@gmail.com>: Jun 29 02:18PM -0700

> If I have an on-stack int64_t, then an array char[5], then another
> int64_t I'd be astonished to find there wasn't a gap at the end of the
> array.

The question isn't about padding between distinct objects.

Given:

int64_t a;
char b[5];
int64_t c;

the standard says nothing about the order in which they're allocated.
For certain orders, some padding may be required to satisfy alignment
requirements. But even if they're allocated in the order in which
they're defined, any padding following b is not part of b; sizeof
(b) will (presumably) be 5, and the following 3(?) bytes will not
be part of any named object. (Or the compiler might allocate some
other object in that space.)

Every compiler I've seen has sizeof (b) == 5 (disclaimer: I haven't
really tested this). The question is whether the standard requires
this, or allows padding past 5 bytes to be *part of* the array object.

In a very quick look at (a draft of) the C++17 standard, I haven't found
an explicit requirement. Either such a requirement is there and I
haven't found it (always possible), or the authors of the standard felt
it was too obvious to state. (Or they intended to allow padding at the
end of an array, but I personally don't think that's likely.)

Note that we can't necessarily use multidimensional arrays to argue that
padding is forbidden. Given:

int arr[5][5]

the expression arr[0][7] has undefined behavior; it's not necessarily
equivalent to arr[1][2]. (Typical compilers may generate the same code
for both expression, which is of course valid for undefined behavior.)

However, arr[0][5] is a valid pointer value (dereferencing it has
undefined behavior), so another way to state the question is whether
(&arr[0][5] == &arr[1][0]) must be true. Again, it typically is, but
that doesn't prove that the standard requires it.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */

Chris Vine <chris@cvine--nospam--.freeserve.co.uk>: Jun 29 10:38PM +0100

On Mon, 29 Jun 2020 13:55:46 -0700

> Fwiw, I always liked using a one dimensional array, then partitioning it
> using some math. Think of representing a 2d plane in a 1d array.

> Using the array form of new seems to create an unseen header.

Yes, when using languages in which arrays are boxed types where you are
provided with what amounts to a pointer to garbage collected memory,
and the element type in use is unboxed, such as an integer, then for
multi-dimensional arrays you are almost driven to constructing a single
array and layering arithmetic on the indices over it in order to secure
contiguity and localization and so encourage vectorized instructions.

Much as I appreciate garbage collection, this is the kind of low level
stuff where C and C++ and their array syntax come out well.

Keith Thompson <Keith.S.Thompson+u@gmail.com>: Jun 29 02:41PM -0700

"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com> writes:
[...]
>> On 29/06/2020 21:49, Chris Vine wrote:
[...]
> the array. This implies that the size of an array of /n/ elements is
> /n/ times the size of an element.❞

> ... where the ❝This implies❞ is normative text, not a note.

Yes, I had somehow managed to miss that.

I'm still curious *how* it implies it. The "This implies that ..."
wording seems to indicate that it's a consequence of some normative
requirement elsewhere in the standard. (Nevertheless, I accept
that it's required.)

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */

Optimiser Repressing Linker Error

"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Jun 29 10:02PM +0200

On 29.06.2020 19:22, Paavo Helde wrote:
> }

> Now the Func() call gets officially discarded and Func can legally
> remain undefined.

Gah, where's the upvote button on this Usenet thing?

- Alf

soft and program

Tuesday, June 30, 2020

Re:

Fwd:

Digest for comp.lang.c++@googlegroups.com - 10 updates in 4 topics

Digest for comp.programming.threads@googlegroups.com - 5 updates in 5 topics

Monday, June 29, 2020

Digest for comp.lang.c++@googlegroups.com - 11 updates in 3 topics

Blog Archive

About Me