soft and program: Digest for comp.programming.threads@googlegroups.com

comp.programming.threads@googlegroups.com

Google Groups

About Hardware Transactional Memory and my invention that is my powerful Fast Mutex - 1 Update
Here is my new inventions that are my new variants of Scalable RWLocks that are powerful.. - 1 Update
My inventions that are my SemaMonitor and my SemaCondvar were updated to version 2.3 - 1 Update
The RISC-V, has its drawbacks - 1 Update
2D vs 3D Stacking: Intel's plan to beat Zen 2 - 1 Update
Don't worry, it was my last post about economy - 1 Update
Don't worry, it was my last post about politics - 1 Update

About Hardware Transactional Memory and my invention that is my powerful Fast Mutex

aminer68@gmail.com: Dec 09 04:46PM -0800

Hello,

About Hardware Transactional Memory and my invention that is my powerful Fast Mutex:

"As someone who has used TSX to optimize synchronization primitives, you can expect to see a ~15-20% performance increase, if (big if) your program is heavy on disjoint data access, i.e. a lock is needed for correctness, but conflicts are rare in practice. If you have a lot of threads frequently writing the same cache lines, you are probably going to see worse performance with TSX as opposed to traditional locking. It helps to think about TSX as transparently performing optimistic concurrency control, which is actually pretty much how it is implemented under the hood."

Read more here:

https://news.ycombinator.com/item?id=8169697

So as you are noticing, HTM (hardware transactional memory) and TM can not replace locks when doing IO and for highly contended critical sections, this is why i have invented my following powerful Fast Mutex:

More about research and software development..

I have just looked at the following new video:

Why is coding so hard...

https://www.youtube.com/watch?v=TAAXwrgd1U8

I am understanding this video, but i have to explain my work:

I am not like this techlead in the video above, because i am also an "inventor" that has invented many scalable algorithms and there implementions, i am also inventing effective abstractions, i give you an example:

Read the following of the senior research scientist that is called Dave Dice:

Preemption tolerant MCS locks

https://blogs.oracle.com/dave/preemption-tolerant-mcs-locks

As you are noticing he is trying to invent a new lock that is preemption tolerant, but his lock lacks some important characteristics, this is why i have just invented a new Fast Mutex that is adaptative and that is much much better and i think mine is the "best", and i think you will not find it anywhere, my new Fast Mutex has the following characteristics:

1- Starvation-free
2- Good fairness
3- It keeps efficiently and very low the cache coherence traffic
4- Very good fast path performance (it has the same performance as the
scalable MCS lock when there is contention.)
5- And it has a decent preemption tolerance.

this is how i am an "inventor", and i have also invented other scalable algorithms such as a scalable reference counting with efficient support for weak references, and i have invented a fully scalable Threadpool, and i have also invented a Fully scalable FIFO queue, and i have also invented other scalable algorithms and there inmplementations, and i think i will sell some of them to Microsoft or to
Google or Embarcadero or such software companies.

And about composability of lock-based systems now:

Design your systems to be composable. Among the more galling claims of the detractors of lock-based systems is the notion that they are somehow uncomposable:

"Locks and condition variables do not support modular programming," reads one typically brazen claim, "building large programs by gluing together smaller programs[:] locks make this impossible."9 The claim, of course, is incorrect. For evidence one need only point at the composition of lock-based systems such as databases and operating systems into larger systems that remain entirely unaware of lower-level locking.

There are two ways to make lock-based systems completely composable, and each has its own place. First (and most obviously), one can make locking entirely internal to the subsystem. For example, in concurrent operating systems, control never returns to user level with in-kernel locks held; the locks used to implement the system itself are entirely behind the system call interface that constitutes the interface to the system. More generally, this model can work whenever a crisp interface exists between software components: as long as control flow is never returned to the caller with locks held, the subsystem will remain composable.

Second (and perhaps counterintuitively), one can achieve concurrency and
composability by having no locks whatsoever. In this case, there must be
no global subsystem state—subsystem state must be captured in per-instance state, and it must be up to consumers of the subsystem to assure that they do not access their instance in parallel. By leaving locking up to the client of the subsystem, the subsystem itself can be used concurrently by different subsystems and in different contexts. A concrete example of this is the AVL tree implementation used extensively in the Solaris kernel. As with any balanced binary tree, the implementation is sufficiently complex to merit componentization, but by not having any global state, the implementation may be used concurrently by disjoint subsystems—the only constraint is that manipulation of a single AVL tree instance must be serialized.

Read more here:

https://queue.acm.org/detail.cfm?id=1454462

Thank you,
Amine Moulat Ramdane.

Here is my new inventions that are my new variants of Scalable RWLocks that are powerful..

aminer68@gmail.com: Dec 09 04:42PM -0800

Hello,

Here is my new inventions that are my new variants of Scalable RWLocks that are powerful..

Author: Amine Moulay Ramdane

Description:

A fast, and scalable and starvation-free and fair and lightweight Multiple-Readers-Exclusive-Writer Lock called LW_RWLockX, the scalable LW_RWLockX does spin-wait, and also a fast and scalable and starvation-free and fair Multiple-Readers-Exclusive-Writer Lock called RWLockX, the scalable RWLockX doesn't spin-wait but uses my portable SemaMonitor and portable event objects , so it is energy efficient.

The parameter of the constructors is the size of the array of the readers , so if the size of the array is equal to the number of parallel readers, so it will be scalable, but if the number of readers are greater than the size of the array , you will start to have contention, please look at the source code of my scalable algorithms to understand.

I have used my following hash function to make my new variants of RWLocks scalable:

---

function DJB2aHash(key:int64):uint64;
var
i: integer;
key1:uint64;

begin
Result := 5381;
for i := 1 to 8 do
begin
key1:=(key shr ((i-1)*8)) and $00000000000000ff;
Result := ((Result shl 5) xor Result) xor key1;
end;
end;

---

You can download them from:

https://sites.google.com/site/scalable68/new-variants-of-scalable-rwlocks

Thank you,
Amine Moulay Ramdane.

My inventions that are my SemaMonitor and my SemaCondvar were updated to version 2.3

aminer68@gmail.com: Dec 09 04:38PM -0800

Hello,

My inventions that are my SemaMonitor and my SemaCondvar were updated to version 2.3, they have become efficient and powerful, please read the readme file to know more about the changes, and i have implemented an efficient Monitor over my SemaCondvar. Here is the description of my efficient Monitor inside the Monitor.pas file that you will find inside the zip file:

Description:

This is my implementation of a Monitor over my SemaCondvar.

You will find the Monitor class inside the Monitor.pas file inside the zip file.

When you set the first parameter of the constructor to true, the signal will not be lost if the threads are not waiting with wait() method, but when you set the first parameter of the construtor to false, if the threads are not waiting with the wait() method, the signal will be lost..

Second parameter of the constructor is the kind of Lock, you can
set it to ctMLock to use my scalable node based lock called MLock, or you can set it to ctMutex to use a Mutex or you can set it to ctCriticalSection to use the TCriticalSection.

Here is the methods of my efficient Monitor that i have implemented:

TMonitor = class
private
cache0:typecache0;
lock1:TSyncLock;
obj:TSemaCondvar;
cache1:typecache0;

public

constructor Create(bool:boolean=true;lock:TMyLocks=ctMLock);
destructor Destroy; override;
procedure Enter();
procedure Leave();
function Signal():boolean;overload;
function Signal(nbr:long;var remains:long):boolean;overload;
procedure Signal_All();
function Wait(const AMilliseconds:longword=INFINITE): boolean;
function WaitersBlocked():long;

end;

The wait() method is for the threads to wait on the Monitor object for the signal to be signaled. If wait() fails, that can be that the number of waiters is greater than high(longword).

And the signal() method will signal one time a waiting thread on the Monitor object, but if signal() fails , the returned value is false.

the signal_all() method will signal all the waiting threads on the Monitor object.

The signal(nbr:long;var remains:long) method will signal nbr of waiting threads, but if signal() fails, the remaining number of signals that were not signaled will be returned in the remains variable.

and WaitersBlocked() will return the number of waiting threads on the Monitor object.

and Enter() and Leave() methods to enter and leave the monitor's Lock.

You can download the zip files from:

https://sites.google.com/site/scalable68/semacondvar-semamonitor

and the lightweight version is here:

https://sites.google.com/site/scalable68/light-weight-semacondvar-semamonitor

Thank you,
Amine Moulay Ramdane.

The RISC-V, has its drawbacks

aminer68@gmail.com: Dec 09 03:28PM -0800

Hello,

The RISC-V, has its drawbacks. It has a short history compared to Intel and ARM, and the architecture is not yet sophisticated. Since Intel and ARM have dominated the CPU market for decades, most existing products are optimized for these CPUs. In addition, Intel and ARM are responsible for everything from CPU development to testing and manufacturing, and can provide highly reliable CPUs to customers. And we have not to forget about innovation of companies such as Intel, since you can notice it with the following innovation from Intel on the following video:

2D vs 3D Stacking: Intel's plan to beat Zen 2

https://www.youtube.com/watch?v=26EzYDBNwLU

Thank you,
Amine Moulay Ramdane.

2D vs 3D Stacking: Intel's plan to beat Zen 2

aminer68@gmail.com: Dec 09 02:39PM -0800

Hello,

Look at this interesting video:

2D vs 3D Stacking: Intel's plan to beat Zen 2

https://www.youtube.com/watch?v=26EzYDBNwLU

Thank you,
Amine Moulay Ramdane.

Don't worry, it was my last post about economy

aminer68@gmail.com: Dec 09 10:16AM -0800

Hello,

Don't worry, it was my last post about economy.

Thank you,
Amine Mouloay Ramdane.

Don't worry, it was my last post about politics

aminer68@gmail.com: Dec 09 10:13AM -0800

Hello,

Don't worry, it was my last post about politics.

Thank you,
Amine Mouloay Ramdane.

You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.programming.threads+unsubscribe@googlegroups.com.

soft and program

Tuesday, December 10, 2019

Digest for comp.programming.threads@googlegroups.com - 7 updates in 7 topics

No comments:

Blog Archive

About Me