Monday, July 9, 2018

Digest for comp.programming.threads@googlegroups.com - 2 updates in 2 topics

Sky89 <Sky89@sky68.com>: Jul 08 08:32PM -0400

Hello..
 
 
Here is my other invention..
 
You know me that i am an "inventor" of many scalable algorithms and
there implementations, now here is my other "invention" that is an
efficient Threadpool with priorities that scales very well, it is as it
is fully scalable, because work stealing is "rare", it is much more
scalable than the one of Microsoft, here it is read about it and
download it(it comes with a FreePascal versions for Windows and Linux
and with a Delphi version for Windows) and it comes with an HTML
tutorial inside the zip file:
 
An efficient Threadpool engine with priorities that scales very well
 
version 3.2
 
Author: Amine Moulay Ramdane
 
Description:
 
Efficient Thread Pool Engine with priorities that scales very well. I
have updated my efficient Threadpool engine with priorities and my
Threadpool engine to version 3.2, i have come up with a new algorithm
that is more optimized and that scales very well.
 
The following have been added to my efficient Threadpool engine:
 
- I have used scalable counting networks to make my Threadpool engine
scales very well.
 
- You can give the following priorities to jobs:
 
LOW_PRIORITY
 
NORMAL_PRIORITY
 
HIGH_PRIORITY
 
- The worker threads enter in a wait state when there is no job in the
concurrent FIFO queues - for more efficiency -
 
- You can distribute your jobs to the worker threads and call any method
with the threadpool's execute() method.
 
- It uses work-stealing to be more efficient.
 
- You can configure it to use stacks or FIFO queues , when you use
stacks it will be cache efficient.
 
- Now it can use processor groups on windows, so that it can use more
than 64 logical processors and it scales well.
 
- Now it distributes the jobs on multiple FIFO queues or stacks so that
it scales well.
 
- You can wait for the jobs to finish with the wait() method.
 
- It's NUMA-aware and NUMA efficient.
 
- And it is portable to many operating systems.
 
- Uses O(1) complexity on enqueue and O(3) worst case complexity on dequeue.
 
And look at the ThreadPoolExecutor Class of Java, look for example at
the awaitTermination() method, it says:
 
---
 
boolean awaitTermination(long timeout, TimeUnit unit)
 
Blocks until all tasks have completed execution after a shutdown
request, or the timeout occurs, or the current thread is interrupted,
whichever happens first.
 
--
 
read more here:
 
https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ThreadPoolExecutor.html#method.summary
 
Did you notice ?
 
In Java when you wait for the tasks you have to wait for "ALL" the
tasks, and that's not efficient , and if you want to use the object from
multiple threads i think it will have the same effect, you can avoid
some of the problems by using many objects of the ThreadPoolExecutor
class but this will take ressources and this will cause more and more
context switches and that's bad, i think C# has the same problem, other
than that Java and C# don't support priorities, it means that you can
not give priorities to tasks/jobs, like high or normal or low, and
that's not good for games and other applications where you have to use
priorities even if the system is not a realtime system, this is why i
have decided to implement my efficient Threadpool engine version 3.2
that supports those characteristics and that scales well, so that you
can create a child object of the Threadpool class that will use the same
worker threads and that will wait only for the tasks that you will add
with the execute() method , and also my efficient Threadpool engine
supports 3 priorities, High and normal and low, that's where my
efficient Threadpool engine comes in hand and that's where it's
efficient. Hope you will like it.
 
Please read the HTML tutorial inside the zip.
 
More precision about my efficient Threadpool that scales very well, my
Threadpool is much more scalable than the one of Microsoft, in the
workers side i am using scalable counting networks to distribute on the
many queues or stacks, so it is scalable on the workers side, on the
consumers side i am also using lock striping to be able to scale very
well, so it is scalable on those parts, on the other part that is work
stealing, i am using scalable counting networks, so globally it scales
very well, and since work stealing is "rare" so i think that my
efficient Threadpool that scales very well is really powerful, and it is
much more optimized and the scalable counting networks eliminate false
sharing, and it works with Windows and Linux.
 
You have to know that to enlarge the stack of the worker threads of the
Threadpool that use TThread, you have to set the stack size for the
executable.
 
 
You can dowload it from:
 
https://sites.google.com/site/scalable68/an-efficient-threadpool-engine-with-priorities-that-scales-very-well
 
Look into defines.inc there is many options:
 
{$DEFINE CPU32} and {$DEFINE Windows32} for 32 bit systems
 
{$DEFINE CPU64} and {$DEFINE Windows64} for 64 bit systems
 
Look at test.pas demo inside the zip file...
 
Language: FPC Pascal v2.2.0+ / Delphi 5+: http://www.freepascal.org/
 
Operating Systems: Win , Linux and Mac (x86).
 
Required FPC switches: -O3 -Sd
 
-Sd for delphi mode....
 
Required Delphi switches: -DDelphi -DMSWINDOWS -$H+
 
For Delphi XE-XE7 use the -DXE switch
 
{$DEFINE CPU32} and {$DEFINE Windows32} for 32 bit systems
 
{$DEFINE CPU64} and {$DEFINE Windows64} for 64 bit systems
 
 
 
 
Thank you,
Amine Moulay Ramdane.
Sky89 <Sky89@sky68.com>: Jul 08 08:07PM -0400

Hello,
 
About Embarcadero and Delphi and FreePascal..
 
As you have noticed i am an "inventor" of many scalable algorithms and
there implementations, and as you have noticed Embarcadero company that
actually sells its Delphi product has pushed to adopt the ARC memory
manager despite it being not the right choice for highly parallel
applications, so i have decided to invent a "scalable" reference
counting with efficient support for weak references that is suited also
for "highly" parallel applications and i have implemented it in Delphi
and FreePascal, and you will not find it on C++ or Rust or Delphi, here
is my "invention" and read about it and download it from bellow:
 
Scalable reference counting with efficient support for weak references
version 1.12
 
Author: Amine Moulay Ramdane
 
Description:
 
I have enhanced my scalable algorithm and now it is much powerful, now
my scalable algorithm implementation works also as a "scalable" counter
that supports both "increment" and "decrement" using two scalable
counting networks, please take a look at my new scalable algorithm
implementation inside the source code..
 
This is my scalable reference counting with efficient support for weak
references, and since problems that cannot be solved without weak
references are rare, so this library does scale very well, this scalable
reference counting is implemented using scalable counting networks that
eliminate completely false sharing , so it is fully scalable on
multicore processors and manycore processors and this scalable algorithm
is optimized, and this library does work on both Windows and Linux
(x86), and it is easy to port to Mac OS X.
 
I have modified my scalable algorithm, now as you will notice i am not
using decrement with support for antitokens in the balancers of the
scalable counting networks, i am only using an "increment", please look
at my new scalable algorithm inside the zip file, i think it is working
correctly. Also notice that the returned value of _Release() method will
be valid if it is equal to 0.
 
I have optimized it more, now i am using only tokens and no antitokens
in the balancers of the scalable counting networks, so i am only
supporting increment, not decrement, so you have to be smart to invent
it correctly, this is what i have done, so look at the
AMInterfacedObject.pas file inside my zip file, you will notice that it
uses counting_network_next_value() function,
counting_network_next_value() increments the scalable counting network
by 1, the _AddRef() method is simple, it increment by 1 to increment the
reference to the object, but look inside the _Release() method it calls
counting_network_next_value() three times, and my invention is calling
counting_network_next_value(cn1) first inside the _Release() method to
be able to make my scalable algorithm works, so just debug it more and
you will notice that my scalable algorithm is smart and it is working
correctly, i have debugged it and i think it is working correctly.
 
I have to prove my scalable reference counting algorithm, like with
mathematical proof, so i will use logic to prove like in PhD papers:
 
You will find the code of my scalable reference counting inside
AMInterfacedObject.pas inside the zip file here:
 
If you look inside the code there is two methods, _AddRef() and
_Release() methods, i am using two scalable counting networks,
think about them like counters, so in the _AddRef() method i am
executing the following:
 
v1 := counting_network_next_value(cn1);
 
cn1 is the scalable counting network, and counting_network_next_value()
is a function that increment the scalable counting network by 1.
 
In the _Release() method i am executing the following:
 
v2 := counting_network_next_value(cn1);
v1 := counting_network_next_value(cn2);
v1 := counting_network_next_value(cn2);
 
So my scalable algorithm is "smart", because the logical proof is
that i am calling counting_network_next_value(cn1) first in the
above, so this allows my scalable algorithm to work correctly,
because we are advancing cn1 by 1 to obtain the value of cn1,
so the other threads are advancing also cn1 by one inside
_Release() , it is the last thread that is advancing cn1 by 1 that will
make the reference counter equal to 0 , and _AddRef() method is the same
and it is easy to reason about, so this scalable algorithm is working.
Please look more carefully at my algorithm and you will notice that it
is working as i have just logically proved it.
 
Please read also the following to understand better:
 
Here is the parameters of the constructor:
 
First parameter is: The width of the scalable counting networks that
permits my scalable refererence counting algorithm to be scalable, this
parameter must be 1 to 31, it is now at 4 , this is the power, so it is
equal to 2 power 4 , that means 2^4=16, and you have to pass this
counting networks width to the n of following formula:
 
(n*log(n)*(1+log(n)))/4
 
The log of the formula is in base 2
 
This formula gives the number of gates of the scalable counting
networks, and if we replace n by 16, this will equal 80 gates, that
means you can scale the scalable counting networks to 80 cores, and
beyond 80 cores you will start to have contention.
 
Second parameter is: a boolean that tells if reference counting is used
or not, it is by default to true, that means that reference counting is
used.
 
About the weak references support: the Weak<T> type supports assignment
from and to T and makes it usable as if you had a variable of T. It has
the IsAlive property to check if the reference is still valid and not a
dangling pointer. The Target property can be used if you want access to
members of the reference.
 
Note: the use of the IsAlive property on our weak reference, this tells
us whether the referenced object is still available, and provides a safe
way to get a concrete reference to the parent.
 
I have ported efficient weak references support to Linux by implementing
efficient code hooking, look at my DSharp.Core.Detour.pas file for Linux
that i have written to see how i have implemented it in the Linux
library. Please look at the example.dpr and test.pas demos to see how
weak references work etc.
 
Call _AddRef() and _Release() methods to manually increment or decrement
the number of references to the object.
 
Weak references support is done by hooking the TObject.FreeInstance
method so every object destruction is noticed and if a weak reference
for that object exists it gets removed from the internal dictionary
where all weak references are stored. While it works I am aware that
this is hacky approach and it might not work if someone overrides the
FreeInstance method and does not call inherited.
 
 
You can download it from:
 
https://sites.google.com/site/scalable68/scalable-reference-counting-with-efficient-support-for-weak-references
 
- Platform: Windows and Linux(x86)
 
Language: FPC Pascal v3.1.x+ / Delphi 2007+:
 
http://www.freepascal.org/
 
Required FPC switches: -O3 -Sd
 
-Sd for delphi mode....
 
Required Delphi switches: -$H+ -DDelphi
 
For Delphi XE versions and Delphi Tokyo use the -DXE switch
 
The defines options inside defines.inc are:
 
{$DEFINE CPU32} for 32 bit systems
 
{$DEFINE CPU64} for 64 bit systems
 
 
 
Thank you,
Amine Moulay Ramdane.
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.programming.threads+unsubscribe@googlegroups.com.

No comments: