Tuesday, November 17, 2015

Digest for comp.programming.threads@googlegroups.com - 4 updates in 3 topics

bleachbot <bleachbot@httrack.com>: Nov 16 11:11PM +0100

bleachbot <bleachbot@httrack.com>: Nov 16 11:41PM +0100

Ramine <ramine@1.1>: Nov 16 05:42PM -0800

Hello,
 
 
So if you have read my previous post about
the scalability of fastflow and TBB and TPL,
i have said that they must use a serial part
of around 350 CPU cycles , because you have to have
a cache-line transfer for every worker thread or task
that you have to wait for to finish, so my wisdom is
this: since my Threadpool engine is using a serial
part of 3 locks at around a total of 350 CPU cycles * 3
= 1050 CPU cycles, and this from the Amdahl law will
get you 3 times less scalability than Fastflow, but
3 times less scalability is not so important, so
my wisdom is to stay with my Threadpool engine
because it is good and still useful. And my second
wisdom to scale more my Threadpool engine is to increase
the P (parallel) part by doing more of the same:
Increase the volume of data processed by the P part
(and therefore the percentage p of time spent in computing) ,
this will permit you to get more scalability. This is
Gustafson's Law.
 
 
Hope you have understood this important post !
 
You can download my Threadpool engine from the following links:
 
https://sites.google.com/site/aminer68/threadpool-with-priorities
 
and from here:
 
https://sites.google.com/site/aminer68/threadpool
 
 
 
Thank you,
Amine Moulay Ramdane.
Ramine <ramine@1.1>: Nov 16 05:11PM -0800

Hello,
 
 
As you have noticed i have implemented a Threadpool engine
that is really cool, its interface is very easy and cool, read about it
here:
 
https://sites.google.com/site/aminer68/threadpool-with-priorities
 
 
But if you are an expert on parallel programming, you will notice
that i am using 3 locks that protect a very small portion of the code on
the worker threads side of the Threadpool engine, 2 locks for the pop()
side of the concurrent FIFO queue and i am using a third lock also, so
an expert on parallel programming will say that those locks that protect
a very small portion of the code are using each one around
350 CPU cycles on a x86 architecture, so this will lower the
scalability, but my answer is this: look at the fastflow
for example here: http://calvados.di.unipi.it/
if you need on fastflow to wait for the worker threads to
finish this will constitute the same portion of a serial
part of one lock, because a cache-line transfer is around
the same CPU cycles of a Lock, so this will constitute
a serial part of around 350 CPU cycles on x86, and this
will have the same effect as my Threadpool engine,
i think OpenMP and TBB and microsoft TPL are the same..
so this is why i think my Threadpool engine is still useful, so the idea
to scale more is to increase the P (parallel) part by doing more of the
same: Increase the volume of data processed by the P part (and therefore
the percentage p of time spent in computing) , this will permit you to
get more scalability. This is Gustafson's Law.
 
 
You can download my Threadpool engine from the following links:
 
https://sites.google.com/site/aminer68/threadpool-with-priorities
 
and from here:
 
https://sites.google.com/site/aminer68/threadpool
 
 
 
Thank you,
Amine Moulay Ramdane.
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.programming.threads+unsubscribe@googlegroups.com.

No comments: