soft and program: Digest for comp.programming.threads@googlegroups.com

Sunday, January 4, 2015

Digest for comp.programming.threads@googlegroups.com - 1 update in 1 topic

comp.programming.threads@googlegroups.com

Google Groups

Cilk-style scheduler - 1 Update

Ramine <ramine@1.1>: Jan 03 02:36PM -0800

Hello...

I have finally arrived to an important subject...it's still
on the subject of parallel optimization, but this time i will
talk more about the Cilk-style scheduler.. as you have seen me in my
previous post i have talked about the Dmitry Vyukov's Cilk-style
scheduler that supports system-topology awareness, hyper-threading
awareness, affinity-awareness, batch-spawn capability and manual
task-depth control, here it is:

http://www.1024cores.net/home/parallel-computing/radix-sort

and as you have seen me talking about my parallel quicksort, i have said
that it's not so good cause since it dispatches the indexes of the
arrays to be partitionned each time to different threads of my
classical threadpool, since the my threadpool don't look like Dmitry
Vyukov's scheduler , so this will render the parallel quicksort
algorithm less cache-aware, so this will not scale much than 3X for
strings and it will scale less for integers and doubles... but if we are
using the Dmitry Vyukov's scheduler, as we recursively dispatch the
array's indexes to the Scheduler's threads that will call the partition
function, the Dmitry Vyukov's Cilk-style scheduler will dispatch those
array's indexes to threads that will reuse "efficiently" the data of the
caches , so this will reuse efficiently the L2-cache's data and
L3-cache's data efficiently , but even though it will reuse the data of
caches efficiently, this optimizations of the Dmitry Vyukov's scheduler
will not bring more than a 2X improvement , in fact you will get less
and less improvement when using "bigger" and bigger data, so this is not
so important improvement that will bring the Dmitry Vyukov's scheduler,
so what you have to do is to render your parallel sort algorithm into a
NUMA-aware algorithm this way you will render your parallel algorithm
into a scalable algorithm, this is why i have told you in my previous
post that the Dmitry's Sheduler will not bring an important improvement,
and if you take a look at my new parallel sort algorithm of my parallel
sort library version 3.3, its sorting part doesn't contain parallel
recursive calls and its merging part doesn't contain parallel recursive
calls, so the Dmitry Vyukov's scheduler will not bring improvement to my
parallel algorithm, so for all those reasons this why i have decided to
not implement a scheduler that look like the Dmitry Vyukov's scheduler
and this why i have decided to use my classical threadpool engine instead.

Thank you for your time.

Thank you,
Amine Moulay Ramdane.

You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.programming.threads+unsubscribe@googlegroups.com.

soft and program

Sunday, January 4, 2015

Digest for comp.programming.threads@googlegroups.com - 1 update in 1 topic

No comments:

Blog Archive

About Me