- Cilk-style scheduler - 1 Update
Ramine <ramine@1.1>: Jan 03 02:36PM -0800 Hello... I have finally arrived to an important subject...it's still on the subject of parallel optimization, but this time i will talk more about the Cilk-style scheduler.. as you have seen me in my previous post i have talked about the Dmitry Vyukov's Cilk-style scheduler that supports system-topology awareness, hyper-threading awareness, affinity-awareness, batch-spawn capability and manual task-depth control, here it is: http://www.1024cores.net/home/parallel-computing/radix-sort and as you have seen me talking about my parallel quicksort, i have said that it's not so good cause since it dispatches the indexes of the arrays to be partitionned each time to different threads of my classical threadpool, since the my threadpool don't look like Dmitry Vyukov's scheduler , so this will render the parallel quicksort algorithm less cache-aware, so this will not scale much than 3X for strings and it will scale less for integers and doubles... but if we are using the Dmitry Vyukov's scheduler, as we recursively dispatch the array's indexes to the Scheduler's threads that will call the partition function, the Dmitry Vyukov's Cilk-style scheduler will dispatch those array's indexes to threads that will reuse "efficiently" the data of the caches , so this will reuse efficiently the L2-cache's data and L3-cache's data efficiently , but even though it will reuse the data of caches efficiently, this optimizations of the Dmitry Vyukov's scheduler will not bring more than a 2X improvement , in fact you will get less and less improvement when using "bigger" and bigger data, so this is not so important improvement that will bring the Dmitry Vyukov's scheduler, so what you have to do is to render your parallel sort algorithm into a NUMA-aware algorithm this way you will render your parallel algorithm into a scalable algorithm, this is why i have told you in my previous post that the Dmitry's Sheduler will not bring an important improvement, and if you take a look at my new parallel sort algorithm of my parallel sort library version 3.3, its sorting part doesn't contain parallel recursive calls and its merging part doesn't contain parallel recursive calls, so the Dmitry Vyukov's scheduler will not bring improvement to my parallel algorithm, so for all those reasons this why i have decided to not implement a scheduler that look like the Dmitry Vyukov's scheduler and this why i have decided to use my classical threadpool engine instead. Thank you for your time. Thank you, Amine Moulay Ramdane. |
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page. To unsubscribe from this group and stop receiving emails from it send an email to comp.programming.threads+unsubscribe@googlegroups.com. |
No comments:
Post a Comment