- cmsg cancel <n5498m$osu$2@dont-email.me> - 4 Updates
- About my new algorithm... - 1 Update
- A new algorithm of Parallel implementation of Conjugate Gradient Sparse Linear System Solver library - 2 Updates
- Scalable Parallel implementation of Conjugate Gradient Linear System solver library version 1.22 - 1 Update
- What the Hell is going on here? - 1 Update
- How do I look inside an .exe file to view the programming - 1 Update
bleachbot <bleachbot@httrack.com>: Dec 19 07:58PM +0100 |
bleachbot <bleachbot@httrack.com>: Dec 20 05:15AM +0100 |
bleachbot <bleachbot@httrack.com>: Dec 20 05:41AM +0100 |
bleachbot <bleachbot@httrack.com>: Dec 20 05:42AM +0100 |
Ramine <ramine@1.1>: Dec 19 11:43PM -0800 Hello, Read here: https://en.wikipedia.org/wiki/Sparse_matrix As you have noticed it says: "When storing and manipulating sparse matrices on a computer, it is beneficial and often necessary to use specialized algorithms and data structures that take advantage of the sparse structure of the matrix. Operations using standard dense-matrix structures and algorithms are slow and inefficient when applied to large sparse matrices as processing and memory are wasted on the zeroes. Sparse data is by nature more easily compressed and thus require significantly less storage. Some very large sparse matrices are infeasible to manipulate using standard dense-matrix algorithms." I have took care of that on my new algorithm, i have used my ParallelHahshList datastructure to store the sparse matrices of the linear system so that it become very fast and so that it doesn't waste on the zeros, in fact my new algorithm doesn't store the zeros of the sparse matrice of the linear system. And my new parallel algorithm that i have just implemented today is designed for sparse matrices of linear equations arising from industrial Finite element problems and such.. Here is my new library of my new parallel algorithm: https://sites.google.com/site/aminer68/parallel-implementation-of-conjugate-gradient-sparse-linear-system-solver Thank you, Amine Moulay Ramdane. |
A new algorithm of Parallel implementation of Conjugate Gradient Sparse Linear System Solver library
Ramine <ramine@1.1>: Dec 19 11:16PM -0800 Hello, I have just implemented today a new parallel algorithm of a Parallel implementation of Conjugate Gradient Sparse Linear System Solver library.. this library is designed for sparse matrices of linear equations arising from industrial Finite element problems and such, and my new parallel algorithm is cache-aware and very fast.. So as you have noticed, i have implemented now two parallel algorithms, one that is cache-aware an NUMA-aware and that is scalable on NUMA architecture, and this scalable Parallel algorithm is designed for dense matrices that you find on Linear Equations arising from Integral Equation Formulations, here it is: https://sites.google.com/site/aminer68/scalable-parallel-implementation-of-conjugate-gradient-linear-system-solver-library-that-is-numa-aware-and-cache-aware And my new parallel algorithm that i have just implemented today is designed for sparse matrices of linear equations arising from industrial Finite element problems and such: Here is my new library of my new parallel algorithm: https://sites.google.com/site/aminer68/parallel-implementation-of-conjugate-gradient-sparse-linear-system-solver Author: Amine Moulay Ramdane Description: I have come up with a new algorithm of my Parallel Conjugate gradient sparse solver library, now it has become cache-aware, but you have to notice that this new cache-aware algorithm is more efficient on multicores, since i have benchmarked it against my previous algorithm and it has given a scalability of 5X on a Quadcore over the single thread of my previous algorithm , that's a really a big improvement !. This Parallel library is especially designed for large scale industrial engineering problems that you find on industrial Finite element problems and such, this scalable Parallel library was ported to FreePascal and all the Delphi XE versions and even to Delphi 7, hope you will find it really good. The Parallel implementation of Conjugate Gradient Sparse Linear System Solver that i programmed here is designed to be used to solve large sparse systems of linear equations where the direct methods can exceed available machine memory and/or be extremely time-consuming. for example the direct method of the Gauss algorithm takes O(n^2) in the back substitution process and is dominated by the O(n^3) forward elimination process, that means, if for example an operation takes 10^-9 second and we have 1000 equations , the elimination process in the Gauss algorithm will takes 0.7 second, but if we have 10000 equations in the system , the elimination process in the Gauss algorithm will take 11 minutes !. This is why i have develloped for you the Parallel implementation of Conjugate Gradient Sparse Linear System Solver in Object Pascal, that is very fast. You have only one method to use that is Solve() function TParallelConjugateGradient.Solve(var A: arrarrext;var B,X:VECT;var RSQ:DOUBLE;nbr_iter:integer;show_iter:boolean):boolean; The system: A*x = b The important parameters in the Solve() method are: A is the matrix , B is the b vector, X the initial vector x, nbr_iter is the number of iterations that you want and show_iter to show the number of iteration on the screen. RSQ is the sum of the squares of the components of the residual vector A.x - b. I have got over 5X scalability on a quad core. The Conjugate Gradient Method is the most prominent iterative method for solving sparse systems of linear equations. Unfortunately, many textbook treatments of the topic are written with neither illustrations nor intuition, and their victims can be found to this day babbling senselessly in the corners of dusty libraries. For this reason, a deep, geometric understanding of the method has been reserved for the elite brilliant few who have painstakingly decoded the mumblings of their forebears. Conjugate gradient is the most popular iterative method for solving large systems of linear equations. CG is effective for systems of the form A.x = b where x is an unknown vector, b is a known vector, A is a known square, symmetric, positive-definite (or positive-indefinite) matrix. These systems arise in many important settings, such as finite difference and finite element methods for solving partial differential equations, structural analysis, circuit analysis, and math homework The Conjugate gradient method can also be applied to non-linear problems, but with much less success since the non-linear functions have multiple minimums. The Conjugate gradient method will indeed find a minimum of such a nonlinear function, but it is in no way guaranteed to be a global minimum, or the minimum that is desired. But the conjugate gradient method is great iterative method for solving large, sparse linear systems with a symmetric, positive, definite matrix. In the method of conjugate gradients the residuals are not used as search directions, as in the steepest decent method, cause searching can require a large number of iterations as the residuals zig zag towards the minimum value for ill-conditioned matrices. But instead conjugate gradient method uses the residuals as a basis to form conjugate search directions . In this manner, the conjugated gradients (residuals) form a basis of search directions to minimize the quadratic function f(x)=1/2*Transpose(x)*A*x + Transpose(b)*x and to achieve faster speed and result of dim(N) convergence. Language: FPC Pascal v2.2.0+ / Delphi 7+: http://www.freepascal.org/ Operating Systems: Windows, Mac OSX , Linux... Required FPC switches: -O3 -Sd -dFPC -dFreePascal -Sd for delphi mode.... Required Delphi switches: -$H+ -DDelphi {$DEFINE CPU32} and {$DEFINE Windows32} for 32 bit systems {$DEFINE CPU64} and {$DEFINE Windows64} for 64 bit systems Thank you, Amine Moulay Ramdane. |
Ramine <ramine@1.1>: Dec 19 11:42PM -0800 Hello, Read here: https://en.wikipedia.org/wiki/Sparse_matrix As you have noticed it says: "When storing and manipulating sparse matrices on a computer, it is beneficial and often necessary to use specialized algorithms and data structures that take advantage of the sparse structure of the matrix. Operations using standard dense-matrix structures and algorithms are slow and inefficient when applied to large sparse matrices as processing and memory are wasted on the zeroes. Sparse data is by nature more easily compressed and thus require significantly less storage. Some very large sparse matrices are infeasible to manipulate using standard dense-matrix algorithms." I have took care of that on my new algorithm, i have used my ParallelHahshList datastructure to store the sparse matrices of the linear system so that it become very fast and so that it doesn't waste on the zeros, in fact my new algorithm doesn't store the zeros of the sparse matrice of the linear system. And my new parallel algorithm that i have just implemented today is designed for sparse matrices of linear equations arising from industrial Finite element problems and such.. Here is my new library of my new parallel algorithm: https://sites.google.com/site/aminer68/parallel-implementation-of-conjugate-gradient-sparse-linear-system-solver Thank you, Amine Moulay Ramdane. |
Ramine <ramine@1.1>: Dec 19 02:00PM -0800 Hello, I have updated my Scalable Parallel implementation of Conjugate Gradient Linear System solver library that is NUMA-aware and cache-aware to version 1.22, in the previous version, the FIFO queues of the Threadpools that i were using was not Allocated in different NUMA nodes, i have enhanced it and now each FIFO queue of a Threadpool is allocated in different NUMA node, and the rest of my algorithm is NUMA-aware and cache aware, so now all my algorithm has become fully NUMA-aware and cache-aware, so it is now scalable on multicores and on NUMA architectures of the x86 architecture. Today, ladies and gentlemen, i will talk a little bit about my scalable conjugate gradient system solver library.. The important thing to understand is that it it is NUMA-aware and scalable on NUMA architecture, because i am using two functions that mutiply a matrix by vector, so i have used a mechanism to distributed equally the memory allocation of the rows of the matrix on different NUMA nodes, and i have made my algorithm cache-aware, other than that i have used a probabilistic mechanism to make it scalable on NUMA architecture , this probalistic mechanism does minimize at best the contention points and it renders my algorithm fully scalable on NUMA architecture. Hope you will be happy with my new scalable algorithm and my scalable parallel library, frankly i think i have to write something like a PhD paper to explain more my new scalable algorithm , but i will let it as it is at this moment... perhaps i will do it in the near future. This scalable Parallel library is especially designed for large scale industrial engineering problems that you find on industrial Finite element problems and such, this scalable Parallel library was ported to FreePascal and all the Delphi XE versions and even to Delphi 7, hope you will find it really good. Here is the simulation program that uses the probabilistic mechanism that i have talked about and that prove to you that my algorithm is scalable: If you look at my scalable parallel algorithm, it is dividing the each array of the matrix by 250 elements, and if you look carefully i am using two functions that consumes the greater part of all the CPU, it is the atsub() and asub(), and inside those functions i am using a probabilistic mechanism so that to render my algorithm scalable on NUMA architecture, what i am doing is scrambling the array parts using a probabilistic function and what i have noticed that this probabilitic mechanism is very efficient, to prove to you what i am saying , please look at the following simulation that i have done using a variable that contains the number of NUMA nodes, and what i have noticed that my simulation is giving almost a perfect scalability on NUMA architecture, for example let us give to the "NUMA_nodes" variable a value of 4, and to our array a value of 250, the simulation bellow will give a number of contention points of a quarter of the array, so if i am using 16 cores , in the the worst case it will scale 4X throughput on NUMA architecture, because since i am using an array of 250 and there is a quarter of the array of contention points , so from the Amdahl's law this will give a scalability of almost 4X throughput on four NUMA nodes, and this will give almost a perfect scalability on more and more NUMA nodes, so my parallel algorithm is scalable on NUMA architecture. Here is the simulation that i have done, please run it and you will notice yourself that my parallel algorithm is scalable on NUMA architecture. Here it is: --- program test; uses math; var tab,tab1,tab2,tab3:array of integer; a,n1,k,i,n2,tmp,j,numa_nodes:integer; begin a:=250; Numa_nodes:=4; setlength(tab2,a); for i:=0 to a-1 do begin tab2[i]:=i mod numa_nodes; end; setlength(tab,a); randomize; for k:=0 to a-1 do tab[k]:=k; n2:=a-1; for k:=0 to a-1 do begin n1:=random(n2); tmp:=tab[k]; tab[k]:=tab[n1]; tab[n1]:=tmp; end; setlength(tab1,a); randomize; for k:=0 to a-1 do tab1[k]:=k; n2:=a-1; for k:=0 to a-1 do begin n1:=random(n2); tmp:=tab1[k]; tab1[k]:=tab1[n1]; tab1[n1]:=tmp; end; for i:=0 to a-1 do if tab2[tab[i]]=tab2[tab1[i]] then begin inc(j); writeln('A contention at: ',i); end; writeln('Number of contention points: ',j); setlength(tab,0); setlength(tab1,0); setlength(tab2,0); end. --- You can download my new Scalable Parallel implementation of Conjugate Gradient Linear System solver library version 1.222 from: https://sites.google.com/site/aminer68/scalable-parallel-implementation-of-conjugate-gradient-linear-system-solver-library-that-is-numa-aware-and-cache-aware Thank you, Amine Moulay Ramdane. |
Marcel Mueller <news.5.maazl@spamgourmet.org>: Dec 19 07:43PM +0100 > Ramine, Ramine, Ramine... > Is it what the freedom of speech really makes when applied with no limits? adjust your killfile Marcel |
yirgaatmail@gmail.com: Dec 19 10:33AM -0800 some of you shut your mouth you can do it using decompiler. search google |
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page. To unsubscribe from this group and stop receiving emails from it send an email to comp.programming.threads+unsubscribe@googlegroups.com. |
No comments:
Post a Comment