soft and program: Digest for comp.programming.threads@googlegroups.com

comp.programming.threads@googlegroups.com

Google Groups

Topic digest
View all topics

cmsg cancel <n5498m$osu$2@dont-email.me> - 4 Updates
About my new algorithm... - 1 Update
A new algorithm of Parallel implementation of Conjugate Gradient Sparse Linear System Solver library - 2 Updates
Scalable Parallel implementation of Conjugate Gradient Linear System solver library version 1.22 - 1 Update
What the Hell is going on here? - 1 Update
How do I look inside an .exe file to view the programming - 1 Update

cmsg cancel <n5498m$osu$2@dont-email.me>

bleachbot <bleachbot@httrack.com>: Dec 19 07:58PM +0100

bleachbot <bleachbot@httrack.com>: Dec 20 05:15AM +0100

bleachbot <bleachbot@httrack.com>: Dec 20 05:41AM +0100

bleachbot <bleachbot@httrack.com>: Dec 20 05:42AM +0100

About my new algorithm...

Ramine <ramine@1.1>: Dec 19 11:43PM -0800

Hello,

Read here:

https://en.wikipedia.org/wiki/Sparse_matrix

As you have noticed it says:

"When storing and manipulating sparse matrices on a computer, it is
beneficial and often necessary to use specialized algorithms and data
structures that take advantage of the sparse structure of the matrix.
Operations using standard dense-matrix structures and algorithms are
slow and inefficient when applied to large sparse matrices as processing
and memory are wasted on the zeroes. Sparse data is by nature more
easily compressed and thus require significantly less storage. Some very
large sparse matrices are infeasible to manipulate using standard
dense-matrix algorithms."

I have took care of that on my new algorithm, i have used
my ParallelHahshList datastructure to store the sparse matrices
of the linear system so that it become very fast and so that it
doesn't waste on the zeros, in fact my new algorithm doesn't store the
zeros of the sparse matrice of the linear system.

And my new parallel algorithm that i have just implemented today
is designed for sparse matrices of linear equations arising from
industrial Finite element problems and such..

Here is my new library of my new parallel algorithm:

https://sites.google.com/site/aminer68/parallel-implementation-of-conjugate-gradient-sparse-linear-system-solver

Thank you,
Amine Moulay Ramdane.

A new algorithm of Parallel implementation of Conjugate Gradient Sparse Linear System Solver library

Ramine <ramine@1.1>: Dec 19 11:16PM -0800

Hello,

I have just implemented today a new parallel algorithm of
a Parallel implementation of Conjugate Gradient Sparse
Linear System Solver library.. this library is designed for
sparse matrices of linear equations arising from industrial Finite
element problems and such, and my new parallel algorithm is cache-aware
and very fast..

So as you have noticed, i have implemented now two parallel algorithms,
one that is cache-aware an NUMA-aware and that is scalable on NUMA
architecture, and this scalable Parallel algorithm is designed for dense
matrices that you find on Linear Equations arising from Integral
Equation Formulations, here it is:

https://sites.google.com/site/aminer68/scalable-parallel-implementation-of-conjugate-gradient-linear-system-solver-library-that-is-numa-aware-and-cache-aware

And my new parallel algorithm that i have just implemented today
is designed for sparse matrices of linear equations arising from
industrial Finite element problems and such:

Here is my new library of my new parallel algorithm:

https://sites.google.com/site/aminer68/parallel-implementation-of-conjugate-gradient-sparse-linear-system-solver

Author: Amine Moulay Ramdane

Description:

I have come up with a new algorithm of my Parallel Conjugate gradient
sparse solver library, now it has become cache-aware, but you have to
notice that this new cache-aware algorithm is more efficient on
multicores, since i have benchmarked it against my previous algorithm
and it has given a scalability of 5X on a Quadcore over the single
thread of my previous algorithm , that's a really a big improvement !.

This Parallel library is especially designed for large scale industrial
engineering problems that you find on industrial Finite element problems
and such, this scalable Parallel library was ported to FreePascal and
all the Delphi XE versions and even to Delphi 7, hope you will find it
really good.

The Parallel implementation of Conjugate Gradient Sparse Linear System
Solver that i programmed here is designed to be used to solve large
sparse systems of linear equations where the direct methods can exceed
available machine memory and/or be extremely time-consuming. for example
the direct method of the Gauss algorithm takes O(n^2) in the back
substitution process and is dominated by the O(n^3) forward elimination
process, that means, if for example an operation takes 10^-9 second and
we have 1000 equations , the elimination process in the Gauss algorithm
will takes 0.7 second, but if we have 10000 equations in the system ,
the elimination process in the Gauss algorithm will take 11 minutes !.
This is why i have develloped for you the Parallel implementation of
Conjugate Gradient Sparse Linear System Solver in Object Pascal, that is
very fast.
You have only one method to use that is Solve()

function TParallelConjugateGradient.Solve(var A: arrarrext;var
B,X:VECT;var RSQ:DOUBLE;nbr_iter:integer;show_iter:boolean):boolean;

The system: A*x = b

The important parameters in the Solve() method are:

A is the matrix , B is the b vector, X the initial vector x,

nbr_iter is the number of iterations that you want and show_iter to show
the number of iteration on the screen.

RSQ is the sum of the squares of the components of the residual vector
A.x - b.

I have got over 5X scalability on a quad core.

The Conjugate Gradient Method is the most prominent iterative method for
solving sparse systems of linear equations. Unfortunately, many textbook
treatments of the topic are written with neither illustrations nor
intuition, and their victims can be found to this day babbling
senselessly in the corners of dusty libraries. For this reason, a deep,
geometric understanding of the method has been reserved for the elite
brilliant few who have painstakingly decoded the mumblings of their
forebears. Conjugate gradient is the most popular iterative method for
solving large systems of linear equations. CG is effective for systems
of the form A.x = b where x is an unknown vector, b is a known vector, A
is a known square, symmetric, positive-definite (or positive-indefinite)
matrix. These systems arise in many important settings, such as finite
difference and finite element methods for solving partial differential
equations, structural analysis, circuit analysis, and math homework

The Conjugate gradient method can also be applied to non-linear
problems, but with much less success since the non-linear functions have
multiple minimums. The Conjugate gradient method will indeed find a
minimum of such a nonlinear function, but it is in no way guaranteed to
be a global minimum, or the minimum that is desired. But the conjugate
gradient method is great iterative method for solving large, sparse
linear systems with a symmetric, positive, definite matrix.

In the method of conjugate gradients the residuals are not used as
search directions, as in the steepest decent method, cause searching can
require a large number of iterations as the residuals zig zag towards
the minimum value for ill-conditioned matrices. But instead conjugate
gradient method uses the residuals as a basis to form conjugate search
directions . In this manner, the conjugated gradients (residuals) form a
basis of search directions to minimize the quadratic function
f(x)=1/2*Transpose(x)*A*x + Transpose(b)*x and to achieve faster speed
and result of dim(N) convergence.

Language: FPC Pascal v2.2.0+ / Delphi 7+: http://www.freepascal.org/

Operating Systems: Windows, Mac OSX , Linux...

Required FPC switches: -O3 -Sd -dFPC -dFreePascal

-Sd for delphi mode....

Required Delphi switches: -$H+ -DDelphi

{$DEFINE CPU32} and {$DEFINE Windows32} for 32 bit systems

{$DEFINE CPU64} and {$DEFINE Windows64} for 64 bit systems

Thank you,
Amine Moulay Ramdane.

Ramine <ramine@1.1>: Dec 19 11:42PM -0800

Hello,

Read here:

https://en.wikipedia.org/wiki/Sparse_matrix

As you have noticed it says:

"When storing and manipulating sparse matrices on a computer, it is
beneficial and often necessary to use specialized algorithms and data
structures that take advantage of the sparse structure of the matrix.
Operations using standard dense-matrix structures and algorithms are
slow and inefficient when applied to large sparse matrices as processing
and memory are wasted on the zeroes. Sparse data is by nature more
easily compressed and thus require significantly less storage. Some very
large sparse matrices are infeasible to manipulate using standard
dense-matrix algorithms."

I have took care of that on my new algorithm, i have used
my ParallelHahshList datastructure to store the sparse matrices
of the linear system so that it become very fast and so that it
doesn't waste on the zeros, in fact my new algorithm doesn't store the
zeros of the sparse matrice of the linear system.

And my new parallel algorithm that i have just implemented today
is designed for sparse matrices of linear equations arising from
industrial Finite element problems and such..

Here is my new library of my new parallel algorithm:

https://sites.google.com/site/aminer68/parallel-implementation-of-conjugate-gradient-sparse-linear-system-solver

Thank you,
Amine Moulay Ramdane.

Scalable Parallel implementation of Conjugate Gradient Linear System solver library version 1.22

Ramine <ramine@1.1>: Dec 19 02:00PM -0800

Hello,

I have updated my Scalable Parallel implementation of Conjugate Gradient
Linear System solver library that is NUMA-aware and cache-aware to
version 1.22, in the previous version, the FIFO queues
of the Threadpools that i were using was not Allocated in different NUMA
nodes, i have enhanced it and now each FIFO queue of a Threadpool is
allocated in different NUMA node, and the rest of my algorithm is
NUMA-aware and cache aware, so now all my algorithm has become fully
NUMA-aware and cache-aware, so it is now scalable on multicores and
on NUMA architectures of the x86 architecture.

Today, ladies and gentlemen, i will talk a little bit about
my scalable conjugate gradient system solver library..

The important thing to understand is that it it is NUMA-aware
and scalable on NUMA architecture, because i am using
two functions that mutiply a matrix by vector, so i have used
a mechanism to distributed equally the memory allocation of
the rows of the matrix on different NUMA nodes, and
i have made my algorithm cache-aware, other than that i have
used a probabilistic mechanism to make it scalable on NUMA architecture
, this probalistic mechanism does minimize at best the contention points
and it renders my algorithm fully scalable on NUMA architecture.

Hope you will be happy with my new scalable algorithm and my scalable
parallel library, frankly i think i have to write something like a PhD
paper to explain more my new scalable algorithm , but i will let it as
it is at this moment... perhaps i will do it in the near future.

This scalable Parallel library is especially designed for large scale
industrial engineering problems that you find on industrial Finite
element problems and such, this scalable Parallel library was ported to
FreePascal and all the Delphi XE versions and even to Delphi 7, hope you
will find it really good.

Here is the simulation program that uses the probabilistic mechanism
that i have talked about and that prove to you that my algorithm is
scalable:

If you look at my scalable parallel algorithm, it is dividing the each
array of the matrix by 250 elements, and if you look carefully i am
using two functions that consumes the greater part of all the CPU, it is
the atsub() and asub(), and inside those functions i am using a
probabilistic mechanism so that to render my algorithm scalable on NUMA
architecture, what i am doing is scrambling the array parts using a
probabilistic function and what i have noticed that this probabilitic
mechanism is very efficient, to prove to you what i am saying , please
look at the following simulation that i have done using a variable that
contains the number of NUMA nodes, and what i have noticed that my
simulation is giving almost a perfect scalability on NUMA architecture,
for example let us give to the "NUMA_nodes" variable a value of 4, and
to our array a value of 250, the simulation bellow will give a number of
contention points of a quarter of the array, so if i am using 16 cores ,
in the the worst case it will scale 4X throughput on NUMA architecture,
because since i am using an array of 250 and there is a quarter of the
array of contention points , so from the Amdahl's law this will give a
scalability of almost 4X throughput on four NUMA nodes, and this will
give almost a perfect scalability on more and more NUMA nodes, so my
parallel algorithm is scalable on NUMA architecture.

Here is the simulation that i have done, please run it and you will
notice yourself that my parallel algorithm is scalable on NUMA architecture.

Here it is:

---
program test;

uses math;

var tab,tab1,tab2,tab3:array of integer;
a,n1,k,i,n2,tmp,j,numa_nodes:integer;
begin

a:=250;
Numa_nodes:=4;

setlength(tab2,a);

for i:=0 to a-1
do
begin

tab2[i]:=i mod numa_nodes;

end;

setlength(tab,a);

randomize;

for k:=0 to a-1
do tab[k]:=k;

n2:=a-1;

for k:=0 to a-1
do
begin
n1:=random(n2);
tmp:=tab[k];
tab[k]:=tab[n1];
tab[n1]:=tmp;
end;

setlength(tab1,a);

randomize;

for k:=0 to a-1
do tab1[k]:=k;

n2:=a-1;

for k:=0 to a-1
do
begin
n1:=random(n2);
tmp:=tab1[k];
tab1[k]:=tab1[n1];
tab1[n1]:=tmp;
end;

for i:=0 to a-1
do
if tab2[tab[i]]=tab2[tab1[i]] then
begin
inc(j);
writeln('A contention at: ',i);

end;

writeln('Number of contention points: ',j);
setlength(tab,0);
setlength(tab1,0);
setlength(tab2,0);
end.
---

You can download my new Scalable Parallel implementation of Conjugate
Gradient Linear System solver library version 1.222 from:

https://sites.google.com/site/aminer68/scalable-parallel-implementation-of-conjugate-gradient-linear-system-solver-library-that-is-numa-aware-and-cache-aware

Thank you,
Amine Moulay Ramdane.

What the Hell is going on here?

Marcel Mueller <news.5.maazl@spamgourmet.org>: Dec 19 07:43PM +0100

> Ramine, Ramine, Ramine...

> Is it what the freedom of speech really makes when applied with no limits?

adjust your killfile

Marcel

How do I look inside an .exe file to view the programming

yirgaatmail@gmail.com: Dec 19 10:33AM -0800

some of you shut your mouth you can do it using decompiler. search google

You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.programming.threads+unsubscribe@googlegroups.com.

soft and program

Sunday, December 20, 2015

Digest for comp.programming.threads@googlegroups.com - 10 updates in 6 topics

No comments:

Blog Archive

About Me