Thursday, December 8, 2022

Digest for comp.programming.threads@googlegroups.com - 3 updates in 3 topics

Amine Moulay Ramdane <aminer68@gmail.com>: Dec 07 03:13PM -0800

Hello,
 
 
 
 
More of my philosophy about how the entrepreneurial spirit is alive and well in the United States and more of my thoughts..
 
I am a white arab from Morocco, and i think i am smart since i have also
invented many scalable algorithms and algorithms..
 
 
I invite you to read the following interesting article
that shows what economists call 'creative destruction,' wherein new innovation springs up because of the failure of particular industries or businesses , like in the time of Covid-19, and it shows that despite the risks of opening a business during a global pandemic, new data from the U.S. Census Bureau reveals that the entrepreneurial spirit is alive and well in the United States.
 
Visualizing America's Entrepreneurial Spirit During COVID-19
 
Read more here:
 
https://www.visualcapitalist.com/visualizing-americas-entrepreneurial-spirit-during-covid-19/
 
 
More of my philosophy about who will win the World Cup quarterfinal and more of my thoughts..
 
I think i am highly smart, and I have passed two certified IQ tests and i have scored above 115 IQ, and i mean that it is "above" 115 IQ, and i have just looked at the following Portugal's results, and look for example at the following result of the World cup:
 
Portugal 3-2 Ghana
 
Korea Republic 2-1 Portugal
 
So i am noticing the pattern with my fluid intelligence, and it is that from the scores below i am noticing that Portugal is not good in defense, since it scores ninth in defense, and notice how Ghana and Korea Republic have scored against Portugal even they are much less good at offensive than Morocco, and notice in the scores below that Portugal is almost as good in offensive as Morocco, but Morocco scores second in the defense, so i say that Morocco has to be more offensive and be good in defense so that it wins, but i think that Morocco will win World Cup quarterfinal.
 
Read here and translate it in english:
 
https://olympics.com/fr/infos/maroc-portugal-quarts-de-finale-coupe-du-monde
 
So look at the following scores so that you notice more the pattern that i am talking about:
 
https://www.sportytrader.com/favoris/coupe-monde-football/
 
And read my previous thoughts:
 
More of my philosophy about Morocco football team nicknamed "the Lions of the Atlas"..
 
Morocco beat Spain in penalties to reach World Cup quarterfinals. But i think that Morocco can also win World Cup quarterfinal, since it scores second in defense , and it also has one of the best goalkeeper in the world, so it is very good in defense, and it scores the sixth in offensive, so you can read carefully about those scores in the following webpage (And you can translate the webpage in english):
 
https://www.sportytrader.com/favoris/coupe-monde-football/
 
 
And i invite you to listen to the following beautiful moroccan arabic song about the Morocco football team nicknamed "the Lions of the Atlas":
 
https://www.youtube.com/watch?v=rY0OYsKoTBA
 
 
 
Thank you,
Amine Moulay Ramdane.
Amine Moulay Ramdane <aminer68@gmail.com>: Dec 07 11:52AM -0800

Hello,
 
 
 
 
More of my philosophy about who will win the World Cup quarterfinal and more of my thoughts..
 
I am a white arab from Morocco, and i think i am smart since i have also
invented many scalable algorithms and algorithms..
 
 
I think i am highly smart, and I have passed two certified IQ tests and i have scored above 115 IQ, and i mean that it is "above" 115 IQ, and i have just looked at the following Portugal's results, and look for example at the following result of the World cup:
 
Portugal 3-2 Ghana
 
Korea Republic 2-1 Portugal
 
So i am noticing the pattern with my fluid intelligence, and it is that from the scores below i am noticing that Portugal is not good in defense, since it scores ninth in defense, and notice how Ghana and Korea Republic have scored against Portugal even they are much less good at offensive than Morocco, and notice in the scores below that Portugal is almost as good in offensive as Morocco, but Morocco scores second in the defense, so i say that Morocco has to be more offensive and be good in defense so that it wins, but i think that Morocco will win World Cup quarterfinal.
 
Read here and translate it in english:
 
https://olympics.com/fr/infos/maroc-portugal-quarts-de-finale-coupe-du-monde
 
So look at the following scores so that you notice more the pattern that i am talking about:
 
https://www.sportytrader.com/favoris/coupe-monde-football/
 
And read my previous thoughts:
 
More of my philosophy about Morocco football team nicknamed "the Lions of the Atlas"..
 
Morocco beat Spain in penalties to reach World Cup quarterfinals. But i think that Morocco can also win World Cup quarterfinal, since it scores second in defense , and it also has one of the best goalkeeper in the world, so it is very good in defense, and it scores the sixth in offensive, so you can read carefully about those scores in the following webpage (And you can translate the webpage in english):
 
https://www.sportytrader.com/favoris/coupe-monde-football/
 
 
And i invite you to listen to the following beautiful moroccan arabic song about the Morocco football team nicknamed "the Lions of the Atlas":
 
https://www.youtube.com/watch?v=rY0OYsKoTBA
 
 
 
Thank you,
Amine Moulay Ramdane.
Amine Moulay Ramdane <aminer68@gmail.com>: Dec 07 10:27AM -0800

Hello,
 
 
 
More of my philosophy about the how bad is CLX memory latency and about technology and more of my thoughts..
 
I am a white arab from Morocco, and i think i am smart since i have also
invented many scalable algorithms and algorithms..
 
 
HOW BAD IS THE CXL MEMORY LATENCY REALLY?
 
"If the folks at Astera are to be believed, the latency isn't as bad as you might think. The company's Leo CXL memory controllers are designed to accept standard DDR5 memory DIMMs up to 5600 MT/sec. They claim customers can expect latencies roughly on par with accessing memory on a second CPU, one NUMA hop away. This puts it in the neighborhood of 170 nanoseconds to 250 nanoseconds. In fact, as far as the system is concerned, that's exactly how these memory modules show up to the operating system."
 
Read more here:
 
https://www.nextplatform.com/2022/12/05/just-how-bad-is-cxl-memory-latency/
 
 
CXL memory pools: Just how big can they be?
 
 
Read more here:
 
https://blocksandfiles.com/2022/07/07/cxl-memory-pools-size/
 
 
 
More of my philosophy about technology and about Intel technology and more of my thoughts..
 
 
Intel says it will squeeze 1 trillion transistors onto a chip package by 2030
 
"Intel Corp. researchers this weekend revealed a number of technological innovations and concepts, including packaging improvements that could result in computer chips that are 10 times as powerful as today's most advanced silicon."
 
 
Read more here:
 
https://siliconangle.com/2022/12/04/intel-says-will-squeeze-1-trillion-transistors-onto-chip-package-2030/
 
 
More of my philosophy about the 12 memory channels of
the new AMD Epyc Genoa CPU and more of my thoughts..
 
I am a white arab from Morocco, and i think i am smart since i have also
invented many scalable algorithms and algorithms..
 
 
So as i am saying below, i think that so that to use 12 memory
channels in parallel that supports it the new AMD Genoa CPU, the GMI-Wide mode must enlarge more and connects each CCD with more GMI links, so i think that it is what is doing AMD in its new 4 CCDs configuration, even with the cost optimized Epyc Genoa 9124 16 cores with 64 MB of L3 cache with 4 Core Complex Dies (CCDs), that costs around $1000 (Look at it here: https://www.tomshardware.com/reviews/amd-4th-gen-epyc-genoa-9654-9554-and-9374f-review-96-cores-zen-4-and-5nm-disrupt-the-data-center ), and as i am explaining more below that the Core Complex Dies (CCDs) connect to memory, I/O, and each other through the I/O Die (IOD) and each CCD connects to the IOD via a dedicated high-speed, or Global Memory Interconnect (GMI) link and the IOD also contains memory channels, PCIe Gen5 lanes, and Infinity Fabric links and all dies, or chiplets, interconnect with each other via AMD's Infinity Fabric Technology, and of course this will permit my new software project of Parallel C++ Conjugate Gradient Linear System Solver Library that scales very well to scale on the 12 memory channels, read my following thoughts so that to understand more about it:
 
More of my philosophy about the new Zen 4 AMD Ryzen™ 9 7950X and more of my thoughts..
 
 
So i have just looked at the new Zen 4 AMD Ryzen™ 9 7950X CPU, and i invite you to look at it here:
 
https://www.amd.com/en/products/cpu/amd-ryzen-9-7950x
 
But notice carefully that the problem is with the number of supported memory channels, since it just support two memory channels, so it is not good, since for example my following Open source software project of Parallel C++ Conjugate Gradient Linear System Solver Library that scales very well is scaling around 8X on my 16 cores Intel Xeon with 2 NUMA nodes and with 8 memory channels, but it will not scale correctly on the
new Zen 4 AMD Ryzen™ 9 7950X CPU with just 2 memory channels since it is also memory-bound, and here is my Powerful Open source software project of Parallel C++ Conjugate Gradient Linear System Solver Library that scales very well and i invite you to take carefully a look at it:
 
https://sites.google.com/site/scalable68/scalable-parallel-c-conjugate-gradient-linear-system-solver-library
 
So i advice you to buy an AMD Epyc CPU or an Intel Xeon CPU that supports 8 memory channels.
 
---
 
 
And of course you can use the next Twelve DDR5 Memory Channels for Zen 4 AMD EPYC CPUs so that to scale more my above algorithm, and read about it here:
 
https://www.tomshardware.com/news/amd-confirms-12-ddr5-memory-channels-on-genoa
 
 
And here is the simulation program that uses the probabilistic mechanism that i have talked about and that prove to you that my algorithm of my Parallel C++ Conjugate Gradient Linear System Solver Library is scalable:
 
If you look at my scalable parallel algorithm, it is dividing the each array of the matrix by 250 elements, and if you look carefully i am using two functions that consumes the greater part of all the CPU, it is the atsub() and asub(), and inside those functions i am using a probabilistic mechanism so that to render my algorithm scalable on NUMA architecture , and it also make it scale on the memory channels, what i am doing is scrambling the array parts using a probabilistic function and what i have noticed that this probabilistic mechanism is very efficient, to prove to you what i am saying , please look at the following simulation that i have done using a variable that contains the number of NUMA nodes, and what i have noticed that my simulation is giving almost a perfect scalability on NUMA architecture, for example let us give to the "NUMA_nodes" variable a value of 4, and to our array a value of 250, the simulation bellow will give a number of contention points of a quarter of the array, so if i am using 16 cores , in the worst case it will scale 4X throughput on NUMA architecture, because since i am using an array of 250 and there is a quarter of the array of contention points , so from the Amdahl's law this will give a scalability of almost 4X throughput on four NUMA nodes, and this will give almost a perfect scalability on more and more NUMA nodes, so my parallel algorithm is scalable on NUMA architecture and it also scale well on the memory channels,
 
Here is the simulation that i have done, please run it and you will notice yourself that my parallel algorithm is scalable on NUMA architecture.
 
Here it is:
 
---
program test;
 
uses math;
 
var tab,tab1,tab2,tab3:array of integer;
a,n1,k,i,n2,tmp,j,numa_nodes:integer;
begin
 
a:=250;
Numa_nodes:=4;
 
setlength(tab2,a);
 
for i:=0 to a-1
do
begin
 
tab2:=i mod numa_nodes;
 
end;
 
setlength(tab,a);
 
randomize;
 
for k:=0 to a-1
do tab:=k;
 
n2:=a-1;
 
for k:=0 to a-1
do
begin
n1:=random(n2);
tmp:=tab;
tab:=tab[n1];
tab[n1]:=tmp;
end;
 
setlength(tab1,a);
 
randomize;
 
for k:=0 to a-1
do tab1:=k;
 
n2:=a-1;
 
for k:=0 to a-1
do
begin
n1:=random(n2);
tmp:=tab1;
tab1:=tab1[n1];
tab1[n1]:=tmp;
end;
 
for i:=0 to a-1
do
if tab2[tab]=tab2[tab1] then
begin
inc(j);
writeln('A contention at: ',i);
 
end;
 
writeln('Number of contention points: ',j);
setlength(tab,0);
setlength(tab1,0);
setlength(tab2,0);
end.
---
 
 
More of my philosophy about 4 CCDs configuration of AMD Epyc Genoa CPU and more of my thoughts..
 
 
I have just read the following new paper about AMD 4th Gen EPYC 9004 Series, so i invite you to read it carefully:
 
https://hothardware.com/reviews/amd-genoa-data-center-cpu-launch
 
 
So read carefully the 4 CCDs configuration, so i am understanding
the following from it:
 
 
I/O DIE is what is connected to the memory channels externally, and it says that SKUs north of 4 CCDs (e.g. 32 cores) use the GMI3-Narrow configuration with a single GMI link per CCD. With 4 CCD and lower SKUs, AMD can implement GMI-Wide mode which joins each CCD to the IOD with two GMI links. In this case, one link of each CCD populates GMI0 to GMI3 while the other link of each CCD populates GMI8 to GMI11 as diagramed above. This helps these parts better balance against I/O demands.
 
So i think that that AMD implemented in his new 4 CCDs configuration the GMI-Wide mode which joins each CCD to the IOD with two GMI links, so that to be connected to the 8 memory channels externally and use them in parallel, so i think that the problem is solved, since i think that the cost optimized Epyc Genoa 9124 16 cores with 64 MB of L3 cache with 4 Core Complex Dies (CCDs), that costs around $1000 (Look at it here: https://www.tomshardware.com/reviews/amd-4th-gen-epyc-genoa-9654-9554-and-9374f-review-96-cores-zen-4-and-5nm-disrupt-the-data-center )
can use fully the 8 memory channels in parallel, so it is a good Epyc Genoa processor to buy.
 
And of course i invite you to read the following:
 
More of my philosophy about the new Epyc Genoa and about Core Complex Die (CCD) and Core-complex(CCX) and more of my thoughts..
 
I have just looked at the following paper from AMD and i invite
you to look at it:
 
https://developer.amd.com/wp-content/resources/56827-1-0.pdf
 
And as you notice above that you have to look at how many
Core Complex Dies (CCDs) you have, since it tells you more
about how many connections of Infinity Fabric you have, and it is
an important information, since look at the following article
about the new AMD Epyc Genoa:
 
https://wccftech.com/amd-epyc-genoa-cpu-lineup-specs-benchmarks-leak-up-to-2-6x-faster-than-intel-xeon/
 
 
And read my philosophy about quantum computers and about technology in my following thoughts in the following web link:
 
https://groups.google.com/g/alt.culture.morocco/c/tQ0Cs6Nw1yc
 
 
And you can read much more of my thoughts about technology in the following web links:
 
 
https://groups.google.com/g/alt.culture.morocco/c/MosH5fY4g_Y
 
And here:
 
https://groups.google.com/g/soc.culture.usa/c/N_UxX3OECX4
 
 
 
Thank you,
Amine Moulay Ramdane.
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.programming.threads+unsubscribe@googlegroups.com.

No comments: