- More of my philosophy about Asynchronous programming and about the futures and about the ActiveObject and about technology and more of my thoughts.. - 1 Update
- More of my philosophy about my new updated implementation of a future and about the ActiveObject and about technology and more of my thoughts.. - 1 Update
- More of my philosophy about future and ActiveObject and about technology and more of my thoughts.. - 1 Update
- More of my philosophy about technology and about my implementation of a future and more of my thoughts.. - 1 Update
Amine Moulay Ramdane <aminer68@gmail.com>: Nov 02 05:02PM -0700 Hello, More of my philosophy about Asynchronous programming and about the futures and about the ActiveObject and about technology and more of my thoughts.. I am a white arab, and i think i am smart since i have also invented many scalable algorithms and algorithms.. I think i am highly smart since I have passed two certified IQ tests and i have scored "above" 115 IQ, i think from my new implementation of future below, you can notice that Asynchronous programming is not a simple task, since it can get too much complicated , since you can notice in my implementation below that if i make the starting of the thread of the future out of the constructor and if i make the passing of the parameter as a pointer to the future out of the constructor , it will get more complex to get the automaton of how to use and call the methods right and safe, so i think that there is still a problem with Asynchronous programming and it is that when you have many Asynchronous tasks or threads it can get really complex, and i think that it is the weakness of Asynchronous programming, and of course i am also speaking of the implementation of a sophisticated ActiveObject or a future or complex Asynchronous programming. More of my philosophy about my new updated implementation of a future and about the ActiveObject and about technology and more of my thoughts.. I think i am highly smart since I have passed two certified IQ tests and i have scored "above" 115 IQ, so i have just updated my implementation of a future, and now both the starting the thread of the future and the passing the parameter as a pointer to the future is made from the constructor so that to make safe the system of the automaton of the how to use and call the methods, and I have just added support for exceptions, so you have to know that programming with futures is asynchronous programming, but so that to be robust the future implementation has to deal correctly with "exceptions", so in my implementation of a future when an exception is raised inside the future you will receive the exception, so i have implemented two things: The HasException() method so that to detect the exception from inside the future, and the the exception and its address is returned as a string in the ExceptionStr property, and my implementation of a future does of course support passing parameters as a pointer to the future, also my implementation of a future works in Windows and Linux, and of course you can also use my following more sophisticated Threadpool engine with priorities as a sophisticated ActiveObject or such and pass the methods or functions and there parameters to it, here it is: Threadpool engine with priorities https://sites.google.com/site/scalable68/threadpool-engine-with-priorities And stay tuned since i will enhance more my above Threadpool engine with priorities. So you can download my new updated portable and efficient implementation of a future in Delphi and FreePascal version 1.32 from my website here: https://sites.google.com/site/scalable68/a-portable-and-efficient-implementation-of-a-future-in-delphi-and-freepascal And here is a new example program of how to use my implementation of a future in Delphi and Freepascal and notice that the interface has changed a little bit: -- program TestFuture; uses system.SysUtils, system.Classes, Futures; type TTestFuture1 = class(TFuture) public function Compute(ptr:pointer): Variant; override; end; TTestFuture2 = class(TFuture) public function Compute(ptr:pointer): Variant; override; end; var obj1:TTestFuture1; obj2:TTestFuture2; a:variant; function TTestFuture1.Compute(ptr:pointer): Variant; begin raise Exception.Create('I raised an exception'); end; function TTestFuture2.Compute(ptr:pointer): Variant; begin writeln(nativeint(ptr)); result:='Hello world !'; end; begin writeln; obj1:=TTestFuture1.create(pointer(12)); if obj1.GetValue(a) then writeln(a) else if obj1.HasException then writeln(obj1.ExceptionStr); obj1.free; writeln; obj2:=TTestFuture2.create(pointer(12)); if obj2.GetValue(a) then writeln(a); obj2.free; end. --- More of my philosophy about quantum computing and about matrix operations and about scalability and more of my thoughts.. I think i am highly smart since I have passed two certified IQ tests and i have scored "above" 115 IQ, i have just looked at the following video about the powerful parallel quantum computer of IBM from USA that will be soon available in the cloud, and i invite you to look at it: Quantum Computing: Now Widely Available! https://www.youtube.com/watch?v=laqpfQ8-jFI But i have just read the following paper and it is saying that the powerful Quantum algorithms for matrix operations and linear systems of equations are available, read about them on the below paper, so as you notice in the following paper that many matrix operations and also the linear systems of equations solver can be done in a quantum computer, read about it here in the following paper: Quantum algorithms for matrix operations and linear systems of equations Read more here: https://arxiv.org/pdf/2202.04888.pdf So i think that IBM will do the same for there powerful parallel quantum computer that will be available in the cloud, but i think that you will have to pay for it of course since i think it will be commercial, but i think that there is a weakness with this kind of configuration of the powerful parallel quantum computer from IBM, since the cost of bandwidth of internet is exponentially decreasing , but the latency of accessing the internet is not, so it is why i think that people will still use classical computers for many mathematical applications that uses mathematical operations such as matrix operations and linear systems of equations etc. that needs a much faster latency, other than that Moore's law will still be effective in classical computers since it will permit us to have really powerful classical computer at a low cost and it will be really practical since the quantum computer is big in size and not so practical, so read about the two inventions below that will make logic gates thousands of times faster or a million times faster than those in existing computers so that to notice it, so i think that the business of classical computers will still be great in the future even with the coming of the powerful parallel quantum computer of IBM, so as you notice this kind of business is not only dependent on Moore's law and Bezos' Law , but it is also dependent on the latency of accessing internet, so read my following thoughts about Moore's law and about Bezos' Law: More of my philosophy about Moore's law and about Bezos' Law.. For RAM chips and flash memory, Moore's Law means that in eighteen months you'll pay the same price as today for twice as much storage. But other computing components are also seeing their price versus performance curves skyrocket exponentially. Data storage doubles every twelve months. More about Moore's law and about Bezos' Law.. "Parallel code is the recipe for unlocking Moore's Law" And: "BEZOS' LAW The Cost of Cloud Computing will be cut in half every 18 months - Bezos' Law Like Moore's law, Bezos' Law is about exponential improvement over time. If you look at AWS history, they drop prices constantly. In 2013 alone they've already had 9 price drops. The difference; however, between Bezos' and Moore's law is this: Bezos' law is the first law that isn't anchored in technical innovation. Rather, Bezos' law is anchored in confidence and market dynamics, and will only hold true so long as Amazon is not the aggregate dominant force in Cloud Computing (50%+ market share). Monopolies don't cut prices." More of my philosophy about latency and contention and concurrency and parallelism and more of my thoughts.. I think i am highly smart and i have just posted, read it below, about the new two inventions that will make logic gates thousands of times faster or a million times faster than those in existing computers, and i think that there is still a problem with those new inventions, and it is about the latency and concurrency, since you need concurrency and you need preemptive or non-preemptive scheduling of the coroutines , so since the HBM is 106.7 ns in latency and the DDR4 is 73.3 ns in latency and the AMD 3D V-Cache has also almost the same cost in latency, so as you notice that this kind of latency is still costly , also there is a latency that is the Time slice that takes a coroutine to execute and it is costly in latency, since this kind of latency and Time slice is a waiting time that looks like the time wasted in a contention in parallelism, so by logical analogy this kind of latency and Time slice create like a contention like in parallelism that reduces scalability, so i think it is why those new inventions have this kind of limit or constraints in a "concurrency" environment. And i invite you to read my following smart thoughts about preemptive and non-preemptive timesharing: https://groups.google.com/g/alt.culture.morocco/c/JuC4jar661w More of my philosophy about Fastest-ever logic gates and more of my thoughts.. "Logic gates are the fundamental building blocks of computers, and researchers at the University of Rochester have now developed the fastest ones ever created. By zapping graphene and gold with laser pulses, the new logic gates are a million times faster than those in existing computers, demonstrating the viability of "lightwave electronics.". If these kinds of lightwave electronic devices ever do make it to market, they could be millions of times faster than today's computers. Currently we measure processing speeds in Gigahertz (GHz), but these new logic gates function on the scale of Petahertz (PHz). Previous studies have set that as the absolute quantum limit of how fast light-based computer systems could possibly get." Read more here: https://newatlas.com/electronics/fastest-ever-logic-gates-computers-million-times-faster-petahertz/ Read my following news: And with the following new discovery computers and phones could run thousands of times faster.. Prof Alan Dalton in the School of Mathematical and Physics Sciences at the University of Sussex, said: "We're mechanically creating kinks in a layer of graphene. It's a bit like nano-origami. "Using these nanomaterials will make our computer chips smaller and faster. It is absolutely critical that this happens as computer manufacturers are now at the limit of what they can do with traditional semiconducting technology. Ultimately, this will make our computers and phones thousands of times faster in the future. "This kind of technology -- "straintronics" using nanomaterials as opposed to electronics -- allows space for more chips inside any device. Everything we want to do with computers -- to speed them up -- can be done by crinkling graphene like this." Dr Manoj Tripathi, Research Fellow in Nano-structured Materials at the University of Sussex and lead author on the paper, said: "Instead of having to add foreign materials into a device, we've shown we can create structures from graphene and other 2D materials simply by adding deliberate kinks into the structure. By making this sort of corrugation we can create a smart electronic component, like a transistor, or a logic gate." The development is a greener, more sustainable technology. Because no additional materials need to be added, and because this process works at room temperature rather than high temperature, it uses less energy to create. Read more here: https://www.sciencedaily.com/releases/2021/02/210216100141.htm But I think that mass production of graphene still hasn't quite begun, so i think the inventions above of the Fastest-ever logic gates that uses graphene and of the one with nanomaterials that uses graphene will not be commercialized fully until perhaps around year 2035 or 2040 or so, so read the following so that to understand why: "Because large-scale mass production of graphene still hasn't quite begun , the market is a bit limited. However, that leaves a lot of room open for investors to get in before it reaches commercialization. The market was worth $78.7 million in 2019 and, according to Grand View Research, is expected to rise drastically to $1.08 billion by 2027. North America currently has the bulk of market share, but the Asia-Pacific area is expected to have the quickest growth in adoption of graphene uses in coming years. North America and Europe are also expected to have above-market average growth. The biggest driver of all this growth is expected to be the push for cleaner, more efficient energy sources and the global reduction of emissions in the air." Read more here: https://www.energyandcapital.com/report/the-worlds-next-rare-earth-metal/1600 And of course you can read my thoughts about technology in the following web link: https://groups.google.com/g/soc.culture.usa/c/N_UxX3OECX4 More of my philosophy about matrix-matrix multiplication and about scalability and more of my thoughts.. I think that the time complexity of the Strassen algorithm for matrix-matrix multiplication is around O(N^2.8074), and the time complexity of the naive algorithm is O(N^3) , so it is not a significant difference, so i think i will soon implement the parallel Blocked matrix-matrix multiplication and i will implement it with a new algorithm that also uses intel AVX512 and that uses fused multiply-add and of course it will use the assembler instructions below of prefetching into caches so that to gain a 22% speed, so i think that overall it will have around the same speed as parallel BLAS, and i say that Pipelining greatly increases throughput in modern CPUs such as x86 CPUs, and another common pipelining scenario is the FMA or fused multiply-add, which is a fundamental part of the instruction set for some processors. The basic load-operate-store sequence simply lengthens by one step to become load-multiply-add-store. The FMA is possible only if the hardware supports it, as it does in the case of the Intel Xeon Phi, for example, as well as in Skylake etc. More of my philosophy about matrix-vector multiplication of large matrices and about scalability and more of my thoughts.. The matrix-vector multiplication of large matrices is completly limited by the memory bandwidth as i have just said it, read it below, so vector extensions like using SSE or AVX are usually not necessary for matrix-vector multiplication of large matrices. It is interesting that matrix-matrix-multiplications don't have these kind of problems with memory bandwidth. Companies like Intel or AMD typically usually show benchmarks of matrix-matrix multiplications and they show how nice they scale on many more cores, but they never show matrix-vector multiplications, and notice that my Powerful Open source software project of Parallel C++ Conjugate Gradient Linear System Solver Library that scales very well is also memory-bound and the matrices for it are usually big, but my new algorithm of it is efficiently cache-aware and efficiently NUMA-aware, and i have implemented it for the dense and sparse matrices. More of my philosophy about the efficient Matrix-Vector multiplication algorithm in MPI and about scalability and more of my thoughts.. Matrix-vector multiplication is an absolutely fundamental operation, with countless applications in computer science and scientific computing. Efficient algorithms for matrix-vector multiplication are of paramount importance, and notice that for matrix-vector multiplication, n^2 time is certainly required for an n × n dense |
Amine Moulay Ramdane <aminer68@gmail.com>: Nov 02 04:23PM -0700 Hello, More of my philosophy about my new updated implementation of a future and about the ActiveObject and about technology and more of my thoughts.. I am a white arab, and i think i am smart since i have also invented many scalable algorithms and algorithms.. I think i am highly smart since I have passed two certified IQ tests and i have scored "above" 115 IQ, so i have just updated my implementation of a future, and now both the starting the thread of the future and the passing the parameter as a pointer to the future is made from the constructor so that to make safe the system of the automaton of the how to use and call the methods, and I have just added support for exceptions, so you have to know that programming with futures is asynchronous programming, but so that to be robust the future implementation has to deal correctly with "exceptions", so in my implementation of a future when an exception is raised inside the future you will receive the exception, so i have implemented two things: The HasException() method so that to detect the exception from inside the future, and the the exception and its address is returned as a string in the ExceptionStr property, and my implementation of a future does of course support passing parameters as a pointer to the future, also my implementation of a future works in Windows and Linux, and of course you can also use my following Threadpool engine with priorities as a sophisticated ActiveObject or such and pass the methods or functions and there parameters to it, here it is: Threadpool engine with priorities https://sites.google.com/site/scalable68/threadpool-engine-with-priorities And stay tuned since i will enhance more my above Threadpool engine with priorities. So you can download my new updated portable and efficient implementation of a future in Delphi and FreePascal version 1.32 from my website here: https://sites.google.com/site/scalable68/a-portable-and-efficient-implementation-of-a-future-in-delphi-and-freepascal And here is a new example program of how to use my implementation of a future in Delphi and Freepascal and notice that the interface has changed a little bit: -- program TestFuture; uses system.SysUtils, system.Classes, Futures; type TTestFuture1 = class(TFuture) public function Compute(ptr:pointer): Variant; override; end; TTestFuture2 = class(TFuture) public function Compute(ptr:pointer): Variant; override; end; var obj1:TTestFuture1; obj2:TTestFuture2; a:variant; function TTestFuture1.Compute(ptr:pointer): Variant; begin raise Exception.Create('I raised an exception'); end; function TTestFuture2.Compute(ptr:pointer): Variant; begin writeln(nativeint(ptr)); result:='Hello world !'; end; begin writeln; obj1:=TTestFuture1.create(pointer(12)); if obj1.GetValue(a) then writeln(a) else if obj1.HasException then writeln(obj1.ExceptionStr); obj1.free; writeln; obj2:=TTestFuture2.create(pointer(12)); if obj2.GetValue(a) then writeln(a); obj2.free; end. --- More of my philosophy about quantum computing and about matrix operations and about scalability and more of my thoughts.. I think i am highly smart since I have passed two certified IQ tests and i have scored "above" 115 IQ, i have just looked at the following video about the powerful parallel quantum computer of IBM from USA that will be soon available in the cloud, and i invite you to look at it: Quantum Computing: Now Widely Available! https://www.youtube.com/watch?v=laqpfQ8-jFI But i have just read the following paper and it is saying that the powerful Quantum algorithms for matrix operations and linear systems of equations are available, read about them on the below paper, so as you notice in the following paper that many matrix operations and also the linear systems of equations solver can be done in a quantum computer, read about it here in the following paper: Quantum algorithms for matrix operations and linear systems of equations Read more here: https://arxiv.org/pdf/2202.04888.pdf So i think that IBM will do the same for there powerful parallel quantum computer that will be available in the cloud, but i think that you will have to pay for it of course since i think it will be commercial, but i think that there is a weakness with this kind of configuration of the powerful parallel quantum computer from IBM, since the cost of bandwidth of internet is exponentially decreasing , but the latency of accessing the internet is not, so it is why i think that people will still use classical computers for many mathematical applications that uses mathematical operations such as matrix operations and linear systems of equations etc. that needs a much faster latency, other than that Moore's law will still be effective in classical computers since it will permit us to have really powerful classical computer at a low cost and it will be really practical since the quantum computer is big in size and not so practical, so read about the two inventions below that will make logic gates thousands of times faster or a million times faster than those in existing computers so that to notice it, so i think that the business of classical computers will still be great in the future even with the coming of the powerful parallel quantum computer of IBM, so as you notice this kind of business is not only dependent on Moore's law and Bezos' Law , but it is also dependent on the latency of accessing internet, so read my following thoughts about Moore's law and about Bezos' Law: More of my philosophy about Moore's law and about Bezos' Law.. For RAM chips and flash memory, Moore's Law means that in eighteen months you'll pay the same price as today for twice as much storage. But other computing components are also seeing their price versus performance curves skyrocket exponentially. Data storage doubles every twelve months. More about Moore's law and about Bezos' Law.. "Parallel code is the recipe for unlocking Moore's Law" And: "BEZOS' LAW The Cost of Cloud Computing will be cut in half every 18 months - Bezos' Law Like Moore's law, Bezos' Law is about exponential improvement over time. If you look at AWS history, they drop prices constantly. In 2013 alone they've already had 9 price drops. The difference; however, between Bezos' and Moore's law is this: Bezos' law is the first law that isn't anchored in technical innovation. Rather, Bezos' law is anchored in confidence and market dynamics, and will only hold true so long as Amazon is not the aggregate dominant force in Cloud Computing (50%+ market share). Monopolies don't cut prices." More of my philosophy about latency and contention and concurrency and parallelism and more of my thoughts.. I think i am highly smart and i have just posted, read it below, about the new two inventions that will make logic gates thousands of times faster or a million times faster than those in existing computers, and i think that there is still a problem with those new inventions, and it is about the latency and concurrency, since you need concurrency and you need preemptive or non-preemptive scheduling of the coroutines , so since the HBM is 106.7 ns in latency and the DDR4 is 73.3 ns in latency and the AMD 3D V-Cache has also almost the same cost in latency, so as you notice that this kind of latency is still costly , also there is a latency that is the Time slice that takes a coroutine to execute and it is costly in latency, since this kind of latency and Time slice is a waiting time that looks like the time wasted in a contention in parallelism, so by logical analogy this kind of latency and Time slice create like a contention like in parallelism that reduces scalability, so i think it is why those new inventions have this kind of limit or constraints in a "concurrency" environment. And i invite you to read my following smart thoughts about preemptive and non-preemptive timesharing: https://groups.google.com/g/alt.culture.morocco/c/JuC4jar661w More of my philosophy about Fastest-ever logic gates and more of my thoughts.. "Logic gates are the fundamental building blocks of computers, and researchers at the University of Rochester have now developed the fastest ones ever created. By zapping graphene and gold with laser pulses, the new logic gates are a million times faster than those in existing computers, demonstrating the viability of "lightwave electronics.". If these kinds of lightwave electronic devices ever do make it to market, they could be millions of times faster than today's computers. Currently we measure processing speeds in Gigahertz (GHz), but these new logic gates function on the scale of Petahertz (PHz). Previous studies have set that as the absolute quantum limit of how fast light-based computer systems could possibly get." Read more here: https://newatlas.com/electronics/fastest-ever-logic-gates-computers-million-times-faster-petahertz/ Read my following news: And with the following new discovery computers and phones could run thousands of times faster.. Prof Alan Dalton in the School of Mathematical and Physics Sciences at the University of Sussex, said: "We're mechanically creating kinks in a layer of graphene. It's a bit like nano-origami. "Using these nanomaterials will make our computer chips smaller and faster. It is absolutely critical that this happens as computer manufacturers are now at the limit of what they can do with traditional semiconducting technology. Ultimately, this will make our computers and phones thousands of times faster in the future. "This kind of technology -- "straintronics" using nanomaterials as opposed to electronics -- allows space for more chips inside any device. Everything we want to do with computers -- to speed them up -- can be done by crinkling graphene like this." Dr Manoj Tripathi, Research Fellow in Nano-structured Materials at the University of Sussex and lead author on the paper, said: "Instead of having to add foreign materials into a device, we've shown we can create structures from graphene and other 2D materials simply by adding deliberate kinks into the structure. By making this sort of corrugation we can create a smart electronic component, like a transistor, or a logic gate." The development is a greener, more sustainable technology. Because no additional materials need to be added, and because this process works at room temperature rather than high temperature, it uses less energy to create. Read more here: https://www.sciencedaily.com/releases/2021/02/210216100141.htm But I think that mass production of graphene still hasn't quite begun, so i think the inventions above of the Fastest-ever logic gates that uses graphene and of the one with nanomaterials that uses graphene will not be commercialized fully until perhaps around year 2035 or 2040 or so, so read the following so that to understand why: "Because large-scale mass production of graphene still hasn't quite begun , the market is a bit limited. However, that leaves a lot of room open for investors to get in before it reaches commercialization. The market was worth $78.7 million in 2019 and, according to Grand View Research, is expected to rise drastically to $1.08 billion by 2027. North America currently has the bulk of market share, but the Asia-Pacific area is expected to have the quickest growth in adoption of graphene uses in coming years. North America and Europe are also expected to have above-market average growth. The biggest driver of all this growth is expected to be the push for cleaner, more efficient energy sources and the global reduction of emissions in the air." Read more here: https://www.energyandcapital.com/report/the-worlds-next-rare-earth-metal/1600 And of course you can read my thoughts about technology in the following web link: https://groups.google.com/g/soc.culture.usa/c/N_UxX3OECX4 More of my philosophy about matrix-matrix multiplication and about scalability and more of my thoughts.. I think that the time complexity of the Strassen algorithm for matrix-matrix multiplication is around O(N^2.8074), and the time complexity of the naive algorithm is O(N^3) , so it is not a significant difference, so i think i will soon implement the parallel Blocked matrix-matrix multiplication and i will implement it with a new algorithm that also uses intel AVX512 and that uses fused multiply-add and of course it will use the assembler instructions below of prefetching into caches so that to gain a 22% speed, so i think that overall it will have around the same speed as parallel BLAS, and i say that Pipelining greatly increases throughput in modern CPUs such as x86 CPUs, and another common pipelining scenario is the FMA or fused multiply-add, which is a fundamental part of the instruction set for some processors. The basic load-operate-store sequence simply lengthens by one step to become load-multiply-add-store. The FMA is possible only if the hardware supports it, as it does in the case of the Intel Xeon Phi, for example, as well as in Skylake etc. More of my philosophy about matrix-vector multiplication of large matrices and about scalability and more of my thoughts.. The matrix-vector multiplication of large matrices is completly limited by the memory bandwidth as i have just said it, read it below, so vector extensions like using SSE or AVX are usually not necessary for matrix-vector multiplication of large matrices. It is interesting that matrix-matrix-multiplications don't have these kind of problems with memory bandwidth. Companies like Intel or AMD typically usually show benchmarks of matrix-matrix multiplications and they show how nice they scale on many more cores, but they never show matrix-vector multiplications, and notice that my Powerful Open source software project of Parallel C++ Conjugate Gradient Linear System Solver Library that scales very well is also memory-bound and the matrices for it are usually big, but my new algorithm of it is efficiently cache-aware and efficiently NUMA-aware, and i have implemented it for the dense and sparse matrices. More of my philosophy about the efficient Matrix-Vector multiplication algorithm in MPI and about scalability and more of my thoughts.. Matrix-vector multiplication is an absolutely fundamental operation, with countless applications in computer science and scientific computing. Efficient algorithms for matrix-vector multiplication are of paramount importance, and notice that for matrix-vector multiplication, n^2 time is certainly required for an n × n dense matrix, but you have to be smart, since in MPI computing for also the supercomputer exascale systems, doesn't only take into account this n^2 time, since it has to also be efficiently be cache-aware, and it has to also have a good complexity for the how much memory is used by the parallel processes in MPI, since notice carefully with me that you have also to not send both a row of the matrix and the vector the the parallel processes of MPI, but you have to know how to reduce efficiently this complexity by for example dividing each row of the matrix and by dividing the vector and sending a part of the row of the matrix and a part of the vector to the parallel processes of MPI, and i think that in an efficient algorithm for Matrix-Vector multiplication, time for addition is dominated by the communication time, and of course that my implementation of my Powerful Open source software of Parallel C++ Conjugate Gradient Linear System Solver Library that scales very well is also smart, since it is efficiently cache-aware and efficiently NUMA-aware, and it implements both the dense and the sparse, and of course as i am showing |
Amine Moulay Ramdane <aminer68@gmail.com>: Nov 02 12:24PM -0700 Hello, More of my philosophy about future and ActiveObject and about technology and more of my thoughts.. I am a white arab, and i think i am smart since i have also invented many scalable algorithms and algorithms.. I think i am highly smart since I have passed two certified IQ tests and i have scored "above" 115 IQ, so notice that my implementation of a future below is not both starting the thread of the future and passing the parameter as a pointer to the future from the the constructor so that to make the system of the automaton of the how to use and call the methods safer,so i have let it as it is since the using of the future is not so difficult, but of course that you can use my following Threadpool engine with priorities as a more sophisticated ActiveObject or such and pass the methods or functions and there parameters to it, here it is: Threadpool engine with priorities https://sites.google.com/site/scalable68/threadpool-engine-with-priorities And stay tuned since i will enhance more my above Threadpool engine with priorities. More of my philosophy about technology and about my implementation of a future and more of my thoughts.. My portable and efficient implementation of a future in Delphi and FreePascal was updated to version 1.31 I have just added support for exceptions, so you have to know that programming with futures is asynchronous programming, but so that to be robust the future implementation has to deal correctly with "exceptions", so in my implementation of a future when an exception is raised inside the future you will receive the exception, so i have implemented two things: The HasException() method so that to detect the exception from inside the future, and the the exception and its address is returned as a string in the ExceptionStr property, and my implementation of a future does of course support passing parameters as a pointer to the future, also my implementation of a future works in Windows and Linux. You can download my portable and efficient implementation of a future in Delphi and FreePascal version 1.31 from my website here: https://sites.google.com/site/scalable68/a-portable-and-efficient-implementation-of-a-future-in-delphi-and-freepascal And here is an example program of how to use my implementation of a future in Delphi and Freepascal: -- program TestFuture; uses system.SysUtils, system.Classes, Futures; type TTestFuture1 = class(TFuture) public function Compute(ptr:pointer): Variant; override; end; TTestFuture2 = class(TFuture) public function Compute(ptr:pointer): Variant; override; end; var obj1:TTestFuture1; obj2:TTestFuture2; a:variant; function TTestFuture1.Compute(ptr:pointer): Variant; begin raise Exception.Create('I raised an exception'); end; function TTestFuture2.Compute(ptr:pointer): Variant; begin writeln(nativeint(ptr)); result:='Hello world !'; end; begin writeln; obj1:=TTestFuture1.create(); obj1.SetParameter(pointer(12)); obj1.Start; if obj1.GetValue(a) then writeln(a) else if obj1.HasException then writeln(obj1.ExceptionStr); obj1.free; writeln; obj2:=TTestFuture2.create(); obj2.SetParameter(pointer(12)); obj2.Start; if obj2.GetValue(a) then writeln(a); obj2.free; end. --- More of my philosophy about quantum computing and about matrix operations and about scalability and more of my thoughts.. I think i am highly smart since I have passed two certified IQ tests and i have scored "above" 115 IQ, i have just looked at the following video about the powerful parallel quantum computer of IBM from USA that will be soon available in the cloud, and i invite you to look at it: Quantum Computing: Now Widely Available! https://www.youtube.com/watch?v=laqpfQ8-jFI But i have just read the following paper and it saying that the powerful Quantum algorithms for matrix operations and linear systems of equations are available, read about them on the below paper, so as you notice in the following paper that many matrix operations and also the linear systems of equations solver can be done in a quantum computer, read about it here in the following paper: Quantum algorithms for matrix operations and linear systems of equations Read more here: https://arxiv.org/pdf/2202.04888.pdf So i think that IBM will do the same for there powerful parallel quantum computer that will be available in the cloud, but i think that you will have to pay for it of course since i think it will be commercial, but i think that there is a weakness with this kind of configuration of the powerful parallel quantum computer from IBM, since the cost of bandwidth of internet is exponentially decreasing , but the latency of accessing the internet is not, so it is why i think that people will still use classical computers for many mathematical applications that uses mathematical operations such as matrix operations and linear systems of equations etc. that needs a much faster latency, other than that Moore's law will still be effective in classical computers since it will permit us to have really powerful classical computer at a low cost and it will be really practical since the quantum computer is big in size and not so practical, so read about the two inventions below that will make logic gates thousands of times faster or a million times faster than those in existing computers so that to notice it, so i think that the business of classical computers will still be great in the future even with the coming of the powerful parallel quantum computer of IBM, so as you notice this kind of business is not only dependent on Moore's law and Bezos' Law , but it is also dependent on the latency of accessing internet, so read my following thoughts about Moore's law and about Bezos' Law: More of my philosophy about Moore's law and about Bezos' Law.. For RAM chips and flash memory, Moore's Law means that in eighteen months you'll pay the same price as today for twice as much storage. But other computing components are also seeing their price versus performance curves skyrocket exponentially. Data storage doubles every twelve months. More about Moore's law and about Bezos' Law.. "Parallel code is the recipe for unlocking Moore's Law" And: "BEZOS' LAW The Cost of Cloud Computing will be cut in half every 18 months - Bezos' Law Like Moore's law, Bezos' Law is about exponential improvement over time. If you look at AWS history, they drop prices constantly. In 2013 alone they've already had 9 price drops. The difference; however, between Bezos' and Moore's law is this: Bezos' law is the first law that isn't anchored in technical innovation. Rather, Bezos' law is anchored in confidence and market dynamics, and will only hold true so long as Amazon is not the aggregate dominant force in Cloud Computing (50%+ market share). Monopolies don't cut prices." More of my philosophy about latency and contention and concurrency and parallelism and more of my thoughts.. I think i am highly smart and i have just posted, read it below, about the new two inventions that will make logic gates thousands of times faster or a million times faster than those in existing computers, and i think that there is still a problem with those new inventions, and it is about the latency and concurrency, since you need concurrency and you need preemptive or non-preemptive scheduling of the coroutines , so since the HBM is 106.7 ns in latency and the DDR4 is 73.3 ns in latency and the AMD 3D V-Cache has also almost the same cost in latency, so as you notice that this kind of latency is still costly , also there is a latency that is the Time slice that takes a coroutine to execute and it is costly in latency, since this kind of latency and Time slice is a waiting time that looks like the time wasted in a contention in parallelism, so by logical analogy this kind of latency and Time slice create like a contention like in parallelism that reduces scalability, so i think it is why those new inventions have this kind of limit or constraints in a "concurrency" environment. And i invite you to read my following smart thoughts about preemptive and non-preemptive timesharing: https://groups.google.com/g/alt.culture.morocco/c/JuC4jar661w More of my philosophy about Fastest-ever logic gates and more of my thoughts.. "Logic gates are the fundamental building blocks of computers, and researchers at the University of Rochester have now developed the fastest ones ever created. By zapping graphene and gold with laser pulses, the new logic gates are a million times faster than those in existing computers, demonstrating the viability of "lightwave electronics.". If these kinds of lightwave electronic devices ever do make it to market, they could be millions of times faster than today's computers. Currently we measure processing speeds in Gigahertz (GHz), but these new logic gates function on the scale of Petahertz (PHz). Previous studies have set that as the absolute quantum limit of how fast light-based computer systems could possibly get." Read more here: https://newatlas.com/electronics/fastest-ever-logic-gates-computers-million-times-faster-petahertz/ Read my following news: And with the following new discovery computers and phones could run thousands of times faster.. Prof Alan Dalton in the School of Mathematical and Physics Sciences at the University of Sussex, said: "We're mechanically creating kinks in a layer of graphene. It's a bit like nano-origami. "Using these nanomaterials will make our computer chips smaller and faster. It is absolutely critical that this happens as computer manufacturers are now at the limit of what they can do with traditional semiconducting technology. Ultimately, this will make our computers and phones thousands of times faster in the future. "This kind of technology -- "straintronics" using nanomaterials as opposed to electronics -- allows space for more chips inside any device. Everything we want to do with computers -- to speed them up -- can be done by crinkling graphene like this." Dr Manoj Tripathi, Research Fellow in Nano-structured Materials at the University of Sussex and lead author on the paper, said: "Instead of having to add foreign materials into a device, we've shown we can create structures from graphene and other 2D materials simply by adding deliberate kinks into the structure. By making this sort of corrugation we can create a smart electronic component, like a transistor, or a logic gate." The development is a greener, more sustainable technology. Because no additional materials need to be added, and because this process works at room temperature rather than high temperature, it uses less energy to create. Read more here: https://www.sciencedaily.com/releases/2021/02/210216100141.htm But I think that mass production of graphene still hasn't quite begun, so i think the inventions above of the Fastest-ever logic gates that uses graphene and of the one with nanomaterials that uses graphene will not be commercialized fully until perhaps around year 2035 or 2040 or so, so read the following so that to understand why: "Because large-scale mass production of graphene still hasn't quite begun , the market is a bit limited. However, that leaves a lot of room open for investors to get in before it reaches commercialization. The market was worth $78.7 million in 2019 and, according to Grand View Research, is expected to rise drastically to $1.08 billion by 2027. North America currently has the bulk of market share, but the Asia-Pacific area is expected to have the quickest growth in adoption of graphene uses in coming years. North America and Europe are also expected to have above-market average growth. The biggest driver of all this growth is expected to be the push for cleaner, more efficient energy sources and the global reduction of emissions in the air." Read more here: https://www.energyandcapital.com/report/the-worlds-next-rare-earth-metal/1600 And of course you can read my thoughts about technology in the following web link: https://groups.google.com/g/soc.culture.usa/c/N_UxX3OECX4 More of my philosophy about matrix-matrix multiplication and about scalability and more of my thoughts.. I think that the time complexity of the Strassen algorithm for matrix-matrix multiplication is around O(N^2.8074), and the time complexity of the naive algorithm is O(N^3) , so it is not a significant difference, so i think i will soon implement the parallel Blocked matrix-matrix multiplication and i will implement it with a new algorithm that also uses intel AVX512 and that uses fused multiply-add and of course it will use the assembler instructions below of prefetching into caches so that to gain a 22% speed, so i think that overall it will have around the same speed as parallel BLAS, and i say that Pipelining greatly increases throughput in modern CPUs such as x86 CPUs, and another common pipelining scenario is the FMA or fused multiply-add, which is a fundamental part of the instruction set for some processors. The basic load-operate-store sequence simply lengthens by one step to become load-multiply-add-store. The FMA is possible only if the hardware supports it, as it does in the case of the Intel Xeon Phi, for example, as well as in Skylake etc. More of my philosophy about matrix-vector multiplication of large matrices and about scalability and more of my thoughts.. The matrix-vector multiplication of large matrices is completly limited by the memory bandwidth as i have just said it, read it below, so vector extensions like using SSE or AVX are usually not necessary for matrix-vector multiplication of large matrices. It is interesting that matrix-matrix-multiplications don't have these kind of problems with memory bandwidth. Companies like Intel or AMD typically usually show benchmarks of matrix-matrix multiplications and they show how nice they scale on many more cores, but they never show matrix-vector multiplications, and notice that my Powerful Open source software project of Parallel C++ Conjugate Gradient Linear System Solver Library that scales very well is also memory-bound and the matrices for it are usually big, but my new algorithm of it is efficiently cache-aware and efficiently NUMA-aware, and i have implemented it for the dense and sparse matrices. More of my philosophy about the efficient Matrix-Vector multiplication algorithm in MPI and about scalability and more of my thoughts.. Matrix-vector multiplication is an absolutely fundamental operation, with countless applications in computer science and scientific computing. Efficient algorithms for matrix-vector multiplication are of paramount importance, and notice that for matrix-vector multiplication, n^2 time is certainly required for an n × n dense matrix, but you have to be smart, since in MPI computing for also the supercomputer exascale systems, doesn't only take into account this n^2 time, since it has to also be efficiently be cache-aware, and it has to also have a good complexity for the how much memory is used by the parallel processes in MPI, since notice carefully with me that you have also to not send both a row of the matrix and the vector the the parallel processes of MPI, but you have to know how to reduce efficiently this complexity by for example dividing each row of the matrix and by dividing the vector and sending a part of the row of the matrix and a part of the vector to the parallel processes of MPI, and i think that in an efficient algorithm for Matrix-Vector multiplication, time for addition is dominated by the communication time, and of course that my implementation of my Powerful Open source |
Amine Moulay Ramdane <aminer68@gmail.com>: Nov 02 09:23AM -0700 Hello, More of my philosophy about technology and about my implementation of a future and more of my thoughts.. I am a white arab, and i think i am smart since i have also invented many scalable algorithms and algorithms.. My portable and efficient implementation of a future in Delphi and FreePascal was updated to version 1.31 I have just added support for exceptions, so you have to know that programming with futures is asynchronous programming, but so that to be robust the future implementation has to deal correctly with "exceptions", so in my implementation of a future when an exception is raised inside the future you will receive the exception, so i have implemented two things: The HasException() method so that to detect the exception from inside the future, and the the exception and its address is returned as a string in the ExceptionStr property, and my implementation of a future does of course support passing parameters as a pointer to the future, also my implementation of a future works in Windows and Linux. You can download my portable and efficient implementation of a future in Delphi and FreePascal version 1.31 from my website here: https://sites.google.com/site/scalable68/a-portable-and-efficient-implementation-of-a-future-in-delphi-and-freepascal And here is an example program of how to use my implementation of a future in Delphi and Freepascal: -- program TestFuture; uses system.SysUtils, system.Classes, Futures; type TTestFuture1 = class(TFuture) public function Compute(ptr:pointer): Variant; override; end; TTestFuture2 = class(TFuture) public function Compute(ptr:pointer): Variant; override; end; var obj1:TTestFuture1; obj2:TTestFuture2; a:variant; function TTestFuture1.Compute(ptr:pointer): Variant; begin raise Exception.Create('I raised an exception'); end; function TTestFuture2.Compute(ptr:pointer): Variant; begin writeln(nativeint(ptr)); result:='Hello world !'; end; begin writeln; obj1:=TTestFuture1.create(); obj1.SetParameter(pointer(12)); obj1.Start; if obj1.GetValue(a) then writeln(a) else if obj1.HasException then writeln(obj1.ExceptionStr); obj1.free; writeln; obj2:=TTestFuture2.create(); obj2.SetParameter(pointer(12)); obj2.Start; if obj2.GetValue(a) then writeln(a); obj2.free; end. --- More of my philosophy about quantum computing and about matrix operations and about scalability and more of my thoughts.. I think i am highly smart since I have passed two certified IQ tests and i have scored "above" 115 IQ, i have just looked at the following video about the powerful parallel quantum computer of IBM from USA that will be soon available in the cloud, and i invite you to look at it: Quantum Computing: Now Widely Available! https://www.youtube.com/watch?v=laqpfQ8-jFI But i have just read the following paper and it saying that the powerful Quantum algorithms for matrix operations and linear systems of equations are available, read about them on the below paper, so as you notice in the following paper that many matrix operations and also the linear systems of equations solver can be done in a quantum computer, read about it here in the following paper: Quantum algorithms for matrix operations and linear systems of equations Read more here: https://arxiv.org/pdf/2202.04888.pdf So i think that IBM will do the same for there powerful parallel quantum computer that will be available in the cloud, but i think that you will have to pay for it of course since i think it will be commercial, but i think that there is a weakness with this kind of configuration of the powerful parallel quantum computer from IBM, since the cost of bandwidth of internet is exponentially decreasing , but the latency of accessing the internet is not, so it is why i think that people will still use classical computers for many mathematical applications that uses mathematical operations such as matrix operations and linear systems of equations etc. that needs a much faster latency, other than that Moore's law will still be effective in classical computers since it will permit us to have really powerful classical computer at a low cost and it will be really practical since the quantum computer is big in size and not so practical, so read about the two inventions below that will make logic gates thousands of times faster or a million times faster than those in existing computers so that to notice it, so i think that the business of classical computers will still be great in the future even with the coming of the powerful parallel quantum computer of IBM, so as you notice this kind of business is not only dependent on Moore's law and Bezos' Law , but it is also dependent on the latency of accessing internet, so read my following thoughts about Moore's law and about Bezos' Law: More of my philosophy about Moore's law and about Bezos' Law.. For RAM chips and flash memory, Moore's Law means that in eighteen months you'll pay the same price as today for twice as much storage. But other computing components are also seeing their price versus performance curves skyrocket exponentially. Data storage doubles every twelve months. More about Moore's law and about Bezos' Law.. "Parallel code is the recipe for unlocking Moore's Law" And: "BEZOS' LAW The Cost of Cloud Computing will be cut in half every 18 months - Bezos' Law Like Moore's law, Bezos' Law is about exponential improvement over time. If you look at AWS history, they drop prices constantly. In 2013 alone they've already had 9 price drops. The difference; however, between Bezos' and Moore's law is this: Bezos' law is the first law that isn't anchored in technical innovation. Rather, Bezos' law is anchored in confidence and market dynamics, and will only hold true so long as Amazon is not the aggregate dominant force in Cloud Computing (50%+ market share). Monopolies don't cut prices." More of my philosophy about latency and contention and concurrency and parallelism and more of my thoughts.. I think i am highly smart and i have just posted, read it below, about the new two inventions that will make logic gates thousands of times faster or a million times faster than those in existing computers, and i think that there is still a problem with those new inventions, and it is about the latency and concurrency, since you need concurrency and you need preemptive or non-preemptive scheduling of the coroutines , so since the HBM is 106.7 ns in latency and the DDR4 is 73.3 ns in latency and the AMD 3D V-Cache has also almost the same cost in latency, so as you notice that this kind of latency is still costly , also there is a latency that is the Time slice that takes a coroutine to execute and it is costly in latency, since this kind of latency and Time slice is a waiting time that looks like the time wasted in a contention in parallelism, so by logical analogy this kind of latency and Time slice create like a contention like in parallelism that reduces scalability, so i think it is why those new inventions have this kind of limit or constraints in a "concurrency" environment. And i invite you to read my following smart thoughts about preemptive and non-preemptive timesharing: https://groups.google.com/g/alt.culture.morocco/c/JuC4jar661w More of my philosophy about Fastest-ever logic gates and more of my thoughts.. "Logic gates are the fundamental building blocks of computers, and researchers at the University of Rochester have now developed the fastest ones ever created. By zapping graphene and gold with laser pulses, the new logic gates are a million times faster than those in existing computers, demonstrating the viability of "lightwave electronics.". If these kinds of lightwave electronic devices ever do make it to market, they could be millions of times faster than today's computers. Currently we measure processing speeds in Gigahertz (GHz), but these new logic gates function on the scale of Petahertz (PHz). Previous studies have set that as the absolute quantum limit of how fast light-based computer systems could possibly get." Read more here: https://newatlas.com/electronics/fastest-ever-logic-gates-computers-million-times-faster-petahertz/ Read my following news: And with the following new discovery computers and phones could run thousands of times faster.. Prof Alan Dalton in the School of Mathematical and Physics Sciences at the University of Sussex, said: "We're mechanically creating kinks in a layer of graphene. It's a bit like nano-origami. "Using these nanomaterials will make our computer chips smaller and faster. It is absolutely critical that this happens as computer manufacturers are now at the limit of what they can do with traditional semiconducting technology. Ultimately, this will make our computers and phones thousands of times faster in the future. "This kind of technology -- "straintronics" using nanomaterials as opposed to electronics -- allows space for more chips inside any device. Everything we want to do with computers -- to speed them up -- can be done by crinkling graphene like this." Dr Manoj Tripathi, Research Fellow in Nano-structured Materials at the University of Sussex and lead author on the paper, said: "Instead of having to add foreign materials into a device, we've shown we can create structures from graphene and other 2D materials simply by adding deliberate kinks into the structure. By making this sort of corrugation we can create a smart electronic component, like a transistor, or a logic gate." The development is a greener, more sustainable technology. Because no additional materials need to be added, and because this process works at room temperature rather than high temperature, it uses less energy to create. Read more here: https://www.sciencedaily.com/releases/2021/02/210216100141.htm But I think that mass production of graphene still hasn't quite begun, so i think the inventions above of the Fastest-ever logic gates that uses graphene and of the one with nanomaterials that uses graphene will not be commercialized fully until perhaps around year 2035 or 2040 or so, so read the following so that to understand why: "Because large-scale mass production of graphene still hasn't quite begun , the market is a bit limited. However, that leaves a lot of room open for investors to get in before it reaches commercialization. The market was worth $78.7 million in 2019 and, according to Grand View Research, is expected to rise drastically to $1.08 billion by 2027. North America currently has the bulk of market share, but the Asia-Pacific area is expected to have the quickest growth in adoption of graphene uses in coming years. North America and Europe are also expected to have above-market average growth. The biggest driver of all this growth is expected to be the push for cleaner, more efficient energy sources and the global reduction of emissions in the air." Read more here: https://www.energyandcapital.com/report/the-worlds-next-rare-earth-metal/1600 And of course you can read my thoughts about technology in the following web link: https://groups.google.com/g/soc.culture.usa/c/N_UxX3OECX4 More of my philosophy about matrix-matrix multiplication and about scalability and more of my thoughts.. I think that the time complexity of the Strassen algorithm for matrix-matrix multiplication is around O(N^2.8074), and the time complexity of the naive algorithm is O(N^3) , so it is not a significant difference, so i think i will soon implement the parallel Blocked matrix-matrix multiplication and i will implement it with a new algorithm that also uses intel AVX512 and that uses fused multiply-add and of course it will use the assembler instructions below of prefetching into caches so that to gain a 22% speed, so i think that overall it will have around the same speed as parallel BLAS, and i say that Pipelining greatly increases throughput in modern CPUs such as x86 CPUs, and another common pipelining scenario is the FMA or fused multiply-add, which is a fundamental part of the instruction set for some processors. The basic load-operate-store sequence simply lengthens by one step to become load-multiply-add-store. The FMA is possible only if the hardware supports it, as it does in the case of the Intel Xeon Phi, for example, as well as in Skylake etc. More of my philosophy about matrix-vector multiplication of large matrices and about scalability and more of my thoughts.. The matrix-vector multiplication of large matrices is completly limited by the memory bandwidth as i have just said it, read it below, so vector extensions like using SSE or AVX are usually not necessary for matrix-vector multiplication of large matrices. It is interesting that matrix-matrix-multiplications don't have these kind of problems with memory bandwidth. Companies like Intel or AMD typically usually show benchmarks of matrix-matrix multiplications and they show how nice they scale on many more cores, but they never show matrix-vector multiplications, and notice that my Powerful Open source software project of Parallel C++ Conjugate Gradient Linear System Solver Library that scales very well is also memory-bound and the matrices for it are usually big, but my new algorithm of it is efficiently cache-aware and efficiently NUMA-aware, and i have implemented it for the dense and sparse matrices. More of my philosophy about the efficient Matrix-Vector multiplication algorithm in MPI and about scalability and more of my thoughts.. Matrix-vector multiplication is an absolutely fundamental operation, with countless applications in computer science and scientific computing. Efficient algorithms for matrix-vector multiplication are of paramount importance, and notice that for matrix-vector multiplication, n^2 time is certainly required for an n × n dense matrix, but you have to be smart, since in MPI computing for also the supercomputer exascale systems, doesn't only take into account this n^2 time, since it has to also be efficiently be cache-aware, and it has to also have a good complexity for the how much memory is used by the parallel processes in MPI, since notice carefully with me that you have also to not send both a row of the matrix and the vector the the parallel processes of MPI, but you have to know how to reduce efficiently this complexity by for example dividing each row of the matrix and by dividing the vector and sending a part of the row of the matrix and a part of the vector to the parallel processes of MPI, and i think that in an efficient algorithm for Matrix-Vector multiplication, time for addition is dominated by the communication time, and of course that my implementation of my Powerful Open source software of Parallel C++ Conjugate Gradient Linear System Solver Library that scales very well is also smart, since it is efficiently cache-aware and efficiently NUMA-aware, and it implements both the dense and the sparse, and of course as i am showing below, it is scaling well on the memory channels, so it is scaling well in my 16 cores dual Xeon with 8 memory channels as i am showing below, and it will scale well on 16 sockets HPE NONSTOP X SYSTEMS or the 16 sockets HPE Integrity Superdome X with above 512 cores and with 64 memory channels, so i invite you to read carefully and to download my Open source software project of Parallel C++ Conjugate Gradient Linear System Solver Library that scales very well from my website here: https://sites.google.com/site/scalable68/scalable-parallel-c-conjugate-gradient-linear-system-solver-library MPI will continue to be a viable programming model on exascale supercomputer systems, so i will soon implement |
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page. To unsubscribe from this group and stop receiving emails from it send an email to comp.programming.threads+unsubscribe@googlegroups.com. |
No comments:
Post a Comment