- About the Active object pattern.. - 1 Update
- What about garbage collection? - 1 Update
- Disadvantages of Actor model - 2 Updates
- More about Hardware transactional memory - 1 Update
- Disadvantages of Transactional memory - 1 Update
- #define - 2 Updates
Horizon68 <horizon@horizon.com>: Jul 04 02:44PM -0700 Hello.. About the Active object pattern.. I think the proxy and scheduler of the Active object pattern are embellishments, not essential. The core of the idea is simply a queue of closures executed on different thread(s) to that of the client, and here you are noticing that you can do the same thing as the Active object pattern and more by using my powerful "invention" that is: An efficient Threadpool engine with priorities that scales very well that you can download from here: https://sites.google.com/site/scalable68/an-efficient-threadpool-engine-with-priorities-that-scales-very-well This Threadpool of mine is really powerful because it scales very well on multicore and NUMA systems, also it comes with a ParallelFor() that scales very well on multicores and NUMA systems. Here is the explanation of my ParallelFor() that scales very well: I have also implemented a ParallelFor() that scales very well, here is the method: procedure ParallelFor(nMin, nMax:integer;aProc: TParallelProc;GrainSize:integer=1;Ptr:pointer=nil;pmode:TParallelMode=pmBlocking;Priority:TPriorities=NORMAL_PRIORITY); nMin and nMax parameters of the ParallelFor() are the minimum and maximum integer values of the variable of the ParallelFor() loop, aProc parameter of ParallelFor() is the procedure to call, and GrainSize integer parameter of ParallelFor() is the following: The grainsize sets a minimum threshold for parallelization. A rule of thumb is that grainsize iterations should take at least 100,000 clock cycles to execute. For example, if a single iteration takes 100 clocks, then the grainsize needs to be at least 1000 iterations. When in doubt, do the following experiment: 1- Set the grainsize parameter higher than necessary. The grainsize is specified in units of loop iterations. If you have no idea of how many clock cycles an iteration might take, start with grainsize=100,000. The rationale is that each iteration normally requires at least one clock per iteration. In most cases, step 3 will guide you to a much smaller value. 2- Run your algorithm. 3- Iteratively halve the grainsize parameter and see how much the algorithm slows down or speeds up as the value decreases. A drawback of setting a grainsize too high is that it can reduce parallelism. For example, if the grainsize is 1000 and the loop has 2000 iterations, the ParallelFor() method distributes the loop across only two processors, even if more are available. And you can pass a parameter in Ptr as pointer to ParallelFor(), and you can set pmode parameter of to pmBlocking so that ParallelFor() is blocking or to pmNonBlocking so that ParallelFor() is non-blocking, and the Priority parameter is the priority of ParallelFor(). Look inside the test.pas example to see how to use it. Thank you, Amine Moulay Ramdane. |
Horizon68 <horizon@horizon.com>: Jul 04 02:41PM -0700 Hello, Read this: What about garbage collection? Read what said this serious specialist called Chris Lattner: "One thing that I don't think is debatable is that the heap compaction behavior of a GC (which is what provides the heap fragmentation win) is incredibly hostile for cache (because it cycles the entire memory space of the process) and performance predictability." "Not relying on GC enables Swift to be used in domains that don't want it - think boot loaders, kernels, real time systems like audio processing, etc." "GC also has several *huge* disadvantages that are usually glossed over: while it is true that modern GC's can provide high performance, they can only do that when they are granted *much* more memory than the process is actually using. Generally, unless you give the GC 3-4x more memory than is needed, you'll get thrashing and incredibly poor performance. Additionally, since the sweep pass touches almost all RAM in the process, they tend to be very power inefficient (leading to reduced battery life)." Read more here: https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20160208/009422.html Here is Chris Lattner's Homepage: http://nondot.org/sabre/ And here is Chris Lattner's resume: http://nondot.org/sabre/Resume.html#Tesla This why i have invented the following scalable algorithm and its implementation that makes Delphi and FreePascal more powerful: My invention that is my scalable reference counting with efficient support for weak references version 1.35 is here.. Here i am again, i have just updated my scalable reference counting with efficient support for weak references to version 1.35, I have just added a TAMInterfacedPersistent that is a scalable reference counted version, and now i think i have just made it complete and powerful. Because I have just read the following web page: https://www.codeproject.com/Articles/1252175/Fixing-Delphis-Interface-Limitations But i don't agree with the writting of the guy of the above web page, because i think you have to understand the "spirit" of Delphi, here is why: A component is supposed to be owned and destroyed by something else, "typically" a form (and "typically" means in english: in "most" cases, and this is the most important thing to understand). In that scenario, reference count is not used. If you pass a component as an interface reference, it would be very unfortunate if it was destroyed when the method returns. Therefore, reference counting in TComponent has been removed. Also because i have just added TAMInterfacedPersistent to my invention. To use scalable reference counting with Delphi and FreePascal, just replace TInterfacedObject with my TAMInterfacedObject that is the scalable reference counted version, and just replace TInterfacedPersistent with my TAMInterfacedPersistent that is the scalable reference counted version, and you will find both my TAMInterfacedObject and my TAMInterfacedPersistent inside the AMInterfacedObject.pas file, and to know how to use weak references please take a look at the demo that i have included called example.dpr and look inside my zip file at the tutorial about weak references, and to know how to use delegation take a look at the demo that i have included called test_delegation.pas, and take a look inside my zip file at the tutorial about delegation that learns you how to use delegation. I think my Scalable reference counting with efficient support for weak references is stable and fast, and it works on both Windows and Linux, and my scalable reference counting scales on multicore and NUMA systems, and you will not find it in C++ or Rust, and i don't think you will find it anywhere, and you have to know that this invention of mine solves the problem of dangling pointers and it solves the problem of memory leaks and my scalable reference counting is "scalable". And please read the readme file inside the zip file that i have just extended to make you understand more. You can download my new scalable reference counting with efficient support for weak references version 1.35 from: https://sites.google.com/site/scalable68/scalable-reference-counting-with-efficient-support-for-weak-references Thank you, Amine Moulay Ramdane. |
Horizon68 <horizon@horizon.com>: Jul 04 01:45PM -0700 Hello... Disadvantages of Actor model: 1- Not all languages easily enforce immutability Erlang, the language that first popularized actors has immutability at its core but Java and Scala (actually the JVM) does not enforce immutability. 2- Still pretty complex Actors are based on an asynchronous model of programming which is not so straight forward and easy to model in all scenarios; it is particularly difficult to handle errors and failure scenarios. 3- Does not prevent deadlock or starvation Two actors can be in the state that wait message one from another; thus you have a deadlock just like with locks, although much easier to debug. With transactional memory however you are guaranteed deadlock free. 4- Not so efficient Because of enforced immutability and because many actors have to switch in using the same thread actors won't be as efficient as lock-based concurrency. Conclusion: Lock-based concurrency is the most efficient. More about Message Passing Process Communication Model and Shared Memory Process Communication Model: An advantage of shared memory model is that memory communication is faster as compared to the message passing model on the same machine. However, shared memory model may create problems such as synchronization and memory protection that need to be addressed. Message passing's major flaw is the inversion of control–it is a moral equivalent of gotos in un-structured programming (it's about time somebody said that message passing is considered harmful). Also some research shows that the total effort to write an MPI application is significantly higher than that required to write a shared-memory version of it. Thank you, Amine Moulay Ramdane. |
Mr Flibble <flibbleREMOVETHISBIT@i42.co.uk>: Jul 04 10:29PM +0100 On 04/07/2019 21:45, Horizon68 wrote: > Message passing's major flaw is the inversion of control–it is a moral > equivalent of gotos in un-structured programming (it's about time somebody > said that message passing is considered harmful). Inversion of control is a good thing not a bad thing and when combined with message passing complex systems can be easily reduced to a collection of simpler sub-systems. Such systems can be considered orthogonal to concurrent or data oriented systems, i.e. the two approaches can be used together if the design of the system (and the system designer) isn't fucktarded. So, no, message passing is NOT considered harmful, only a fucktard would think that. /Flibble -- "Snakes didn't evolve, instead talking snakes with legs changed into snakes." - Rick C. Hodgin "You won't burn in hell. But be nice anyway." – Ricky Gervais "I see Atheists are fighting and killing each other again, over who doesn't believe in any God the most. Oh, no..wait.. that never happens." – Ricky Gervais "Suppose it's all true, and you walk up to the pearly gates, and are confronted by God," Bryne asked on his show The Meaning of Life. "What will Stephen Fry say to him, her, or it?" "I'd say, bone cancer in children? What's that about?" Fry replied. "How dare you? How dare you create a world to which there is such misery that is not our fault. It's not right, it's utterly, utterly evil." "Why should I respect a capricious, mean-minded, stupid God who creates a world that is so full of injustice and pain. That's what I would say." |
Horizon68 <horizon@horizon.com>: Jul 04 02:23PM -0700 Hello, More about Hardware transactional memory, and now about the disadvantages of Intel TSX: Here is also something interesting to read about hardware transactional memory that is Intel TSX: TSX does not gaurantee forward progress, so there must always be a fallback non-TSX pathway. (complex transactions might always abort even without any contention because they overflow the speculation buffer. Even transactions that could run in theory might livelock forever if you don't have the right pauses to allow forward progress, so the fallback path is needed then too). TSX works by keeping a speculative set of registers and processor state. It tracks all reads done in the speculation block, and enqueues all writes to be delayed until the transaction ends. The memory tracking of the transaction is currently done using the L1 cache and the standard cache line protocols. This means contention is only detected at cache line granularity, so you have the standard "false sharing" issue. If your transaction reads a cache line, then any write to that cache line by another core causes the transaction to abort. (reads by other cores do not cause an abort). If your transaction writes a cache line, then any read or write by another core causes the transaction to abort. If your transaction aborts, then any cache lines written are evicted from L1. If any of the cache lines involved in the transaction are evicted during the transaction (eg. if you touch too much memory, or another core locks that line), the transaction is aborted. TSX seems to allow quite a large working set (up to size of L1 ?). Obviously the more memory you touch the more likely to abort due to contention. Obviously you will get aborts from anything "funny" that's not just plain code and memory access. Context switches, IO, kernel calls, etc. will abort transactions. At the moment, TSX is quite slow, even if there's no contention and you don't do anything in the block. There's a lot of overhead. Using TSX naively may slow down even threaded code. Getting significant performance gains from it is non-trivial. Read more here: http://cbloomrants.blogspot.ca/2014/11/11-12-14-intel-tsx-notes.html Thank you, Amine Moulay Ramdane. |
Horizon68 <horizon@horizon.com>: Jul 04 02:15PM -0700 Hello.. Read the following paper about the disadvantages of Transactional memory: "Hardware-only (HTM) suffers from two major impediments: high implementation and verification costs lead to design risks too large to justify on a niche programming model; hardware capacity constraints lead to significant performance degradation when overflow occurs, and proposals for managing overflows (for example, signatures ) incur false positives that add complexity to the programming model. Therefore, from an industrial perspective, HTM designs have to provide more benefits for the cost, on a more diverse set of workloads (with varying transactional characteristics) for hardware designers to consider implementation." etc. "We observed that the TM programming model itself, whether implemented in hardware or software, introduces complexities that limit the expected productivity gains, thus reducing the current incentive for migration to transactional programming, and the justification at present for anything more than a small amount of hardware support." Read more here: http://pages.cs.wisc.edu/~cain/pubs/cascaval_cacm08.pdf Thank you, Amine Moulay Ramdane. |
G G <gdotone@gmail.com>: Jul 03 09:21PM -0700 #define _PROTOTYPE(function, params) function params how does this work? |
"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Jul 04 06:38AM +0200 On 04.07.2019 06:21, G G wrote: > #define _PROTOTYPE(function, params) function params > how does this work? It defines a pure text substitution rule, called a macro. `params` is intended to be a parenthesized list of arguments. Note that names starting with underscore followed by uppercase are reserved to the implementation, so if this is not code from the C++ implementation it's UB. Cheers!, - Alf |
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page. To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com. |
No comments:
Post a Comment