- Valgrind finds memory leak on ImVector<char[4096]>::resize(). Why? - 12 Updates
- uninitialized build-in types - 3 Updates
- Onwards and upwards - 3 Updates
- cmsg cancel <n55a0o$4s0$6@dont-email.me> - 3 Updates
- A new algorithm of Parallel implementation of Conjugate Gradient Sparse Linear System Solver library - 3 Updates
omarcornut@gmail.com: Dec 20 12:31AM -0800 On Saturday, 19 December 2015 21:08:34 UTC+1, Mr Flibble wrote: > the case of std::vector, you provide a custom allocator that doesn't > perform value initialization if that is your use-case; there is no need > to write your own container to do a similar job. There is nothing wrong for you. This is crazy that you can't even have the IMAGINATION to consider that someone somewhere may not want a vector where both size and capacity are not stored at 8-byte size_t. What if I want them to be 2-bytes or 4-bytes? What if I don't want to allow cases such as allowing the reference parameter to push_back pointing mid-vector, etc? Suggesting that every C++ programmer needs to use the exact same code in every situation is crazy. You do realise that writing those few dozens lines doing exactly what I need in that simple context is easier than writing and maintaining an allocator and passing that around and dealing with horrible horrible error messages? |
omarcornut@gmail.com: Dec 20 12:56AM -0800 I genuinely wonder if you are trolling me or it is just lack of experience and imagination. You are both essentially saying "everybody should use the same code". Do I really need to point out how wrong that statement is? On Saturday, 19 December 2015 21:16:37 UTC+1, Öö Tiib wrote: > > frame-rate, when more so a complex game (imagine your typical console > > title) having a debug build that runs a 5 fps is absolutely redhibitory. > Do not use debug build then. So you are suggesting we don't use debuggers? You are suggesting we don't aim to improve our working condition by making a debugging session run faster to increase the odds of being able to work under those conditions? > > day while minimizing your performance cost. > The debuggers work excellently with optimized builds. The only > thing that I avoid defining is NDEBUG (because that erases asserts). They sometimes do an ok job, they can't possibly do everything. Also I see you probably haven't worked with debuggers shipped for various console platforms. > > if using the library in a debug build took 4 ms of your time every frame. > I use debug versions only in unit-tests and automated tests. It does > not matter for me how long those run on test farm. Great for you. I want users to be able to use debug versions on a daily basis when possible. e.g. working on feature X one is going to set up their data in a way that allows maximizing debugging capabilities if possible. > element into a 'vector' and forgot that it *may* invalidate iterators. > Such logic errors are all caught by unit tests quickly so I do not need > those checks in versions that are debugged manually. Great that your code is all unit tested. I guess that you must write very little code and work with simple systems. It's hard and unusual to automatically test all parts of a game. You can test many parts, you can't test all of it, certainly not with unit tests. Indie game developers in particularly generally don't have the infrastructure to do so. It is not my job as a library developer to push more burden on every indie developers when I want to help them. > What? Most people who use C++ for anything use it because they need > its excellent performance. Optimized (by decent compiler) code of > 'std::vector' is about as good as same thing written in assembler. I agree, most people. Not most high-end game developers. First we're talking about debug builds here. Secondly there's hundreds of possible algorithm and implementation variants while stl provide a few dozens and they vary by implementation. It's *unacceptable* for a game of very high calibre to rely on implementation dependant automatic growth of vector capacity. Personally I don't want my vectors to have 24 bytes of overhead when they could be 16 bytes. Shitty error messages when you start dealing with more complex stuff is also unacceptable. Slower compilation all across the board is also not something I fancy. I do use a lot of STL. I am merely saying there are cases where it is ok to not use it and for that library it makes perfect sense and it is the best thing to doo. > > an issue with porting. > Fear, Uncertainty and Doubt. There was maybe case or two 20 years ago and > old women of bus-station gossip of it. World has changed. Knowledge, Curiosity, Imagination. (Lack of in your case) Essentially many of people answers in this thread are "we don't like that YOU write a dozen lines of your own code, we'd rather make the access bar for ALL YOUR USERS higher and harder because it is the right way to do things". TL;DR; - I am saying there are different ways of doing things for different scenarios. I am not saying your way is wrong. - You are saying there is a single way of doing thing. You are saying that my way of doing thing, within a context that's not yours and that you poorly understand is wrong. |
Gareth Owen <gwowen@gmail.com>: Dec 20 10:01AM > I genuinely wonder if you are trolling me or it is just lack of > experience and imagination. Not experience, but definitely imagination. He knows the single right answer to every question and no alternative view, regardless how well argued, will be entertained. You're best just ignoring him. > You are both essentially saying "everybody should use the same > code". Do I really need to point out how wrong that statement is? You could try. He'll just ignore/denigrate you. |
Nobody <nobody@nowhere.invalid>: Dec 20 11:25AM On Sat, 19 Dec 2015 08:06:59 -0800, Öö Tiib wrote: > What is the point of using debug build and then write hand-optimized code > in it? Real-time code often adapts its behaviour to performance. E.g. games often update the simulation using numerical integration where "dt" is the time interval between previous frames, and/or adjust the level of graphical detail in order to obtain a reasonable frame rate. In this situation, it's often impossible to actually debug issues using a debug build because the debug build runs so much slower than a release build that it substantially changes the behaviour of the program, resulting in Heisenbugs. |
Flix <writeme@newsgroup.com>: Dec 20 01:25PM +0100 > It's hard to tell what your issue is given the lack of Valgrind report and/or code sample. I'd imagine the issue is probably > that you have a ImVector whose destructor wasn't called (perhaps you have a ImVector on the heap somewhere). > There's no ImVector<char[4096]> in the ImGui codebase so it doesn't seem like code in the default library? Actually I just run Valgrind on a test case that was using my (old) "imguifilesystem" control (it was meant to replace std:string). I'll make further tests on it, and it I'll find something, I'll use the ImGui Issue Forum (I though that C arrays could have some kind of ctr/dct "overhead"). However it was not my intention to trigger such a discussion. I'm sorry. Maybe a library that doesn't use STL is more portable (less dependencies = more portability) and less dependent on the particular performance of a specific STL library. But I'm sure people would have something to say about it... |
Flix <writeme@newsgroup.com>: Dec 20 02:43PM +0100 On 20/12/2015 13:25, Flix wrote: > On 19/12/2015 20:37, omarcornut@gmail.com wrote: > Actually I just run Valgrind on a test case that was using my (old) > "imguifilesystem" control (it was meant to replace std:string). Ooops, I meant to replace a vector of char[PATH_MAX]. |
Mr Flibble <flibbleREMOVETHISBIT@i42.co.uk>: Dec 20 06:15PM >> perform value initialization if that is your use-case; there is no need >> to write your own container to do a similar job. > There is nothing wrong for you. This is crazy that you can't even have the IMAGINATION to consider that someone somewhere may not want a vector where both size and capacity are not stored at 8-byte size_t. What if I want them to be 2-bytes or 4-bytes? What if I don't want to allow cases such as allowing the reference parameter to push_back pointing mid-vector, etc? Suggesting that every C++ programmer needs to use the exact same code in every situation is crazy. It is nothing to do with imagination and everything to do with psychosis. If you aren't using std::vector because sizeof(std::vector<T>) is 24 bytes instead of 12 bytes then you are seriously in need of medication as psychosis can be the only explanation for someone doing something totally wrong. YOU ARE DOING IT WRONG! (tm) Oh and BTW most std::vector implementations will be storing three pointers rather than std::size_t for size and capacity. I am not suggesting that every C++ programmer use the same code for different situations but you have yet to provide a sane reason for using your container over std::vector (I have no idea as to what you meaning by pushing back reference parameters mid-vector). > You do realise that writing those few dozens lines doing exactly what I need in that simple context is easier than writing and maintaining an allocator and passing that around and dealing with horrible horrible error messages? It is very rare that one needs an uninitialised buffer and when you do new[] usually suffices so you still have yet to provide sufficient rationale for using your container over std::vector sausages. /Flibble |
Flix <writeme@newsgroup.com>: Dec 20 08:05PM +0100 On 18/12/2015 22:19, Flix wrote: > replacement in your code! > I thought that char arrays had no C++ constructor/destructor. > Why do Valgrind complains when resizing the array ? I fixed it! And it was my fault (even if the Valgrind output was not very useful). Basically I had a class allocated on the heap that contained a ImVector<char[4096]>. When I deallocated it, I simply released the memory without calling the destructor on my class: thus ~ImVector<char[4096]>() was never called, even if my class heap-space was correctly deallocated. Since "going down the chain": ImVector::resize() -> ImVector::reserve() are the only places where allocators are used inside ImVector, Valgrind returned these methods. Basically Valgrind can't know where I should have freed the memory: that's why leaks are very difficult to fix. |
omarcornut@gmail.com: Dec 20 12:08PM -0800 On Sunday, 20 December 2015 20:05:50 UTC+1, Flix wrote: > > Why do Valgrind complains when resizing the array ? > I fixed it! > And it was my fault (even if the Valgrind output was not very useful). Glad that you found your problem Flix. I'd say Valgrind was reasonably useful there. On Sunday, 20 December 2015 19:15:54 UTC+1, Mr Flibble wrote: > (I have no idea as to what you meaning by pushing back reference parameters mid-vector push_back() takes a const T& and a typical implementation needs to cater for the case where that reference points within the vector data itself at the time of calling the function, aka take a temporary copy is reallocating. I already have provided with enough reasons that you decide to ignore; and probably missing some (many stl types playing terrible with edit&continue/live recompilation techniques, ease of portability to old or non-conformant architectures, duplicated code bloat polluting instruction cache, more painful visibility and stepping in debuggers, implementation dependent-behavior, horrible error messages, increased compilation times - consider various situations with pre-compiled headers are unavailable, constant harassment of 32/64 warnings because of a size_t that I don't care about, ease to extend a library you wrote yourself to create more specialised containers (e.g. holding a local buffer to avoid heap allocations) etc.). I am going to opt out of the conversation and go get my medication for psychosis because some people here haven't written software where memory or performances or portability (hint: portability doesn't mean unix+mac+windows) matters and dealing with dozens/hundreds thousands of things going on. With sloppy programming practices being so common, no wonders my Windows keyboard driver nowadays takes 55 MB of RAM and software are generally slower and less snappy to the end user than they were ten years ago. Step back a bit to compare to my prime example above, that GTA5 runs with 256 MB of RAM on the PS3, aka about 5 fives the amount of RAM that my keyboard driver written by some arguably incompetent programmers uses. I don't need those 12 bytes for that library and I still recommend using STL in majority of cases, but I needed those 12 bytes in the past in several occasions. |
omarcornut@gmail.com: Dec 20 12:17PM -0800 > > (I have no idea as to what you meaning by pushing back reference parameters mid-vector > push_back() takes a const T& and a typical implementation needs to cater for the case where that reference points within the vector data itself at the time of calling the function, aka take a temporary copy is reallocating. > I already have provided with enough reasons that you decide to ignore; and probably missing some (many stl types playing terrible with edit&continue/live recompilation techniques, ease of portability to old or non-conformant architectures, duplicated code bloat polluting instruction cache, more painful visibility and stepping in debuggers, implementation dependent-behavior, horrible error messages, increased compilation times - consider various situations with pre-compiled headers are unavailable, constant harassment of 32/64 warnings because of a size_t that I don't care about, ease to extend a library you wrote yourself to create more specialised containers (e.g. holding a local buffer to avoid heap allocations) etc.). And I have to add that those reasons makes even more sense when shipping a library. When I'm writing my own code I have less issues with using std::map etc. for convenience because I know it works and it'll cover 90% of my needs. When I'm writing a library that I expect game programmers to use it's a very different thing to use those. Even if you argue and debate my points, my library would simply lose half of its intended audience if it was dragging in <vector> and <map>. |
"Öö Tiib" <ootiib@hot.ee>: Dec 20 12:55PM -0800 > experience and imagination. You are both essentially saying "everybody > should use the same code". Do I really need to point out how wrong > that statement is? Looked at your code ... bah. It won't pass review of any decent C++ developer. Go read some book about C++. For single example: every class that has destructor (including 'ImVector') violates rule of three. So there will be mundane memory management issues with objects of your library classes. Last time a leak in my code reached repository was 12 years ago. I am trolling? Why? It is exactly as bad as I suspected, there are no point even to test the thing. |
Mr Flibble <flibbleREMOVETHISBIT@i42.co.uk>: Dec 20 09:27PM > On Sunday, 20 December 2015 19:15:54 UTC+1, Mr Flibble wrote: >> (I have no idea as to what you meaning by pushing back reference parameters mid-vector > push_back() takes a const T& and a typical implementation needs to cater for the case where that reference points within the vector data itself at the time of calling the function, aka take a temporary copy is reallocating. A decent STL implementation will already have a check for reallocation so I don't see a need for an extra check if element being inserted is a reference to an existing element in same container assuming the previous memory is deallocated AFTER elements are inserted into newly allocated memory which should be the case for a decent implementation. I suspect the less than perfect Microsoft VC++ STL implementation is informing your erroneous views? Am I right? :D > I already have provided with enough reasons that you decide to ignore; and probably missing some (many stl types playing terrible with edit&continue/live recompilation techniques, ease of portability to old or non-conformant architectures, duplicated code bloat polluting instruction cache, more painful visibility and stepping in debuggers, implementation dependent-behavior, horrible error messages, increased compilation times - consider various situations with pre-compiled headers are unavailable, constant harassment of 32/64 warnings because of a size_t that I don't care about, ease to extend a library you wrote yourself to create more specialised containers (e.g. holding a local buffer to avoid heap allocations) etc.). > I am going to opt out of the conversation and go get my medication for psychosis because some people here haven't written software where memory or performances or portability (hint: portability doesn't mean unix+mac+windows) matters and dealing with dozens/hundreds thousands of things going on. With sloppy programming practices being so common, no wonders my Windows keyboard driver nowadays takes 55 MB of RAM and software are generally slower and less snappy to the end user than they were ten years ago. Step back a bit to compare to my prime example above, that GTA5 runs with 256 MB of RAM on the PS3, aka about 5 fives the amount of RAM that my keyboard driver written by some arguably incompetent programmers uses. I don't need those 12 bytes for that library and I still recommend using STL in majority of cases, but I needed those 12 bytes in the past in several occasions. Sorry but you still haven't provided any sane rationale for your noddy std::vector rip off sausages. /Flibble |
"Öö Tiib" <ootiib@hot.ee>: Dec 19 03:33PM -0800 On Sunday, 20 December 2015 00:06:13 UTC+2, David Brown wrote: > > at ugly side. > Yes, but "int x = 0;" is pretty easy to type, and usually a better > choice of initialiser than INT_MIN. 0 is most often needed but that makes it is also most incorrect when it is yet unclear what is needed. > good warnings (if your compiler supports them) will mean that your code > "blows up" during compilation rather than when running, and that's the > best you can get. That is what I usually do. Sometimes compiler does not catch a defect, for example when uninitialized variable is passed by reference. Then I can use valgrind and/or clangs -fsanitize=memory that blow it up run-time and everything is still good. However I see that lot of people do not like undetermined state. Therefore I was trying to imagine of odd case when I do not know correct value but for whatever odd requirement I may not leave it uninitialized either. Perhaps it is then best to enwrap it into a class (that keeps track of its value being determined) or to use something like 'boost::optional' on all such cases that I can imagine. |
Jorgen Grahn <grahn+nntp@snipabacken.se>: Dec 20 05:51PM On Wed, 2015-12-16, David Brown wrote: > On 17/12/15 00:31, flimflim1172@gmail.com wrote: ... > have a part to play, as one would normally expect, then you need to > initialise it correctly. Of course that means editing the constructor - > adding the member to the class definition is only half the job. Sometimes if you have problems with this, it means you're using too primitive types. I once had a bunch of classes which kept statistics counters for their work, up to a dozen or so each. The right thing to do there was to create a counter<T> class and use it instead of unsigned, unsigned long etc. /Jorgen -- // Jorgen Grahn <grahn@ Oo o. . . \X/ snipabacken.se> O o . |
Chris Vine <chris@cvine--nospam--.freeserve.co.uk>: Dec 20 07:58PM On Sat, 19 Dec 2015 13:39:03 -0800 (PST) Öö Tiib <ootiib@hot.ee> wrote: [snip] > Nah, discard that. Seems that there is 'INT_FAST16_MIN' macro > available. In addition C++11 requires a a specialization of std::numeric_limits<T>::min() for all arithmetic types (§3.9.1/8 and §18.3.2.1/2). int_fast16_t is required to be a typedef to a signed integer type, which is an arithmetic type. Chris |
woodbrian77@gmail.com: Dec 20 08:59AM -0800 On Tuesday, October 27, 2015 at 6:09:23 PM UTC-5, Mr Flibble wrote: > > container and to make many insertions and deletions throughout the > > container. But you would need convincing evidence with profiling first. > There are plenty of reasons for using std::list. These days I only use std::list as a last resort. Brian Ebenezer Enterprises - In G-d we trust. http://webEbenezer.net |
woodbrian77@gmail.com: Dec 20 09:07AM -0800 Is there a growing appreciation for on line code generation here? Economies around the world are declining so I think this is helping to spur more interest. Here in the US we have the 20-trillion-dollar man. Brian Ebenezer Enterprises http://webEbenezer.net |
Daniel <danielaparker@gmail.com>: Dec 20 10:25AM -0800 > Is there a growing appreciation for on line code generation > here? We're all waiting on the excellent Mr Flibble, as soon as he gives the word, we're all on board. Daniel |
bleachbot <bleachbot@httrack.com>: Dec 20 05:17AM +0100 |
bleachbot <bleachbot@httrack.com>: Dec 20 05:51AM +0100 |
bleachbot <bleachbot@httrack.com>: Dec 20 03:45PM +0100 |
A new algorithm of Parallel implementation of Conjugate Gradient Sparse Linear System Solver library
Ramine <ramine@1.1>: Dec 19 11:18PM -0800 Hello, I have just implemented today a new parallel algorithm of a Parallel implementation of Conjugate Gradient Sparse Linear System Solver library.. this library is designed for sparse matrices of linear equations arising from industrial Finite element problems and such, and my new parallel algorithm is cache-aware and very fast.. So as you have noticed, i have implemented now two parallel algorithms, one that is cache-aware an NUMA-aware and that is scalable on NUMA architecture, and this scalable Parallel algorithm is designed for dense matrices that you find on Linear Equations arising from Integral Equation Formulations, here it is: https://sites.google.com/site/aminer68/scalable-parallel-implementation-of-conjugate-gradient-linear-system-solver-library-that-is-numa-aware-and-cache-aware And my new parallel algorithm that i have just implemented today is designed for sparse matrices of linear equations arising from industrial Finite element problems and such: Here is my new library of my new parallel algorithm: https://sites.google.com/site/aminer68/parallel-implementation-of-conjugate-gradient-sparse-linear-system-solver Feel free to port them to C++... Author: Amine Moulay Ramdane Description: I have come up with a new algorithm of my Parallel Conjugate gradient sparse solver library, now it has become cache-aware, but you have to notice that this new cache-aware algorithm is more efficient on multicores, since i have benchmarked it against my previous algorithm and it has given a scalability of 5X on a Quadcore over the single thread of my previous algorithm , that's a really a big improvement !. This Parallel library is especially designed for large scale industrial engineering problems that you find on industrial Finite element problems and such, this scalable Parallel library was ported to FreePascal and all the Delphi XE versions and even to Delphi 7, hope you will find it really good. The Parallel implementation of Conjugate Gradient Sparse Linear System Solver that i programmed here is designed to be used to solve large sparse systems of linear equations where the direct methods can exceed available machine memory and/or be extremely time-consuming. for example the direct method of the Gauss algorithm takes O(n^2) in the back substitution process and is dominated by the O(n^3) forward elimination process, that means, if for example an operation takes 10^-9 second and we have 1000 equations , the elimination process in the Gauss algorithm will takes 0.7 second, but if we have 10000 equations in the system , the elimination process in the Gauss algorithm will take 11 minutes !. This is why i have develloped for you the Parallel implementation of Conjugate Gradient Sparse Linear System Solver in Object Pascal, that is very fast. You have only one method to use that is Solve() function TParallelConjugateGradient.Solve(var A: arrarrext;var B,X:VECT;var RSQ:DOUBLE;nbr_iter:integer;show_iter:boolean):boolean; The system: A*x = b The important parameters in the Solve() method are: A is the matrix , B is the b vector, X the initial vector x, nbr_iter is the number of iterations that you want and show_iter to show the number of iteration on the screen. RSQ is the sum of the squares of the components of the residual vector A.x - b. I have got over 5X scalability on a quad core. The Conjugate Gradient Method is the most prominent iterative method for solving sparse systems of linear equations. Unfortunately, many textbook treatments of the topic are written with neither illustrations nor intuition, and their victims can be found to this day babbling senselessly in the corners of dusty libraries. For this reason, a deep, geometric understanding of the method has been reserved for the elite brilliant few who have painstakingly decoded the mumblings of their forebears. Conjugate gradient is the most popular iterative method for solving large systems of linear equations. CG is effective for systems of the form A.x = b where x is an unknown vector, b is a known vector, A is a known square, symmetric, positive-definite (or positive-indefinite) matrix. These systems arise in many important settings, such as finite difference and finite element methods for solving partial differential equations, structural analysis, circuit analysis, and math homework The Conjugate gradient method can also be applied to non-linear problems, but with much less success since the non-linear functions have multiple minimums. The Conjugate gradient method will indeed find a minimum of such a nonlinear function, but it is in no way guaranteed to be a global minimum, or the minimum that is desired. But the conjugate gradient method is great iterative method for solving large, sparse linear systems with a symmetric, positive, definite matrix. In the method of conjugate gradients the residuals are not used as search directions, as in the steepest decent method, cause searching can require a large number of iterations as the residuals zig zag towards the minimum value for ill-conditioned matrices. But instead conjugate gradient method uses the residuals as a basis to form conjugate search directions . In this manner, the conjugated gradients (residuals) form a basis of search directions to minimize the quadratic function f(x)=1/2*Transpose(x)*A*x + Transpose(b)*x and to achieve faster speed and result of dim(N) convergence. Language: FPC Pascal v2.2.0+ / Delphi 7+: http://www.freepascal.org/ Operating Systems: Windows, Mac OSX , Linux... Required FPC switches: -O3 -Sd -dFPC -dFreePascal -Sd for delphi mode.... Required Delphi switches: -$H+ -DDelphi {$DEFINE CPU32} and {$DEFINE Windows32} for 32 bit systems {$DEFINE CPU64} and {$DEFINE Windows64} for 64 bit systems Thank you, Amine Moulay Ramdane. |
Ramine <ramine@1.1>: Dec 19 11:52PM -0800 Hello, Read here: https://en.wikipedia.org/wiki/Sparse_matrix As you have noticed it says: "When storing and manipulating sparse matrices on a computer, it is beneficial and often necessary to use specialized algorithms and data structures that take advantage of the sparse structure of the matrix. Operations using standard dense-matrix structures and algorithms are slow and inefficient when applied to large sparse matrices as processing and memory are wasted on the zeroes. Sparse data is by nature more easily compressed and thus require significantly less storage. Some very large sparse matrices are infeasible to manipulate using standard dense-matrix algorithms." I have took care of that on my new algorithm, i have used my ParallelHashList datastructure to store the sparse matrices of the linear system so that it become very fast and so that it doesn't waste on the zeros, in fact my new algorithm doesn't store the zeros of the sparse matrice of the linear system. And my new parallel algorithm that i have just implemented today is designed for sparse matrices of linear equations arising from industrial Finite element problems and such.. Here is my new library of my new parallel algorithm: https://sites.google.com/site/aminer68/parallel-implementation-of-conjugate-gradient-sparse-linear-system-solver Thank you, Amine Moulay Ramdane. |
Ramine <ramine@1.1>: Dec 20 09:46AM -0800 Hello, I have updated my new Parallel implementation of Conjugate Gradient Linear Sparse System Solver library to version 1.24, i have corrected a bug, and i have thoroughly tested it and i think it's stable and very fast now. You can download it from here: https://sites.google.com/site/aminer68/parallel-implementation-of-conjugate-gradient-sparse-linear-system-solver Thank you, Amine Moulay Ramdane. |
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page. To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com. |
No comments:
Post a Comment