- neoGFX C++ ECS - 2 Updates
- Parallel version of popcount - 3 Updates
- About Single Responsibility Principle - 3 Updates
- Using #define in the construction of an array - 2 Updates
"Chris M. Thomasson" <invalid_chris_thomasson@invalid.invalid>: Nov 09 08:08PM -0800 On 11/9/2018 6:15 AM, Mr Flibble wrote: >>> https://github.com/i42output/neoGFX >> Can one introduce their own custom GLSL shader code into the mix? > Of course. That is very nice. I am assuming you have a nice abstraction around uniform variables passed to the shaders. Do you provide for any built in uniforms, something like iTime in ShaderToy.com? |
Mr Flibble <flibbleREMOVETHISBIT@i42.co.uk>: Nov 10 04:38PM On 10/11/2018 04:08, Chris M. Thomasson wrote: > That is very nice. I am assuming you have a nice abstraction around > uniform variables passed to the shaders. Do you provide for any built in > uniforms, something like iTime in ShaderToy.com? Yes there are shader classes and such but they will probably go through at least one more design/refactor iteration before release. Currently standard uniforms are the transformation matrix and projection matrix. /Flibble -- "You won't burn in hell. But be nice anyway." – Ricky Gervais "I see Atheists are fighting and killing each other again, over who doesn't believe in any God the most. Oh, no..wait.. that never happens." – Ricky Gervais "Suppose it's all true, and you walk up to the pearly gates, and are confronted by God," Bryne asked on his show The Meaning of Life. "What will Stephen Fry say to him, her, or it?" "I'd say, bone cancer in children? What's that about?" Fry replied. "How dare you? How dare you create a world to which there is such misery that is not our fault. It's not right, it's utterly, utterly evil." "Why should I respect a capricious, mean-minded, stupid God who creates a world that is so full of injustice and pain. That's what I would say." |
Paul <pepstein5@gmail.com>: Nov 10 03:54AM -0800 Googling for fast ways of doing popcount, I read about this for doing it "in parallel". It doesn't seem parallelized to me. There are no threads or anything. I would have thought a parallel approach would involve different threads tackling different bits? Thanks for your advice about C++ parallel versions of popcount. Kind Regards, Paul unsigned int countBits(unsigned int x) { // count bits of each 2-bit chunk x = x - ((x >> 1) & 0x55555555); // count bits of each 4-bit chunk x = (x & 0x33333333) + ((x >> 2) & 0x33333333); // count bits of each 8-bit chunk x = x + (x >> 4); // mask out junk x &= 0xF0F0F0F; // add all four 8-bit chunks return (x * 0x01010101) >> 24; } |
"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Nov 10 04:59PM +0100 On 10.11.2018 12:54, Paul wrote: > // add all four 8-bit chunks > return (x * 0x01010101) >> 24; > } This code is well commented about the parallelism. E.g. "count bits of each 2-bit chunk", that's about doing all the 2-bit chunks in parallel. Up to the width of the architecture's `int`, which you can infer from the code is 32. However, how it works is a different matter: there's no comment about that. Studying the hand-optimized code resulting from some idea is like reverse-engineering machine code, or (slight exaggeration) trying to deduce the shape of a human of a given age from a DNA sequencing. Instead I would look for a description of the basic idea, and how that idea was expressed in code. And instead of reinventing this particular wheel, why not use one of the umpteen existing solutions? In C++ you have `std::bitset::count`, portably. And in both C and C++ there are somewhat less portable C level compiler intrinsics, e.g. as listed at <url: https://en.wikichip.org/wiki/population_count>. Cheers & hth., - Alf |
Melzzzzz <Melzzzzz@zzzzz.com>: Nov 10 04:03PM > Googling for fast ways of doing popcount, I read about this for doing it "in parallel". Depending on architecture, you can do it pretty fast, but popcount native instruction on x86 is as fast as avx2 version with pshufb. > It doesn't seem parallelized to me. There are no threads or anything. > I would have thought a parallel approach would involve different threads tackling > different bits? This means parallel as multiple chunks in same register in one op. Like SIMD ;) -- press any key to continue or any other to quit... |
Pavel <pauldontspamtolk@removeyourself.dontspam.yahoo>: Nov 09 09:43PM -0500 JiiPee wrote: >> that in >> CDialog. > 1) Does the opengl data I am not sure what you mean by "openGL data". I referred to the data that openGL drawing may require -- but these would probably be your domain-specific data with lots of stuff relevent and irrelevant to drawing (of course it could be openGL models exclusively, too). need to be defined in CDocument? Of course not but it's convenient as wiring is there. Can we do this: > And thus CMyDocument would only have non-opengl data. > This way Renderer and OpenglDAta are both isolated from the MFC > view/documen/frame system You can compose your document from other objects. Or you could rename CMyDocument to CMyOpenGLData and lose the extra object. Whichever is better depends on how else you plan to use MyOpenGLData. > different way there. > I am not sure if CDocument can be used with CDialog/CWnd. So it cannot > be used with them? You can and most probably will add your custom CView to some kind of top-level or higher-level CWnd as a child anyway. |
Pavel <pauldontspamtolk@removeyourself.dontspam.yahoo>: Nov 09 10:48PM -0500 JiiPee wrote: > You mean that all future Views/Windows would have totally different > drawing needs, thus renderer would not be a good way to handle them all > as the windows have totally different drawing needs? No. I meant that you create some CMyShapeDrawingView with lots of useful customizable properties and add it to whatever parent CWnd you want and use MFC event wiring to handle interesting events coming from parent windows and access their properties. > The drawing will be simple OpenGL drawing modelling 3D objects (like > tools, table, etc... like the 3D modelling programs.). rotating them, > zooming etc... all basic functionalities. Well, it sounds like you need to handle lots of user input events that are bound to specific locations at your rendered object to select them. You need to decide how you are going to map your gestures to locations at your 3D objects. There is more than one way. I have never done it myself. You might need to research it. See e.g. http://www.opengl-tutorial.org/miscellaneous/clicking-on-objects/ or ask in some OpenGL group. On the other hand, if you meant to only display a single object at a time, you can do a "non -OO UI", controlling your object with bunch of buttons, scrollers etc. outside of your view and/or a set of keyboard accelerators. In any case you will need to update your scene on these actions. This may be done, e.g. by changing your CDocument and notifying your view but you are of course free to use an entirely custom domain data structures. Frankly, the more I think about it, the more I like the idea of using CDocument/CView framework for this kind of job. It was put together for a reason and is largely based on a sound theory developed by really smart people before Microsoft came to be. |
JiiPee <no@notvalid.com>: Nov 10 10:18AM On 10/11/2018 03:48, Pavel wrote: > idea of using CDocument/CView framework for this kind of job. It was put > together for a reason and is largely based on a sound theory developed > by really smart people before Microsoft came to be. yes am planning to use that framework. But am still struggling to understand you point why isolating opengl code from View and possibly document code from CDocument would be bad. I can read you messages.. but is it possible to make a very simple, couple of lines example code illustrating this issue (why Renderer would be bad/worse)? |
Paavo Helde <myfirstname@osa.pri.ee>: Nov 10 09:54AM +0200 On 10.11.2018 1:22, Jorgen Grahn wrote: > FWIW, I wouldn't do it like that: it's cryptic, and I suspect it's > slow. Lookup tables made more sense in the 1990s, before CPUs became > much faster than RAM. To be honest, for a 256-byte lookup table one should talk about CPU cache speeds, not RAM speed. Also, if the lookup table is small enough it can fit directly into CPU registers. Google found a link which claims that the fastest way to count raised bits is the built-in hardware instruction (POPCNT in newer Intel/AMD), followed closely by a table lookup for each 4 bits, all inside the CPU registers: "https://stackoverflow.com/questions/109023/how-to-count-the-number-of-set-bits-in-a-32-bit-integer" It also provides a high level function, but in my eyes this is not less cryptic than the macros ;-) int numberOfSetBits(int i) { // Java: use >>> instead of >> // C or C++: use uint32_t i = i - ((i >> 1) & 0x55555555); i = (i & 0x33333333) + ((i >> 2) & 0x33333333); return (((i + (i >> 4)) & 0x0F0F0F0F) * 0x01010101) >> 24; } |
Jorgen Grahn <grahn+nntp@snipabacken.se>: Nov 10 08:15AM On Sat, 2018-11-10, Paavo Helde wrote: >> much faster than RAM. > To be honest, for a 256-byte lookup table one should talk about CPU > cache speeds, not RAM speed. Yes, but the difference between CPU and RAM speeds is the reason caches exist. I didn't want to go too deep into all that, because it's complex and I don't know it well. /Jorgen -- // Jorgen Grahn <grahn@ Oo o. . . \X/ snipabacken.se> O o . |
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page. To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com. |
No comments:
Post a Comment