Saturday, November 10, 2018

Digest for comp.lang.c++@googlegroups.com - 10 updates in 4 topics

"Chris M. Thomasson" <invalid_chris_thomasson@invalid.invalid>: Nov 09 08:08PM -0800

On 11/9/2018 6:15 AM, Mr Flibble wrote:
 
>>> https://github.com/i42output/neoGFX
 
>> Can one introduce their own custom GLSL shader code into the mix?
 
> Of course.
 
That is very nice. I am assuming you have a nice abstraction around
uniform variables passed to the shaders. Do you provide for any built in
uniforms, something like iTime in ShaderToy.com?
Mr Flibble <flibbleREMOVETHISBIT@i42.co.uk>: Nov 10 04:38PM

On 10/11/2018 04:08, Chris M. Thomasson wrote:
 
> That is very nice. I am assuming you have a nice abstraction around
> uniform variables passed to the shaders. Do you provide for any built in
> uniforms, something like iTime in ShaderToy.com?
 
Yes there are shader classes and such but they will probably go through at
least one more design/refactor iteration before release. Currently
standard uniforms are the transformation matrix and projection matrix.
 
/Flibble
 
--
"You won't burn in hell. But be nice anyway." – Ricky Gervais
 
"I see Atheists are fighting and killing each other again, over who
doesn't believe in any God the most. Oh, no..wait.. that never happens." –
Ricky Gervais
 
"Suppose it's all true, and you walk up to the pearly gates, and are
confronted by God," Bryne asked on his show The Meaning of Life. "What
will Stephen Fry say to him, her, or it?"
"I'd say, bone cancer in children? What's that about?" Fry replied.
"How dare you? How dare you create a world to which there is such misery
that is not our fault. It's not right, it's utterly, utterly evil."
"Why should I respect a capricious, mean-minded, stupid God who creates a
world that is so full of injustice and pain. That's what I would say."
Paul <pepstein5@gmail.com>: Nov 10 03:54AM -0800

Googling for fast ways of doing popcount, I read about this for doing it "in parallel".
It doesn't seem parallelized to me. There are no threads or anything.
I would have thought a parallel approach would involve different threads tackling
different bits?
Thanks for your advice about C++ parallel versions of popcount.
 
Kind Regards,
Paul
 
unsigned int countBits(unsigned int x)
{
// count bits of each 2-bit chunk
x = x - ((x >> 1) & 0x55555555);
// count bits of each 4-bit chunk
x = (x & 0x33333333) + ((x >> 2) & 0x33333333);
// count bits of each 8-bit chunk
x = x + (x >> 4);
// mask out junk
x &= 0xF0F0F0F;
// add all four 8-bit chunks
return (x * 0x01010101) >> 24;
}
"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Nov 10 04:59PM +0100

On 10.11.2018 12:54, Paul wrote:
> // add all four 8-bit chunks
> return (x * 0x01010101) >> 24;
> }
 
This code is well commented about the parallelism. E.g. "count bits of
each 2-bit chunk", that's about doing all the 2-bit chunks in parallel.
Up to the width of the architecture's `int`, which you can infer from
the code is 32.
 
However, how it works is a different matter: there's no comment about
that. Studying the hand-optimized code resulting from some idea is like
reverse-engineering machine code, or (slight exaggeration) trying to
deduce the shape of a human of a given age from a DNA sequencing.
Instead I would look for a description of the basic idea, and how that
idea was expressed in code.
 
And instead of reinventing this particular wheel, why not use one of the
umpteen existing solutions? In C++ you have `std::bitset::count`,
portably. And in both C and C++ there are somewhat less portable C level
compiler intrinsics, e.g. as listed at <url:
https://en.wikichip.org/wiki/population_count>.
 
 
Cheers & hth.,
 
- Alf
Melzzzzz <Melzzzzz@zzzzz.com>: Nov 10 04:03PM

> Googling for fast ways of doing popcount, I read about this for doing it "in parallel".
 
Depending on architecture, you can do it pretty fast, but popcount
native instruction on x86 is as fast as avx2 version with pshufb.
 
> It doesn't seem parallelized to me. There are no threads or anything.
> I would have thought a parallel approach would involve different threads tackling
> different bits?
 
This means parallel as multiple chunks in same register in one op. Like
SIMD ;)
 
--
press any key to continue or any other to quit...
Pavel <pauldontspamtolk@removeyourself.dontspam.yahoo>: Nov 09 09:43PM -0500

JiiPee wrote:
>> that in
>> CDialog.
 
> 1) Does the opengl data
I am not sure what you mean by "openGL data". I referred to the data
that openGL drawing may require -- but these would probably be your
domain-specific data with lots of stuff relevent and irrelevant to
drawing (of course it could be openGL models exclusively, too).
 
need to be defined in CDocument?
Of course not but it's convenient as wiring is there.
 
Can we do this:
 
> And thus CMyDocument would only have non-opengl data.
 
> This way Renderer and OpenglDAta are both isolated from the MFC
> view/documen/frame system
You can compose your document from other objects. Or you could rename
CMyDocument to CMyOpenGLData and lose the extra object. Whichever is
better depends on how else you plan to use MyOpenGLData.
> different way there.
 
> I am not sure if CDocument can be used with CDialog/CWnd. So it cannot
> be used with them?
You can and most probably will add your custom CView to some kind of
top-level or higher-level CWnd as a child anyway.
Pavel <pauldontspamtolk@removeyourself.dontspam.yahoo>: Nov 09 10:48PM -0500

JiiPee wrote:
 
> You mean that all future Views/Windows would have totally different
> drawing needs, thus renderer would not be a good way to handle them all
> as the windows have totally different drawing needs?
No. I meant that you create some CMyShapeDrawingView with lots of useful
customizable properties and add it to whatever parent CWnd you want and
use MFC event wiring to handle interesting events coming from parent
windows and access their properties.
 
> The drawing will be simple OpenGL drawing modelling 3D objects (like
> tools, table, etc... like the 3D modelling programs.). rotating them,
> zooming etc... all basic functionalities.
Well, it sounds like you need to handle lots of user input events that
are bound to specific locations at your rendered object to select them.
You need to decide how you are going to map your gestures to locations
at your 3D objects. There is more than one way. I have never done it
myself. You might need to research it. See e.g.
http://www.opengl-tutorial.org/miscellaneous/clicking-on-objects/ or ask
in some OpenGL group.
 
On the other hand, if you meant to only display a single object at a
time, you can do a "non -OO UI", controlling your object with bunch of
buttons, scrollers etc. outside of your view and/or a set of keyboard
accelerators. In any case you will need to update your scene on these
actions. This may be done, e.g. by changing your CDocument and notifying
your view but you are of course free to use an entirely custom domain
data structures. Frankly, the more I think about it, the more I like the
idea of using CDocument/CView framework for this kind of job. It was put
together for a reason and is largely based on a sound theory developed
by really smart people before Microsoft came to be.
JiiPee <no@notvalid.com>: Nov 10 10:18AM

On 10/11/2018 03:48, Pavel wrote:
> idea of using CDocument/CView framework for this kind of job. It was put
> together for a reason and is largely based on a sound theory developed
> by really smart people before Microsoft came to be.
 
 
yes am planning to use that framework. But am still struggling to
understand you point why isolating opengl code from View and possibly
document code from CDocument would be bad.
 
 
I can read you messages.. but is it possible to make a very simple,
couple of lines example code illustrating this issue (why Renderer would
be bad/worse)?
Paavo Helde <myfirstname@osa.pri.ee>: Nov 10 09:54AM +0200

On 10.11.2018 1:22, Jorgen Grahn wrote:
 
> FWIW, I wouldn't do it like that: it's cryptic, and I suspect it's
> slow. Lookup tables made more sense in the 1990s, before CPUs became
> much faster than RAM.
 
To be honest, for a 256-byte lookup table one should talk about CPU
cache speeds, not RAM speed. Also, if the lookup table is small enough
it can fit directly into CPU registers. Google found a link which claims
that the fastest way to count raised bits is the built-in hardware
instruction (POPCNT in newer Intel/AMD), followed closely by a table
lookup for each 4 bits, all inside the CPU registers:
 
"https://stackoverflow.com/questions/109023/how-to-count-the-number-of-set-bits-in-a-32-bit-integer"
 
 
It also provides a high level function, but in my eyes this is not less
cryptic than the macros ;-)
 
int numberOfSetBits(int i)
{
// Java: use >>> instead of >>
// C or C++: use uint32_t
i = i - ((i >> 1) & 0x55555555);
i = (i & 0x33333333) + ((i >> 2) & 0x33333333);
return (((i + (i >> 4)) & 0x0F0F0F0F) * 0x01010101) >> 24;
}
Jorgen Grahn <grahn+nntp@snipabacken.se>: Nov 10 08:15AM

On Sat, 2018-11-10, Paavo Helde wrote:
>> much faster than RAM.
 
> To be honest, for a 256-byte lookup table one should talk about CPU
> cache speeds, not RAM speed.
 
Yes, but the difference between CPU and RAM speeds is the reason
caches exist. I didn't want to go too deep into all that, because
it's complex and I don't know it well.
 
/Jorgen
 
--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

No comments: