Tuesday, August 22, 2023

Digest for comp.lang.c++@googlegroups.com - 25 updates in 2 topics

Bonita Montero <Bonita.Montero@gmail.com>: Aug 22 05:27AM +0200


>>> I'm fairly sure I could speed it up if I could be bothered.
 
>> For sure not.
 
> And why's that then? ...
 
Because you can only write simple code.
 
"Öö Tiib" <ootiib@hot.ee>: Aug 21 11:25PM -0700

On Tuesday, 22 August 2023 at 06:27:51 UTC+3, Bonita Montero wrote:
 
> >> For sure not.
 
> > And why's that then? ...
 
> Because you can only write simple code.
 
Do not hijack nice thread of finding primes with your optimising
textual output.
Store primes however you want to and do not time text streaming
whatsoever. The problem was to find primes.
Muttley@dastardlyhq.com: Aug 22 06:56AM

On Tue, 22 Aug 2023 05:27:36 +0200
 
>>> For sure not.
 
>> And why's that then? ...
 
>Because you can only write simple code.
 
Simpler code = more efficient, less bugs and easier to maintain all other
things being equal. Complexity in code isn't a goal, its something to be
kept to a minimum. If you were even half the genius coder you seem to think
you are you'd already be aware of this.
Bonita Montero <Bonita.Montero@gmail.com>: Aug 22 11:23AM +0200

Am 22.08.2023 um 08:25 schrieb Öö Tiib:
 
>> Because you can only write simple code.
 
> Do not hijack nice thread of finding primes with your optimising
> textual output.
 
The speedup over cout << p << endl is massive, at least with MSVC;
I've measured that, you didn't.
Bonita Montero <Bonita.Montero@gmail.com>: Aug 22 11:27AM +0200


>>> And why's that then? ...
 
>> Because you can only write simple code.
 
> Simpler code = more efficient, ...
 
I've shown that my code is 2.3 times more efficient.
In most cases more complex code is faster.
 
> ... less bugs ...
 
Show me the bugs. I can manage such code, you don't. I did use
iterators as mich as possible to have iterator debugging to find
bugs if they would have happened.
 
> Complexity in code isn't a goal, ...
 
It's inevitable if the code should be more efficient.
 
> If you were even half the genius coder you seem to think
> you are you'd already be aware of this.
 
I'm not a genius, but I have extreme calm in my head that
I can program something like this.
Bonita Montero <Bonita.Montero@gmail.com>: Aug 22 11:34AM +0200

Am 22.08.2023 um 11:27 schrieb Bonita Montero:
 
>> Simpler code = more efficient, ...
 
> I've shown that my code is 2.3 times more efficient.
> In most cases more complex code is faster.
 
On my secondary Linux PC (Threadripper 3990X) my code is 61% faster.
Bonita Montero <Bonita.Montero@gmail.com>: Aug 22 11:59AM +0200

Am 22.08.2023 um 11:23 schrieb Bonita Montero:
>> textual output.
 
> The speedup over cout << p << endl is massive, at least with MSVC;
> I've measured that, you didn't.
 
C:\Users\Boni\Documents\Source\bitmapSieve>bg --times --affinity 1
--high x64\Release\bitmapSieve 100000000
real 9675.36ms
user 2875.00ms
sys 6718.75ms
cycles 43.369.468.839
 
C:\Users\Boni\Documents\Source\bitmapSieve>bg --times --affinity 1
--high x64\Release\bitmapSieve 100000000
real 370.28ms
user 281.25ms
sys 62.50ms
cycles 1.653.010.785
 
26 times faster on Windows (Zen4):
 
boni@EliteDesk:~$ g++ -march=native -std=c++20 -O2 bitmapSieve.cpp
boni@EliteDesk:~$ time ./a.out 100000000
 
real 0m17,552s
user 0m5,406s
sys 0m12,113s
boni@EliteDesk:~$ g++ -march=native -std=c++20 -O2 bitmapSieve.cpp
boni@EliteDesk:~$ time ./a.out 100000000
 
real 0m1,808s
user 0m1,680s
sys 0m0,085s
 
9.7 times faster on Linux (Skylake).
"Öö Tiib" <ootiib@hot.ee>: Aug 22 05:15AM -0700

On Tuesday, 22 August 2023 at 12:23:53 UTC+3, Bonita Montero wrote:
> > textual output.
 
> The speedup over cout << p << endl is massive, at least with MSVC;
> I've measured that, you didn't.
 
We already know that you are especially daff but I repeat: it does not
matter to efficiency of finding primes. Open other thread if you want
to discuss text output optimisations not finding primes.
Bonita Montero <Bonita.Montero@gmail.com>: Aug 22 02:23PM +0200

Am 22.08.2023 um 14:15 schrieb Öö Tiib:
 
> We already know that you are especially daff but I repeat: it does not
> matter to efficiency of finding primes. Open other thread if you want
> to discuss text output optimisations not finding primes.
 
I'd optimized both inside the _same_ program. You and Muttley
think that simple coding is enough, I've shown multiple times
that I can do it much better.
Unfortunately printf() and cout << are by far not that fast
as they could be, even if the output is redirected to a file.
Muttley@dastardlyhq.com: Aug 22 02:13PM

On Tue, 22 Aug 2023 11:27:29 +0200
>> Simpler code = more efficient, ...
 
>I've shown that my code is 2.3 times more efficient.
>In most cases more complex code is faster.
 
No, not in most cases as endless bloated frameworks with their huge memory
and CPU requirements demonstrate on a daily basis to me. There's a big
difference between code thats fully featured and code thats needlessly
compicated because the developer(s) were unable to think clearly or see the
obvious solution.
 
>> Complexity in code isn't a goal, ...
 
>It's inevitable if the code should be more efficient.
 
Bollocks.
 
>> you are you'd already be aware of this.
 
>I'm not a genius, but I have extreme calm in my head that
>I can program something like this.
 
Oh right, you're a Zen dev are you? :) Whatever you say!
Bonita Montero <Bonita.Montero@gmail.com>: Aug 22 04:17PM +0200


> No, not in most cases as endless bloated frameworks ...
 
I'm using standard C++ and no bloating framworks. The memory-con-
sumtion is eight times less than with your code because I store
each bool as a bit. Your code has bloat, not mine in that sense.
 
> There's a big difference between code thats fully featured and code
> thats needlessly compicated because the developer(s) were unable to
> think clearly or see the obvious solution.
 
It overburdens you, not every developer.
Muttley@dastardlyhq.com: Aug 22 02:27PM

On Tue, 22 Aug 2023 16:17:21 +0200
 
>I'm using standard C++ and no bloating framworks. The memory-con-
>sumtion is eight times less than with your code because I store
>each bool as a bit. Your code has bloat, not mine in that sense.
 
Accessing bits in a word is normally slower than accessing the whole word
on most CPUs. However due to the huge memory this sort of algo can use this
access time is probably more than offset by paging delays.
 
Using bits doesn't make you a genius.
 
>> thats needlessly compicated because the developer(s) were unable to
>> think clearly or see the obvious solution.
 
>It overburdens you, not every developer.
 
Whoosh...
Bonita Montero <Bonita.Montero@gmail.com>: Aug 22 04:48PM +0200


> Accessing bits in a word is normally slower than accessing the whole word
> on most CPUs. ...
 
I have shown that when the amount of data is large enough, access time
has the greatest weight. Furthermore, most of the calculations for the
bits within the byte in other arithmetic units take place in parallel.
If I choose the data set small enough, your algorithm is faster, but
then performance doesn't count anyway.
 
> Using bits doesn't make you a genius.
 
Did I claim that for myself ?
Muttley@dastardlyhq.com: Aug 22 02:57PM

On Tue, 22 Aug 2023 16:48:45 +0200
 
>I have shown that when the amount of data is large enough, access time
>has the greatest weight. Furthermore, most of the calculations for the
>bits within the byte in other arithmetic units take place in parallel.
 
CPUs only have 1 data bus no matter how many cores they have so the optimum
number of threads for this sort of algorithm is very much a suck it and
see approach depending on hardware and on a single core CPU your approach
would suck badly.
 
>> Using bits doesn't make you a genius.
 
>Did I claim that for myself ?
 
By denigrating anyone who doesn't like your coding style the implication is
there for all to see. You think yourself a better programmer than anyone
else on this group.
Bonita Montero <Bonita.Montero@gmail.com>: Aug 22 05:00PM +0200

> number of threads for this sort of algorithm is very much a suck it and
> see approach depending on hardware and on a single core CPU your approach
> would suck badly.
 
Data is fetched in quantities of a cacheline size. And with my code
an eight of that quantity is fetched from memory. That's the reason
while my code is much faster than yours.
 
> By denigrating anyone who doesn't like your coding style the implication
> is there for all to see. You think yourself a better programmer than anyone
> else on this group.
 
That's obvious. I'm programming C++ since 1992 and I program at all
since I'm 10.
scott@slp53.sl.home (Scott Lurndal): Aug 22 03:49PM

>>has the greatest weight. Furthermore, most of the calculations for the
>>bits within the byte in other arithmetic units take place in parallel.
 
>CPUs only have 1 data bus no matter how many cores
 
That is not an accurate statement for any processor built in
the last decade or more. Most of them use a mesh, crossbar or ring structure, not
a shared bus structure to interface the cores to the memory subsystem on the
chip. Most have multiple DRAM controllers (from one at
the real low end to twenty at the high end) and stripe
the DRAM address space across the controllers. All CPUs have private
L1 and L2 caches, and large shared LLC caches (LLC is distributed
across mesh elements) whic signifinctly reduce DRAM controller accesses.
 
Likewise, on the I/O side, there are multiple bridges from the
mesh/ring structures into the I/O side (e.g. one host bridge per
PCI Express root port, one or more for on-chip accelerators, etc).
Muttley@dastardlyhq.com: Aug 22 05:56PM

On Tue, 22 Aug 2023 17:00:42 +0200
>> see approach depending on hardware and on a single core CPU your approach
>> would suck badly.
 
>Data is fetched in quantities of a cacheline size. And with my code
 
If it has to be paged from disk in the first place the size of the cache is a
total irrelevance as far as time taken is concerned.
 
>an eight of that quantity is fetched from memory. That's the reason
>while my code is much faster than yours.
 
As I said, it depends on the hardware. Try both on a Z80 and see what happens.
 
>> is there for all to see. You think yourself a better programmer than anyone
>> else on this group.
 
>That's obvious.
 
Thanks for proving you're as arrogant as I thought.
 
>I'm programming C++ since 1992 and I program at all
>since I'm 10.
 
If you want to get into a pissing contest about who's been programming longer
I suspect a lot of people on this group have been doing it a lot longer than
you.
Muttley@dastardlyhq.com: Aug 22 05:58PM

On Tue, 22 Aug 2023 15:49:09 GMT
 
>>CPUs only have 1 data bus no matter how many cores
 
>That is not an accurate statement for any processor built in
>the last decade or more. Most of them use a mesh, crossbar or ring
 
Given millions of 8 and 16 bit processors are still designed and produced for
embedded systems that statement is nonsense and ARM based systems can use
whatever parts they want in the CPU.
 
 
>Likewise, on the I/O side, there are multiple bridges from the
>mesh/ring structures into the I/O side (e.g. one host bridge per
>PCI Express root port, one or more for on-chip accelerators, etc).
 
Thats nice, but CPU > x86.
Bonita Montero <Bonita.Montero@gmail.com>: Aug 22 08:13PM +0200


> If it has to be paged from disk in the first place the size of the
> cache is a total irrelevance as far as time taken is concerned.
 
No one uses software that actually pages to disk.
Paging is just for that the softwar doesn't crash.
 
> As I said, it depends on the hardware. Try both on a Z80 and see what happens.
 
Your code wasn't intended for a Z-80.
 
> Thanks for proving you're as arrogant as I thought.
 
When you're good in sth. you often appear arrogant.
 
> If you want to get into a pissing contest about who's been programming
> longer I suspect a lot of people on this group have been doing it a lot
> longer than you.
 
Maybe, but at this level.
Pavel <pauldontspamtolk@removeyourself.dontspam.yahoo>: Aug 21 10:55PM -0400

Öö Tiib wrote:
>> compute its type and/or call a virtual method".
 
> IOW you do huge switch typeid ... case static_ cast or have the virtual
> method (over what we were "gaining advantage") already there?
Even a switch with 30 cases is no worse than 30 methods although switch
is not always necessary (try to read the above instead of imagining).
 
The advantage is in having this logic in one place. If your visitor does
a coherent thing (which it is supposed to do; else why is it a single
visitor, there could be and usually would be common code for all or many
visited). The most pliant cases (that are exactly those gaining most
from applying a visitor pattern) may get by object virtual methods.
Simple example:
 
class BorderingVisitor: public AbstractVisitor {
public:
void doOp(const Shape& s) const override {
if (s.area() > 25) {
dc.drawRedRectangle(s.boundingRectangle().increaseBy(0.1));
}
}
private:
DrawingContext dc;
};
 
If we wanted a green border around a shape of any of 10 classes derived
from convex polygon (ConvexTriangle, ConvexQuadrilateral etc), red
border around a shape of any of 10 classes derived from Oval and no
border around any of 30 other zero-area shape classes, all we need to do
is to change doOp to:
 
void BorderingVisitor::doOp(const Shape& s) const {
if (s.area() > 25) {
const Rectangle br{s.boundingRectangle().increaseBy(0.1)};
if (const ConvexPolygon* cp = s.castToConvexPolygon()) {
dc.drawGreenRectangle(br);
}
else {
assert(!!s.castToOval());
dc.drawRedRectangle(br);
}
}
}
 
And this is the complete code for adding a new operation (virtual cast
is to be only defined once for all operations)
 
> But we need no bloat from that pattern to do neither. These were the
> things we wanted to get rid of.
 
Care to show us the code for your "non bloated" "one-place" visitor
class for the above 2 problems? (like I did, you can omit the code that
is common for all operations).
 
 
> Of course huge switch or calling virtual method (or now even multiple)
> work without the whole pattern. Read up on "double dispatch".
> It is tricky like I already said.
 
Surely you are the most competent software engineer understanding
multiple dispatch. Care to point out a "huge switch" in either of the
above 2 examples?
Pavel <pauldontspamtolk@removeyourself.dontspam.yahoo>: Aug 21 11:28PM -0400

Bonita Montero wrote:
> You can better do this
I suspect we are talking about different "this"es. A design pattern is a
recipe for solving a problem of a defined class. What is the class of
problems that your code intends to demonstrate how to solve?
 
Bonita Montero <Bonita.Montero@gmail.com>: Aug 22 05:47AM +0200

Am 22.08.2023 um 05:28 schrieb Pavel:
> I suspect we are talking about different "this"es. A design pattern is
> a recipe for solving a problem of a defined class. What is the class of
> problems that your code intends to demonstrate how to solve?
 
Visitor pattern and inversion of control is basically the same.
 
Pavel <pauldontspamtolk@removeyourself.dontspam.yahoo>: Aug 22 12:11AM -0400

Bonita Montero wrote:
>> a recipe for solving a problem of a defined class. What is the class
>> of problems that your code intends to demonstrate how to solve?
 
> Visitor pattern and inversion of control is basically the same.
What represents the abstract element and concrete elements of different
types derived from the abstract element -- all of which are the
mandatory participants of a Visitor problem -- in your code?
 
"Öö Tiib" <ootiib@hot.ee>: Aug 21 09:56PM -0700

On Tuesday, 22 August 2023 at 05:55:32 UTC+3, Pavel wrote:
> > method (over what we were "gaining advantage") already there?
 
> Even a switch with 30 cases is no worse than 30 methods although switch
> is not always necessary (try to read the above instead of imagining).
 
Visitor is typically used to search, filter, draw or print whole data object
hierarchy, convert to JSON to XML or to tree in GUI. If whole data hierarchy
is small then you get 30 cases. Switch case is worse than virtual methods.
Think why virtual methods were added? To get rid of switch cases over type.
Visitor is not meant as excuse to add those back, otherwise just use
the switch case in function, do not manufacture visitors that do not use
double dispatch for confusion.
 
> border around a shape of any of 10 classes derived from Oval and no
> border around any of 30 other zero-area shape classes, all we need to do
> is to change doOp to:
 
Unused cp in your code?
Can't post sane code from gg ... but also don't want to install newsreaders
to all devices I use. It is drawing border to any shape and so it is unclear
why you need visitor. Logical one place is either DrwawingContext's or
Shape's nonvirtual method or free function:
 
void Shape::draw_border(DrawingContext &dc) const {
if (area() <= 25) return;
const Rectangle br{boundingRectangle().increaseBy(0.1)};
if (isConvexPolygon()) { dc.drawGreenRectangle(br); return; }
assert(isOval());
dc.drawRedRectangle(br);
}
 
 
> And this is the complete code for adding a new operation (virtual cast
> is to be only defined once for all operations)
 
You forget that you have BorderingVisitor class of unclear life , inject
DrawingContext to it (is it copy?), etc. result is
 
BorderingVisitor bv(dc);
s.acceptVisit(bv);

instead one of:
 
dc.draw_border(s);
 
s.draw_border(dc);
 
draw_border(dc, s);
 
> Care to show us the code for your "non bloated" "one-place" visitor
> class for the above 2 problems? (like I did, you can omit the code that
> is common for all operations).
 
Yes, sorry for gg breaking all formating.
 
> Surely you are the most competent software engineer understanding
> multiple dispatch. Care to point out a "huge switch" in either of the
> above 2 examples?
 
There are no double dispatch in your code so the whole acceptVisit, doOp
add nothing. These were entirely added to do double dispatch to add
virtual methods to large potentially unrelated data hierarchies without
changing any of classes in those hierarchies. What you do is not even
use case for visitor pattern.
Bonita Montero <Bonita.Montero@gmail.com>: Aug 22 02:30PM +0200

Am 22.08.2023 um 06:11 schrieb Pavel:
> What represents the abstract element and concrete elements of different
> types derived from the abstract element -- all of which are the
> mandatory participants of a Visitor problem -- in your code?
 
My code and the visitor-pattern have in common that
there's some callback with its own state, i.e. both
have some inversion of control
 
 
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

No comments: