Tuesday, January 20, 2015

Digest for comp.lang.c++@googlegroups.com - 10 updates in 2 topics

legalize+jeeves@mail.xmission.com (Richard): Jan 19 11:38PM

[Please do not mail me a copy of your followup]
 
Martijn Lievaart <m@rtij.nl.invlalid> spake the secret code
>older (extinct) microprocessors had rather slow indirect function calls,
>so that may also spark those 'no virtual functions' rants. In the end,
>profile, profile, profile.
 
In this case, I am talking about a particular discussion I had with
some game developers where they yelled out how virtual functions were
bad and I drilled down to find out the real problem which was keeping
your cache hot. Virtual functions and hot caches are not
incompatible, but if you blindly use virtual functions all over the
place without regards to how it affects your cache, then you can have
problems. They may think that they "solved" the problem by banishing
virtual functions, but when you banish virtual functions you're forced
to organize your code differently and it was the different
organization that gave them hot caches, not the banishment of virtual
functions.
 
Again, it comes down to measurement and understanding system performance
as a whole and not simplistically avoiding things like virtual functions,
std::vector or C++ for that matter. But hey, it's more "exciting" to
screech against virtual functions than it is to repeat the
time-honored advice of keeping your caches hot.
--
"The Direct3D Graphics Pipeline" free book <http://tinyurl.com/d3d-pipeline>
The Computer Graphics Museum <http://computergraphicsmuseum.org>
The Terminals Wiki <http://terminals.classiccmp.org>
Legalize Adulthood! (my blog) <http://legalizeadulthood.wordpress.com>
"Öö Tiib" <ootiib@hot.ee>: Jan 19 04:48PM -0800

On Tuesday, 20 January 2015 01:39:11 UTC+2, Richard wrote:
> std::vector or C++ for that matter. But hey, it's more "exciting" to
> screech against virtual functions than it is to repeat the
> time-honored advice of keeping your caches hot.
 
The problem is when people use 'virtual' where run-time polymorphism isn't
needed at all. If run-time polymorphism is needed then virtual functions are
commonly more efficient than the typical alternatives. Typical alternatives
are done with "type" or "kind" member and then switch-case or if-else or
lookup-in-table to find out the correct behaviour. Such "polymorphism"
is worse to read and slower than one level of additional indirection from
virtual call.
Martijn Lievaart <m@rtij.nl.invlalid>: Jan 20 12:43PM +0100

On Mon, 19 Jan 2015 16:48:32 -0800, Öö Tiib wrote:
 
> behaviour. Such "polymorphism"
> is worse to read and slower than one level of additional indirection
> from virtual call.
 
True, but on one point. Such "polymorphism" MAY be slower. I have seen
plenty of cases where it wasn't.
 
Profile, profile, profile.
 
M4
"Öö Tiib" <ootiib@hot.ee>: Jan 20 08:59AM -0800

On Tuesday, 20 January 2015 13:46:00 UTC+2, Martijn Lievaart wrote:
> > from virtual call.
 
> True, but on one point. Such "polymorphism" MAY be slower. I have seen
> plenty of cases where it wasn't.
 
Dinosaur switch-cases have been always much slower.
 
> Profile, profile, profile.
 
Profiling typically reveals that way higher level algorithms are naive.
People have wasted their time to micro-optimise functions that
are called most often and forgot to search for the obvious opportunity
to reduce the count of calls 20 times.
Martijn Lievaart <m@rtij.nl.invlalid>: Jan 20 08:29PM +0100

On Tue, 20 Jan 2015 08:59:28 -0800, Öö Tiib wrote:
 
 
>> True, but on one point. Such "polymorphism" MAY be slower. I have seen
>> plenty of cases where it wasn't.
 
> Dinosaur switch-cases have been always much slower.
 
Nope. This is cargo cult programming. IF it is really important (that
should always be the first question) AND there are no more algorithmic
gains (should be second question) then, and only then, don't assume,
measure.
 
> People have wasted their time to micro-optimise functions that are
> called most often and forgot to search for the obvious opportunity to
> reduce the count of calls 20 times.
 
True, but besides the point.
 
M4
"Öö Tiib" <ootiib@hot.ee>: Jan 20 12:18PM -0800

On Tuesday, 20 January 2015 21:31:10 UTC+2, Martijn Lievaart wrote:
> should always be the first question) AND there are no more algorithmic
> gains (should be second question) then, and only then, don't assume,
> measure.
 
Done for decades. Result was told to you: "Dinosaur switch-cases have been
always much slower than virtual functions."
It is also logical. Otherwise compiler would generate for (at least some of)
virtual calls such switch-cases under the hood.
scott@slp53.sl.home (Scott Lurndal): Jan 20 08:52PM

>It is also logical. Otherwise compiler would generate for (at least some of=
>)=20
>virtual calls such switch-cases under the hood.
 
I'm afraid it is difficult to take your word for this without
any data to support it.
 
It's clearly dependent upon each program. Considering that a
case statement (where the case index is non-sequential or sparse) is
generally a sequence of compares and branches, a good branch
predictor will keep the instruction pipeline full. A virtual
function call, being a non-predicable branch, will not only
result in a pipeline flush, but will also often, even likely
for large objects, hit a completely different cacheline to access
the vtbl for the object which, depending on residency in the LLC,
may result in a delay of between 80 and 400 instructions to fill.
 
Of course, a case statement where the indexes are relatively
sequential will often be generated as a simple table lookup
followed by an indirect branch within the current instruction
stream. Two or three instructions, likely icache resident and
no LLC fill required.
 
the "otherwise compiler would generate such switch-cases under
the hood" statement is ridiculous, as that would not be
optimal in any case (pun intended) and clearly incompatible
with the relevent ABI's.
Melzzzzz <mel@zzzzz.com>: Jan 20 10:05PM +0100

On Tue, 20 Jan 2015 20:52:36 GMT
> for large objects, hit a completely different cacheline to access
> the vtbl for the object which, depending on residency in the LLC,
> may result in a delay of between 80 and 400 instructions to fill.
There is compiler that does not use VTBL's, rather tree of tests:
http://smarteiffel.loria.fr/
following your same idea.
"Öö Tiib" <ootiib@hot.ee>: Jan 20 01:23PM -0800

On Tuesday, 20 January 2015 22:52:49 UTC+2, Scott Lurndal wrote:
> for large objects, hit a completely different cacheline to access
> the vtbl for the object which, depending on residency in the LLC,
> may result in a delay of between 80 and 400 instructions to fill.
 
If we are talking about virtual calls done in some inner
cycles then vtables of classes involved are hot in cache.
If we are talking about rare virtual calls then those do not
affect performance.
 
> the hood" statement is ridiculous, as that would not be
> optimal in any case (pun intended) and clearly incompatible
> with the relevent ABI's.
 
What ABI's? C++ compiler does pretty much what it only wants to as
long as externally observable behavior stays same. If it is certain
about object's type then it calls the virtual functions non-virtually.
It (or linker) may even inline those.
ram@zedat.fu-berlin.de (Stefan Ram): Jan 19 11:54PM

>I am learning c++ after many years of modula-2. To my eye, assignment
>operator is := and equality comparison is =.
 
From the POV of mathematics, »:=« is as "wrong" as »=«, because
a mathematical definition like »x := 2« is not an assignment
(it is more like an initialization of a const value).
The assignment is not a definition, it's a write operation.
 
Ancient languages already got this and chose another symbol,
an arrow »<-«. This arrow, IIRC, was even part of ASCII 1963.
It might have been some version of Algol, which actually sometimes
is given as a predecessor of C.
 
However, »<-« disappeared from ASCII not after 1968. Of course,
today, we have Unicode.
 
In some cases, it might work to write one's C code as Unicode
using special symbols for assignment and comparison. A tiny
preprocessor can then run before compilation. Another approach
is used in literal programming, where IIRC, c code is rendered
with arrows for assignments (while still written with equal signs).
 
Me, I just like to write code like »if( ok = «), e.g., as in:
 
#include <iostream>
#include <ostream>
#include <limits>
 
int main() { double x; bool ok; do { ::std::cout << "Number? ";
if( ok = ::std::cin >> x )::std::cout << x << '\n';
else { ::std::cout << "?REDO FROM START\n";
::std::cin.clear(); ::std::cin.ignore
( ::std::numeric_limits< std::streamsize >::max(), '\n' ); }}
while( !ok ); }
 
I deem »if( ok = « to be idiomatic and do not want to do without it,
therefore I use:
 
-Wno-parentheses
 
.
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

No comments: