Wednesday, September 7, 2022

Digest for comp.lang.c++@googlegroups.com - 13 updates in 1 topic

Bonita Montero <Bonita.Montero@gmail.com>: Sep 07 06:46AM +0200

Am 06.09.2022 um 20:49 schrieb Lynn McGuire:
 
> You need to unroll the loop for the common cases of e == 1, e == 2, e ==
> 3, etc.
 
> Lynn
 
LOL.
Bonita Montero <Bonita.Montero@gmail.com>: Sep 07 06:48AM +0200

Am 06.09.2022 um 19:08 schrieb Bonita Montero:
>             v *= sq;
>     return e >= 0 ? v : 1.0 / v;
> }
 
I've got it ! I've optimized away one jump and the performance is
now the same:
 
double powPow( double d, int e )
{
auto run = [&]<bool Neg>( bool_constant<Neg> ) -> double
{
unsigned eAbs;
if constexpr( !Neg )
eAbs = e;
else
eAbs = -e;
double v = 1.0, sq = d;
for( ; eAbs; eAbs >>= 1, sq *= sq )
if( eAbs & 1 )
v *= sq;
if constexpr( Neg )
v = 1.0 / v;
return v;
};
if( e >= 0 )
return run( false_type() );
else
return run( true_type() );
}
Bonita Montero <Bonita.Montero@gmail.com>: Sep 07 07:17AM +0200

Now with loop-unrolling, but not the one Lynn suggested:
 
double powPow( double d, int e )
{
auto run = [&]<bool Neg>( bool_constant<Neg> ) -> double
{
auto unroll_n = [&]<typename Fn, size_t ... Indices>( Fn fn,
std::index_sequence<Indices ...> )
{
return (fn.template operator ()<Indices>() && ...);
};
unsigned eAbs;
if constexpr( !Neg )
eAbs = e;
else
eAbs = -e;
double v = 1.0, sq = d;
while( unroll_n(
[&]<size_t I>() -> bool
{
if( !eAbs )
return false;
if( eAbs & 1 )
v *= sq;
sq *= sq;
eAbs >>= 1;
return true;
}, std::make_index_sequence<5>() ) );
if constexpr( Neg )
v = 1.0 / v;
return v;
};
if( e >= 0 )
return run( false_type() );
else
return run( true_type() );
}
 
Now it's 13% faster than MSVCRT.
Bonita Montero <Bonita.Montero@gmail.com>: Sep 07 09:42AM +0200

This is +110% faster than MSVCRT's implementation:
 
ouble powPow( double d, int e )
{
constexpr size_t UNROLL_FACTOR = 1;
auto run = [&]<bool Neg>( bool_constant<Neg> ) -> double
{
unsigned eAbs = !Neg ? e : -e;
double v = 1.0, sq = d;
for( ; eAbs; eAbs >>= 1, sq *= sq )
v *= (int)(eAbs & 1) * sq + (int)(eAbs & 1 ^ 1);
v = !Neg ? v : 1.0 / v;
return v;
};
if( e >= 0 )
return run( false_type() );
else
return run( true_type() );
}
 
There are no conditional moves for floating-point operations,
so I did the decision whether to add the squared value with
that: (int)(eAbs & 1) * sq + (int)(eAbs & 1 ^ 1).
"Alf P. Steinbach" <alf.p.steinbach@gmail.com>: Sep 07 10:37AM +0200

On 7 Sep 2022 06:48, Bonita Montero wrote:
>     else
>         return run( true_type() );
> }
 
Maybe you could measure also my old code with slightly different
approach (no `constexpr`), at <url:
https://github.com/progrock-libraries/kickstart/blob/master/source/library/kickstart/main_library/core/ns%E2%96%B8language/operations/intpow.hpp>:
 
 
#include <assert.h>
 
#include <type_traits> // is_floating_point_v
 
// Important to not introduce possible future name conflicts with
<math.h>, hence
// the "lx" (short for "language extension") ns▸
namespace kickstart::language::lx::_definitions {
using std::is_floating_point_v;
 
namespace impl
{
// Essentially this is Horner's rule adapted to calculating a
power, so that the
// number of floating point multiplications is at worst O(log₂n).
template< class Number_type >
constexpr inline auto intpow_( const Number_type base, const
int exponent )
-> Number_type
{
Number_type result = 1;
Number_type weight = base;
for( int n = exponent; n != 0; weight *= weight ) {
if( n % 2 != 0 ) {
result *= weight;
}
n /= 2;
}
return result;
}
} // namespace impl
 
template< class Number_type >
constexpr inline auto intpow( const Number_type base, const int
exponent )
-> Number_type
{
// TODO: proper failure handling
if( exponent < 0 and not is_floating_point_v<Number_type> ) {
assert( "For negative powers the base must be floating
point" and false );
}
return (0?0
: exponent > 0? impl::intpow_<Number_type>( base,
exponent )
: exponent == 0? 1
: Number_type(
1.0/impl::intpow_<Number_type>( base, -exponent ) )
);
}
 
 
//-----------------------------------------------------------
@exported:
namespace d = _definitions;
namespace exported_names { using
d::intpow;
} // namespace exported names
} // namespace kickstart::language::lx::_definitions
 
namespace kickstart::language::lx { using namespace
_definitions::exported_names; }
 
 
In the above copy+paste I removed an include of a header that defines a
`bool` replacement type called `Truth`, b/c I can't see that it's used.
 
The main difference from your code is that I discriminate the different
cases in a public function that just calls the internal implementation
with appropriate adjustments of arguments and result. Also, I use IMO
more clear notation like `/= 2` instead of `>>= 1`. Should yield same
machine code though.
 
I never measured this so it's interesting how it would stack up to your
code, and MSVC's.
 
 
- Alf
Bonita Montero <Bonita.Montero@gmail.com>: Sep 07 01:38PM +0200

I think that it's very rare that someone uses an integer type for
pow( x, intType ). And with a double you have more precision within
the calculcaton itself and you'd just to cast back the result. So
ther's no need to have that generic.
Juha Nieminen <nospam@thanks.invalid>: Sep 07 01:02PM


> There are no conditional moves for floating-point operations,
> so I did the decision whether to add the squared value with
> that: (int)(eAbs & 1) * sq + (int)(eAbs & 1 ^ 1).
 
test.cc:3:18: error: 'size_t' does not name a type
test.cc:4:34: error: 'bool_constant' has not been declared
test.cc:14:28: error: 'false_type' was not declared in this scope
test.cc:16:28: error: 'true_type' was not declared in this scope
scott@slp53.sl.home (Scott Lurndal): Sep 07 02:13PM

>> 3, etc.
 
>> Lynn
 
>LOL.
 
Instead of insults, how about you simply disassemble the MS runtime
version of the function and figure out for yourself how it works?
 
A competent programmer would have no problem doing that.
scott@slp53.sl.home (Scott Lurndal): Sep 07 02:14PM

>test.cc:4:34: error: 'bool_constant' has not been declared
>test.cc:14:28: error: 'false_type' was not declared in this scope
>test.cc:16:28: error: 'true_type' was not declared in this scope
 
Typical unreadable code from Christoph/Bonnie.
Bonita Montero <Bonita.Montero@gmail.com>: Sep 07 05:18PM +0200

Am 07.09.2022 um 16:14 schrieb Scott Lurndal:
>> test.cc:14:28: error: 'false_type' was not declared in this scope
>> test.cc:16:28: error: 'true_type' was not declared in this scope
 
> Typical unreadable code from Christoph/Bonnie.
 
There's nothing unreadable with that.
It's simply that you're both too stupid to adapt my style.
Bonita Montero <Bonita.Montero@gmail.com>: Sep 07 05:19PM +0200

Am 07.09.2022 um 16:13 schrieb Scott Lurndal:
 
>> LOL.
 
> Instead of insults, how about you simply disassemble the MS runtime
> version of the function and figure out for yourself how it works?
 
That's not necessary any more since I've developed some code
that is more than twice as fast as the MSVC implementation.
 
Muttley@dastardlyhq.com: Sep 07 03:45PM

On Wed, 7 Sep 2022 17:18:32 +0200
 
>> Typical unreadable code from Christoph/Bonnie.
 
>There's nothing unreadable with that.
>It's simply that you're both too stupid to adapt my style.
 
Its quite clear you love complexity.
Bonita Montero <Bonita.Montero@gmail.com>: Sep 07 06:07PM +0200


>> There's nothing unreadable with that.
>> It's simply that you're both too stupid to adapt my style.
 
> Its quite clear you love complexity.
 
I've written complex code, but this absolutely isn't complex.
You're just overstrained.
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

No comments: