Saturday, January 24, 2015

Digest for comp.lang.c++@googlegroups.com - 25 updates in 5 topics

DSF <notavalid@address.here>: Jan 24 04:36PM -0500

Hello, group.
 
I though of this style for getters and putters, but I assume there
must be something about it that's inelegant, bad, or dare I use the
word of overkill...Evil! (Because I haven't seen it in any source.)
 
class foo
{
public:
 
void LeftPosition(int lpos) {leftposition = lpos;}
int LeftPosition(void) {return leftposition;}
 
void RightPosition(int rpos) {rightposition = rpos;}
int RightPosition(void) {return rightposition;}
 
private:
 
int leftposition;
int rightposition;
};
 
If I were to guess, I would say that people have an aversion to
identical names, save for case.
 
What do you think?
 
DSF
"'Later' is the beginning of what's not to be."
D.S. Fiscus
Ian Collins <ian-news@hotmail.com>: Jan 25 10:39AM +1300

DSF wrote:
> int LeftPosition(void) {return leftposition;}
 
> void RightPosition(int rpos) {rightposition = rpos;}
> int RightPosition(void) {return rightposition;}
 
Inverted capitalisation aside, it's fairly common in code I've seen.
 
--
Ian Collins
Mr Flibble <flibbleREMOVETHISBIT@i42.co.uk>: Jan 24 10:01PM

On 24/01/2015 21:36, DSF wrote:
 
> DSF
> "'Later' is the beginning of what's not to be."
> D.S. Fiscus
 
Hi.
 
1) In C++ (void) is superfluous, () will suffice.
2) The getter functions should by cv-qualified as 'const'.
3) Personal taste but I prefer different name for setter function namely
'SetLeftPosition'.
 
/Flibble
Jorgen Grahn <grahn+nntp@snipabacken.se>: Jan 24 10:13PM

On Sat, 2015-01-24, Ian Collins wrote:
>> int RightPosition(void) {return rightposition;}
 
> Inverted capitalisation aside, it's fairly common in code
> I've seen.
 
Yes, and I have used the style now and then (although I try to avoid
thinking in terms of getters and setters).
 
The consts are missing, though -- the get functions should be const.
(Apart from other benefits from const, it's also a strong hint what
they do.)
 
/Jorgen
 
--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .
Paavo Helde <myfirstname@osa.pri.ee>: Jan 24 04:27PM -0600

DSF <notavalid@address.here> wrote in
 
> I though of this style for getters and putters, but I assume there
> must be something about it that's inelegant, bad, or dare I use the
> word of overkill...Evil! (Because I haven't seen it in any source.)
[...]
> int LeftPosition(void) {return leftposition;}
 
Yes, get rid of void and add const. Then it's ok.
 
Cheers
Paavo
Pavel <pauldontspamtolk@removeyourself.dontspam.yahoo>: Jan 24 12:42AM -0500


> A possible speed bottleneck, is the masking of the following: (bs & mask_long)
 
> Why does std::bitset not have a way of chopping off unneeded higher bits, and just returning a u_long of the lower bits (without considering any higher bits that are set)? A function like: chop_ulong()
 
> That would allow one to leverage more performance, wouldn't it ?
The performance of such an operation is too dependent on predominant usage and
data distribution to have a generic solution. To illustrated the point about
different data distributions (for simplicity, the below assumes the "particular
bit state" your are searching for is true and you will see that often the best
representation of the bit set is not std::bitset at all):
 
- the probability true and false is 0.5 at every position. The best algorithm is
probably the naive iteration (bitset representation should do)
- the first 'true' can appear at any position with equal probability; the
probability of true and false in any subsequent position is 0.5. You are
probably best off with never storing the useless beginning, as you hinted to
above. You could use some deque<bitset<sizeof(unsigned long) * CHAR_BIT> > and
store the initial position of 1 (which must be in the first bitset) with it.
- there are relatively very few "trues" and all you need is to iterate over
their indices in the bitset in order. You could even use std::set<size_t> that
stores indices of trues instead of the bitset. If you do not need to toggle bits
after creating bitset, you could also a sorted vector of indices of "true" in
this case.
- if your bitset fits in a single value of an integral type, but trues are
relatively rare there, there are a few binary-log algorithms you could use, some
of order of O(log2(sizeof(your-integral-type) * CHAR_BIT), see for example
http://graphics.stanford.edu/~seander/bithacks.html#IntegerLogObvious. Or, you
could use a single platform-dependent instruction, e.g. for Intel:
http://web.itu.edu.tr/kesgin/mul06/intel/instr/bsf.html
 
etc.
 
On a side note, bitset is rarely good for efficiency as you do not have access
to the underlying memory; usually you are better off with your own
representation tailored for your needs.
 
std::vector<bool>, even though it has non-compile-time size, has IMHO a better
theoretical potential for out-of-the box optimization in the Library as the
Library could (I think) specialize algorithm such as std::find for its
const_iterator type using platform-specific tricks in implementation including
those mentioned above and your requested find (reformulated for vector<bool>)
can be implemented in a straightforward manner over std::find. I am not sure if
any implementation provides so optimized std::find() though.
 
Hope this will help,
-Pavel
Paavo Helde <myfirstname@osa.pri.ee>: Jan 24 01:02AM -0600

jononanon@googlemail.com wrote in
 
>> > how would you implement an efficient search for the first bit
>> > conformin
> g to a particular bit_state (true or false) in a std::bitset<N> ?
 
The straightforward solution for this is to use a simple loop and
std::bitset::test(). The implementation of std::bitset is all inline so
the result could actually be pretty fast with a good optimizing compiler.
In the following I presume that you have profiled this and determined the
speed is not sufficient. If you are not done this, you should do it now.
 
A side note: if you have somehow obtained raw bits as an integer value
then if you need extreme speed you might consider to use the relevant CPU
instruction directly (exposed e.g. by gcc as __builtin_clz()).
 
> exception overflow_error is hence called); and if not shoots out that
> ulong as fast as possible. In other words: to_ulong() should not waste
> any processor cycles with checking bits!
 
This assumption probably does not hold. Maintaining such a bool flag
would mean extra unneeded overhead for all programs which do not need
this functionality.
 
> considering any higher bits that are set)? A function like:
> chop_ulong()
 
> That would allow one to leverage more performance, wouldn't it ?
 
And it would lose information. If you are certain that all bits fit in an
unsigned long long, why not just use std::bitset<N> with small enough N?
N is a compile-time constant, so the optimizer would probably optimize
to_ullong() to a single instruction in this case.
 
If I were to implement this as a general library function and the general
solution appears to be too slow, I would create a template function which
branches to two implementations, one a fast one for small N-s using
to_ullong(), and a general one using a loop and test() for larger N-s.
Then it would be up to the caller to use small enough N if it wants the
best speed.
 
Cheers
Paavo
jononanon@googlemail.com: Jan 24 02:21AM -0800


> A possible speed bottleneck, is the masking of the following: (bs & mask_long)
 
> Why does std::bitset not have a way of chopping off unneeded higher bits, and just returning a u_long of the lower bits (without considering any higher bits that are set)? A function like: chop_ulong()
 
> That would allow one to leverage more performance, wouldn't it ?
 
For a flexible top-speed implementation, bitset would need a function get_ulong(i) which returns the i'th ulong of the inner bitset representation.
 
Why? Because then I can extract whatever part of the bitset I need, without messing with shifts.
 
e.g.
bitset<CHAR_BIT * sizeof(unsigned long) * m>
would return the appropriate ulong for get_ulong(0), get_ulong(1), ..., get_ulong(m-1); but throw an exception when calling get_ulong(idx) if idx >= m
 
In general for bitset<N>, I can then call get_ulong() for indices 0 to (N-1)/CHAR_BIT/sizeof(unsigned long)
jononanon@googlemail.com: Jan 24 02:26AM -0800

> bitset<CHAR_BIT * sizeof(unsigned long) * m>
> would return the appropriate ulong for get_ulong(0), get_ulong(1), ..., get_ulong(m-1); but throw an exception when calling get_ulong(idx) if idx >= m
 
> In general for bitset<N>, I can then call get_ulong() for indices 0 to (N-1)/CHAR_BIT/sizeof(unsigned long)
 
So the standard should enlarge the bitset class for a mechanism like that! ;)
 
Oh for those people who like to copy-and-paste code:
WORD OF WARNING ABOUT THE CODE IN THE 2ND POST:
It will fail (i.e. return wrong result), if searching for bit-state false, if N is not a multiple of CHAR_BIT*sizeof(unsigned long).
So use that technique only when searching for a bit-state true.
jononanon@googlemail.com: Jan 24 10:03AM -0800

Here's some highly illegal code, which is BLAZING FAST - this is the kind of stuff I want the C++ standard std::bitset to do for me... (because it's complex and should be kept firmly out of sight).
 
It even works perfectly on linux using gcc.
 
It is highly illegal because it uses this:
 
bitset<33> y(7);
cout << *reinterpret_cast<const ulong *>(&y) << '\n';
 
(So the libstdc++ implementation of bitset has the the internal bit representation at the right spot! Nice. https://github.com/gcc-mirror/gcc/blob/d353bf189d2bbaf4059f402ee4d2a5ea074c349f/libstdc%2B%2B-v3/include/std/bitset#L76 )
 
 
 
 
 
 
////////////////////////////////////////////////////////
#include <bitset>
#include <stdexcept>
#include <limits>
#include <climits>
#include <iostream>
#include <cassert>
#include <climits>
 
using namespace std;
 
namespace Bitset {
size_t npos = numeric_limits<size_t>::max();
};
 
 
 
/// some lookup tables
 
#if (ULONG_MAX == 4294967295UL)
 
// unsigned long has 32 bits
static_assert(sizeof(unsigned long) == 4, "unsigned long must have 4 bytes");
 
static const int MultiplyDeBruijnBitPosition[32] =
{
0, 1, 28, 2, 29, 14, 24, 3, 30, 22, 20, 15, 25, 17, 4, 8,
31, 27, 13, 23, 21, 19, 16, 7, 26, 12, 18, 6, 11, 5, 10, 9
};
 
#else
#if (ULONG_MAX == 18446744073709551615UL)
 
// unsigned long has 64 bits
static_assert(sizeof(unsigned long) == 8, "unsigned long must have 8 bytes");
 
static const int MultiplyDeBruijnBitPosition[64] =
{
0, 47, 1, 56, 48, 27, 2, 60, 57, 49, 41, 37, 28, 16, 3, 61,
54, 58, 35, 52, 50, 42, 21, 44, 38, 32, 29, 23, 17, 11, 4, 62,
46, 55, 26, 59, 40, 36, 15, 53, 34, 51, 20, 43, 31, 22, 10, 45,
25, 39, 14, 33, 19, 30, 9, 24, 13, 18, 8, 12, 7, 6, 5, 63
};
 
#else
 
static_assert(0, "strange machine architecture");
 

No comments: