soft and program: Digest for comp.lang.c++@googlegroups.com

comp.lang.c++@googlegroups.com

Google Groups

reinterpret_cast question (with correction!) - 8 Updates
Data structure for sparse mapping of values - 11 Updates
Formatting Ouput with Punctuation (Commas) - 4 Updates
Comparing Protocol Buffers and the C++ Middleware Writer - 2 Updates

reinterpret_cast question (with correction!)

"Chris M. Thomasson" <invalid@invalid.invalid>: Aug 08 09:58PM -0700

On 8/8/2016 1:33 PM, Daniel wrote:
> On Monday, August 8, 2016 at 3:38:55 PM UTC-4, Mr Flibble wrote:

>> Formally it is undefined behaviour so you shouldn't do it all.

> Sausages also have undefined behaviour.

lol. :^)

"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Aug 09 07:45AM +0200

On 08.08.2016 21:38, Mr Flibble wrote:
>> here.

>> http://stackoverflow.com/questions/14272141/is-casting-stdpairt1-t2-const-to-stdpairt1-const-t2-const-safe

> Who cares? Formally it is undefined behaviour

No.

But this isn't super-obvious.

C++11 outlines most of the valid portable reinterpretations via
§3.10/10, the paragraph that the g++ folks call the "strict aliasing
rule" because it restricts how one can portably alias an object as
having different types (i.e., reinterpretation).

The view that this is a *formal* rule with an *exhaustive* listing of
possibilities, is shot down by the 6th dash, explaining that one can
reinterpret something as

• §3.10/10-6 "an aggregate or union type that includes one of the
aforementioned types among its elements or non-static data members
(including, recursively, an element or non-static data member of a
subaggregate or contained union),"

First, regarding whether the list is exhaustive, this dash permits only
one-way reinterpretation from type A to aggregate or union type B, while
§9.2/20 supports two-way reinterpretation for the special case where A
is first member of POD class type B,

• §9.2/20 "A pointer to a standard-layout struct object, suitably
converted using a reinterpret_cast, points to its initial member (or if
that member is a bit-field, then to the unit in which it resides) and
vice versa"

So, it's not exhaustive.

Secondly, the vague language about *any* aggregate or union that just
includes, somewhere, an element with compatible type, is decidedly not
formal. One has to bring practical sound judgement to the table in order
to not conclude that reinterpreting a `double` as a `struct { int a;
double b; };` is valid. For that's what the 6th dash literally says as a
formal statement.

So, it's not formal either: it's just a vague, broad outline.

But it's what we have.

Then, using it, with the above caveats (not exhaustive, not formal) in
mind, the second dash says that an object can be reinterpreted as

• §3.10/10-6 "a cv-qualified version of the dynamic type of the object"

And this means that via §9.2/20 the non-`const` member `pair` can be
reinterpreted as a `string` (the first member), which via 3.10/10-6 in
turn can be reinterpreted as a `string const`, which via §9.2/20 or via
§3.10/10-6 can be reinterpreted as a `pair<string const, string>`.

Then, still at issue is the second item of the pair. But here 3.10/10-6
applies /directly/. So also this part is OK.

Of course it's much easier to just reason about whether the
reinterpretation makes sense :-), as I did earlier in the thread,
because that's the goal that the standard's rules are designed to allow.

> so you shouldn't do it all.

On the contrary, it can be argued that one should do this
reinterpretation as a matter of course, so as not to incur needless
copying and very inefficient dynamic allocations.

Cheers & hth.,

- Alf

Chris Vine <chris@cvine--nospam--.freeserve.co.uk>: Aug 09 11:45AM +0100

On Tue, 9 Aug 2016 07:45:59 +0200
> if that member is a bit-field, then to the unit in which it resides)
> and vice versa"

> So, it's not exhaustive.

§9.2/20 would only make §3.10/10 non-exhaustive if it permits
aliasing that §3.10/10 does not. I do not believe that is the case.

You give the example of a standard layout struct of type B having a
first member of type A, and make the point that the sixth bullet of
§3.10/10 allows one way conversion but §9.2/20 allows two way
conversion.

However, to cast from the B-type object to address its A-type member
object you are covered (so far as concerns §3.10/10) by the first
bullet, because both objects in fact exist. If the struct of type B is
fully constructed than _all_ its members, including the member of type
A, must also be fully constructed. If you cast a pointer to the B-type
object to a pointer of type A*, and then access the first member by
dereferencing that pointer of type A*, you are in fact accessing a
_real_ object (the first member) via a pointer to its own dynamic
type. Ditto if you make a cast in the opposite direction. There is no
type punning involved.

The sixth bullet point of §3.10/10 covers real type punning,
particularly (but obviously not exclusively) through unions, to pretend
that an object of one type is actually, for limited purposes, of another
type.

Chris

Mr Flibble <flibbleREMOVETHISBIT@i42.co.uk>: Aug 09 05:33PM +0100

On 09/08/2016 06:45, Alf P. Steinbach wrote:

>> Who cares? Formally it is undefined behaviour

> No.

> But this isn't super-obvious.

Wrong.

std::pair<const std::string, T> and std::pair<std::string, T> are two
UNRELATED non-POD types. One type is NOT a cv-qualified version of the
other. What you are trying to do is undefined behaviour: don't do it.

[snip]

/Flibble

"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Aug 09 07:39PM +0200

On 09.08.2016 18:33, Mr Flibble wrote:

> Wrong.

> std::pair<const std::string, T> and std::pair<std::string, T> are two
> UNRELATED non-POD types.

No, the standard specifies the members, i.e. the layout.

That's all that was used in the derivation of correctness of the cast.

> One type is NOT a cv-qualified version of the
> other.

That's pretty irrelevant, considering the derivation that you snipped
did not rely on that fact.

>What you are trying to do is undefined behaviour:

No, it's not me, and second, as you have seen proved, but snipped, it's
not UB.

> don't do it.

Well, that's your original advice, and you can't back down from that,
can you?

Have you considered that standard library implementations necessarily do
this?

Cheers & hth.,

- Alf

Mr Flibble <flibbleREMOVETHISBIT@i42.co.uk>: Aug 09 06:49PM +0100

On 09/08/2016 18:39, Alf P. Steinbach wrote:
[snip]

> Have you considered that standard library implementations necessarily do
> this?

No I haven't because they don't.

/Flibble

Daniel <danielaparker@gmail.com>: Aug 09 11:19AM -0700

On Tuesday, August 9, 2016 at 1:49:53 PM UTC-4, Mr Flibble wrote:

> > Have you considered that standard library implementations necessarily do
> > this?

> No I haven't because they don't.

Do a global search for reinterpret_cast in, say, Microsoft Visual Studio 14.0\VC\include. You'll find them all over the place, e.g.

typedef typename const _Value_type value_type;

const _Value_type& operator() (const index<_Rank>& _Index) const __GPU
{
void * _Ptr = _Access(_Read_access, _Index);
return *reinterpret_cast<value_type*>(_Ptr);
}

Mr Flibble <flibbleREMOVETHISBIT@i42.co.uk>: Aug 09 08:32PM +0100

On 09/08/2016 19:19, Daniel wrote:
> void * _Ptr = _Access(_Read_access, _Index);
> return *reinterpret_cast<value_type*>(_Ptr);
> }

That is something else entirely. I didn't say don't use
reinterpret_cast; I said this particular use of reinterpret_cast was
undefined behaviour.

/Flibble

Data structure for sparse mapping of values

bitrex <bitrex@de.lete.earthlink.net>: Aug 09 11:44AM -0400

I'm working on an embedded project that has to drive a multi-numeral 7
segment LED display. At the moment the logic is working pretty well; the
user can treat the display as a stream in the fashion of "cout" and
write lines of text to it, and the class will automatically handle the
scanning, scrolling (if the amount of text is larger than the width of
the display,etc.)

The class has an internal std::string buffer that's written to, and then
a "frame buffer" that holds the substring which is currently being
scanned to the physical digits. So when the display updates the code has
to look at each character in the buffer and determine the appropriate
output lines to drive on the current numeral.

If I were implementing the entire printable ASCII character set it would
be easy: just have array of values where each printable char input
mapped directly one-to-one to a bitfield. But I'm only implementing a
restricted subset of it, so if the values are all contiguous in a plain
old array to facilitate quick lookup there would be quite a few
unimplemented gaps taking up space. An array holding bitfields in say
0xFF hex format is also clunky for the user to modify.

At the moment I just have a naive implementation where I have a std::map
from a std::string to a pointer to an array of bools, stored in program
memory, it works but it's not ideal, since std::map has a pretty huge
memory footprint, and it seems goofy to store pointers in memory when
the size of the data structure you're actually interested in is smaller
than the width of a pointer type...

Right now it's something like this, and I'm hoping for something with
similar convenience but less memory footprint:

//.h

namespace DisplayConstants
{
extern const bool char_0[7] PROGMEM;
extern const bool char_1[7] PROGMEM;
extern const bool char_2[7] PROGMEM;
extern const bool char_3[7] PROGMEM;
extern const bool char_4[7] PROGMEM;
extern const bool char_5[7] PROGMEM;
extern const bool char_6[7] PROGMEM;
extern const bool char_7[7] PROGMEM;
extern const bool char_8[7] PROGMEM;
extern const bool char_9[7] PROGMEM;
extern const bool char_A[7] PROGMEM;
//etc
extern const bool char_space[7] PROGMEM;
extern const std::map<std::string, const bool*> character_map;
}

//.cpp

namespace DisplayConstants
{
static std::map<std::string, const bool*> create_map()
{
std::map<std::string, const bool*> m;

m["0"] = &char_0[0];
m["1"] = &char_1[0];
m["2"] = &char_2[0];
m["3"] = &char_3[0];
m["4"] = &char_4[0];
m["5"] = &char_5[0];
m["6"] = &char_6[0];
m["7"] = &char_7[0];
m["8"] = &char_8[0];
m["9"] = &char_9[0];
m["A"] = &char_A[0];
m["a"] = &char_A[0];
M[" "] = &char_space[0];
//etc
}

const bool char_0[7] PROGMEM = { 1, 1, 1, 1, 1, 1, 0 };
const bool char_1[7] PROGMEM = { 0, 1, 1, 0, 0, 0, 0 };
const bool char_2[7] PROGMEM = { 1, 1, 0, 1, 1, 0, 1 };
const bool char_3[7] PROGMEM = { 1, 1, 1, 1, 0, 0, 1 };
const bool char_4[7] PROGMEM = { 0, 1, 1, 0, 0, 1, 1 };
const bool char_5[7] PROGMEM = { 1, 0, 1, 1, 0, 1, 1 };
const bool char_6[7] PROGMEM = { 1, 0, 1, 1, 1, 1, 1 };
const bool char_7[7] PROGMEM = { 1, 1, 1, 0, 0, 0, 0 };
const bool char_8[7] PROGMEM = { 1, 1, 1, 1, 1, 1, 1 };
const bool char_9[7] PROGMEM = { 1, 1, 1, 1, 0, 1, 1 };
const bool char_A[7] PROGMEM = { 1, 1, 1, 0, 1, 1, 1 };
const bool char_space[7] PROGMEM = {0, 0, 0, 0, 0, 0, 0};
//etc

const std::map<std::string, const bool*> character_map = create_map();
}

Victor Bazarov <v.bazarov@comcast.invalid>: Aug 09 12:19PM -0400

On 8/9/2016 11:44 AM, bitrex wrote:

> At the moment I just have a naive implementation where I have a std::map
> from a std::string to a pointer to an array of bools, stored in program
> memory, [..]

Keep in mind that saving memory will likely cost you time. If speed is
not what you're after, instead of an array of bools, store a single char
value. Seven times smaller... And if you go for a static array that
maps your symbol to a char representation, you will likely only use 128
bytes (basic character set). You can only uniquely represent 128 values
with a 7-segment display anyway, no?

V
--
I do not respond to top-posted replies, please don't ask

Ben Bacarisse <ben.usenet@bsb.me.uk>: Aug 09 05:37PM +0100

> I'm working on an embedded project that has to drive a multi-numeral 7
> segment LED display.
<snip>
> be quite a few unimplemented gaps taking up space. An array holding
> bitfields in say 0xFF hex format is also clunky for the user to
> modify.

I'd try an array of pairs. Not sure if using std::pair and std::vector
is worth it, but the idea is just that you search the array
(exhaustively) for the character you are looking to display. For the
segments, I'd use C++'s binary literals:

struct segment_list {
unsigned char character; unsigned char segments;
} slist[] = {
{ '0', 0b1'11'1'11'0 },
{ '1', 0b0'11'0'00'0 },
...
{ 0, 0 }
};

(I'm using the optional 's to show something about the segment groups.)

Obviously this is not 'efficient', but that is unlikely to be much of a
problem in this sort of situation.

<snip>
--
Ben.

Christian Gollwitzer <auriocus@gmx.de>: Aug 09 06:46PM +0200

Am 09.08.16 um 17:44 schrieb bitrex:
> write lines of text to it, and the class will automatically handle the
> scanning, scrolling (if the amount of text is larger than the width of
> the display,etc.)

How many different letters do you have? The old display drivers for 7
segment displays used hard-wired logic, some Boolean functions on the
input to transform 16 digits into codes for hex display. You could do
something similar. Quine McCluskey is an algorithm to compute a minimal
expression for a single-valued Boolean. Not sure if something exists for
multiple Boolean outputs, where several McCluskeys obviously are not
minimal.

Christian

Christian Gollwitzer <auriocus@gmx.de>: Aug 09 06:50PM +0200

Am 09.08.16 um 18:46 schrieb Christian Gollwitzer:
> expression for a single-valued Boolean. Not sure if something exists for
> multiple Boolean outputs, where several McCluskeys obviously are not
> minimal.

Yet another one is the perfect hash. Usually it maps strings to indices.
gperf is an utility which can do that (create code for a list of
strings). It'll not work well for single letters, but if tweaked, then
maybe.

Christian

scott@slp53.sl.home (Scott Lurndal): Aug 09 05:08PM

>old array to facilitate quick lookup there would be quite a few
>unimplemented gaps taking up space. An array holding bitfields in say
>0xFF hex format is also clunky for the user to modify.

Given that ASCII is 7-bit, you'll need an array of 128 bitfields. If
the bitfield is 8 bits, then you're consuming 128 bytes. That's almost
in the noise. "Mind the gap" shouldn't apply in this case :-)

bitrex <bitrex@de.lete.earthlink.net>: Aug 09 01:33PM -0400

On 08/09/2016 12:46 PM, Christian Gollwitzer wrote:
> multiple Boolean outputs, where several McCluskeys obviously are not
> minimal.

> Christian

It's probably at most around 30, the ten numerals and the upper and
lowercase letters (which for a non-alphanumeric 7 segment have to be
approximated rather crudely, some uppercase and lowercase letters map to
the same value, as for example "M" and "m" is simply like a box without
the lower segment, with a bar across the top.)

bitrex <bitrex@de.lete.earthlink.net>: Aug 09 01:39PM -0400

On 08/09/2016 12:37 PM, Ben Bacarisse wrote:

> Obviously this is not 'efficient', but that is unlikely to be much of a
> problem in this sort of situation.

> <snip>

Thanks. For such a small data set, I'm guessing that simply linear or
binary searching a data structure for the required value won't be
significantly slower than using some kind of hash table like map.

bitrex <bitrex@de.lete.earthlink.net>: Aug 09 01:44PM -0400

On 08/09/2016 12:46 PM, Christian Gollwitzer wrote:
> multiple Boolean outputs, where several McCluskeys obviously are not
> minimal.

> Christian

That's a neat idea...implement it in "virtual logic" rather than a
lookup table.

scott@slp53.sl.home (Scott Lurndal): Aug 09 06:36PM

>approximated rather crudely, some uppercase and lowercase letters map to
>the same value, as for example "M" and "m" is simply like a box without
>the lower segment, with a bar across the top.)

That's 62 characters, store the bitmasks in an array of bytes
indexed starting at 0x30 through 0x7a. The number of gaps will
amount to a dozen bytes or so.

if ((character >= ASCII_ZERO)
&& (character <= ASCII_z)) {
bitmask = bitmasks[character - ASCII_ZERO];
} else {
return NO_BITMAP_FOR_CHARACTER;
}

bitrex <bitrex@de.lete.earthlink.net>: Aug 09 02:58PM -0400

On 08/09/2016 02:36 PM, Scott Lurndal wrote:
> } else {
> return NO_BITMAP_FOR_CHARACTER;
> }

One likes to feel like one is clever, but it does seem like that's the
most straightforward solution... ; )

Formatting Ouput with Punctuation (Commas)

red floyd <no.spam@its.invalid>: Aug 08 04:54PM -0700

On 8/8/2016 2:21 PM, Good Guy wrote:
[red floyd wrote] WTF do you have against plain text, dude?

> Actually there is a wonderful tool called Google that has most of the
> answers - some good and some not so good but it requires some effort to
> make them work.

Do you even READ the posts you are replying to?

"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Aug 09 06:06AM +0200

On 09.08.2016 01:54, red floyd wrote:
>> answers - some good and some not so good but it requires some effort to
>> make them work.

> Do you even READ the posts you are replying to?

It's an ad bot.

Cheers!,

- Alf

Geoff <geoff@invalid.invalid>: Aug 08 11:33PM -0700

>support such formatting, but I can't find anything that seems to show
>how to do it.
> Please advise. TIA

I don't know if this is a "good" way and it's Microsoft C++ and CRT dependent
but I wrote this quite a while ago and have not had to touch it since:

//
//============================================================================
// Turns an unsigned long long into a string with localized separators
//============================================================================
//
std::string LocalizeNumber(unsigned long long number)
{
std::string str;
TCHAR temp[64], num[64];

_ui64toa_s(number, temp, sizeof temp, 10);
GetNumberFormat(LOCALE_USER_DEFAULT, 0, temp, NULL, num, sizeof num);
str.assign ( num );
return str;
}

red floyd <no.spam@its.invalid>: Aug 09 11:23AM -0700

On 8/8/2016 9:06 PM, Alf P. Steinbach wrote:

> It's an ad bot.

Doh!!!!

How could I be so stupid to miss that! Thanks, Alf!

Comparing Protocol Buffers and the C++ Middleware Writer

Ian Collins <ian-news@hotmail.com>: Aug 09 05:35PM +1200

>> working with.

> If I understand correctly, you mean a public cloud
> wouldn't be acceptable for your projects.

Pretty much.

The build server all run inside the firewall with no internet access.

Then there's the issue of versioning the tools (protobuf for example)
with the code so the same tools are used to build legacy, supported, builds.

--
Ian

scott@slp53.sl.home (Scott Lurndal): Aug 09 12:51PM

>> working with.

>If I understand correctly, you mean a public cloud
>wouldn't be acceptable for your projects.

To build our code, we must have _all_ tools required available
internally - we could _never_ farm out code out to some third-party
service - it might not be there next week.

You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

soft and program

Tuesday, August 9, 2016

Digest for comp.lang.c++@googlegroups.com - 25 updates in 4 topics

No comments:

Blog Archive

About Me