soft and program: Digest for comp.lang.c++@googlegroups.com

comp.lang.c++@googlegroups.com

Google Groups

[QT creator on "nix"] - getting a strict 8 bit (1 byte) array with no padding - 18 Updates
Initialization of std::vector<std::string> throws - 5 Updates
How to write a helper function - 2 Updates

[QT creator on "nix"] - getting a strict 8 bit (1 byte) array with no padding

Christian Gollwitzer <auriocus@gmx.de>: Nov 22 07:26AM +0100

Am 21.11.19 um 17:20 schrieb Paavo Helde:
> than char. For example, long is 32 bits in 64-bit MSVC and 64 bits with
> 64-bit gcc on Linux.

> However, for your purposes std::uint32_t should work fine.

Why not

#pragma pack(1)
struct pixel {
uint8_t red, green, blue, alpha;
}

? That would make more sense to me than uint32_t and bit-twiddling to
get the channel values.

Christian

Keith Thompson <kst-u@mib.org>: Nov 22 12:37AM -0800

> Am 21.11.19 um 17:20 schrieb Paavo Helde:
[...]
> }

> ? That would make more sense to me than uint32_t and bit-twiddling to
> get the channel values.

Because #pragma pack is non-standard?

If you only care about implementations that support it as an extension,
that's fine, but the OP seemed very concerned about portability.

An array of uint8_t could also work:

typedef uint8_t pixel[4];
enum { red, green, blue, alpha };

Or an array wrapped in a struct to allow assignment. (An array of 4
uint8_t elements is guaranteed to be 32 bits, but a struct containing
such an array is not -- but it's a nearly safe assumption that there
will be no padding between the elements or at the end.)

--
Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */

David Brown <david.brown@hesbynett.no>: Nov 22 09:37AM +0100

On 22/11/2019 07:26, Christian Gollwitzer wrote:

> ? That would make more sense to me than uint32_t and bit-twiddling to
> get the channel values.

> Christian

If you are using "pragma pack" (or gcc "packed" attribute), you are
probably doing something wrong. It can have some uses when dealing with
file formats or network packets with unaligned fields. But most of the
cases I see - such as this one (and the OP's first suggestion) are
misunderstandings and misuses.

When you use "pragma pack", you are making code that is non-standard,
non-portable, and can cause trouble interacting with other code. (What
happens when you mix pointers to a packed version of your struct, and a
non-packed version? Are they compatible? Will your compiler warn you
of mixups? My guess is you don't know, and that is not a good thing.)

They can lead to inefficiencies - on some compilers and targets, packed
fields are accessed by byte. Even when the fields are accessed by their
full size, packed means you can get misalignments, which can have a very
significant impact on efficiency in some situations.

They can lead to restrictions in the code. For some tools, you can't
take the address of a packed field, or you might see problems using it
because the compiler assumes pointers are properly aligned and that
might not be the case for packed fields.

In short, "packed" can lead to a lot of extra complications and
muck-ups. Perhaps the most insidious aspect is that /usually/ you don't
see problems with them. Things will work while you are writing /this/
code, and testing it with /this/ compiler - the problems will turn up
later when someone else gets trouble using the code in a different
project, and won't know why.

So use "packed" only when you /really/ need to - when it is clearly the
best solution to the problem. And only when you /really/ understand
what it does, and why it might be useful.

And clearly, in this case, it is utterly pointless. Just write the
struct directly - it is impossible for "pack" to be of any help here.
And if you don't understand that, you certainly should not be using
"pack". (And you can ask here for more help in understanding, if this
post is not sufficient.)

David Brown <david.brown@hesbynett.no>: Nov 22 10:10AM +0100

On 22/11/2019 09:37, Keith Thompson wrote:
> uint8_t elements is guaranteed to be 32 bits, but a struct containing
> such an array is not -- but it's a nearly safe assumption that there
> will be no padding between the elements or at the end.)

While it is true in theory that a struct can have extra padding at the
end (beyond what is needed for an array of the struct to have proper
alignment for all members), are there any compilers that have done so in
practice?

Another option is to cover all bases with a union:

typedef union {
uint32_t combined;
uint8_t rgba[4];
struct {
uint8_t red, green, blue, alpha;
};
} pixel;

_Static_assert(sizeof(pixel) == 4, "This only works on real systems!");

(Watch out for endian issues giving different orders within these parts
- that's always a concern for portable code when matching external file
formats.)

"Öö Tiib" <ootiib@hot.ee>: Nov 22 01:37AM -0800

On Friday, 22 November 2019 10:37:59 UTC+2, David Brown wrote:
> happens when you mix pointers to a packed version of your struct, and a
> non-packed version? Are they compatible? Will your compiler warn you
> of mixups? My guess is you don't know, and that is not a good thing.)

Both gcc and clang have -Waddress-of-packed-member but AFAIK you
won't get that warning when compiling for platform on what it does not
matter.

> fields are accessed by byte. Even when the fields are accessed by their
> full size, packed means you can get misalignments, which can have a very
> significant impact on efficiency in some situations.

Huh? Packed structs usually lead to raise of storage efficiency.
Say there is struct containing a char and double. Normally it is 16 bytes
but packed it is 9 bytes. 43.75% of storage saved!
Where we have to process large array of those there it is often impossible
to predict without profiling on target platform if that will cause performance
gain or loss.

Soviet_Mario <SovietMario@CCCP.MIR>: Nov 22 11:20AM +0100

Il 22/11/19 09:37, Keith Thompson ha scritto:
> that's fine, but the OP seemed very concerned about portability.

> An array of uint8_t could also work:

> typedef uint8_t pixel[4];

intresting idea !!!

> enum { red, green, blue, alpha };

btw since here #pragma pack seems supported, I find the
"dot" member selector not only clearer (more readable) but
more amenable to "intellisense" when typing (type less, no
typos as autocompletion pops up the members).

but your solution is elegant in that uses more intrinsic
features

> Or an array wrapped in a struct to allow assignment. (An array of 4
> uint8_t elements is guaranteed to be 32 bits, but a struct containing

actually I have the struct with individual "channels"
blended in a UNION with a u_int_32_t 32 bit value, which I
use when I have to do fast memberwise copy (operator =,
copy-constructor) or even apply bitwise operators on all
members at a time (like NOT, XOR etc).
I dunno whether being aligned to the same initial address
changes anything ... after having used u_int_8_t (four
distinct members) even without the #pragma directive the
compiler does not detect further PADDING issues (a feature
that it has just proved itself to be well aware of despite
my ignoring that before)

--
1) Resistere, resistere, resistere.
2) Se tutti pagano le tasse, le tasse le pagano tutti
Soviet_Mario - (aka Gatto_Vizzato)

"Öö Tiib" <ootiib@hot.ee>: Nov 22 02:23AM -0800

On Friday, 22 November 2019 11:10:26 UTC+2, David Brown wrote:

> (Watch out for endian issues giving different orders within these parts
> - that's always a concern for portable code when matching external file
> formats.)

It usually works in practice but is undefined behavior in C++, so Bjarne
Stroustrup and Herb Sutter suggest against it in C++ Core Guidelines.

http://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Ru-pun

Soviet_Mario <SovietMario@CCCP.MIR>: Nov 22 11:31AM +0100

Il 22/11/19 09:37, David Brown ha scritto:

> If you are using "pragma pack" (or gcc "packed" attribute), you are
> probably doing something wrong. It can have some uses when dealing with
> file formats

it is exactly this second case : two programs have to
communicate and store picture files in a compact way (and if
I'll manage to, even exchange such pictures loaded in
memory) upon a single big binary "fread" from disk or save
with a single raw "fwrite" to disk.
I can't find it acceptable that the DISK layout and RAM
layout may differ as to packing aspects.
Non only for the waste of space (a large waste as pictures
could be some tens megapixels), but as in such case I would
have to read/write only single elements at a time.
Unacceptable (*).
I must be allowed to load and save fast and in a single
shot, so 1:1 mapping RAM<=>DISK.
This requirement is underogatable, even if I agree with all
other remarks you make afterwards ...

(*) well actually I do not make strictly a single load/save
as stated. First I save/load a very small descriptor with
WIDTH/HEIGHT stored as 16 bit integers, then, aware of the
array size of the buffer, alloc and fread or fwrite down.

It's a very "stupid" non standard format, but I don't need
to export this directly. When in gambas I save using its
native functions. The C++ engine is just supposed to
transform pictures and save in raw format.

--
1) Resistere, resistere, resistere.
2) Se tutti pagano le tasse, le tasse le pagano tutti
Soviet_Mario - (aka Gatto_Vizzato)

David Brown <david.brown@hesbynett.no>: Nov 22 11:40AM +0100

On 22/11/2019 10:37, Öö Tiib wrote:

> Both gcc and clang have -Waddress-of-packed-member but AFAIK you
> won't get that warning when compiling for platform on what it does not
> matter.

I wasn't saying that /I/ don't know the answer to these questions, or
that other more experienced developers don't know. I was saying that
that folks suggesting "pack" here don't know the answers. And that is
what matters - they are recommending a technique that they do not fully
understand. (If they did understand it fully, they would understand
that it is completely unnecessary in this situation.)

For the gcc warnings:

"-Waddress-of-packed-member" warns when you take the address of a packed
member, regardless of alignment and regardless of whether the target
supports unaligned data. It is enabled by default.

"-Wpacked" warns if you have made a struct "packed" but it does not
affect the size or layout of the struct. ("packed" attribute will still
affect the alignment of the struct.) It would trigger for code like the
the packed "pixel" struct, since the "packed" attribute does not affect
the structure.

Another useful warning here is "-Wpadded", which will tell you if a
struct has padding for alignment of members or the struct itself. If
"-Wpadded" is not triggered for your structs, making them "packed" is
not going to be at all useful (unless you need to change their alignment).

>> full size, packed means you can get misalignments, which can have a very
>> significant impact on efficiency in some situations.

> Huh? Packed structs usually lead to raise of storage efficiency.

No, they don't - not usually, not significantly, and not helpfully.
/Occasionally/ they can help, but more often you can get close to the
same efficiency by simply re-arranging the members more sensibly. (And
if you don't have enough members to make this work, you don't have
enough data for the memory efficiency to be relevant.)

> Where we have to process large array of those there it is often impossible
> to predict without profiling on target platform if that will cause performance
> gain or loss.

If memory efficiency or performance here is important, you would be
better off having two arrays - one of doubles, one of chars. Sure, that
would involve a little extra code - but C++ excels in letting you write
wrappers so that you have a good clean interface while keeping the
internal structures intact. And now you have all your doubles aligned,
giving significantly better cache usage and access speeds, and allowing
the possibility of SIMD vector instructions for massive improvements.
And since a structure like this is typically for a "valid, value" pair,
you now have 8 times the efficiency in dealing with invalid data over
the array since you only need to access the big doubles when the data is
valid. And if you only actually needed a single bit (bool) of the char,
you could potentially get another factor of 8 here (depending on the
ratio of valid to invalid values).

Packed /can/ be useful, but very often it is used when it is unnecessary
or directly detrimental, and often there are better ways to deal with
the situation.

David Brown <david.brown@hesbynett.no>: Nov 22 12:23PM +0100

On 22/11/2019 11:31, Soviet_Mario wrote:
> mapping RAM<=>DISK.
> This requirement is underogatable, even if I agree with all other
> remarks you make afterwards ...

You have a struct with byte data - making it packed is /useless/. There
is /never/ any advantage in that.

If the structure contains data of different sizes, then first check if
there really are padding or alignment issues. Most structures used for
file formats are "naturally aligned" - the people who designed the
structures are smart enough to make sure that every 2-byte data item is
at an even address, every 4-byte data item is at a multiple of 4
address, and so on. There are exceptions, especially for some older
formats from the days of 16-bit cpus, but it is often the case.

And if it is the case here, then making it "packed" is useless at best,
and can easily be a significant disadvantage.

All of what you have written here is "premature optimisation" - it is
knee-jerk thinking "I've got a file structure, I must use packed" rather
than looking at what is actually relevant or helpful. At the moment,
you haven't the faintest clue as to whether using a packed struct makes
a real and measurable difference that is worth the cost - even if we
assume the file format details are not aligned. You haven't done any
timings of real-world usage of your system to see if this is a
bottleneck. You haven't tested alternatives like "mmap", or existing
libraries. You haven't compared the time it takes to load or save the
file with the time it takes to do your processing. You haven't compared
the time it takes to pack or unpack the format to the time it takes to
read or write it from the disk. You haven't compared the time the
processing takes for a "packed" struct to how it would run with an
unpacked struct.

(Yes, I realise I have made a lot of claims here about what you have and
haven't done. From your posts, I think they are almost certainly true.
If I am wrong there, let me know.)

If you want to tell me that using "pack" makes your development process
simpler and the code easier, that's fine - I'll believe that, and it is
a perfectly good reason for using it. But if you try to tell me you are
doing it for run-time speed reasons, then you are wrong. Measure first,
find out if it really is too slow, find your /real/ bottlenecks, /then/
start looking for ways to improve the throughput.

> It's a very "stupid" non standard format, but I don't need to export
> this directly. When in gambas I save using its native functions. The C++
> engine is just supposed to transform pictures and save in raw format.

So don't make stupid formats. Make /simple/ formats, and non-standard
formats, if that is what is best for the job. But don't make a stupid
format. If the fields in the struct are not properly ("naturally")
aligned, change the format so that they are. That will help the gambas
side, help the C++ side, and mean you don't have to faff around with
"pack" extensions /or/ with extra packing and unpacking stages.

David Brown <david.brown@hesbynett.no>: Nov 22 12:38PM +0100

On 22/11/2019 11:23, Öö Tiib wrote:

> It usually works in practice but is undefined behavior in C++, so Bjarne
> Stroustrup and Herb Sutter suggest against it in C++ Core Guidelines.

> http://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Ru-pun

You are right - I was thinking in C terms, where this /is/ defined
behaviour. For C++, you'd need to put it all in a class and make
appropriate accessor functions. On the plus side, that would let you
make an endian-independent solution, and if it is done well the
compiler's optimiser will eliminate all overhead.

On some compilers, like gcc, the union type-punning is guaranteed to
work in C++ too. But then the code is not portable.

Soviet_Mario <SovietMario@CCCP.MIR>: Nov 22 01:14PM +0100

Il 22/11/19 12:23, David Brown ha scritto:
> than looking at what is actually relevant or helpful. At the moment,
> you haven't the faintest clue as to whether using a packed struct makes
> a real and measurable difference that is worth the cost -

sure I have no idea, I put the pragma there just for
"precaution"
what cost is that supposed to have ?

> timings of real-world usage of your system to see if this is a
> bottleneck. You haven't tested alternatives like "mmap", or existing
> libraries.

I'm writing from scratch almost, but It will be a very
simple library

> You haven't compared the time it takes to load or save the
> file with the time it takes to do your processing.

sure, but I'll have to do both in any case :)
It's a non-sense comparison though
I just thought of
read/write all-at-once vs pixel by pixel (and choose I)

no use in knowing if I spend more time applying
transformations on the loaded picture vs saving or loading
it. They are sequential mandatory operations

> You haven't compared
> the time it takes to pack or unpack the format to the time it takes to

chosen not to pack/unpack anything actually, as I'm hoping
the RAM / disk layout are the very same raw binary.
I was asking help in making sure this was (not if it was
useful or not)

> read or write it from the disk. You haven't compared the time the
> processing takes for a "packed" struct to how it would run with an
> unpacked struct.

forgive me : how to read an all packed file in an array of
UNpacked data of different layout, in a single read
operation ? I can't figure out this

> (Yes, I realise I have made a lot of claims here about what you have and
> haven't done. From your posts, I think they are almost certainly true.
> If I am wrong there, let me know.)

no problem, most are right.
Some are irrelevant imho

> If you want to tell me that using "pack" makes your development process
> simpler and the code easier, that's fine

I don't know. Just searching a way to do binary read/write
in one single shot directly to/from RAM without any conversions.
And have a RAM layout simple enough not to require bitwise
masking to locate fields (the drawback of a pure "bit
fields" solution)

> doing it for run-time speed reasons, then you are wrong. Measure first,
> find out if it really is too slow, find your /real/ bottlenecks, /then/
> start looking for ways to improve the throughput.

I don't have any running stuff by now, as I'm still in the
design time.
But I could not redesign the BASIC disk and data structure
later, so i MUST make some premature optimization now as it
affects the very base of the pillar.

>> this directly. When in gambas I save using its native functions. The C++
>> engine is just supposed to transform pictures and save in raw format.

> So don't make stupid formats. Make /simple/ formats,

I meant the same actually.
I store 16 bit width, 16 bit height, and then a compact
rectangular array o 32 bit pixels (8+8+8+8 for 3 color
channels + transparency) per each pixel.

it is is not completely "naturally aligned", but the header
and the "body" are placed apart, in memory, and read/written
in distinct file operations.

So I ask you, when I call NEW operator for an array of
"pixel" structs, is it naturally aligned or not ?

> and non-standard
> formats, if that is what is best for the job. But don't make a stupid
> format.

I called it stupid in place of the simplest as possible.
There are no "tags" included as this is only an "internal"
(and personal) format for intermediate raw data.
It does not seem to be more sophisticated than this.

--
1) Resistere, resistere, resistere.
2) Se tutti pagano le tasse, le tasse le pagano tutti
Soviet_Mario - (aka Gatto_Vizzato)

"Öö Tiib" <ootiib@hot.ee>: Nov 22 04:55AM -0800

On Friday, 22 November 2019 12:40:21 UTC+2, David Brown wrote:
> same efficiency by simply re-arranging the members more sensibly. (And
> if you don't have enough members to make this work, you don't have
> enough data for the memory efficiency to be relevant.)

But I did nowhere say that the effect of raised storage efficiency is
guaranteed to be significant or helpful anyhow? I only said that
there usually is that effect. Period. Lack of that effect is less usual.
Careful design and arrangement of data in way that padding is
minimal is effort and so is rarely invested. But what you say about
lot of members clearly contradicts with example that I brought in
next sentence.

> > gain or loss.

> If memory efficiency or performance here is important, you would be
> better off having two arrays - one of doubles, one of chars.

It is third option that will save memory like packing on this example
It is worth to try out as well but unfortunately it is also not a silver
bullet.
It can even sometimes perform worse than packed structs depending
on platform it runs on, on context and what kind of processing is
performed on that data. Trying that alternative out may involve way
more significant refactorings than adding two pragmas.

> internal structures intact. And now you have all your doubles aligned,
> giving significantly better cache usage and access speeds, and allowing
> the possibility of SIMD vector instructions for massive improvements.

Those are also available only on some (I agree that common) platforms
and usefulness of those assumes that by "processing" we did mean
(I also agree that those are common) sequential for loops.

> valid. And if you only actually needed a single bit (bool) of the char,
> you could potentially get another factor of 8 here (depending on the
> ratio of valid to invalid values).

Validity is more commonly encoded into values of doubles (in typical
64-bit IEEE 754 doubles there is huge supply of NaNs plus two Infs)
so I assumed that the char under question carries more than one bit
of information in it.

> Packed /can/ be useful, but very often it is used when it is unnecessary
> or directly detrimental, and often there are better ways to deal with
> the situation.

Correct. I wrote nowhere something that contradicts with it.
I only said that it is often impossible to predict without profiling on
target platform if the effect of those is positive or negative as
answer to your claim that these can lead only to significant
negative effects. Also it is often cheapest alternative to try
and test what the two pragmas added to code cause.

David Brown <david.brown@hesbynett.no>: Nov 22 03:04PM +0100

On 22/11/2019 13:14, Soviet_Mario wrote:
>> a real and measurable difference that is worth the cost -

> sure I have no idea, I put the pragma there just for "precaution"
> what cost is that supposed to have ?

I've told you in several posts what the costs are. "Packed" is /not/ a
"precaution", nor is using a struct with a bitfield instead of uint8_t.
If you don't know what you are doing, fine - if I understand you
correctly, you are experienced with gambas programming but new to C++.
So /ask/ (and please read the answers). Don't guess, or grab the first
thing that you find from a google search. And when the more experienced
people in a group like this all say "don't use packed" and "use
uint8_t", then that is what you should do. Ask /why/, and try to learn
- and once you do understand "pack", when it can be useful, when it is
useless, when it is worse than useless, what it does, and what it costs
- /then/ you can decide that because you know the full situation better
than we do, you still want to use it against general advice.

"Packed" is not a precaution - it is a last choice solution when you
have nothing better and know you can't write the code as well without
it. (That might mean you can't write the code as quickly or clearly
without it, which is fair enough - /if/ you are sure it is necessary.)

> no use in knowing if I spend more time applying transformations on the
> loaded picture vs saving or loading it. They are sequential mandatory
> operations

Of course there is use in knowing this, if speed is an issue. If the
processing takes 1000 times as long as the loading and saving, then you
know that putting the slightest effort into speed optimisation for file
access is a waste of effort - your time would be better spent trying to
shave 0.1% off the processing time than increasing the speed of the file
access by orders of magnitude.

And if your file format contains non-aligned members, then the simplest
and easiest way to improve and simplify the code, and improve the speed
of several parts, is to fix the file format. Make sure the members are
aligned (add explicit dummy members if necessary) - then forget you ever
heard of "pragma pack".

> chosen not to pack/unpack anything actually, as I'm hoping the RAM /
> disk layout are the very same raw binary.
> I was asking help in making sure this was (not if it was useful or not)

No one can tell you if it is the same without knowing the file format.

There is nothing wrong with having the ram and disk layout the same -
often that is a very convenient and efficient choice. But if the disk
layout is not directly usable in ram, then it often is /not/ a
convenient and efficient choice - and if it /is/ usable directly in ram,
you don't need or want "pack".

However you look at it, "pack" is usually (but not always) the wrong
choice. You should only consider it once you have eliminated better
alternatives.

As for asking for help here, you are getting the help you need even if
it is not exactly what you asked for. It's as though you said "I need a
new toothbrush. Can you tell me how to get to the toy shop?" It is
conceivable that a toy shop might have toothbrushes, but it is very
unlikely to be a good choice. Would you not rather have people say "You
don't want to go to a toy shop. You want to go to a chemist - this is
how you get there" ?

>> unpacked struct.

> forgive me : how to read an all packed file in an array of UNpacked data
> of different layout, in a single read operation ? I can't figure out this

First, you take a step back. You look to see if the packed file really
is "packed", in the sense of having unaligned data that would have
padding in a normal struct. I have seen no answer to this (unless it is
later in this post), despite asking this /crucial/ question.

If the format is already good, you are done. One memcpy, or read from
the file directly to a suitable buffer, or memmap to the file directly
(no reading or writing needed).

If the format is bad, the next step is to consider changing the format.
This would be good for all parts of the system. When the format is
changed, you are done. Again, I have had no answer to whether this is
possible.

If the format is bad, and cannot be fixed, then you have to write reader
and writer functions that unpack and pack the data into the right
formats. Reader and writer functions are common for all sorts of
reasons - you already have something to handle starting with the file
header and then allocating memory depending on the size. It is very
normal to have a function that reads in the data format and uses it to
populate an internal format. Perhaps your file format uses 8-bit
unsigned data for colour channels, but your internal format uses
floating point because that suits your processing needs. Perhaps the
file format uses run-length encoding for smaller files, and you want to
expand it to a raw array. There are all sorts of things that go on in
typical reader functions. Making them a "single operation", whatever
that might mean, is rarely relevant - code that uses these functions has
a "single operation" of calling the reader function.

>> If I am wrong there, let me know.)

> no problem, most are right.
> Some are irrelevant imho

Perhaps - but not necessarily the ones you think are irrelevant. IMHO,
of course.

(And to be clear - I did not right this to be rude, or insulting, or
demoralising, and I hope it did not come across that way. I am writing
to push you to think about things that are important here, but which I
feel you are skipping. The aim is that you learn more, and that you can
write better code that is clear, correct and efficient without wasting
effort on poorer techniques.)

>> simpler and the code easier, that's fine

> I don't know. Just searching a way to do binary read/write in one single
> shot directly to/from RAM without any conversions.

Doing this all "in one single shot" should not be an aim. Doing it in a
way that makes the code simple and clear, easy to develop, easy to read
and easily correct should be your aim. Quite possibly your real goal
would be easiest achieved by using a standard format (like PNG) and
existing libraries for gambas and C++. Call the one function - simple
and clear.

> And have a RAM layout simple enough not to require bitwise masking to
> locate fields (the drawback of a pure "bit fields" solution)

Why would you want to bring bitfields into this?

> But I could not redesign the BASIC disk and data structure later, so i
> MUST make some premature optimization now as it affects the very base of
> the pillar.

Fair enough. So get it /right/ by designing the right data structure
/now/, rather than picking a bad structure from the start.

> it is is not completely "naturally aligned", but the header and the
> "body" are placed apart, in memory, and read/written in distinct file
> operations.

So make it "naturally aligned". What will be the cost? A couple of
extra unsused padding bytes, added manually?

struct pixel {
uint8_t red, green, blue, alpha;
};

struct picture_format {
uint64_t key; // Always 0x0064727574636950ull
uint8_t file_type_version;
uint8_t padding1;
uint16_t height;
uint16_t width;
uint16_t padding2;

pixel pixeldata[];
};

Pick an alignment for the structure - 32-bit or 64-bit are good choices.
Start with a member of that size for convenience - I like a constant
marker to make it easy to tell that you are using the right file type.
(The value here is the string "Picture".) Make sure everything is
aligned sensibly. Add padding members manually and explicitly to keep
it clear - then there is never any doubt. (These may later be re-used
for extra information in later file format versions.) Make sure
separate blocks, such as the header and the data section, each are
aligned to the main large alignment for efficiency.

> So I ask you, when I call NEW operator for an array of "pixel" structs,
> is it naturally aligned or not ?

Yes, as long as you don't faff about with "packed" stuff. Then you
don't know.

> There are no "tags" included as this is only an "internal" (and
> personal) format for intermediate raw data.
> It does not seem to be more sophisticated than this.

Then it is inexcusable to have a format that needs "packed". That just
means totally pointless complications, inefficiencies, and non-portable
code.

If you have learned nothing else from this conversation, hopefully you
have learned that much.

David Brown <david.brown@hesbynett.no>: Nov 22 03:18PM +0100

On 22/11/2019 13:55, Öö Tiib wrote:

> But I did nowhere say that the effect of raised storage efficiency is
> guaranteed to be significant or helpful anyhow? I only said that
> there usually is that effect. Period. Lack of that effect is less usual.

Fair enough.

> Careful design and arrangement of data in way that padding is
> minimal is effort and so is rarely invested.

I would expect anyone who cares about the storage efficiency to put in
that effort.

> But what you say about
> lot of members clearly contradicts with example that I brought in
> next sentence.

Yes, but see below about that one.

> It is third option that will save memory like packing on this example
> It is worth to try out as well but unfortunately it is also not a silver
> bullet.

If programming was full of silver bullet solutions, it would not be
nearly as much fun :-)

> Those are also available only on some (I agree that common) platforms
> and usefulness of those assumes that by "processing" we did mean
> (I also agree that those are common) sequential for loops.

Of course. It is only a possibility, not something that is always useful.

> 64-bit IEEE 754 doubles there is huge supply of NaNs plus two Infs)
> so I assumed that the char under question carries more than one bit
> of information in it.

I was thinking of something roughly like a std::optional<double>, but
where you want greater storage efficiency.

Of course there are endless use-cases for structures similar to this,
and many ways to handle it. I am just showing some ideas that can often
be better than throwing "packed" into the mix (while still accepting
that /sometimes/ "packed" is the best choice).

> answer to your claim that these can lead only to significant
> negative effects. Also it is often cheapest alternative to try
> and test what the two pragmas added to code cause.

It might be that there has been a little mixup about which claims
referred to which uses of "packed" (without saying who has mixed things
up, as it was likely to be me). In general, I agree that "packed" may
have positive effects, negative effects, or no effects, and any effects
may be significant or insignificant. I say there are usually, but not
always, alternatives that could do better. But I agree that "packed" is
often an easy choice from the development viewpoint.

However, I would say it is almost always a bad choice to start off with
a design that /requires/ "packed" - the OP has no reason at all for
doing that.

The case where I said there could never be any positive effects, but
might be negative effects, was a poster who suggested "packing" a struct
consisting of 4 uint8_t members. There, I think we can agree, it could
never lead to increased storage or runtime efficiency on any real
platform. (There is, I believe, a very hypothetical possibility of the
struct having extra padding.) And it is certainly possible that on some
platforms with some compilers, there could be negative effects.

scott@slp53.sl.home (Scott Lurndal): Nov 22 03:16PM

>Huh? Packed structs usually lead to raise of storage efficiency.
>Say there is struct containing a char and double. Normally it is 16 bytes
>but packed it is 9 bytes. 43.75% of storage saved!

Of course, accesses become less efficient (particularly if the
unaligned accesses trap to the kernel for resolution).

"Öö Tiib" <ootiib@hot.ee>: Nov 22 07:23AM -0800

On Friday, 22 November 2019 16:18:15 UTC+2, David Brown wrote:
> > of information in it.

> I was thinking of something roughly like a std::optional<double>, but
> where you want greater storage efficiency.

There we slightly mixed up in our thinkings ... I myself imagined
something like array of "quantities" where double is value and char
selects its magnitude (or physical unit) from limited set of such.

> and many ways to handle it. I am just showing some ideas that can often
> be better than throwing "packed" into the mix (while still accepting
> that /sometimes/ "packed" is the best choice).

Yes, and splitting struct between different arrays can be OK for
storage optimization. But it can also increase cache misses
because locality of members of object is now gone ... needs to
be profiled.

> However, I would say it is almost always a bad choice to start off with
> a design that /requires/ "packed" - the OP has no reason at all for
> doing that.

I also replied to OP that his unsigned bitfields in anonymous unions in
packed structs can only harm compared to good old uint8_t.

> platform. (There is, I believe, a very hypothetical possibility of the
> struct having extra padding.) And it is certainly possible that on some
> platforms with some compilers, there could be negative effects.

True. Benefits are only fictional but real bad effects
can emerge in practice. Such code looks confused and so
experienced reader becomes careful ... hmm I bet here
will be major crocodiles somewhere. ;)

David Brown <david.brown@hesbynett.no>: Nov 22 04:34PM +0100

On 22/11/2019 16:23, Öö Tiib wrote:

> There we slightly mixed up in our thinkings ... I myself imagined
> something like array of "quantities" where double is value and char
> selects its magnitude (or physical unit) from limited set of such.

OK. The gains from having these as separate arrays (rather than packed
structs) might not be as much, but they could still be relevant.

> storage optimization. But it can also increase cache misses
> because locality of members of object is now gone ... needs to
> be profiled.

You still have as much locality, especially when iterating over the
arrays, just split into two blocks. Random access might be worse - you
will always have to pull in two cache lines, rather than only sometimes
needing to do so. (But if you are doing mainly random access and speed
is important, unpacked structs will be best.) Misalignment penalties
for the packed struct can vary from nothing to a great deal, depending
on the target platform.

Yes - you need to profile and test.

Initialization of std::vector<std::string> throws

Ralf Goertz <me@myprovider.invalid>: Nov 22 12:12PM +0100

Am Wed, 4 Sep 2019 14:33:39 +0200

> > { { /* something */ } } introduces a braced initialization list with
> > one value.

> But didn't we need to use two braces for container intialization?

Since nobody has explained that satisfactorily yet (also see Tim's
answer to Pavel) and I just stumbled over it again, I'd like to follow
up. In the following situation double braces are necessary. But why?
This kind of problem led to my original belief that for containers they
were always needed.

#include <vector>
#include <iostream>

struct Foo {
std::vector<int> v;
Foo(std::vector<int> v_=std::vector<int>()) : v(v_){}
};

int main() {
Foo f{{4,5}}; //double braces needed
for (auto r:f.v)
std::cout<<r<<std::endl;
return 0;
}

With single braces I get:
problem.cc: In function 'int main()':
problem.cc:10:14: error: no matching function for call to 'Foo::Foo(<brace-enclosed initializer list>)'
10 | Foo f{4,5};
| ^
problem.cc:6:14: note: candidate: 'Foo::Foo(std::vector<int>)'
6 | Foo(std::vector<int> v_=std::vector<int>()) : v(v_){}
| ^~~
problem.cc:6:14: note: candidate expects 1 argument, 2 provided
problem.cc:4:8: note: candidate: 'Foo::Foo(const Foo&)'
4 | struct Foo {
| ^~~
problem.cc:4:8: note: candidate expects 1 argument, 2 provided
problem.cc:4:8: note: candidate: 'Foo::Foo(Foo&&)'
problem.cc:4:8: note: candidate expects 1 argument, 2 provided

Chris Vine <chris@cvine--nospam--.freeserve.co.uk>: Nov 22 01:44PM

On Fri, 22 Nov 2019 12:12:43 +0100
> problem.cc:4:8: note: candidate expects 1 argument, 2 provided
> problem.cc:4:8: note: candidate: 'Foo::Foo(Foo&&)'
> problem.cc:4:8: note: candidate expects 1 argument, 2 provided

I do not have all the previous exchanges available so I may be
repeating what others have said, but the reason why single braces fail
is because you are passing two arguments to a conversion constructor of
Foo taking one (or no) argument. As to why double braces work, my guess
is that (ignoring elision) the inner braces perform an implicit
conversion to std::vector<int> via std::vector's constructor taking an
initializer_list, constructing a temporary vector with two elements in
the vector, and that this then initializes Foo, using std::vector's move
constructor.

But as I say, that is a guess.

Ralf Goertz <me@myprovider.invalid>: Nov 22 04:03PM +0100

Am Fri, 22 Nov 2019 13:44:09 +0000
> initializer_list, constructing a temporary vector with two elements in
> the vector, and that this then initializes Foo, using std::vector's
> move constructor.

Okay I guessed that much, but why can't the compiler do the latter when
there is only one pair of braces? The error message clearly indicates
that an initializer list is considered.

By the way my original question was why a double brace initializer list
doesn't work for

std::vector<std::string> pt({{"1","2"}});

(it compiles but throws a std::length_error because the arguments are
taken as pointers) but works perfectly for

std::vector<int> pt({{1,2}});

On the other hand

std::vector<std::string> pt({{"1","2","3"}});

again works as expected (with or without the parenthesis) which I just
figured out. So the two string element case fails only because the
compiler picks the wrong constructor. I find that odd because the
initializer list seems to be the more obvious choice…

Bo Persson <bo@bo-persson.se>: Nov 22 04:22PM +0100

On 2019-11-22 at 16:03, Ralf Goertz wrote:
> figured out. So the two string element case fails only because the
> compiler picks the wrong constructor. I find that odd because the
> initializer list seems to be the more obvious choice…

It is not *that* obvious to the compiler, as it tries to match an
initializer_list<char> and you supply two char*. Unfortunately that is a
better match for std::string(iterator, iterator), but then of course
fails at runtime as it is not a valid iterator pair.

For *any* number of strings other than 2, the std::string(iterator,
iterator) will immediately be ruled out, so works much better.

You would have a better chance to match the vector's initializer_list
constructor if you actually provided som std::string literals:

std::vector<std::string> pt({{"1"s,"2"s}});

Bo Persson

"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Nov 22 04:30PM +0100

On 04.09.2019 14:20, Sam wrote:

>> terminate called after throwing an instance of 'std::length_error'
>> what(): basic_string::_M_create

> Because this is undefined behavior.

That's the short explanation, but it would be even more informative to
mention why.

Like,

#include <iostream>
#include <string>
#include <vector>
using namespace std;

auto main()
-> int
{
const auto& data = "Hello";
std::vector<std::string> v( {{&data[0], &data[4]}} );
std::cout << v.front() << "!" << endl;
}

Result:

Hell!

I.e. it uses the `string` constructor that takes two iterators.

>> whereas neither

>> std::vector<std::string> pt({{"One"}});

This looks like it would use the `string` constructor that takes a
pointer to C string.

> If you actually examine the end result the last declaration, you will
> discover, to your surprise, a std::vector<int> with just one value, and
> that would hopefully be a big honking clue as to what this is doing.

No, this one is more tricky.

Apparently it uses the move constructor of `vector` to take a temporary
`vector` constructed from the inner braces as an `initializer_list`.

Disclaimer: the results are consistent with this explanation, and there
doesn't seem to be any other explanation, but I haven't debugged.

> yourself in a foot. It would be nice if it did, and in many cases it
> does; but you can't rely on your compiler to keep you from shooting
> yourself in a foot.

The problem is not the compiler or programmer. The problem IMO is some
political shenanigans in the C++ standardization committee, where they
chose to sabotage Bjarne's vision of uniform initialization by making
initializer lists take precedence in overload resolution. I prefer this
explanation of academics' childish evilness, to sheer incompetence at
that level, because the explanation of incompetence is too frightening.

The political view has so far, to my knowledge, yielded accurate
predictions and postdictions.

For example, the political view of childish sabotage indicates that the
filesystem::path UTF-8 functionality /will/ be sabotaged for Windows in
C++20, along with use of `u` prefix string literals. The `u` literals
don't make much sense in pure *nix-specific code, because literals are
UTF-8 encoded there anyway, and so the change in type doesn't matter
much there. But in Windows, those unfortunate souls who have tried to
embrace the UTF-8 world via `u` literals and filesystem::path, will have
some work to do, including implementing their own UTF-16 to UTF-8
conversion for getting an UTF-8 representation of a path.

- Alf

How to write a helper function

"Öö Tiib" <ootiib@hot.ee>: Nov 21 04:02PM -0800

On Thursday, 21 November 2019 20:48:37 UTC+2, Adnan Rashid wrote:
> sixLbrApps.Add (sixLbr);
> sixLbrApps.Start(Seconds(0));

> How can I write a helper function for the above code? So I can reuse the above functionality by calling the function.

If you need help about concrete library (I guess) like
<https://www.nsnam.org/> then please contact directly the
manufacturer or forum or fan-club of that library. For me
(if to try to be straightly honest) the whole product feels like "dirt"
or "digital waste", sorry. Usenet group comp.lang.c++ is meant to
discuss standard C++ and so such odd libraries are not topical
here.

Jorgen Grahn <grahn+nntp@snipabacken.se>: Nov 22 06:56AM

On Fri, 2019-11-22, Öö Tiib wrote:
> manufacturer or forum or fan-club of that library. For me (if to try
> to be straightly honest) the whole product feels like "dirt" or
> "digital waste", sorry.

The API looks ugly, but maybe the actual simulation is good?

> Usenet group comp.lang.c++ is meant to
> discuss standard C++ and so such odd libraries are not topical
> here.

I think the OP is really asking about basic C++ programming, not so
much the library. It's still difficult to come up with a good reply
though ... he can wrap the code in

void helper()
{
// the code above
}

but that will just create and start sixLbrApps and then (I guess)
immediately destroy it.

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .

You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

soft and program

Friday, November 22, 2019

Digest for comp.lang.c++@googlegroups.com - 25 updates in 3 topics

No comments:

Blog Archive

About Me