Tuesday, September 29, 2020

Digest for comp.lang.c++@googlegroups.com - 25 updates in 1 topic

"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Sep 28 05:45PM -0700

On 9/28/2020 11:07 AM, Keith Thompson wrote:
> with the resulting pointer doesn't know that it's misaligned. That's
> not much of an issue in x86, where misaligned accesses (usually?) just
> impose a speed penalty,
 
Iirc, misaligned access with a LOCK prefix issues a full bus lock
instead of locking a cache line. Check this out:
 
https://blogs.oracle.com/dave/qpi-quiescence
 
 
 
Richard Damon <Richard@Damon-Family.org>: Sep 28 10:35PM -0400

On 9/28/20 2:31 PM, olcott wrote:
> smallest. Certainly every compiler could compare the need for padding of
> the specified version with the sorted version and then know whether or
> not sorting reduces padding requirements.
 
I believe the standard requires that the order of the members in the
struct has to match the order they are declared in the struct, at least
as long as a access specifier doesn't exist between them.
 
That rule comes from C, without the exception since C doesn't have
access specifiers. As you alluded to elsewhere, allowing rearangement
can cause all sorts of issues with code that makes some otherwise fairly
safe assumptions. I suspect that the rule came out because the early
compilers were smart enough to rearrange, then a lot of code was built
with that assumption, and it became effectively impossible to safely
rearrange so it was defined that it couldn't.
Keith Thompson <Keith.S.Thompson+u@gmail.com>: Sep 28 10:04PM -0700

> can cause all sorts of issues with code that makes some otherwise fairly
> safe assumptions. I suspect that the rule came out because the early
> compilers were smart enough to rearrange, then a lot of code was built
 
Did you mean *weren't* smart enough to rearrange?
 
 
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */
olcott <NoOne@NoWhere.com>: Sep 29 12:34AM -0500

On 9/29/2020 12:04 AM, Keith Thompson wrote:
>> safe assumptions. I suspect that the rule came out because the early
>> compilers were smart enough to rearrange, then a lot of code was built
 
> Did you mean *weren't* smart enough to rearrange?
 
No he meant "were" smart enough, yet this caused problems so they had to
make a standard. It doesn't take much intelligence to sort fields by
size and add a few padding bytes at the end. They (apparently) made it a
standard to not change the specified order to have cross compiler
consistency.
 
 
--
Copyright 2020 Pete Olcott
Jorgen Grahn <grahn+nntp@snipabacken.se>: Sep 29 05:52AM

On Tue, 2020-09-29, Keith Thompson wrote:
>> safe assumptions. I suspect that the rule came out because the early
>> compilers were smart enough to rearrange, then a lot of code was built
 
> Did you mean *weren't* smart enough to rearrange?
 
He must have.
 
>> with that assumption, and it became effectively impossible to safely
>> rearrange so it was defined that it couldn't.
 
If that's what happened, it's a really bad combination: letting bad
code prevent important optimizations. I can see no reason for
portable code to know the struct layout (except perhaps that you can
cast between &foo and &foo.first_element).
 
One could easily imagine each ABI defining struct layout, with
rearrangements, in such a way that waste is minimized.
 
/Jorgen
 
--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .
olcott <NoOne@NoWhere.com>: Sep 29 12:59AM -0500

On 9/29/2020 12:52 AM, Jorgen Grahn wrote:
 
> One could easily imagine each ABI defining struct layout, with
> rearrangements, in such a way that waste is minimized.
 
> /Jorgen
 
It makes sense to have the compilers conform to a consistent standard
across compilers.
 
 
--
Copyright 2020 Pete Olcott
"Öö Tiib" <ootiib@hot.ee>: Sep 28 11:23PM -0700

On Tuesday, 29 September 2020 08:53:04 UTC+3, Jorgen Grahn wrote:
> cast between &foo and &foo.first_element).
 
> One could easily imagine each ABI defining struct layout, with
> rearrangements, in such a way that waste is minimized.
 
Clang's -Wpadded warns that there is padding but for some reason fails
to make difference if reordering would help with something or
would just put same padding at end of struct. So even very modern
compiler can easily be relatively dim-witted about the issue.
Tim Woodall <news001@woodall.me.uk>: Sep 29 06:33AM

> enough to know that it needs to do some bitwise operations to isolate
> the required field when two or more fields are loaded by one aligned
> memory access.
 
What about a struct of array members?
 
struct {
short[3]
short
short[3]
short
char[3]
char
char[3]
char
};
 
The Bin Packing Problem is NP-hard. ISTM that the alignment constraints
effectively make optimizing the arrangement of members in the struct an
equivalent problem.
David Brown <david.brown@hesbynett.no>: Sep 29 09:57AM +0200

On 28/09/2020 19:05, olcott wrote:
> really good idea.
 
> Although compilers could be smart enough to do this, they must refrain
> just in case the order of the fields must correpond to disk storage.
 
Compilers /can/ re-arrange struct fields - as long as it does not affect
the observable behaviour of the code. (Writing the struct directly out
to a file would be observable behaviour.) For a while, gcc had an
optimisation that did such re-arrangements, but it got removed as it was
overly complex and unmaintainable in the face of link-time optimisation.
For local structs within a function, compilers certainly can and do
break up and re-arrange fields.
"Öö Tiib" <ootiib@hot.ee>: Sep 29 04:49AM -0700

On Tuesday, 29 September 2020 05:35:36 UTC+3, Richard Damon wrote:
> compilers were smart enough to rearrange, then a lot of code was built
> with that assumption, and it became effectively impossible to safely
> rearrange so it was defined that it couldn't.
 
The problem is that there are two orthogonal considerations: 1) member
order in memory and 2) member initialisation order. Both are expressed
with declaration order in C++. When optimal of one differs from optimal
of other then we do not have semantics to express it.
 
Other a nit ... C++ standard says that members with same access control
are ordered by declaration order (specifiers do not necessarily change
access control).
Richard Damon <Richard@Damon-Family.org>: Sep 29 08:14AM -0400

On 9/29/20 1:04 AM, Keith Thompson wrote:
>> safe assumptions. I suspect that the rule came out because the early
>> compilers were smart enough to rearrange, then a lot of code was built
 
> Did you mean *weren't* smart enough to rearrange?
Yes.
Richard Damon <Richard@Damon-Family.org>: Sep 29 08:28AM -0400

On 9/29/20 1:52 AM, Jorgen Grahn wrote:
 
> One could easily imagine each ABI defining struct layout, with
> rearrangements, in such a way that waste is minimized.
 
> /Jorgen
 
The first big problem is that if you reorder, the reordering MUST be
consistent in every compilation unit, and ideally you want different
compilers to come up with the same result for the same machine.
 
Second, remember that most C code, especially early on want designed to
be totally portable, just fairly portable. It was very common that piece
of most programs would have a lot of assumptions on the machine that was
targeted. If you were doing it well, you tried to isolate most of these
assumptions to just a few files marked with portability comments.
Structs WERE often used to map to hardware registers or file formats,
and the programmer would maybe need to adjust things to map to their
hardware. (Remember, fixed width types are sort of new). Sometimes,
particularly for file (or other communication) formats, you would
realize that your hardware didn't map well to what the format was based
on, and you needed to add a layer to convert to raw bytes (endian issues
or misaligned fields being the biggest example). If compilers could
re-arrange structs, then the odds of being able to match a struct to an
externally defined format drops significantly, forcing going to the low
level techniques.
olcott <NoOne@NoWhere.com>: Sep 29 10:51AM -0500

On 9/29/2020 1:23 AM, Öö Tiib wrote:
> to make difference if reordering would help with something or
> would just put same padding at end of struct. So even very modern
> compiler can easily be relatively dim-witted about the issue.
 
No, apparently not at all. Standards complying compilers are not allowed
to reorder the fields.
 
--
Copyright 2020 Pete Olcott
olcott <NoOne@NoWhere.com>: Sep 29 10:56AM -0500

On 9/29/2020 1:33 AM, Tim Woodall wrote:
> char[3]
> char
> };
 
If we simply assume that the compiler always loads a 32-bit int and then
does bitwise operations to separate the required field the above struct
is already fully packed.
 
> The Bin Packing Problem is NP-hard. ISTM that the alignment constraints
> effectively make optimizing the arrangement of members in the struct an
> equivalent problem.
 
Try and find a concrete example where simply sorting by size and padding
at the end won't work.
 
--
Copyright 2020 Pete Olcott
olcott <NoOne@NoWhere.com>: Sep 29 11:00AM -0500

On 9/29/2020 6:49 AM, Öö Tiib wrote:
> order in memory and 2) member initialisation order. Both are expressed
> with declaration order in C++. When optimal of one differs from optimal
> of other then we do not have semantics to express it.
 
Sure we do. The initialization order can be specified in the contructor.
 
 
--
Copyright 2020 Pete Olcott
Juha Nieminen <nospam@thanks.invalid>: Sep 29 04:05PM

> smallest. Certainly every compiler could compare the need for padding of
> the specified version with the sorted version and then know whether or
> not sorting reduces padding requirements.
 
Btw, this is the reason why I was so disappointed that C++20 designated
initializers have to be specified in the same order as the members have
been declared.
 
In some situations you may want to declare the members in an order that
optimizes space, but specify them in the initialization list in a more
logical order (especially if it's some kind of struct containing all
kinds of settings or other parameters).
 
Also, if you ever eg. change the size of a member and thus need to
change its placement inside the struct, all the initializations will
break (even though that's supposed to be one of the biggest advantages
of designated initializers: The initialization doesn't break if the
order of the members changes.) I suppose getting error messages is
better than the wrong elements being silently initialized, but still,
it would be niced if it just worked.
Juha Nieminen <nospam@thanks.invalid>: Sep 29 04:21PM


> The Bin Packing Problem is NP-hard. ISTM that the alignment constraints
> effectively make optimizing the arrangement of members in the struct an
> equivalent problem.
 
AFAIK the one-dimensional version (which this situation is) is not NP.
You would need at least two structs, and the problem being to distribute
the given elements optimally among them.
 
Alignment constraints effectively make it into a "multiple containers"
situation, because you need to distribute all the elements among these
containing ranges without them spilling over... except it's still not
NP because the elements are always exactly a half, a quarter and
possibly an eighth of the container size.
 
In this situation you can simply always take the largest remaining
element that will fit in the current free space of the current slot.
(There's no optimization conundrum because of that restriction above.)
Paavo Helde <myfirstname@osa.pri.ee>: Sep 29 10:08PM +0300

29.09.2020 19:00 olcott kirjutas:
>> with declaration order in C++. When optimal of one differs from optimal
>> of other then we do not have semantics to express it.
 
> Sure we do. The initialization order can be specified in the contructor.
 
Sorry, but no. The standard says in [class.base.init]: "Then, non-static
data members are initialized in the order they were declared in the
class definition (again regardless of the order of the mem-initializers)."
James Kuyper <jameskuyper@alumni.caltech.edu>: Sep 29 03:28PM -0400

On 9/29/20 3:08 PM, Paavo Helde wrote:
 
> Sorry, but no. The standard says in [class.base.init]: "Then, non-static
> data members are initialized in the order they were declared in the
> class definition (again regardless of the order of the mem-initializers)."
 
Which is immediately followed by "Finally, the compound-statement of the
constructor body is executed." - I presume that he was referring to
initialization occurring inside that compound-statement, which can, in
fact, initialize the members in any desired order.
olcott <NoOne@NoWhere.com>: Sep 29 02:44PM -0500

On 9/29/2020 2:08 PM, Paavo Helde wrote:
 
> Sorry, but no. The standard says in [class.base.init]: "Then, non-static
> data members are initialized in the order they were declared in the
> class definition (again regardless of the order of the mem-initializers)."
 
There is a semantic difference between initializers and constructors.
 
struct Construct
{
int X;
int Y;
int Z;
Construct();
};
 
 
Construct::Construct()
{
printf("Construct::Construct()\n");
Y = 55;
Z = 66;
X = 77;
}
 
--
Copyright 2020 Pete Olcott
Jorgen Grahn <grahn+nntp@snipabacken.se>: Sep 29 08:39PM

On Tue, 2020-09-29, Richard Damon wrote:
 
> The first big problem is that if you reorder, the reordering MUST be
> consistent in every compilation unit, and ideally you want different
> compilers to come up with the same result for the same machine.
 
I thought that was what I addressed in my last paragraph above:
 
>> One could easily imagine each ABI defining struct layout, with
>> rearrangements, in such a way that waste is minimized.
 
E.g. a stable sort of the struct members by size would avoid a lot of
padding, while still being predictable.
 
> Second, remember that most C code, especially early on want designed to
^^^^
I assume that should read ", wasn't".
 
> realize that your hardware didn't map well to what the format was based
> on, and you needed to add a layer to convert to raw bytes (endian issues
> or misaligned fields being the biggest example).
 
I do remember all that, and that's what I feel is "letting bad code
prevent important optimizations". I would have preferred if the
language allowed an ABI to optimize struct layout. People who wanted
to use structs as a convenient way to access bytes in memory could
either (a) know the ABI or (b) use some kind of "packed structs"
extension, like they mostly do today.
 
It's not clear to me as I write this whether the language or the
popular ABIs prevent this ... but I don't really have to know which
one.
 
> re-arrange structs, then the odds of being able to match a struct to an
> externally defined format drops significantly, forcing going to the low
> level techniques.
 
I should probably add that I feel mapping structs onto raw memory is
the /real/ low level technique. I always try to do such things with
portable code, e.g. parsing binary file formats one octet at a time.
That may be a bit slower (I have never tried measuring) but it's portable
and I think it invites fewer bugs too. Including fewer security bugs.
 
/Jorgen
 
--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .
David Brown <david.brown@hesbynett.no>: Sep 29 10:48PM +0200

On 29/09/2020 17:51, olcott wrote:
>> compiler can easily be relatively dim-witted about the issue.
 
> No, apparently not at all. Standards complying compilers are not allowed
> to reorder the fields.
 
Compilers are not allowed to make /visible/ reordering of the fields.
But compilers are always allowed to do whatever they want in the way of
optimisations as long as the code appears "as if" it followed the
language rules literally.
David Brown <david.brown@hesbynett.no>: Sep 29 10:51PM +0200

On 29/09/2020 21:28, James Kuyper wrote:
> constructor body is executed." - I presume that he was referring to
> initialization occurring inside that compound-statement, which can, in
> fact, initialize the members in any desired order.
 
And as always, the compiler can re-arrange that into any order it wants
if the difference can't be seen by code. (That applies to both the
member initialisation part, and the constructor body.)
David Brown <david.brown@hesbynett.no>: Sep 29 10:54PM +0200

On 29/09/2020 21:44, olcott wrote:
>   Z = 66;
>   X = 77;
> }
 
The compiler can implement that as though you had written:
 
Construct::Construct()
{
X = 77;
Y = 55;
Z = 66;
printf("Construct::Construct()\n");
}
 
In this case, there is no semantic difference because there is no
observable difference in the behaviour of the program for different
orderings of the initialisers or even the printf() call.
Paavo Helde <myfirstname@osa.pri.ee>: Sep 30 12:17AM +0300

29.09.2020 22:28 James Kuyper kirjutas:
> constructor body is executed." - I presume that he was referring to
> initialization occurring inside that compound-statement, which can, in
> fact, initialize the members in any desired order.
 
This would technically be assignment, not initialization, with the known
drawbacks of "delayed init" (having uninitialized data around for a
while or need to invent dummy placeholder values; also, cannot have
const members).
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

No comments: