soft and program: Digest for comp.lang.c++@googlegroups.com

comp.lang.c++@googlegroups.com

Google Groups

alignment and endian issues - 23 Updates
Can one initialise std::string with pre-allocated memory? - 1 Update
parameterized constructor with missing fields? - 1 Update

jameskuyper@verizon.net: Apr 18 09:56AM -0700

On Wednesday, April 18, 2018 at 11:28:05 AM UTC-4, Rick C. Hodgin wrote:
> >> would not be a factor.

> > Garbage. It's undefined behaviour.

> Only if you use architectures which do not allow such things.

Whether or not code has undefined behavior is specified by the C++
standard, not the architecture. It can indirectly depend upon
implementation-specific issue, such as the range of representable values
of a given type, but only insofar as what the C++ says about the code
covers those issues.

> ... A
> simple unit test would confirm if it works or not on any architec-
> ture.

"Undefined behavior" cannot be disproved by a unit test. Undefined
behavior means that there's no restrictions on what the program can do.
Therefore, no matter what result you expect from your unit test, that
result is compatible with "undefined behavior". In particular,
"undefined behavior" permits, as one possibility, that your code will do
precisely whatever it was you incorrectly thought it was required to do.

> Compilers often allow things the standard does not.

The standard imposes no requirements on the behavior when it is
undefined, so it's impossible for there to be any behavior allowed by
the compiler that is not also allowed by the standard.

"Rick C. Hodgin" <rick.c.hodgin@gmail.com>: Apr 18 01:22PM -0400

> implementation-specific issue, such as the range of representable values
> of a given type, but only insofar as what the C++ says about the code
> covers those issues.

I understand.

> result is compatible with "undefined behavior". In particular,
> "undefined behavior" permits, as one possibility, that your code will do
> precisely whatever it was you incorrectly thought it was required to do.

I disagree. You can send it test values and see if it produces what
you expect from a positive and a negative, for example. If it gives
you the correct results, it is working correctly.

> The standard imposes no requirements on the behavior when it is
> undefined, so it's impossible for there to be any behavior allowed by
> the compiler that is not also allowed by the standard.

The compiler will generate code which works based on the operation,
even if the standard says it's undefined behavior. It will generate
the appropriate code sequence which would be expected from what the
source code asks for, even if it is not defined behavior by the
compiler.

Here is what MSVC++ produces for that bit of code in a 64-bit compile:

is_valid2 PROC ; is_valid2, COMDAT
; 36 : {
; 37 : return *reinterpret_cast<const uint64_t*>(data) &
0x8080808080808080L;
mov rax, QWORD PTR data$[rbp]
mov rcx, 8080808080808080H
mov rax, QWORD PTR [rax]
and rax, rcx
test rax, rax
je SHORT $LN3@is_valid2
mov BYTE PTR tv66[rbp], 1
jmp SHORT $LN4@is_valid2
$LN3@is_valid2:
mov BYTE PTR tv66[rbp], 0
$LN4@is_valid2:
movzx eax, BYTE PTR tv66[rbp]
; 38 : };
ret 0
is_valid2 ENDP ; is_valid2

In short, it generates the expected operation. MSVC++ allows that
particular operation to go through correctly.

In optimized code, it produces a far simplified version that's brought
inline so no function call is made:

mov rdi, 8080808080808080H
test QWORD PTR data, rdi

But still the same operation, and exactly the one is_valid2() was
designed to produce ... exactly.

It's UB in the C++ standard, but it works in the compiler. And a
unit test case at startup passing in a data with all 0x80 values,
and one without, will tell if it works.

--
Rick C. Hodgin

Chris Vine <chris@cvine--nospam--.freeserve.co.uk>: Apr 18 06:56PM +0100

On Wed, 18 Apr 2018 12:21:35 -0400
> > the point: behaviour doesn't become defined by being doubly undefined.)

> I understand that. That's why I say it's easily tested with a unit
> test case.

If you think so then you do not understand it.

> > ceases to work? What a load of horse manure.

> No. Use the code. Compile it. If it fails the test case, update
> a #define setting and recompile to use the alternative mechanism.

See above.

> trivial on a 32-bit machine.

> It comes down to looking at what's happening at the machine level, as
> opposed to looking at what the language allows.
[more of the same snipped]

There is no option 3. See above

> solely the direct result of the data, and not the code being changed.

> I think it's the better philosophy regarding data. It is after all...
> data.

It is not about philosophy, whether about data or anything else. That
your code breaches the C++ language specification is the end of it.
That there is no strict aliasing rule in CAlive or some other language
(vapourware or otherwise) is irrelevant to the issue, as is whether you
think there should be a strict aliasing rule in C and C++. This is a
C++ newsgroup, there is a strict aliasing rule in C++ and if he is
writing C++ code with a C++ compiler he needs to know it.

[snip]
> The iterative loop method you propose is slower and very likely
> completely unnecessary given the nature of data computing in
> assembly / machine code.

I wasn't proposing an iterative loop. You seem to be clueless. What
makes your attitude even more ridiculous is that there is a
zero-overhead way of doing it right.

"Rick C. Hodgin" <rick.c.hodgin@gmail.com>: Apr 18 02:17PM -0400

On 4/18/2018 1:56 PM, Chris Vine wrote:
>> data.

> It is not about philosophy, whether about data or anything else. That
> your code breaches the C++ language specification is the end of it.

I agree it is the end of it with regards to the C++ language standard.
My position is it's not the end of it with regards to the compiler. The
compiler is free to do what it wants in the cases of UB, including do
the correct operation.

> think there should be a strict aliasing rule in C and C++. This is a
> C++ newsgroup, there is a strict aliasing rule in C++ and if he is
> writing C++ code with a C++ compiler he needs to know it.

Agreed. If the C++ compiler adheres explicitly to the standard it
may produce unusable code. If, however, it goes ahead and performs
the operation, as we just saw MSVC++ does, then it is working in that
compiler.

>> completely unnecessary given the nature of data computing in
>> assembly / machine code.

> I wasn't proposing an iterative loop.
I apologize. I mistook this post from Paavo Helde for being from you:

bool is_valid(const char* data) {
return std::find_if(data, data+8,
[](char c) {return c&'\x80';})==data+8;

My mistake.

> You seem to be clueless. What
> makes your attitude even more ridiculous is that there is a
> zero-overhead way of doing it right.

It's not zero-overhead. Your proposed memcpy() is iterative, and
operates on the data on a byte-by-byte basis. It is slower than
the proposal by the OP, and while yours may be conforming ... who
cares if his faster method works? If his goal is to be expressly
conforming, then it matters. But if he's targeting a range of
tools where it will work using is_valid2() ... then honestly, who
cares? Every C++ compiler is different and these things can be
wrangled into tests and validated at startup with the simple load
of a test case library that calls some functions included in the
main executable.

--
Rick C. Hodgin

Chris Vine <chris@cvine--nospam--.freeserve.co.uk>: Apr 18 07:40PM +0100

On Wed, 18 Apr 2018 14:17:06 -0400
> On 4/18/2018 1:56 PM, Chris Vine wrote:
[snip]
> wrangled into tests and validated at startup with the simple load
> of a test case library that calls some functions included in the
> main executable.

It is not iterative in the sense of a loop. It is an intrinsic/built-in
which will do a block byte transfer at worst, and if your
reinterpret_cast does not fail on alignment grounds will be elided
entirely.

So you propose to recommend undefined behaviour on the ground that the
one compiler you have tested your reinterpret_cast with (VS) gives the
correct results, but not to adopt defined behaviour even though (i)
every compiler you test it with will elide the memcpy away if your
reinterpret_cast can possibly work, to produce optimal code, and (ii)
every update to VS or change to the program compile parameters may
break your version.

The correct version will be just as fast as your incorrect version, if
compiled with -O or higher and your version actually works.

Your view is deranged.

"Rick C. Hodgin" <rick.c.hodgin@gmail.com>: Apr 18 02:56PM -0400

On 4/18/2018 2:40 PM, Chris Vine wrote:
> which will do a block byte transfer at worst, and if your
> reinterpret_cast does not fail on alignment grounds will be elided
> entirely.

What is your code for this explicit example? How would you write it
using memcpy() to have it be as fast as the OP's is_valid2()?

--
Rick C. Hodgin

Chris Vine <chris@cvine--nospam--.freeserve.co.uk>: Apr 18 08:20PM +0100

On Wed, 18 Apr 2018 14:56:09 -0400
> > entirely.

> What is your code for this explicit example? How would you write it
> using memcpy() to have it be as fast as the OP's is_valid2()?

You replace your assignment-with-reinterpret_cast to 'd' with a
memcpy() to 'd', and compile with optimization enabled. That's it.

This might help you (the memcpy() version which compiles to the same
code as the cast version is at the end):

https://blog.regehr.org/archives/959

Also, since you appear to like authority figures, the author may
persuade you where I have failed.

jameskuyper@verizon.net: Apr 18 12:21PM -0700

On Wednesday, April 18, 2018 at 1:22:15 PM UTC-4, Rick C. Hodgin wrote:

> I disagree. You can send it test values and see if it produces what
> you expect from a positive and a negative, for example. If it gives
> you the correct results, it is working correctly.

I'm not talking about determining whether it's working correctly. I'm
talking about determining whether or not the behavior is undefined.
Since "undefined behavior" imposes no restrictions on the behavior of
your code, the fact that your code might happen to work correctly does
not count as disproving that your code has undefined behavior.

Why should you care whether the behavior is undefined as long as the
code works?

If the behavior is defined by the standard, you can reasonably expect a
compiler that claims to be conforming to generate an executable that
provides that behavior, and will continue to provide that behavior. If
the behavior is undefined, the fact that the executable passed one unit
test provides no justification for assuming that it will pass the same
unit test the next time that it is run with the same inputs (not even if
it's exactly the same executable, with no recompilation). You have even
less justification for assuming it will continue passing that test if
compiled a second time, particularly if it's compiled by a different
standard-conforming compiler.

> > the compiler that is not also allowed by the standard.

> The compiler will generate code which works based on the operation,
> even if the standard says it's undefined behavior.

Which is perfectly allowable, according to the standard.

> ... It will generate
> the appropriate code sequence which would be expected from what the
> source code asks for,

Well, that depends upon what you expect of the source code. If you have
any particular expectations for the behavior when the behavior is
undefined, those expectations don't come from a proper reading of the C
standard. A proper understanding of the C standard will give you no
justification for being surprised by any particular behavior exhibited
by the code - so in one sense, that means the code's behavior has met
expectations, regardless of what that behavior is.

> In short, it generates the expected operation.

And a fully conforming implementation of C is also permitted to generate
an executable that violates your unjustified expectations about the
behavior of this code.

> It's UB in the C++ standard, but it works in the compiler. And a

That's bad wording: the word "but" implies a conflict between the two
statements. UB is fully compatible with the code working exactly as you
incorrectly thought it was required to work.

"Rick C. Hodgin" <rick.c.hodgin@gmail.com>: Apr 18 03:33PM -0400

On 4/18/2018 3:20 PM, Chris Vine wrote:
>> using memcpy() to have it be as fast as the OP's is_valid2()?

> You replace your assignment-with-reinterpret_cast to 'd' with a
> memcpy() to 'd', and compile with optimization enabled. That's it.

I do not understand how this code would appear or work. Can you
please provide an example in source code for that compiles?

As I read it I see this:

bool is_valid2(const char* data)
{
char d[8];

memcpy(d, data, 8);

// How do I test each byte's 0x80 bit?
// return(d & 0x8080808080808080L);
};

??

--
Rick C. Hodgin

"Rick C. Hodgin" <rick.c.hodgin@gmail.com>: Apr 18 03:44PM -0400

> less justification for assuming it will continue passing that test if
> compiled a second time, particularly if it's compiled by a different
> standard-conforming compiler.

There's no argument to that. UB can produce code doing anything it
wants so the discussion ends there.

My point is look at what your compiler generates and if it generates
proper code ... use it.

--
Rick C. Hodgin

jameskuyper@verizon.net: Apr 18 12:49PM -0700

On Wednesday, April 18, 2018 at 3:33:54 PM UTC-4, Rick C. Hodgin wrote:

> bool is_valid2(const char* data)
> {
> char d[8];

Why did you change the type of 'd'? He said nothing about doing that in his suggestion. It's supposed to be

uint64_t d;

> memcpy(d, data, 8);

memcpy(&d, data, sizeof d);

> // How do I test each byte's 0x80 bit?
> // return(d & 0x8080808080808080L);

That's not permitted with your definition of d, since it's an lvalue of array type that gets implicitly converted into a pointer to the first element of that array. As such, it violates 6.5.10p2. With d corrected to an integer type, there's still the issue that the OP apparently had his logic reversed for is_valid2(). I believe that it should be:

return !(d & 0x8080808080808080L);

David Brown <david.brown@hesbynett.no>: Apr 18 09:51PM +0200

On 18/04/18 17:27, Rick C. Hodgin wrote:

> Only if you use architectures which do not allow such things. A
> simple unit test would confirm if it works or not on any architec-
> ture.

No, it will not.

There are two potential problems here. One is the alignment issue. On
some processors (such as the x86), you don't have to have correct
alignment for loading a 64-bit value - on other processors, you /do/
need it. Even on the x86, there are instructions that can't work
unaligned - if the compiler decides to use an SIMD load here (maybe the
function is inlined, and in part of a loop) then the unaligned load will
fail for some SIMD instructions.

The other problem is that reading "const char" data through a pointer to
a "const uint64_t" is undefined behaviour as they are incompatible pointers.

It /might/ work as you expect. It might work with some compilers, and
not with others. It might work with some flags, and not with others.
It might work when the code is a separate function, but not when it is
inlined. It might work on Tuesdays but not Wednesdays. (Okay, that is
unlikely - but you get the point.)

Unit testing is great for showing that valid code with well-defined
behaviour has the behaviour you want. It is useless for determining if
the behaviour is well-defined in the first place.

> Compilers often allow things the standard does not.

True.

> It's not a
> guarantee it will work, but it's easy enough to validate.

Not true.

It is easy to validate if you have a compiler that documents the
behaviour (possibly with a flag - such as -fno-strict-aliasing, or with
an extension such as the "may_alias" type attribute). If the compiler
manual does not document this as an extension, it is dangerous to rely
on it.

Chris Vine <chris@cvine--nospam--.freeserve.co.uk>: Apr 18 08:58PM +0100

On Wed, 18 Apr 2018 15:33:42 -0400

> // How do I test each byte's 0x80 bit?
> // return(d & 0x8080808080808080L);
> };

Are you serious? I even gave you a link to a similar case.

bool is_valid2(const char* data) {
std::uint64_t d;
std::memcpy(&d, data, 8);
return(d & 0x8080808080808080L);
}

"Rick C. Hodgin" <rick.c.hodgin@gmail.com>: Apr 18 03:58PM -0400

>> {
>> char d[8];

> Why did you change the type of 'd'? He said nothing about doing that in his suggestion. It's supposed to be

Because I didn't understand what he was saying. I couldn't visualize
it. And also because he wouldn't answer me with an explicit source
code example to clarify.

> uint64_t d;
> memcpy(&d, data, sizeof d);
> return !(d & 0x8080808080808080L);

The single memcpy over d makes it conforming, eh? And the explicit
non-mempcy() copy through the cast pointer isn't conforming, eh?

CAlive here I come. :-) I will never be limited such things again.

--
Rick C. Hodgin

jameskuyper@verizon.net: Apr 18 12:59PM -0700

On Wednesday, April 18, 2018 at 3:45:07 PM UTC-4, Rick C. Hodgin wrote:
> wants so the discussion ends there.

> My point is look at what your compiler generates and if it generates
> proper code ... use it.

I barely have time to design, write, and test my code. Checking whether the generated code is "proper" would add weeks of delay to every delivery I made. I wonder, just how small are the programs that you write that you can afford the time needed to review the generated code>?

"Rick C. Hodgin" <rick.c.hodgin@gmail.com>: Apr 18 04:02PM -0400

On 4/18/2018 3:58 PM, Chris Vine wrote:
> std::memcpy(&d, data, 8);
> return(d & 0x8080808080808080L);
> }

I couldn't visualize it, Chris. And in all honesty, after seeing
what James posted, and what you post here ... I am baffled at how
that explicit memcpy() into d would work, and a cast pointer de-
reference copy, which will move the same 8 bytes, would not work.

It seems a severe limitation to the compiler and/or C++ language.

You say multiple times that I am clueless and I'm deranged and
that I have failed ... I would argue that these limitations in how
the language works are significant, and inappropriate, and are to
be replaced.

Data is data. It should be viewed as such. I believe that is the
correct philosophy to look at things like this, and any standard
which does not correlate something like a cast pointer copying 8
bytes, compared to a memcpy() which copies 8 bytes, is the insane
component of that discussion.

Regardless of my position on this, I respect your knowledge and
expertise in C++, and I thank you for your multiple replies.

--
Rick C. Hodgin

"Rick C. Hodgin" <rick.c.hodgin@gmail.com>: Apr 18 04:13PM -0400

>> My point is look at what your compiler generates and if it generates
>> proper code ... use it.

> I barely have time to design, write, and test my code. Checking whether the generated code is "proper" would add weeks of delay to every delivery I made. I wonder, just how small are the programs that you write that you can afford the time needed to review the generated code>?

Pretty small. 10s of thousands of lines per app max.

It's not that I review the generated code, it's more when I encounter
something and I'm not sure how it will work in the compiler, I test it.
I typically will test things in MSVC++ and GCC (MinGW).

In addition, I am fluent in x86 assembly, and I know how things can
(and should) work at the assembly level to manipulate data without
the constraints of a language protocol... so I draw on that knowledge
as well.

--
Rick C. Hodgin

David Brown <david.brown@hesbynett.no>: Apr 18 10:22PM +0200

On 18/04/18 20:17, Rick C. Hodgin wrote:
>> On Wed, 18 Apr 2018 12:21:35 -0400
>> "Rick C. Hodgin" <rick.c.hodgin@gmail.com> wrote:
>>> On 4/18/2018 12:03 PM, Chris Vine wrote:
<snip>
> My position is it's not the end of it with regards to the compiler. The
> compiler is free to do what it wants in the cases of UB, including do
> the correct operation.

There is no "correct operation" as far as C++ is concerned. The code
has undefined behaviour - that means it does not make sense in C++ or to
the compiler, even if you feel the intention of the code is clear to a
human reader. (Which it is, in this particular case.)

Imagine it as though someone had written the sentence "I went for a
drive in my bar". This is grammatically correct, and has correct
spelling - a computer spell-checker cannot spot the problem. Most
people would realise "bar" was a typo for "car" - your "unit test" on
the sentence would pass. But some people - perhaps someone with a
different native language, would get confused. The sentence has
"undefined behaviour".

> may produce unusable code. If, however, it goes ahead and performs
> the operation, as we just saw MSVC++ does, then it is working in that
> compiler.

Code like this certainly /can/ have defined behaviour for particular
compilers and/or flags. But you can only rely on it if it is
documented, and if you are sure the code will only be used on such a
compiler. Otherwise it is a very subtle error waiting to creep up on
people.

(It's fine to write code that is specific for a particular compiler -
but you should do so only if you have good reason for it. And you
should document it, and ideally cause compile-time failures if the
assumptions about the tools are broken.)

> > zero-overhead way of doing it right.

> It's not zero-overhead. Your proposed memcpy() is iterative, and
> operates on the data on a byte-by-byte basis.

Logically, yes, memcpy() is byte for byte. In practice, good compilers
will optimise memcpy() very nicely when the operands are appropriate.
In a case like this, a good compiler (with optimisation enabled,
obviously) will do a single 64-bit load on an architecture that supports
unaligned loads. It will do its best in other cases - using
byte-for-byte loads if needed, or bigger loads if the compiler has some
information about the alignment.

So the memcpy() solution will be as fast as your version on any target
that allows unaligned loads, and /correct/ on all targets regardless of
optimisations, flags, compiler variations, etc.

> wrangled into tests and validated at startup with the simple load
> of a test case library that calls some functions included in the
> main executable.

There are certainly cases where implementation-specific code is fine.
If the code is full of calls to WinAPI functions, then relying on x86
features is perfectly reasonable. If it is full of MSVC extensions,
then relying on MSVC behaviour is also fine. (I don't know if MSVC
documents that it allows such pointer casts in this way.)

It is also fine to do:

#if __COMPILER_XXX

// Fast implementation known to work on XXX
bool is_valid(...

#else

// Possibly slow, but definitely correct fall-back version
bool is_valid(...

soft and program

Wednesday, April 18, 2018

Digest for comp.lang.c++@googlegroups.com - 25 updates in 3 topics

No comments:

Blog Archive

About Me