soft and program: Digest for comp.lang.c++@googlegroups.com

comp.lang.c++@googlegroups.com

Google Groups

std::hexfloat - 25 Updates

std::hexfloat

Bonita Montero <Bonita.Montero@gmail.com>: May 21 07:45AM +0200

> That's the wrong way around. What is a lot more frightening is
> incompetent programmers writing code which depends on pointer casts
> without realising that their code (a) gives undefined behaviour, ...

... in theory.

"Fred.Zwarts" <F.Zwarts@KVI.nl>: May 21 09:25AM +0200

"Chris Vine" schreef in bericht
news:20190520174928.c77ab71214afbd6375a5bbce@cvine--nospam--.freeserve.co.uk...
>it will be optimized out where a cast would (but for strict aliasing)
>work, and will still work where casting wouldn't (such as when casting
>would result in misalignment).

I need type punning often when accessing device registers. E.g., for a given
VME address it makes a difference whether it is accessed in D8 mode (byte
access), D16 (16-bit) mode or D32 (32-bit) mode. memcpy for device registers
is a bad idea, because it is not defined what bit-size will be used for the
copy.
I have the feeling that only type punning, in combination with a volatile
declaration is a good method for this purpose. But I am not sure that it is
well defined in the C++ standard and that it is portable across different
platforms. Fortunately, it works on the platforms that I use.
Or is there a method that is clearly supported by the C++ standard?

"Öö Tiib" <ootiib@hot.ee>: May 21 01:16AM -0700

On Monday, 20 May 2019 20:26:32 UTC+3, Ben Bacarisse wrote:

> >> >>>> C++ does not.

> I've cut the quoted text because your reply appears to be about the
> above, not what I wrote (though I may have misunderstood).

If something (on current case what exactly is the puzzle) was
misunderstood then it was most likely me. I have some knowledge
but bad communication skills.
I know that C++ does not allow type punning through union. I have
read it out from each version of standard and also from public
communications and discussions of committee members. What
compiler optimizations that is meant for is unclear but so it is.

> > that placement new.

> Thanks, yes, I read that part, but I could not find where a plain write
> sets the active member.

Sorry I take some online version of draft ... since I don't have
the books under hand right now. There it is not plain write but
assignment operator either built in or trivial. That
http://eel.is/c++draft/class.union#5
of [class.union] http://eel.is/c++draft/class.union
Otherwise, when there are no trivial ways to start lifetime of member
(to make it active) then we have to use placement new.
http://eel.is/c++draft/class.union#6

> It may be wrapped up in other more general text
> about assignment, placement new, or some such.

Indeed. They always try to keep one thing said only in one place and
so it is painful to find sometimes where the one place is.

> > Reading from not active member
> > is undefined ...

> Do you know where this is stated?

In [basic.life]
http://eel.is/c++draft/basic.life#7
Special guarantee that overrules [basic.life] is about common initial
sequence: http://eel.is/c++draft/class.union#1

blt_14rUt9Szv@2mu00w.co.uk: May 21 08:26AM

On Mon, 20 May 2019 19:04:59 +0100
>> 1.234000

>> Sorry, what was that you were saying?

>Sorry, what exactly do you think you were proving?

Are you having a slow brain day or something? You said it wouldn't work on
new compilers with optimisation. I just proved it did just as I've said it
works on every compiler I've ever tried it on.

>Your crap code with undefined behaviour looks as if it is too
>inconsequential for g++ to optimize against it. gcc/g++ will however

LOL, oh please, give it up before you make a complete fool of yourself :)

>warn that your code is non fit for purpose - it tells you that it
>breaks strict aliasing rules.

Clang doesn't and clang is a better compiler all round IMO.

Chris Vine <chris@cvine--nospam--.freeserve.co.uk>: May 21 11:49AM +0100

On Tue, 21 May 2019 08:26:45 +0000 (UTC)

> Are you having a slow brain day or something? You said it wouldn't work on
> new compilers with optimisation. I just proved it did just as I've said it
> works on every compiler I've ever tried it on.

The "it" which doesn't work is type punning through casting pointers.
Your silly toy code with undefined behaviour proves absolutely
nothing. An example has already been given up-thread of the
differences which can arise in code emitted, depending on whether the
-fno-strict-aliasing switch is applied or not.

> >Your crap code with undefined behaviour looks as if it is too
> >inconsequential for g++ to optimize against it. gcc/g++ will however

> LOL, oh please, give it up before you make a complete fool of yourself :)

I think you are the only one doing that.

> >warn that your code is non fit for purpose - it tells you that it
> >breaks strict aliasing rules.

> Clang doesn't and clang is a better compiler all round IMO.

So the standard says clearly that it is undefined behaviour but you say
"ignore that because although gcc warns that it breaks strict aliasing,
clang doesn't". If that is your approach to programming then "crap
code" seems like too mild a description.

Chris Vine <chris@cvine--nospam--.freeserve.co.uk>: May 21 11:47AM +0100

On Tue, 21 May 2019 09:25:19 +0200
> well defined in the C++ standard and that it is portable across different
> platforms. Fortunately, it works on the platforms that I use.
> Or is there a method that is clearly supported by the C++ standard?

The problem with aliasing arising from type punning concerns
dereferencing pointers which do not represent the "dynamic type" (in C
the "effective type") of the object being pointed to. The compiler is
entitled to assume that the object obtain by dereferencing, say, an
int* is actually an int and not a float. Does your case fall foul of
this?

A memcpy() is just another form of assignment: it so happens that
with a cast the dynamic type of the result of the cast remains the
source type, but with memcpy() it becomes the type of the destination.
However I can see that memcpy() might be problematic with device
registers because I don't think it has any atomicity guarantees. I
guess in that case using a union with volatile members might be the
answer: it is probably supported by your compiler. I don't actually
know what the standard says about unions with volatile members - when I
get a chance I must look it up.

Chris

Bart <bc@freeuk.com>: May 21 12:35PM +0100

On 21/05/2019 11:49, Chris Vine wrote:

> The "it" which doesn't work is type punning through casting pointers.
> Your silly toy code with undefined behaviour proves absolutely
> nothing.

Undefined behaviour because the language says so.

>> Clang doesn't and clang is a better compiler all round IMO.

> So the standard says clearly that it is undefined behaviour

Only because the language says so.

but you say
> "ignore that because although gcc warns that it breaks strict aliasing,
> clang doesn't". If that is your approach to programming then "crap
> code" seems like too mild a description.

It's only 'crap code' because the language says so.

My opinion is that such code can be valid, and it can be well-defined
(within a range of machines that might be the only ones you're
interested in). Or the behaviour might be specific to that a range of
machines. But that's OK because we're low-level programmers and we know
what we're doing, right?

My approach is not to use C or C++, partly because all their silly rules
on UB (which seem to only exist to enable extreme optimisations) make
life harder.

And to use alternative languages. But if those other languages can
successfully run the same code on the same machines without UB, then
what are C and C++ playing at?

This is a recent bit of code I used (expressed here as C, and inside a
struct definition):

...
union { // anonymous union
int32_t modelist[4];
int32_t mode;
};

I want to be able to access (read or write) the first 4 bytes of that
union interchangeably as either .mode or .modelist[0], including writing
as .modelist[0] then reading immediately as .mode.

Isn't that technically UB in C or C++? I don't know, but the important
thing is that I don't need to care!

As for type-punning, in the alternate language I use it is an official
feature! int->float type-punning, in C-like syntax, might be written as
(float@)a, where a is an int, and it can also work as (float@)(a+b).

(The C/C++ idiom would be *(float*)&a which only works on lvalues.)

I don't see it as being anything different from this:

a: dd 0 # 32-bit location
mov [a],eax # write 32-bit int
movd xmm0,[a] # read as 32-bit float

What does the code mean? Well if eax contained 0x3F800000, then it's
writing the binary representation of the IEEE float32 value 1.0.

And the [a] could be [esi] where esi contains a pointer that is
interpreted as int32_t* then float32* on successive lines.

All perfectly reasonable things that you might want to do.

"Fred.Zwarts" <F.Zwarts@KVI.nl>: May 21 01:43PM +0200

"Chris Vine" schreef in bericht
news:20190521114749.d756ce67e9563bfa8c3fd2da@cvine--nospam--.freeserve.co.uk...
>entitled to assume that the object obtain by dereferencing, say, an
>int* is actually an int and not a float. Does your case fall foul of
>this?

Device registers normally do not use floating point types. They usually
contain integer values, or bit patterns. But even if they contain a floating
point type, it may not match the format of the host system, so one has to
separate the manitissa and the exponent and construct a floating point value
from it.
I always use pointers to uint8_t, uint16_t, uint32_t or uint64_t to access
such registers. I don't think that will be a problem.

Jorgen Grahn <grahn+nntp@snipabacken.se>: May 21 12:29PM

On Mon, 2019-05-20, Chris Vine wrote:

> The number of people who don't trouble themselves to understand the
> strict aliasing rules of C and C++ is surprising.

"Don't trouble themselves" is a good way of putting it, because it's
not /hard/ to grasp. If you think of memory as a store for /typed/
objects, plus the extra accomodations for char* and unions, you have
the rough picture.

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .

David Brown <david.brown@hesbynett.no>: May 21 03:36PM +0200

On 21/05/2019 13:35, Bart wrote:
>> Your silly toy code with undefined behaviour proves absolutely
>> nothing.

> Undefined behaviour because the language says so.

Yes, exactly.

>>> Clang doesn't and clang is a better compiler all round IMO.

>> So the standard says clearly that it is undefined behaviour

> Only because the language says so.

Yes.

It's fine to say you don't like this behaviour, and think that it should
be possible to access data via pointers of any kind. It's fine to say
that you choose to use compilers that implement such semantics - whether
it is always the case in the compiler you use, or whether you need a
switch like "-fno-strict-aliasing" to get that behaviour. You would be
far from alone here.

What is /not/ fine is to say that because /you/ think type-based alias
analysis should not be used for optimisation, it's okay to write code
that messes around with accessing data via incompatible types.

If someone wants to write C code, they should write C code. If you want
to write code in a variant of C with additional semantics, then you need
to take appropriate steps to ensure that you use it with tools that
guarantee those semantics. (And by that, I mean a tool that documents
the behaviour - not a tool that "worked on your test code".) There is
nothing wrong with writing code that is restricted like this, but you
should be clear about it.

None of this is new. The C language standards have been clear since C90
that accessing data via incompatible types is undefined behaviour. gcc
introduced optimisations on type-based alias analysis around 20 years
ago, and they did so because that's what many top-range C and C++
compilers did.

>> clang doesn't". If that is your approach to programming then "crap
>> code" seems like too mild a description.

> It's only 'crap code' because the language says so.

Yes - but since the discussion is about C or C++, it is those languages
and their standards that matter. The standards say accessing an "float"
object through an "int*" pointer is undefined behaviour - therefore it
is undefined behaviour in C and C++, and if you write such code, it is
crap C or C++ code. Particular implementations are free to give a
definition for it - code can be crap C++ code while being valid "MSVC"
code or valid "gcc -fno-strict-aliasing" code.

> interested in). Or the behaviour might be specific to that a range of
> machines. But that's OK because we're low-level programmers and we know
> what we're doing, right?

If you use compilers that define the semantics of this kind of
cross-type access, that's fine. If you are writing more general C or
C++, or using other compilers, it is not fine.

> My approach is not to use C or C++, partly because all their silly rules
> on UB (which seem to only exist to enable extreme optimisations) make
> life harder.

I fail to see how type-based alias analysis makes anything harder. In
general, optimisations mean that you can write code in a clearer,
simpler and more maintainable manner and let the compiler handle the
efficiency details.

I mean, how often do you actually want to access float data via an int*
pointer? In real code, it's very rare that this sort of thing crops up.
Being rare, there is no problem using valid alternatives like unions or
memcpy - both of which can be tightly optimised by a compiler.

> And to use alternative languages. But if those other languages can
> successfully run the same code on the same machines without UB, then
> what are C and C++ playing at?

As always, you miss the point of undefined behaviour. And as always,
you grossly overstate its importance. In the great majority of cases,
things that are undefined behaviour in C or C++ are things that would
not turn up in correct, sensible code in the first place.

> as .modelist[0] then reading immediately as .mode.

> Isn't that technically UB in C or C++? I don't know, but the important
> thing is that I don't need to care!

That will be valid in C, but (AFAIUI) invalid in C++. However, as
others have noted, practical C++ compilers will define this behaviour to
have the same meaning as in C. (This is not type punning, since mode
and modelist[0] have the same types.)

> As for type-punning, in the alternate language I use it is an official
> feature! int->float type-punning, in C-like syntax, might be written as
> (float@)a, where a is an int, and it can also work as (float@)(a+b).

And how often is this actually relevant in real code? Or is this just
another one of your "features" whose only purpose is to let you pretend
your language is "better" than C?

Just because something is easy to specify and implement does not mean it
is useful.

David Brown <david.brown@hesbynett.no>: May 21 03:45PM +0200

On 21/05/2019 09:25, Fred.Zwarts wrote:
>> work, and will still work where casting wouldn't (such as when casting
>> would result in misalignment).

> I need type punning often when accessing device registers.

That is strange. I work with device registers all the time, and I
rarely use type punning.

The usual way to access hardware registers is via "volatile uint32_t *"
or similar pointers, with the size you want to use. Often you use
structs rather than individual pointers, but it boils down to the same
kind of volatile accesses.

"Type punning" is when you have told the compiler that object A is of
type T, and you know want to access it while pretending it is type U.
You are not doing that, as far as I can tell.

> across different platforms. Fortunately, it works on the platforms that
> I use.
> Or is there a method that is clearly supported by the C++ standard?

Exactly how "volatile" works, and in particular for accesses via an
absolute address cast to a pointer-to-volatile, is not as clearly
defined as it could be in C and C++. C17 clarifies it - maybe newer C++
standards inherit this improvement. But all compilers have implemented
it in the same obvious manner.

Juha Nieminen <nospam@thanks.invalid>: May 21 01:52PM

> I didn't even know hexfloat existed. It seems a spectacularly useless
> manipulator. What on earth is the point of it?

If you save a floating point value in ascii usint the normal decimal
representation, in many (perhaps even most) cases there's a high
chance of losing accuracy when it's read back, for the simple
reason that the base-10 representation cannot accurately represent
every single base-2 floating point value.

Base-16 representation, however, can. It exactly represents the
original floating point value, to the last bit, and nothing is
lost in the conversion to either direction.

Its advantage is that it's agnostic to the actual floating point
value binary representation in the hardware (eg. it doesn't
assume that it's an IEEE floating point value of a given size).
Thus exact floating point values can be transferred between
computers that may use different native floating point
formats.

If you save the floating point bits as raw data, you'll at the
very least run into the problem of endianess, and of course you'll
be assuming that both the source and target architectures use the
exact same internal floating point bit representation.

Bart <bc@freeuk.com>: May 21 03:45PM +0100

On 21/05/2019 14:36, David Brown wrote:

>> It's only 'crap code' because the language says so.

> Yes - but since the discussion is about C or C++, it is those languages
> and their standards that matter.

But how can code that expresses exactly the same thing be fine in one
language and not in another?

The standards say accessing an "float"
> crap C or C++ code. Particular implementations are free to give a
> definition for it - code can be crap C++ code while being valid "MSVC"
> code or valid "gcc -fno-strict-aliasing" code.

And here apparently the same code can be also be fine in a particular
dialect of C or C++, or even using a particular set of compiler options,
but be 'crap' when someone changes the compiler or option.

> pointer? In real code, it's very rare that this sort of thing crops up.
> Being rare, there is no problem using valid alternatives like unions or
> memcpy - both of which can be tightly optimised by a compiler.

In my own codebase I seem to use cast-based type-punning 5-10 times per
application, But union-based type-punning is used all the time.
Cast-based type-punning is convenient when there is no struct or union
involved.

It is very frequent that I'm interested in interpreting the bytes of a
float as some integer value, or vice versa.

(Here's an example using that non-C language; this prints the underlying
binary bits of 0.1 which is a 64-bit float:

print int64@(0.1):"b"

Output is:

11111110111001100110011001100110011001100110011001100110011010

No faffing about with unions or *(int64_t*)&x casts (which won't work on
0.1), which apparently have UB anyway, or memcpy.)

> And how often is this actually relevant in real code? Or is this just
> another one of your "features" whose only purpose is to let you pretend
> your language is "better" than C?

It's 'better' in that it allows this sort of obvious stuff that people
want to write, while C doesn't.

Chris Vine <chris@cvine--nospam--.freeserve.co.uk>: May 21 04:04PM +0100

On Tue, 21 May 2019 12:35:02 +0100
> > clang doesn't". If that is your approach to programming then "crap
> > code" seems like too mild a description.

> It's only 'crap code' because the language says so.

Indeed so. It's crap because the standard says it is crap.

> interested in). Or the behaviour might be specific to that a range of
> machines. But that's OK because we're low-level programmers and we know
> what we're doing, right?

It is apparent from this thread that at least one person does not know
what he is doing.

You are entitled to your _opinion_ if you are going to write your own
language. I think Alf also shares your view, but recognises (sadly)
that it's not how C and C++ in fact are. Our religious spammer, whilst
initially being ignorant of the strict aliasing rules, has said he has
the same view as you about type punning via pointers and now permits it
in his language (at present mainly vapourware I think, but I may be
wrong). Presumably he is only intending to have his language used on a
platform where alignment is not an issue or will have all objects
padded to the most generic alignment. But we have been talking in this
thread about C and C++. The standards for those languages determine
whether the code is valid C or C++, and the standards say it isn't.

Also, your _opinion_ might be valid if you are writing your own
compiler: you could decide to provide type punning extensions, as
indeed gcc has via union's and/or its -fno-strict-aliasing switch. But
then you cannot call the code valid C or C++. Instead it is code
conforming to your opinion.

> My approach is not to use C or C++, partly because all their silly rules
> on UB (which seem to only exist to enable extreme optimisations) make
> life harder.

What makes you say that memcpy() is harder than making a
reinterpret_cast? I don't see it. It seems to me that the main reason
people don't do it correctly is not the difficulty but that they just
can't be bothered to understand the rules.

Incidentally the standards permit any object pointer to be cast to
char* and unsigned char* and dereferenced, if you want byte level
access to an object.

> And the [a] could be [esi] where esi contains a pointer that is
> interpreted as int32_t* then float32* on successive lines.

> All perfectly reasonable things that you might want to do.

Possibly, but that is your language. It's not C or C++. You are
converting a discussion about whether particular code is valid C or C++
into one about what some new language of yours should permit.

Chris Vine <chris@cvine--nospam--.freeserve.co.uk>: May 21 04:28PM +0100

On Tue, 21 May 2019 15:45:48 +0100
> > and their standards that matter.

> But how can code that expresses exactly the same thing be fine in one
> language and not in another?

What a weird thing to say. Code that expresses exactly the same thing
can be fine in one language but not in another because the respective
standards for those languages say so.

In setting a standard, it is a matter of choice for the language
designer involving trade-offs between amongst other things optimization
opportunities, convenience to the programmer, language complexity and
safety.

Paavo Helde <myfirstname@osa.pri.ee>: May 21 07:20PM +0300

On 21.05.2019 15:29, Jorgen Grahn wrote:
> not /hard/ to grasp. If you think of memory as a store for /typed/
> objects, plus the extra accomodations for char* and unions, you have
> the rough picture.

This page

https://gist.github.com/shafik/848ae25ee209f698763cffee272a58f8

claims that the following is UB in C++.

void *p = malloc(sizeof(float));
float *fp = p;
*fp = 1.0f;

and one should use placement new instead:

new (p) float {1.0f} ;

Now that I find hard to crasp. If this is true, how is it even possible
to write e.g. a custom memory allocator?

A low-level memory allocator typically does not know anything about
float, how is it possible to convert a memory block pointer to float*
which has to be returned from allocate() (and later e.g. from
std::vector<float>::data())? Do I really need to perform a dummy
placement new in the beginning of the memory block, to obtain a valid
float* pointer?

Bonita Montero <Bonita.Montero@gmail.com>: May 21 06:23PM +0200

> void *p = malloc(sizeof(float));
> float *fp = p;
> *fp = 1.0f;

LOL.

Paavo Helde <myfirstname@osa.pri.ee>: May 21 07:50PM +0300

On 21.05.2019 16:52, Juha Nieminen wrote:
> chance of losing accuracy when it's read back, for the simple
> reason that the base-10 representation cannot accurately represent
> every single base-2 floating point value.

With enough digits, a decimal (as well as any other) representation can
get arbitrarily close to any real value, so it can also get arbitrarily
close the any value represented exactly in base-2. It is not needed to
represent the base-2 value exactly, it is just enough to provide any
base-10 value which is rounded to the correct base-2 value.

Historically there were indeed some round-trip bugs when serializing
floating-point values, but AFAIK these bugs got fixed in the C runtime
libraries about 10-20 years ago or so. Plus there are libraries which
ensure the minimum number of decimal digits for perfect round-trip.

Maybe you wanted to say that ensuring a proper round-trip is trickier in
base-10 than in base-16 and it may easily waste more bytes than strictly
necessary?

Bart <bc@freeuk.com>: May 21 06:24PM +0100

On 21/05/2019 16:28, Chris Vine wrote:
> designer involving trade-offs between amongst other things optimization
> opportunities, convenience to the programmer, language complexity and
> safety.

Yes, but in this case what someone may want to do could be completely
reasonable in itself, and could be well-defined on the processors they
know their program will run on.

When you look at the reasons why C and also C++ have made certain things
UB, you could well find they don't apply in your case.

Now, I've frequently written code in one language, which works perfectly
well as native code, but it hits UB if I auto-translate to C. There are
actually more problems generating C source as a target, then generating ASM.

(And actually, when I briefly tried to target C++, there were even more
problems.)

jameskuyper@alumni.caltech.edu: May 21 10:34AM -0700

On Saturday, May 18, 2019 at 6:14:42 AM UTC-4,
...
> I didn't even know hexfloat existed. It seems a spectacularly useless
> manipulator. What on earth is the point of it?

The C++ standard defines the behavior of hexfloat in terms of the
behavior of std::printf() with a "%a" or "%A" format specifier. The C++
standard does not provide a detailed description of the behavior of
printf(), cross-referencing the C standard for that definition. The C
standard defines that behavior in part by recommending that conversions
between strings and floating point values performed by standard library
functions should be the same as those described for floating point
constants. The less important advantage described below therefore traces
back to things explained only in the C standard's description of
floating point constants. The main advantage of hexfloat described below
is far less subtle. It's an inherent consequence of using hexadecimal
rather than decimal notation.

I'll use pi as an example to demonstrate the issue. The first 17
significant digits of pi are

3.1415926535897932

Like most decimal floating point constants, 3.1415926535897932 cannot be
represented exactly (which is a truth distinct from the fact that pi
itself also cannot be represented exactly). On the system I'm currently
using, three consecutive floating point values that can be represented
exactly are:

3.141592653589792671908753618481568992137908935546875
3.141592653589793115997963468544185161590576171875000
3.141592653589793560087173318606801331043243408203125

I've shown 52 significant digits for those numbers, which is the minimum
needed to display those values exactly. 17 digits are sufficient,
however, to uniquely identify the middle value as the closest
representable value to the exact value of pi. However the C standard
permits 3.1415926535897932 to have any one of those three values. That
would be true even if the constant used all 52 significant digits of the
middle representation shown above.
The corresponding hexadecimal floating point constants for those same
values are

0X1.921FB54442D17P+1
0X1.921FB54442D18P+1
0X1.921FB54442D19P+1

The main advantage of hexadecimal floating point is simply that it takes
only 11 significant hexadecimal "digits" to represent those values
exactly, compared with the 17 significant decimal digits needed to
specify a value that's closer to the best value than to either of the
others, and compared with the 52 digits needed to represent those values
exactly.

However, there's another less important advantage. If FLT_RADIX is a
power of 2 (as it is on almost, but not quite, every implementation of C
or C++ currently in use), then 0X1.921FB54442D18P+1 is only allowed to
be represented by that exact value, the other two values are not
permitted as they would be for a decimal floating point constant. Also,
0X1.921FB54442D187P+1 is required to be rounded correctly to
0X1.921FB54442D18P+1; it's not allowed to round to 0X1.921FB54442D19P+1.
That would not be the case for decimal floating point constants.

These issues are important only for values that have so many significant
digits that they push the limits on what values can be represented
exactly. This almost never comes up with numbers that describe real-
world measured quantities unless you're using "float" for your
calculations, in which case the right solution is to use "double". It's
only in high precision scientific calculations or pure mathematics that
you're likely to run into situations where hexfloat actually becomes
important. Therefore, if you aren't doing that kind of work, hexfloat
probably seems pointless to you. However, it can be of critical
importance to people who are doing that kind of work.

"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: May 21 08:01PM +0200

On 21.05.2019 18:20, Paavo Helde wrote:
> *fp = 1.0f;

> and one should use placement new instead:

> new (p) float {1.0f} ;

The initialization of `fp` wouldn't compile as C++, but let's assume a
`static_cast` or `reinterpret_cast` there. I favor the latter since it
communicates better to the reader, but I believe for reasons having to
do with a shortcoming of C++03 Herb Sutter and Andrei Alexandrescu
recommended used `static_cast` in their old coding guidelines book.

> > Now that I find hard to crasp. If this is true, how is it even possible
> to write e.g. a custom memory allocator?

Magic is indeed performed in a `new` expression: it transforms a `void*`
produced by an allocator function, to a typed pointer.

Another place this magic occurs, is in the member functions of a
`std::allocator`. At least in C++03. I'm not as up-to-date as I should
be to participate in C++ discussions.

Anyway, even placement `new` doesn't save one from UB when there is an
object other than byte in that memory chunk, and one obtains a pointer
to it of an unrelated pointee type. Wham bang, you're formally dead.

On the other hand, when there is no object of type other than bytes,
then `reinterpret_cast` is technically good and so is placement `new`.

> std::vector<float>::data())? Do I really need to perform a dummy
> placement new in the beginning of the memory block, to obtain a valid
> float* pointer?

Nah, just FUD.

Cheers!,

- Alf

Paavo Helde <myfirstname@osa.pri.ee>: May 21 09:02PM +0300

On 21.05.2019 20:24, Bart wrote:

> Now, I've frequently written code in one language, which works perfectly
> well as native code, but it hits UB if I auto-translate to C.

Seems like a bug in the auto-translator.

> There are
> actually more problems generating C source as a target, then generating
> ASM.

Which ASM? All the 72 architectures covered by gcc?

> (And actually, when I briefly tried to target C++, there were even more
> problems.)

I bet.

Bonita Montero <Bonita.Montero@gmail.com>: May 21 08:47PM +0200

> communicates better to the reader, but I believe for reasons having to
> do with a shortcoming of C++03 Herb Sutter and Andrei Alexandrescu
> recommended used `static_cast` in their old coding guidelines book.

static_cast, reinterpret_cast or C-style-cast - pure syntatic sugar.

Bart <bc@freeuk.com>: May 21 07:56PM +0100

On 21/05/2019 19:02, Paavo Helde wrote:

>> Now, I've frequently written code in one language, which works perfectly
>> well as native code, but it hits UB if I auto-translate to C.

> Seems like a bug in the auto-translator.

Just a mismatch of languages, even though C in this case superficially
works the same way.

C as an intermediate language, even though it is very frequently used
for that purpose, leaves a lot to be desired.

>> actually more problems generating C source as a target, then generating
>> ASM.

> Which ASM? All the 72 architectures covered by gcc?

The ones I had in mind were x64 and ARM64. I think I decided it would be
simpler to target those two than to try and generate C code which would
always compile warning-free and UB-free.

Are there are any others I'm likely to be able to program in consumer
equipment?

I think it is quite common for applications to only need to run on a
small number of architectures, but not want to be inconvenienced by a
language designed to work with every conceivable architecture, past,
present and future, and which therefore have to designate as UB,
behaviour which cannot be guaranteed to work across all of them.

>> (And actually, when I briefly tried to target C++, there were even more
>> problems.)

> I bet.

I haven't tried it for a while. If I try it know on a smallish 3200-line
generated-C program, I get the following number of lines of errors and
warnings:

No options (just -c) Lots of -W options

gcc 0 lines 2900 lines
g++ 1150 lines 2850 lines

Typical error from the 1150-line output is:

jpeg.c:420:5: error: invalid conversion from 'int64 (*)(jpeg_stream*)'
{aka 'long long int (*)(jpeg_stream*)'} to 'void*' [-fpermissive]

(Actually, this line is nothing to do with the application, but this
language generates some metadata which includes an array of pointers to
all functions used in the program. Since every function would have its
own pointer type which depends on its signature, what should be the
array element type?

As far as I'm concerned, any function pointer can be stored within the
same space as a void* pointer on all targets I want this to run on. It
should be a non-issue.)

Chris Vine <chris@cvine--nospam--.freeserve.co.uk>: May 21 08:04PM +0100

On Tue, 21 May 2019 19:20:46 +0300
> std::vector<float>::data())? Do I really need to perform a dummy
> placement new in the beginning of the memory block, to obtain a valid
> float* pointer?

Looking at operator new() rather than malloc() (both of which do pretty
much the same thing), the C++14 standard (§3.7.4.1/2) says

"The pointer returned shall be suitably aligned so that it can be
converted to a pointer of any complete object type with a fundamental
alignment requirement (3.11) and then used to access the object or
array in the storage allocated (until the storage is explicitly
deallocated by a call to a corresponding deallocation function)."

So you can use the storage provided by operator new() to access an
object constucted in that storage. The dynamic type of the allocated
storage is in effect the type of the first object constructed in it
(here, a float). So for trivial types (which do not need to execute a
constructor and destructor) I don't think it is necessary to use
placement new - I think you can just assign or memcpy() into the memory.
That is a feature of trivial types.

It would be a ridiculous interpretation of the standard that malloc()
operates differently. Footnote 36 itself says:

"The intent is to have operator new() implementable by calling
std::malloc() or std::calloc(), so the rules are substantially the
same. C++ differs from C in requiring a zero request to return a
non-null pointer."

The dynamic type of malloc()'ed memory must surely, as in the case of
the effective type in C, arise upon first construction of an object in
that memory, either by placement new (C++) or assignment or memcpy()
(C, and C++ trivial types).

Chris

You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

soft and program

Tuesday, May 21, 2019

Digest for comp.lang.c++@googlegroups.com - 25 updates in 1 topic

No comments:

Blog Archive

About Me