- std::hexfloat - 25 Updates
Hans Bos <hans.bos@xelion.nl>: May 22 07:24PM +0200 Op 22-5-2019 om 14:23 schreef Paavo Helde: >> *always* outputs the exact amount of digits to represent the value >> accurately. > Yes, hexfloat is defined as an exact representation. But there is no guarantee that the radix is 2. Suppose my system has doubles with radix 10. What, in that case, is the exact hex representation of 0.1? |
James Kuyper <jameskuyper@alumni.caltech.edu>: May 22 08:40AM -0400 On 5/22/19 2:40 AM, Juha Nieminen wrote: ... > How would you know, using standard C/C++, how many digits do you need to > output in order to ensure no loss of bits when reading the value back? #include <float.h>, and look at the value of FLT_DECIMAL_DIG, DBL_DECIMAL_DIG, or LDBL_DECIMAL_DIG, as appropriate. Paavo has already given you the modern C++ equivalent, but this will also work with C and with older versions of C++. ... > It is my understanding that hexadecimal floating point representation > *always* outputs the exact amount of digits to represent the value > accurately. It would be more accurate to say that his is true by default. If you specify a particular length, it will obey your specification, whether or not you specify enough digits to meet that requirement. |
Paavo Helde <myfirstname@osa.pri.ee>: May 22 04:42PM +0300 On 22.05.2019 16:10, Bart wrote: > Here's one very basic example: if A is signed, and B unsigned, then my > language says that A+B is performed as signed, with overflow > well-defined, and at at least 64 bits. I don't question your design, I'm just curious: what would be the use case of 64-bit signed wrapover? I.e. in what situation is it useful to have 9223372036854775807 + 1 == -9223372036854775808 For unsigned wrapover in C and C++ at least there is a use case for emulating hardware bit registers, or to have an automatic reset for some generated ID numbers. > extra ones. > With + it doesn't matter, but what about * or /? And we've only just > tried to translate A+B! Looks like this should be translated to C++, not to C, with appropriate C++ number-like classes and custom arithmetic operators. > So it is easy to see that it can be a considerably bigger pain to > generate perfectly correct C, than to generate ASM. I personally have found almost *everything* a bigger pain in C than, say, C++. |
Ben Bacarisse <ben.usenet@bsb.me.uk>: May 20 09:17PM +0100 > (7.1) -- the glvalue is used to access the object, [...] > Taken together I think these passages make the case pretty > airtight. Ah, thank you so much. I hope this did not take too long. I thought it would be the result of a number of passages but it never occurred me to to look at lifetime. -- Ben. |
scott@slp53.sl.home (Scott Lurndal): May 22 06:19PM >> Actually at least one major processor vendor has been thinking about >> changing this in the future..... >Details, please? Unfortunately they're not public. However, there are public projects that propose new pointer formats, for example: https://www.cl.cam.ac.uk/research/security/ctsrd/cheri/cheri-faq.html |
Paavo Helde <myfirstname@osa.pri.ee>: May 22 08:52PM +0300 On 22.05.2019 20:35, Bart wrote: > common sense compilers. > Is /that/ is supposed to be preferable? > How on earth could an clear 0 to 9 loop turn into an endless loop? Just to be clear: I do not advocate having UB. I advocate producing an error if the operation cannot be completed as intended. The ideal output would be: 2147483644 2147483645 2147483646 2147483647 Program terminated because of uncaught exception: numeric overflow in 'a++;' with a=2147483647. |
David Brown <david.brown@hesbynett.no>: May 22 12:18AM +0200 On 21/05/2019 21:53, Scott Lurndal wrote: > Actually at least one major processor vendor has been thinking about > changing this in the future..... > And it certainly wasn't true in the past. It is not true at the moment either. There are more processors around than just x86 and ARM. (I know you, Scott, know this - I am expanding on your post, not correcting it.) And of course, the size of pointers has absolutely /nothing/ to do with the undefined nature of trying to access an object through a pointer to a different type. |
Ben Bacarisse <ben.usenet@bsb.me.uk>: May 20 09:21PM +0100 >> I consider that to be allowed. There's no way that C can't specify the >> result, so this is as "defined" a construct as it can. > Did you mean "there's no way that C /can/ specify the result" ? <sigh> Yes I did. Far too many of my typos negate my meaning. > If so, then that is not quite true - C99 (and C11) specify it better, > even though the final result is still implementation dependent. Sure, they are clearer in a footnote (I take it you refer to the "reinterpret the bits" footnote). Given the context (the late 80s) I don't think anyone was ever in much doubt about what the implementation defined result would be. Those were simpler times! <cut> -- Ben. |
Keith Thompson <kst-u@mib.org>: May 22 10:41AM -0700 >>every type and always will be. > Actually at least one major processor vendor has been thinking about > changing this in the future..... Details, please? > And it certainly wasn't true in the past. True. -- Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst> Will write code for food. void Void(void) { Void(); } /* The recursive call of the void */ |
scott@slp53.sl.home (Scott Lurndal): May 22 06:09PM >>It is not true at the moment either. There are more processors around >So which architectures have a variable number of memory addressing bits >depending on what C type is stored at the address then? More often than not, it is function pointers that are a different size than pointers to other objects; but it depends entirely on the architecture. Some (now basically extinct) burroughs systems, for example, have 80-bit function pointers, but 32-bit data pointers. Not uncommon for segmented architectures (a la 80286). I can't address future processor vendor work in this area without violating non-disclosure agreements, unfortunatly. |
Juha Nieminen <nospam@thanks.invalid>: May 22 06:40AM > floating-point values, but AFAIK these bugs got fixed in the C runtime > libraries about 10-20 years ago or so. Plus there are libraries which > ensure the minimum number of decimal digits for perfect round-trip. How would you know, using standard C/C++, how many digits do you need to output in order to ensure no loss of bits when reading the value back? (And this is assuming that the C or C++ standard library being used has been implemented such that given enough decimal digits, they will be rounded to the correct direction as to restore the original value exactly.) It is my understanding that hexadecimal floating point representation *always* outputs the exact amount of digits to represent the value accurately. |
Bonita Montero <Bonita.Montero@gmail.com>: May 22 07:20AM +0200 > That depends on what you mean by "syntactic sugar". static_cast > will carry out pointer adjustment so it can be used to navigate > an inheritance graph correctly. "syntactic sugar" was related to the case above. |
Ian Collins <ian-news@hotmail.com>: May 22 04:33PM +1200 On 22/05/2019 12:07, Bart wrote: >> bugs nonetheless. > You can call a unwillingness to expend a huge, disproportionate effort > in overcoming C's many shortcomings for this purpose a bug if you like. That's never stopped you expending a huge, disproportionate effort in whinging about C. That time would have easily been enough to fix your code. > g++ (no options): > t.c:66:5: error: invalid conversion from 'void (*)()' to 'void*' > [-fpermissive] The conversion is "conditionally-supported" in C++>=11 which makes it a "program construct that an implementation is not required to support". Thus: $ clang++ -std=c++98 -Wall -Werror -Wextra -pedantic /tmp/x.cc /tmp/x.cc:6:14: error: cast between pointer-to-function and pointer-to-object is an extension [-Werror,-Wpedantic] t_fnptr = (void*)(&puts); ^~~~~~~~~~~~~~ 1 error generated. $ clang++ -std=c++11 -Wall -Werror -Wextra -pedantic /tmp/x.cc $ -- Ian. |
Ian Collins <ian-news@hotmail.com>: May 23 07:40AM +1200 On 23/05/2019 01:10, Bart wrote: > Here's one very basic example: if A is signed, and B unsigned, then my > language says that A+B is performed as signed, with overflow > well-defined, and at at least 64 bits. So use specialised C++ number classes. There are many things that can't easily be expressed in C that are simple to do in C++. -- Ian. |
David Brown <david.brown@hesbynett.no>: May 22 12:13AM +0200 On 21/05/2019 20:56, Bart wrote: >> Seems like a bug in the auto-translator. > Just a mismatch of languages, even though C in this case superficially > works the same way. In other words, bugs in the auto-translator. The flaws are in the design and specification, rather than the implementation, but they are bugs nonetheless. In order to translate code from one language to another, you need to understand both languages. And you need to generate correct and valid code - not something that looks a bit like what you would have liked the target language to be. > C as an intermediate language, even though it is very frequently used > for that purpose, leaves a lot to be desired. It is fine as a target language, but you need to generate correct C code. (Alternatively, you need to generate "C for this compiler, this target and these options" code - and be honest about it. That is a perfectly reasonable solution, and the one used by most code generators.) Other people who write code generators or translators that produce C manage it. And when their generated code has flaws, they blame their generators - not the language. > The ones I had in mind were x64 and ARM64. I think I decided it would be > simpler to target those two than to try and generate C code which would > always compile warning-free and UB-free. You understand how assembly works. You are willing to use features of assemblers. You don't understand how C works. You are unwilling to use many features of the language. It is not surprising that you find generating assembly easier than generating C code. > Are there are any others I'm likely to be able to program in consumer > equipment? Since your languages and tools are for you alone, it is up to you to answer that one. > language designed to work with every conceivable architecture, past, > present and future, and which therefore have to designate as UB, > behaviour which cannot be guaranteed to work across all of them. I agree that code rarely has to be very portable. Of course, I disagree about your characterisation of UB - in particular, it does not make sense to suggest that code with undefined behaviour could work at all. By the meaning of the words, code with undefined behaviour does not have any definition of what it is supposed to do, and therefore cannot be considered to "work". At best, you mean the code should do what it looks like you think it should do. That might be okay to a human reader, but computers are fussy about definitions. > As far as I'm concerned, any function pointer can be stored within the > same space as a void* pointer on all targets I want this to run on. It > should be a non-issue.) C and C++ do not share your opinion - and you are asking the compiler to treat your code as (approximately) standard C or C++. However, gcc (and all other serious compilers) give you a lot of flexibility about choosing warnings and other options, precisely to let you tune the details of the language you want. If you want to generate code that only works on platforms where you can store a function pointer in a void* pointer (though I can't imagine why it would be useful), you can tune your options to suit. Perhaps try with "-fpermissive" ? |
Juha Nieminen <nospam@thanks.invalid>: May 21 01:52PM > I didn't even know hexfloat existed. It seems a spectacularly useless > manipulator. What on earth is the point of it? If you save a floating point value in ascii usint the normal decimal representation, in many (perhaps even most) cases there's a high chance of losing accuracy when it's read back, for the simple reason that the base-10 representation cannot accurately represent every single base-2 floating point value. Base-16 representation, however, can. It exactly represents the original floating point value, to the last bit, and nothing is lost in the conversion to either direction. Its advantage is that it's agnostic to the actual floating point value binary representation in the hardware (eg. it doesn't assume that it's an IEEE floating point value of a given size). Thus exact floating point values can be transferred between computers that may use different native floating point formats. If you save the floating point bits as raw data, you'll at the very least run into the problem of endianess, and of course you'll be assuming that both the source and target architectures use the exact same internal floating point bit representation. |
Chris Vine <chris@cvine--nospam--.freeserve.co.uk>: May 21 08:47PM +0100 On Tue, 21 May 2019 19:28:17 +0000 (UTC) > >-fno-strict-aliasing switch is applied or not. > And someone else explained what happened which was nothing to do with type > punning. Nonsense - see below. > every type and always will be. > However, feel free to provide a proper example where it fails or you can just > keep on farting out indignant hot air. You choice. No, still wrong I am afraid. I repeat: "The 'it' which doesn't work is type punning through casting pointers." It doesn't work because it doesn't work, end of story. Your bluster that when the standard says incorrect aliasing gives rise to undefined behaviour it really means that it works fine in your toy programs, is crap. And the example to which I referred was corrected (20 May 2019 21:49:30 +0200) and did demonstrate the strict aliasing issue. If you want another example, there is one in the posting of another person: int foo( float *f, int *i ) { *i = 1; *f = 0.f; return *i; } int main() { int x = 0; std::cout << x << "\n"; // Expect 0 x = foo(reinterpret_cast<float*>(&x), &x); std::cout << x << "\n"; // Expect 0? } You will find many similar examples in articles on the internet about strict aliasing. |
"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: May 21 08:01PM +0200 On 21.05.2019 18:20, Paavo Helde wrote: > *fp = 1.0f; > and one should use placement new instead: > new (p) float {1.0f} ; The initialization of `fp` wouldn't compile as C++, but let's assume a `static_cast` or `reinterpret_cast` there. I favor the latter since it communicates better to the reader, but I believe for reasons having to do with a shortcoming of C++03 Herb Sutter and Andrei Alexandrescu recommended used `static_cast` in their old coding guidelines book. > > Now that I find hard to crasp. If this is true, how is it even possible > to write e.g. a custom memory allocator? Magic is indeed performed in a `new` expression: it transforms a `void*` produced by an allocator function, to a typed pointer. Another place this magic occurs, is in the member functions of a `std::allocator`. At least in C++03. I'm not as up-to-date as I should be to participate in C++ discussions. Anyway, even placement `new` doesn't save one from UB when there is an object other than byte in that memory chunk, and one obtains a pointer to it of an unrelated pointee type. Wham bang, you're formally dead. On the other hand, when there is no object of type other than bytes, then `reinterpret_cast` is technically good and so is placement `new`. > std::vector<float>::data())? Do I really need to perform a dummy > placement new in the beginning of the memory block, to obtain a valid > float* pointer? Nah, just FUD. Cheers!, - Alf |
David Brown <david.brown@hesbynett.no>: May 20 09:57PM +0200 On 20/05/2019 20:20, Tim Rentsch wrote: > changed in C90, or changed between C90 and C99, such a change > surely would have been mentioned in the Rationale documents. > AFAICT there isn't any. I don't think that all the changes in C99 are covered in the rationale documents (at least, not that I have seen). However, I am happy to believe that the intended behaviour for unions has not changed between C90 and C99, and it is merely the wording that has been made clearer. |
Keith Thompson <kst-u@mib.org>: May 22 10:15AM -0700 > You can already do that with standard hex: > float f = 1.234; > cout << hex << *((long *)&f) << endl; That assumes that long and float have the same size (which is not guaranteed) and that &f is correctly aligned for a long* (which is also not guaranteed). If it happens to work, it prints the *representation* of f, not its value (if long has no padding bits or trap representations). If you want a safe way to print its representation, you can reinterpret it as an array of unsigned char -- but the result is still not directly usable on a system with a different representation for float. std::hex gives you an unambiguous character sequence representing the *value* of f. float f = 1.234; std::cout << std::hexfloat << f << std::endl; The output I get is 0x1.3be76cp+0 which also happens to be a valid literal. It's exact if the floating-point radix is two (or a power of two). An exact decimal representation of that value is 1.2339999675750732421875 (on a typical system). You could print "1.234" and it would *probably* yield the same value if converted back from a string to a float. Hexadecimal floating-point isn't as human-readable as decimal, of course (for most humans). But it's unambiguous, and it means you don't have to think about representation or about loss of precision when converting back and forth between binary and decimal. -- Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst> Will write code for food. void Void(void) { Void(); } /* The recursive call of the void */ |
David Brown <david.brown@hesbynett.no>: May 22 01:22PM +0200 On 22/05/2019 02:07, Bart wrote: >> bugs nonetheless. > You can call a unwillingness to expend a huge, disproportionate effort > in overcoming C's many shortcomings for this purpose a bug if you like. You are happy to classify your wilful and determined ignorance of C as a bug in yourself? Okay, I suppose. Certainly the idea that this is all a "huge, disproportionate effort" is your own personal problem. Undefined behaviours in C are mostly quite clear and obvious, you rarely meet them in practice, and they are mostly straightforward to handle. For a language generator, they are peanuts to deal with. These have been explained to you countless times. Of course, dealing with them nicely and efficiently involves macros and the C preprocessor. But it is apparently far better to whine and moan about deficiencies in C than to use the features of C to get what you need. > not-quite-so-unorthogonal type system, which is 32-bit-based even when > the final target is 64-bit, with its million and one quirks, and which > doesn't quite match that of the target language. C is not based on any hardware model - it is more abstract. Yes, putting that in between the two layers that have matching models will cause complications, and you will have to be careful to get it right. But as abstract models go, C's is not difficult to comprehend. >> It is fine as a target language, but you need to generate correct C code. > Which means what? So that there are 0 errors and 0 warnings no matter > what options somebody will apply? No. It means that there are no errors in the code, based on whatever restrictions you might want to place on how it is used. If you want to generate fully portable C code (matching a particular standard), then do so. If you want to generate code that has limitations on the compiler or flags needed, then do so - but make sure that you document the restrictions. Far and away the best choice here is to use conditional compilation and compiler detection. For example, if you want to allow casting between different pointer types to work for punning, and you want wrapping overflow behaviour to match your source language, then try something like this: #ifdef __GNUC__ /* Set options needed by gcc and clang for desired C variant */ #pragma GCC optimize "-fno-strict-aliasing" #pragma GCC optimize "-fwrapv" #pragma GCC diagnostic ignored "-Wformat" #elif defined(_MSC_VER) /* Set options needed by MSVC for desired C variant */ #elif defined(_BART_C) /* Bart's C compiler already supports Bart C */ #else #error Untested compiler - remove this and compile at your own risk
Subscribe to:
Post Comments (Atom)
|
No comments:
Post a Comment