soft and program: Digest for comp.lang.c++@googlegroups.com

comp.lang.c++@googlegroups.com

Google Groups

CHAR_BIT is not eight - 8 Updates
Type conversion for hundreds of lines - 11 Updates
Why doesn't my simple custom exception's what() return any text? - 1 Update
creating a new local variable after a goto - 4 Updates
I will cut off cross-posting - 1 Update

Keith Thompson <Keith.S.Thompson+u@gmail.com>: Oct 12 05:16PM -0700

> __mov_byte() intrinsics described in Section 7.6"

> So there you go: There are C++ compilers in use today in the year 2022
> with CHAR_BIT something other than 8.

Apparently the TMS320C28x is a DSP (Digital Signal Processor). I've
heard that it's common for CHAR_BIT to be 16 or 32 in implementations
for DSPs.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
Working, but not speaking, for Philips
void Void(void) { Void(); } /* The recursive call of the void */

Juha Nieminen <nospam@thanks.invalid>: Oct 13 08:02AM

> So there you go: There are C++ compilers in use today in the year 2022 with CHAR_BIT something other than 8.

The vast, *vast* majority of C and C++ programmers assume that 'char' is
always 8-bit, and sometimes write code making that assumption. It's
*extremely* rare to see any code out there that uses CHAR_BIT at all,
but it's quite common to see code that assumes that it's 8. Most C and
C++ programmers just assume that it's a de-facto universal standard,
and that the actual C and C++ standards are just antiquated in this
regard, by keeping "backwards compatibility" with some more exotic
CPUs from the 1970's that have been completely obsolete for decades.

I suppose there's a good reason why the language standards don't assume
that 'char' is 8 bits, after all.

Muttley@dastardlyhq.com: Oct 13 08:08AM

On Thu, 13 Oct 2022 08:02:59 -0000 (UTC)
>CPUs from the 1970's that have been completely obsolete for decades.

>I suppose there's a good reason why the language standards don't assume
>that 'char' is 8 bits, after all.

A number of current Texas Instruments DSPs have 16 bit chars.

Personally I always use int8_t or uint8_t if I'm writing portable code just
to be 100% sure.

Frederick Virchanza Gotham <cauldwell.thomas@gmail.com>: Oct 13 02:06AM -0700

On Thursday, October 13, 2022, Juha Nieminen wrote:
> CPUs from the 1970's that have been completely obsolete for decades.

> I suppose there's a good reason why the language standards don't assume
> that 'char' is 8 bits, after all.

I totally agree. I've been using C++ compilers for about 20 years now, and yesterday was the first time I encountered CHAR_BIT != 8. Lots of my code uses "char unsigned" and "uint8_t" interchangeably.

On Thursday, October 13, 2022, Mut...@... wrote:
> A number of current Texas Instruments DSPs have 16 bit chars.

> Personally I always use int8_t or uint8_t if I'm writing portable
> code just to be 100% sure.

I'm writing a 'universal header file' today that will be used in a program on:
1) x86_64 desktop PC
2) microcontroller Texas Instruments F2809 (with 16-Bit bytes)
3) microcontroller Arduino sam3x8e

If I include "cstdint", then it doesn't have "uint8_t" on the Texas Instruments compiler. So I'm using "uint_least8_t" in the code.

Bo Persson <bo@bo-persson.se>: Oct 13 11:10AM +0200

On 2022-10-13 at 10:02, Juha Nieminen wrote:
> CPUs from the 1970's that have been completely obsolete for decades.

> I suppose there's a good reason why the language standards don't assume
> that 'char' is 8 bits, after all.

Often you *can* make the assumption that you have 8-bit chars, and
really not care about any other options.

A lot of code will not run on a signal processor anyway, because it uses
resources that are not available there, like a database or a desktop
user interface.

David Brown <david.brown@hesbynett.no>: Oct 13 11:38AM +0200

On 13/10/2022 02:16, Keith Thompson wrote:

> Apparently the TMS320C28x is a DSP (Digital Signal Processor). I've
> heard that it's common for CHAR_BIT to be 16 or 32 in implementations
> for DSPs.

Yes, that's correct. The TMS320 family are widely used in industrial
electronics and high reliability or rough environment systems. They
have 16-bit char.

Most DSPs these days are very specialised devices, only programmed by a
very few people and with virtually no overlap with "normal" programming.
It's much easier to use a "normal" processor with SIMD and vector
instructions to do the same job that previously needed a DSP for speed.
But a DSP might have instructions that can handle multiple memory
accesses from different ram banks, multiply-accumulate-saturate
operations, pointer update with circular buffer wrapping or FFT
bit-twiddling, loop counter decrement and checking, all within a single
one-clock instruction. You don't really program these in C (much less
C++) - it's really assembly, wrapped in a a C intrinsic.

However, these are usually "hidden" coprocessors, using
manufacturer-written binary blobs, rather than programmed by "normal"
programmers. So your mobile phone chip might have a bunch of normal ARM
cores and a hidden DSP for software-defined radio that you never access
directly.

There are also some DSP cores with odder CHAR size than 16 or 32,
including soft cores where the size of a "byte" is determined when you
add the core to your ASIC or FPGA. 24-bit used to be very common for
audio and visual systems.

The TMS320 and related devices from Texas Instruments are one of the few
types of DSP that are still used by a wide variety of developers. Just
be careful of TI's development tools - they have a tradition of being a
little loose on standards conformance. A particular common feature of
many of TI's tools is that they don't bother zero-initialising
statically allocated implicitly initialised data (the ".bss" segment).
That one caused me a lot of "fun" on a couple of occasions, once with a
TMS320F28x device and once with a very different kind of microcontroller.

David Brown <david.brown@hesbynett.no>: Oct 13 11:42AM +0200

On 13/10/2022 11:06, Frederick Virchanza Gotham wrote:
> 2) microcontroller Texas Instruments F2809 (with 16-Bit bytes)
> 3) microcontroller Arduino sam3x8e

> If I include "cstdint", then it doesn't have "uint8_t" on the Texas Instruments compiler. So I'm using "uint_least8_t" in the code.

That's one solution, yes.

Another is to avoid 8-bit types entirely, and use uint16_t as your
smallest type. That makes it easier to be sure sizes and structs are
the same size in every case. (This depends on what you need to do in
the header, of course.) Static assertions are your friend to ensure
that everything (such as struct size) is as you expect on all targets.

It is not common to program DSP's like the TMS320 using C++ - usually C
is the language of choice. But maybe TI's C++ compilers have improved
since I last looked (which was a long time ago). It's generally not
hard to make a common header that is suitable for C and C++, for
flexibility.

Juha Nieminen <nospam@thanks.invalid>: Oct 13 10:49AM

> A lot of code will not run on a signal processor anyway, because it uses
> resources that are not available there, like a database or a desktop
> user interface.

OTOH if you are making eg. a library that's supposed to be as
standard-conforming and as portable as possible, you should start
caring about CHAR_BIT, if your library somehow cares about bits,
accesses bits, and cares about the sizes of types in bits.

(Thinking about it, I have myself made some such libraries. I'll have
to go and review them from this perspective, and make sure they don't
make that assumption.)

Type conversion for hundreds of lines

JiiPee <kerrttuPoistaTama11@gmail.com>: Oct 13 07:24AM +0300

On 13/10/2022 00:27, Paavo Helde wrote:

> Why on earth are you having local variables of type short?

> I would fix this ASAP either by s/short/auto/ or s/short/size_t/.

But if its old code and hundreds of places, that would require a lot of
testing, right? But you would still change it even though might cause
other issues?

JiiPee <kerrttuPoistaTama11@gmail.com>: Oct 13 07:25AM +0300

On 13/10/2022 00:27, Scott Lurndal wrote:
>> I know this can be fixed:
>> a = static_cast<short>(v.size());

> so declare 'a' as size_t. Problem fixed.

but if in hundreds of places, that would cause othe problems... and
needs a lot of testing? you would still do it? but the probram should be
then tested and check it does not cause other side issues.

JiiPee <kerrttuPoistaTama11@gmail.com>: Oct 13 07:28AM +0300

On 13/10/2022 00:30, Keith Thompson wrote:
> Why is `a` defined as a short and not as a size_t?

> If there's a very good reason that `a`*needs* to be a short, it makes
> sense to consider some kind of cast. If not, just make it a size_t.

Lets assume its some old code... for example from 80's... and you have
it now. Would you do this change to hundreds of places, to change a to
size_t? But that might cause sides issues, isnt it? What if the other
code is relying on the short, and for example takes sizeof() of the
short when storing to a file.

JiiPee <kerrttuPoistaTama11@gmail.com>: Oct 13 07:30AM +0300

On 13/10/2022 01:26, Ben Bacarisse wrote:

> And why run the indexes backwards? It might be needed for some vectors
> of floating-point numbers, but not for int.

> It seems that a lot of peculiar choices have been made in the code base.

the example is only created to illustrate the copy propblem... its not
an example from a real code.

The point is, that if I copy in a for-loop vector::size() to an integer,
do you always do the casting for it? Because there are many places that
can happen....

"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Oct 12 09:30PM -0700

On 10/12/2022 9:28 PM, JiiPee wrote:
> size_t? But that might cause sides issues, isnt it? What if the other
> code is relying on the short, and for example takes sizeof() of the
> short when storing to a file.

Is the code busted as-is using a short? Do you actually _need_ to change
short to size_t? Warnings aside for a moment...

JiiPee <kerrttuPoistaTama11@gmail.com>: Oct 13 07:34AM +0300

On 13/10/2022 00:37, Paavo Helde wrote:
> wouldn't use static_cast here, but e.g. boost::numeric_cast,

OK, interesting option.

Ok, but you would always cast it somehow and not for example create some
conversion function etc?

I mean, if this conversion is needed in 500 places, you would just add
that boost::numeric_cast (or some other cast) into these places?

It makes the code quite "long" and a little difficult to read if adding
a lot of casts... but maybe its the only way.

JiiPee <kerrttuPoistaTama11@gmail.com>: Oct 13 07:37AM +0300

On 13/10/2022 07:30, Chris M. Thomasson wrote:
> Do you actually _need_ to change short to size_t? Warnings aside for a
> moment...

Good question. No, not necessarily... its only a warning in a compiler.
But.. obviously if the compiler is warning alot then better to check all
those warnings, isnt it? At least check all of them... but are you
saying I do not need to change anything, just leave the warnings there?
Just check the code, and if its OK then just not minding the warnings
rather than change hundreds of places?

Juha Nieminen <nospam@thanks.invalid>: Oct 13 07:52AM

> a = static_cast<short>(v.size());

> but if we have hundreds of those lines, how would you fix this? Place a
> static cast in all of them? Of create some helper funktion to do this?

Explicitly saying "static_cast<short>(...)" kind of indicates that you
are saying "yes, I'm fully aware that the value inside the parentheses
technically speaking may be larger than fits in a short, but I know
that it never will in this particular code, so this assignment is done
intentionally and it's not just an oversight (and yes, I am doing this
even knowing the risk that it could potentially cause a bug in the
future)".

So, in a manner of speaking, it's self-documenting code. Thus, I would
use that.

However, if you want to safeguard against that possible future bug,
you could use a function instead, and add a check in the function
(for example an assert(), or possibly a throw), unless this is a
very time-critical code (which I assume it isn't).

Paavo Helde <eesnimi@osa.pri.ee>: Oct 13 11:43AM +0300

13.10.2022 07:34 JiiPee kirjutas:
> conversion function etc?

> I mean, if this conversion is needed in 500 places, you would just add
> that boost::numeric_cast (or some other cast) into these places?

boost::numeric_cast is technically a function, not a cast.

> It makes the code quite "long" and a little difficult to read if adding
> a lot of casts... but maybe its the only way.

Right. In real code I'm using my own checked_cast and debug_cast, which
names I consider reasonable readable and greppable. These are a bit
involved because the boost::numeric_cast did not always quite do what we
wanted exactly, but for brevity I leave out these details from here.

template<typename T, typename U>
inline constexpr T checked_cast(U x) {
// Some code to avoid false alarms on MS "smaller type check.
// ...
// Some code to refuse converting NaN to uint64
// ...
return boost::numeric_cast<T, U>(x);
}

// A specialization for double->float conversion
// to avoid false alarm on -inf, and to allow converting
// large finite double values to float inf.
template<>
inline constexpr float checked_cast<float, double>(double x) {
return static_cast<float>(x);
}

template<typename T, typename U>
inline constexpr T debug_cast(U x) {
#ifdef NDEBUG
return static_cast<T>(x);
#else
return checked_cast<T,U>(x);

soft and program

Thursday, October 13, 2022

Digest for comp.lang.c++@googlegroups.com - 25 updates in 5 topics

No comments:

Blog Archive

About Me