soft and program: Digest for comp.lang.c++@googlegroups.com

comp.lang.c++@googlegroups.com

Google Groups

[QT creator on "nix"] - getting a strict 8 bit (1 byte) array with no padding - 14 Updates
[QT creator free, on "nix"], about NULL, nullptr etc - 1 Update
Substitution failure can be an error - 3 Updates
Improving runtime by preventing a hashtable from growing - 7 Updates

[QT creator on "nix"] - getting a strict 8 bit (1 byte) array with no padding

David Brown <david.brown@hesbynett.no>: Nov 24 08:53PM +0100

On 22/11/2019 19:35, Soviet_Mario wrote:
> Il 22/11/19 15:04, David Brown ha scritto:
>> On 22/11/2019 13:14, Soviet_Mario wrote:
>>> Il 22/11/19 12:23, David Brown ha scritto:

<snipping quite a bit to cut down on the size of the post>

> In that case either pragma is USELESS (and hence does not produce any
> reasonable effect, neither slowness), either is NECESSARY (and so the
> slowing effect is a unavoidable price that I MUST pay either)

A "pack" attribute is unlike to have any effect when you have only
uint8_t types in a struct or array. There is a hypothetical possibility
of it affecting padding or alignments on the struct as a whole, but that
will definitely not apply for any platform "normal" enough to support
QT. But have no positive or useful effects does not guarantee that you
will get no negative effects in terms of efficiency or what the compiler
allows for the code, and certainly you will still have the negative
effects of non-portable code and confusing source code.

>> "Packed" is /not/ a
>> "precaution", nor is using a struct with a bitfield instead of uint8_t.

> the second part has been abandoned early

OK.

> 1995-2000 roughly, then less and less afterwards.
> But strangely enough, I find it more easy to think in C++ terms when I
> met problems in basic.

OK.

>> So /ask/ (and please read the answers). Don't guess, or grab the first
>> thing that you find from a google search.

> but I DID follow the answers :)

I saw a lot of posts indicating that you either didn't read the answers,
didn't understand them, didn't agree with them, or at least did not rate
them as reliable sources of information. There have been so many posts
in this thread that it is hard to have an overview, but I believe you
made several "I'm keeping pragma pack as a precaution" posts after
having been told how it is worse than useless. But maybe I have a false
impression here, based on the order of reading posts or details of what
were written.

>> people in a group like this all say "don't use packed" and "use
>> uint8_t",

> I didn't read or desume the two things should be mutually exclusive.

I can see how that might not have been clear at the start.

> 1) added pragma (to the four u_int_8 members in a struct) is USELESS, as
> the struct was just packed natively (and hence does not produce any
> reasonable effect, neither slowness),

Yes, it is useless here.

> 2) is NECESSARY, as the struct would not have been natively packed
> otherwise (and so the slowing effect is a unavoidable price that I MUST
> pay either)

When you have understood what "packed" does, and how compilers handle
structs, you will realise that this will not be the case. (Any
compiler/target combinations sufficiently odd that uint8_t structs are
not tightly packed anyway, are unlikely to support "pragma pack", almost
guaranteed not to support QT, and are definitely not platforms you will
be using anyway.)

> either
> 3) it is NOT SUFFICIENT (in this case, luckily unlikely, nothing would
> have worked fine at all)

Correct.

But you have missed the fact that there are potential disadvantages in
using "pack" even when it offers no advantages.

> The packed (natively or forcefully) struct is 32 bit wide, so allocating
> an array should allow nice addressing of any item even on a 64 bit OS
> And no item would cross critical addressing boundary

The struct will be 32 bit in size, but the alignment will be one byte.
Whether you consider that an advantage or disadvantage compared to four
byte alignment depends on how it will be used.

>> useless, when it is worse than useless,

> If you have some further patience, tell me if the 3 scenarios fit the
> present situation (not a more general one, but just this)

Hopefully my reply above covers this.

>> "Packed" is not a precaution -

> I call it a precaution as it reduces the chances of a non packed data
> layout.

It does not do that - the chances of the uint8_t data not being packed
are 0, and "packed" does not reduce those chances. Therefore it is not
a "precaution".

> without PRAGMA, do I have any explicit or implicit warranty that my
> union be 32 bit long ? If so, PRAGMA is useless. If not, it seems it
> might be useful.

If you keep using non-standard types, then we can't be entirely sure. I
know what the standard types "uint8_t" and "uint32_t" are, but I have no
definition for "u_int_8" and "u_int_32". I /assume/ they are defined
like the standard types, but you might have made a mistake. I strongly
recommend you don't use home-made types like this when there are
perfectly good standard types available.

The C and C++ standards give implementations a lot of freedom about
packing, padding and alignments in structs and unions. But practical
implementations do not do anything weird unless they have good reason
for it. When you are talking about mainframes, supercomputers, DSPs, or
particularly odd niche processors, you need to be careful and check the
details. For ordinary computers, and code that would never be used on
anything else, there is no problem. (For a guide, anything targetting a
POSIX or Windows system, or any embedded system supported by gcc or
clang, there will be no problems.) On such "ordinary" targets, all
basic types are 8-bit, 16-bit, 32-bit, 64-bit or 128-bit, and the
alignment of those types is at most equal to their size. (It can be
smaller, however, if the "bitness" of the cpu is smaller than that
size.) No padding is added except to make sure each struct member is
aligned, and that the struct itself is aligned by the biggest alignment
of any member.

> what damages could PRAGMA PACK produce ?

I've told you already.

> If the union would be just packed natively, I guess none.

No.

It /might/ be the case that the only disadvantages of "pragma pack" are
lack of portability to other tools on the same or similar platforms, and
confusion for the reader as to why you have added this useless pragma.

But it is also possible that it will restrict what you can do with the
struct (like taking the address of the members), it can lead to errors
that are not diagnosed (like letting you take the address of the
members, but those addresses not being aligned and this leading to
subtle problems when the compiler assumes they /are/ aligned), it can
lead to inefficiencies (like compilers generating byte-by-byte access to
larger fields on some targets) or compilers being pessimistic about
optimisations.

> If it wasn't, at best it would pack it (and maybe slow it, but this is
> irrelevant, I need precise disk layout).

If the union was not packed natively, you'd be working on a very odd
system and "pragma pack" would not apply anyway.

> In this last worst case, are there any other means of getting the thing
> to work ?
> What are them ?

If you are dealing with a system where you can't be sure your simple
arrays and structs are packed sensibly (at most padded for "natural"
alignment), you can't handle it this way anyway - you are going to have
to write code that manually unpacks the raw stream of byte data.

<snip>

> I said :
> 16 bit (width) - 16 bit (height) - N x 32 bit (R,G,B,alpha)
> where N = width x height = number of "pixels"

That does not tell us anything about alignment. Are your fields
properly aligned or not?

> every pixel is multiple of 32 address on file (but this seems irrelevant
> to me).

I presume you mean multiple of 32 bits, i.e., 4 bytes. And yes, this is
/highly/ relevant. It means your data is well aligned, and you can use
a union with a uint32_t for your pixels (if you want). Are you also
sure that your 16-bit values (height and width) are 16-bit aligned? As
long as all your elements are naturally aligned, then the struct is
properly aligned and you can use a plain, simple struct with no "packed"
complications.

This is /critical/ to being able to support your structure simply,
cleanly, efficiently and portably. That is why I have repeatedly asked
you if your format is properly aligned.

> transparency is more than needed ... but taking less is useless or
> impossible, and 1 byte is natural for the otherwise 3,sth bytes long
> struct)

I haven't suggested using smaller types for your data - I assume that
these are appropriate types.

> oriented internally (and the OS does caching independently and
> transparently, basing on disk cluster sizes but this is not sth I should
> care of)

It does not affect the file reading and writing - it affects the struct
in memory that is accessed by the code. C and C++ only define access to
correctly aligned objects and subobjects. Access to unaligned data is
implementation dependent, may require extra extensions (like pragma
pack), may be subject to various restrictions, may be inefficient
(either extra code, or inefficiencies in the cpu processing), and may
cause problems if the data is used with third-party libraries or code.

>> alternatives.

> for a starting, I tried to use a 32 bit total size, hoping it would have
> been packed regardless of any "hints" (pragma).

That is a good plan. When you do that, you avoid any need for "pack"
extensions.

It appears that you have designed a /sensible/ file format. It is
/simple/, but certainly not "stupid". It has taken a while to get you
to give enough details to establish this, but it seems to have been a
sensible format all along.

What you have got wrong is your determination to distrust your own
sensible decisions in the file layout and add worse than useless extra
"pack" junk. Forget "pack" - forget you ever heard of it. It is not
part of the C++ language, it is rarely useful, and it is certainly not
useful here. Just use your file format in the clear, sensible layout
you have designed.

> it is, as both Gambas supports BYTE format (And sizeof()) and
> BYTE-oriented streams. No overspace is generated. Then I have to worry
> about the C++ back end, to comply with this fact.

My knowledge of Gambas is very limited, but according to the
documentation I found its structs are /not/ packed:

<http://gambaswiki.org/wiki/lang/structdecl>

Padding /is/ generated - if needed to get the natural alignment of the
fields. You can be entirely confident that on any platform supported by
Gambas, the padding, alignment and layout of a Gambas struct and a C++
struct will be the same.

> I think the designer of Gambas relies very strongly upon C libraries and
> mind-shape, apart from strings (where he had to welcome BASIC users who
> want strings the basic way).

Of course. And while the details of C struct layouts is implementation
dependent, it is invariably the simplest and most obvious but efficient
layout for the target.

>> padding in a normal struct. I have seen no answer to this (unless it is
>> later in this post), despite asking this /crucial/ question.

> the files are packed.

The information you have given so far in this post suggests they are
properly aligned. They are without padding, but that is because of the
field sizes used and their arrangement, not because you have
specifically asked for non-standard packing.

> such FAST temporaries to be shared, but with portable stream syntax).
> Using /shm would imply no specialized load/store code, just the normal
> file access => delegating the ram usage to the OS.

If your files are in a ram file system (/shm, tmpfs, etc.) then memmap
would not be faster anyway.

>> If the format is bad, the next step is to consider changing the format.

> just observation.
> So, the format is wrong ?

/You/ tell /me/. Only /you/ know the full format. But from what you
have said here, the format is probably fine.

>> formats. Reader and writer functions are common for all sorts of

> well, I'd be considering these for small size, complex structured input,
> not for a 4-12 MB size of simple pixels. It would be a blood bathe

You are basing that on speculation with no evidence.

I am not at all saying that it is a bad idea to use a direct mapping
from file format to struct - indeed, it is a sensible idea whenever you
can define the format.

As far as I understand your project, you are using Gambas for a gui but
for some kind of image processing, you are saving the pictures from
Gambas as files, then starting an external C++ program that reads these
files, processes them in some way, and saves them again, and then the
Gambas program will read them in again.

I have no idea if this is a good method - either from the viewpoint of
development ease, or run-time efficiency. I can think of many
alternative ways to structure this task, which may or may not be better
- without knowing details of the task, or why you might prefer a
particular setup.

But it does seem to me that you are making your decisions based on gut
feeling rather than research or measurement. That does not mean your
decisions are wrong, but it might mean you should not be so confident in
them.

>> /now/, rather than picking a bad structure from the start.

> please give your judgement about the struct I have described before,
> which drawbacks seems to have or don't have.

It sounds like it is probably fine - but as I have said before, you
haven't given all the important details clearly enough.

I want you to understand what is important when making such structures -
that everything is aligned properly (the best being "natural"
alignment), and I strongly recommend adding any padding manually as
unused "padding" fields. I want you to understand /why/ this is
important. And once you understand that, you will realise that "pragma
pack" and similar extensions are useless here and not only are you able
to avoid them, but you /should/ avoid them.

> oh I completely disagree with this point ! The format has to be packed
> as the C++ library receives the file from an other source of data that
> offers them in that form, so it's mandatory.

No, it is not. Not remotely. It would be a sad state for C and C++ if
they required compiler-specific extensions to deal with data from
external sources.

I think perhaps you are mixing up the concepts of "a structure where the
fields are adjacent with no padding" and "uses pragma packed". I am
recommending using structs fit the former, without ever using the later.

Or perhaps you think compilers add random padding and re-arrange structs
in odd and unpredictable ways. The details here are usually specified
by the platform's ABI, precisely so that programs can exchange data
safely (by files, pipes, streams, shared memory, shared libraries, etc.).

> nope, I have constraints to fullfil. Maybe inefficient, maybe non
> portable (who cares ?), but surely not pointless, as otherwise data
> exchange would not work at all

"pragma pack" is pointless here.

Soviet_Mario <SovietMario@CCCP.MIR>: Nov 24 09:53PM +0100

On 24/11/2019 20:53, David Brown wrote:
>> where N = width x height = number of "pixels"

> That does not tell us anything about alignment. Are your
> fields properly aligned or not?

these fields start at the offset 0 of the file.
And they are loaded (fwrite) not in a generic void * buffer
but in an array of typed structs of the kind we are saying.
So I think they are ...

> aligned, and you can use a union with a uint32_t for your
> pixels (if you want). Are you also sure that your 16-bit
> values (height and width) are 16-bit aligned?

they are in the FILE only, but are freaded separately.
The struct itself does not contain these fields.

> all your elements are naturally aligned, then the struct is
> properly aligned and you can use a plain, simple struct with
> no "packed" complications.

okay.

> (either extra code, or inefficiencies in the cpu
> processing), and may cause problems if the data is used with
> third-party libraries or code.

we agree that this will not happen here
CUT

> rarely useful, and it is certainly not useful here. Just
> use your file format in the clear, sensible layout you have
> designed.

LOL ... okay ! Surrendered :)

>> this fact.

> My knowledge of Gambas is very limited, but according to the
> documentation I found its structs are /not/ packed:

no, they arent, but the internal format it uses for 2D
graphics is. Also my very simple structure layout (4 bytes)
was easy enough for even gambas to produce a file with the
expected size (and consistent content, as I realoaded it).

> on any platform supported by Gambas, the padding, alignment
> and layout of a Gambas struct and a C++ struct will be the
> same.

I had the sensation gambas is little more than a wrapper
around C, with some syntactic sugar.

> Of course. And while the details of C struct layouts is
> implementation dependent, it is invariably the simplest and
> most obvious but efficient layout for the target.

in fact it was working OK in gambas.
I would have stayed there, the performance were very good,
but I was starting not to be able to manage the code.
C++ helps so much to organize the structure and to think
correctly.
I am also changing to huge switch () case: to arrays of
pointers to functions for single bit manipulation variants.

Uh ... btw, I go Off Topic.

I was declaring some 3 lines long function as inlined
But, as I started to get their address and populate an array
of pointers, I removed the inline suggestion.

just for, lol :) precaution.
What's the effect of inline linkage when one takes the
address of a function (and stores it) ? I assumed, without
reading documentation supporting this, It would just ignore
inline suggestion.
Or is it undefinite behaviour ?

>> the OS.

> If your files are in a ram file system (/shm, tmpfs, etc.)
> then memmap would not be faster anyway.

I am reading now (The GNU C library reference manual)
that there are functions seeming to directly and
specifically support the /shm shared virtual folder

shm_open, shm_unlink =>
both support NAMED memory streams
and they seem very fit for sharing big pictures, as
exchanging ascii names will be simpler and more readable;
at need even "hardwired" in constants : there won't be many
temporaries at a time, and recycled over and over again in
case, so not a true limitations. Once the results are
passed, they will be saved in regular files (.PNG or .JPG).

also memfd_create seem to have this same advantage (use
names instead of file descriptors).

But this will be the homework for when I am finished the
other parts.

> As far as I understand your project, you are using Gambas
> for a gui but for some kind of image processing, you are
> saving the pictures from Gambas as files,

originally I was doing in gambas, saving / loading included
(as it exports in standard formats, allows for desktop
grabbing and so), but also then I converted colors to my
modified format for greater control and effects that I did
not find the way I wanted them.

Then it grow and I started to no longer be able to "read the
code", find things, know the impact of modifications.

the fact is that in C++ layers are fare more isolated and
when code is written well, modifications often are just
local and do not alter the way other layers use the underlying.
In fact Im slowing regaining control of what happens :)

but the GUI with QT is out of my skills. I just roughly
understand its tipical sockets/signal lazy binding.

> external C++ program that reads these files, processes them
> in some way, and saves them again, and then the Gambas
> program will read them in again.

possibly using /shm files, not to slow down too much

> I have no idea if this is a good method -

no, in general is not. But I'm not a "pro" neither in gambas
nor in C++, so I am trying to mix things to use the easier
part of each in its own natural context.
The GUI in gambas is really simple to manage, compared to
QT. Almost as easy as Visual Studio.

But I was losing control over gambas code, it does not
enforce nor allows for tidyness and sealed objects, its
project management is a nightmare (one file => one class) to
me, so I simply spent a lot of time re-reading my own code.
I was obliget to return to C++ to sweep the dirt :)

> can think of many alternative ways to structure this task,
> which may or may not be better - without knowing details of
> the task, or why you might prefer a particular setup.

the bottleneck is by far my own skills with this two tools.
With the old Visual Basic 2012 or C# I would have finished
it all in days.
But I left windows, so I have to use other tools I am not
much familiar with
I cannot stand windows, but I must admit Visual Studio was
really powerful, complete, cozy and all, and above all
worked always out of the bag, preconfigured consistently. I
really miss it ! QT creator is hard to configure :\

> But it does seem to me that you are making your decisions
> based on gut feeling rather than research or measurement.

absolutely true. But I AM the bottleneck :) of the all chain.

> That does not mean your decisions are wrong, but it might
> mean you should not be so confident in them.

oh I am not confident in the solution : I simply crashed in
using gambas alone for both the gui and more complex
elaborations (struggling with the computations), and also
using QT alone for both (struggling with GUI designer).

Then I decided to try to split the interface from the
engine, at the price of data exchange (possibly with memory
mapped files). I did not have much else left to do.

Maybe I'll crash again, and resign to reinstall a virtual
machine with Visual Studio inside ! LOL.
No ... I'm not so willing to resign actually.

> understand that, you will realise that "pragma pack" and
> similar extensions are useless here and not only are you
> able to avoid them, but you /should/ avoid them.

all true, but as I said : the file is packed. And I don't
want accessors like iomanip and that stuff.
I need to binary write in one block

CUT

> structure where the fields are adjacent with no padding" and
> "uses pragma packed". I am recommending using structs fit
> the former, without ever using the later.

okay. Some day I'll test if it loads well or not.

> Or perhaps you think compilers add random padding and
> re-arrange structs in odd and unpredictable ways.

:) well, not randomly. I think they just try to keep int_32,
float at addresses multiple of 4, and int_64, double
multiple of 8 and so.

I'm not sure of the degree of freedom they have in
"reordering" members though.
Sometimes in constructors I get some warning about the
members will be initializated in a different order than
declared.
But I did it not too much care, as they did not depend on
each other and so actual order was not important.

>>> If you have learned nothing else from this conversation,
>>> hopefully you
>>> have learned that much.

ok ... discussion closed, apart from some OT issues

--
1) Resistere, resistere, resistere.
2) Se tutti pagano le tasse, le tasse le pagano tutti
Soviet_Mario - (aka Gatto_Vizzato)

Keith Thompson <kst-u@mib.org>: Nov 24 01:05PM -0800

> On 24/11/2019 02:06, Keith Thompson wrote:
[...]

>> bits, there is almost no good reason not to use uint8_t. There is no
>> good reason to use some system-specific (or even undocumented)
>> equivalent.

Yes, I meant exactly. I have no idea why I typed "at least".

[...]

--
Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */

Paavo Helde <myfirstname@osa.pri.ee>: Nov 20 11:28PM +0200

On 20.11.2019 20:55, Soviet_Mario wrote:
> I would not want to include many QT features for now
> so, what is the STANDARD C++ way to get
> unsigned 8 bits integer type ?

Why do you need an exactly 8-bit type? And how this is related to QT or
static linking, or to an IDE?

In general, if you need an unsigned 8-bit type, use std::uint8_t.

Manfred <invalid@add.invalid>: Nov 21 01:48AM +0100

On 11/21/19 12:54 AM, Soviet_Mario wrote:
> unsigned char.
> I write delusional as I still do not have found a strict warranty that a
> char is really 8 bit wide

It's not delusional, it's how it works.
You opened some standard header of your implementation, and since in
your implementation a byte is an unsigned char, this is the correct typedef.
If you happen to port the code to an implementation (if any) where a
byte has a different representation, then the standard header of that
implementation would have the corresponding definition.

This way of working is most evident with stuff like uint32_t and
uint64_t (as many have said, most of the time uint8_t is an unsigned char).

If porting to an implementation where no byte is available, then uint8_t
would not be defined (also said by others).

Gist of the story is, yes, uint8_t is guaranteed to be an 8-bit unsigned
integer, as its name suggests.

Soviet_Mario <SovietMario@CCCP.MIR>: Nov 21 12:54AM +0100

Il 20/11/19 20:34, Jorgen Grahn ha scritto:
>> so, what is the STANDARD C++ way to get
>> unsigned 8 bits integer type ?

> uint8_t, just like in C. Or, I guess it's really std::uint8_t.

I opened the headers, i.g. sys/types.h and stdint.h
and they finally stem from a typedef that turns out to be a
"delusional" unsigned char.
I write delusional as I still do not have found a strict
warranty that a char is really 8 bit wide

>> platform dependent, and more often "a word" wide.

> No; a char is rarely more than 8 bits nowadays. You may be thinking
> of wchar_t, but that's something different.

ok, note taken. But is it a warranty or a very common guess ?

>> QT has its explicitely sized integers (even if no 8-bit sized)

> Qt has its own versions of most things ... are you sure you need it?

Yes it has many types redecorated its own way.

But as I wrote I'm not much confident on them on one side
(not on themselves, on my skills obv.) and I often cannot
even guess what they could be like from "outside".
With gambas I just got to work with integral types, even
"strings" are still very difficult to me
So I'd feel better the less "wrappers" around types are put.

Now I'm exploring the solution with a struct containing only
a 8 bit field, and the #pragma pack

--
1) Resistere, resistere, resistere.
2) Se tutti pagano le tasse, le tasse le pagano tutti
Soviet_Mario - (aka Gatto_Vizzato)

Reinhardt Behm <rbehm@hushmail.com>: Nov 21 10:33AM +0800

On 11/21/19 2:55 AM, Soviet_Mario wrote:

> long ago CHAR was 8 bit wide, but now it seems to be platform dependent,
> and more often "a word" wide.

> QT has its explicitely sized integers (even if no 8-bit sized)

Of course it has: Q_UINT8

The standard way would be (u)int8_t, if it is defined in your compiler.

Soviet_Mario <SovietMario@CCCP.MIR>: Nov 21 01:00AM +0100

Il 20/11/19 22:03, David Brown ha scritto:
>> so, what is the STANDARD C++ way to get
>> unsigned 8 bits integer type ?

> uint8_t from <stdint.h>, or std::uint8_t from <cstdint>.

i inquired, and they expand to a typedef related to unsigned
char

> exactly what you ask for. (If the platform doesn't support
> 8-bit char, the type won't exist and the code won't
> compile. But very little code needs to be /that/ portable.)

uhm ... maybe. But I am trying to mirror a byte-wise mask of
opacity/transparency from images, so I have to get two
different pieces of programs (a front end in gambas and this
would-be static library to speak the same language and have
the same data layout).

To be very very tight fisted, I could even have devoted a
SINGLE BIT to the transparency mask, but addressing would
have become far slower. So I waste 7 bits ! but no more.

The pictures themselves are also made of bit fields, but are
four fields 8 bit each, so exactly 32 bit / pixel and the
packing was not necessary there (and also align issue did
not show up).

the gambas frontend has a native type BYTE (0-255 range)
for the transparency mask so I have to replicate it in the
library

--
1) Resistere, resistere, resistere.
2) Se tutti pagano le tasse, le tasse le pagano tutti
Soviet_Mario - (aka Gatto_Vizzato)

Soviet_Mario <SovietMario@CCCP.MIR>: Nov 21 04:14AM +0100

Il 21/11/19 02:44, James Kuyper ha scritto:
> permitted to for uint8_t to be a typedef for a type that isn't exactly 8
> bits. The relevant guarantee is in the C standard, and incorporated into
> the C++ standard by reference, but it is still a guarantee.

very clear (and very good for me!)
tnx

> If CHAR_BIT != 8, or #ifndef UINT8_MAX, then there will be no way to
> declare an 8-bit type for that implementation (except for bit-fields,
> and you can't make arrays of bit-fields).

actually I made an array of the enclosing struct of the bit
field (which was the only member but this may not be relevant).
I noticed that the "padding warning disappeared" with the
#pragma pack(1)
which might be a proof that the system is capable of
addressing single bytes regardless of their alignment in memory

but all this now is unnecessary, after what you all have stated

> The key question the OP needs to ask himself is "What do I want my
> program to do when compiled on a platform where there is no way to
> declare an 8-bit type?".

none, I won't possibly port it anywhere, just run here locally

>> a 8 bit field, and the #pragma pack.

> On a platform where CHAR_BIT==8, unsigned char will work. On a platform
> where CHAR_BIT != 8, that approach won't work, nor will any other.

understood. So the workaround worked only because the system
was just able to address single bytes

--
1) Resistere, resistere, resistere.
2) Se tutti pagano le tasse, le tasse le pagano tutti
Soviet_Mario - (aka Gatto_Vizzato)

Soviet_Mario <SovietMario@CCCP.MIR>: Nov 21 04:09AM +0100

Il 21/11/19 01:48, Manfred ha scritto:
> then uint8_t would not be defined (also said by others).

> Gist of the story is, yes, uint8_t is guaranteed to be an
> 8-bit unsigned integer, as its name suggests.

mmm, all changes then, as it was the warranty I was looking
for !

--
1) Resistere, resistere, resistere.
2) Se tutti pagano le tasse, le tasse le pagano tutti
Soviet_Mario - (aka Gatto_Vizzato)

Keith Thompson <kst-u@mib.org>: Nov 20 10:32PM -0800

> I would not want to include many QT features for now
> so, what is the STANDARD C++ way to get
> unsigned 8 bits integer type ?

If uint8_t exists, then both it and unsigned char are unsigned 8-bit
integer types (and very likely uint8_t is an alias for unsigned char).
Using uint8_t makes it more explicit that you want an unsigned 8-bit
integer type.

If uint8_t doesn't exist, then CHAR_BIT > 8 and there is no 8-bit
integer type. You're unlikely to encounter such an implementation
(I've only heard of very old systems and DSPs) -- and I'd be
astonished if such a system were able to support Qt.

(There's probably some wiggle room there. An implementation with
CHAR_BIT==8 with signed char using a representation other than
2's-complement won't define int8_t and, IIRC, therefore shouldn't define
uint8_t -- but such implementations are, I think, even rarer.)

> I fear that unsigned char is system dependent (and even if
> there are "wider" explicitely supported types, no plain char
> type seems to warranty to be 8 bits wide).

Correct. The three char types are CHAR_BIT bits, or 1 byte.

Your best bet is probably to use uint8_t, which guarantees that your
code won't compile if the implementation doesn't meet its requirements.

[...]

--
Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */

Keith Thompson <kst-u@mib.org>: Nov 20 10:33PM -0800

> Il 20/11/19 19:55, Soviet_Mario ha scritto:
> Uhm, from some search, it seems that a "pragma pack" exist.
> I still have to read it, though

pragma pack is a common extension, but it's non-standard. If you're
worried enough about portability that you're unwilling to assume
CHAR_BIT==8, you probably don't want to depend on extensions.

--
Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */

Keith Thompson <kst-u@mib.org>: Nov 20 10:43PM -0800

> integer type. You're unlikely to encounter such an implementation
> (I've only heard of very old systems and DSPs) -- and I'd be
> astonished if such a system were able to support Qt.
[...]

I'm curious about something. Do you have a realistic expectation that
your code might need to work under an implementation with CHAR_BIT > 8?
Do you have such a target system in mind?

C++ with the assumption that CHAR_BIT==8 is realistically at least 99%
as portable as C++ without that assumption -- and far more portable than
C++ with the assumption that Qt is available.

--
Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */

Keith Thompson <kst-u@mib.org>: Nov 21 11:30AM -0800

David Brown <david.brown@hesbynett.no> writes:
[...]
> uint8_t is always 8 bits. No more, no less.
[...]

Yes. Expanding on that, uint8_t is always 8 bits *if it exists*.

(And in the real world, it almost always exists. You're far more
likely to encounter a very old implementation that doesn't support
<stdint.h> or <cstdint> than to encounter one that does have those
headers but has CHAR_BIT>8.)

--
Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */

[QT creator free, on "nix"], about NULL, nullptr etc

scott@slp53.sl.home (Scott Lurndal): Nov 21 03:44PM

>Note also that even if "E" is an expression with an integer type and a
>value of 0, (T*)E is not guaranteed to produce a null pointer value
>That guarantee requires an integer literal.

I'm sorry to have inadvertantly confused you. The Architecture is question
used the 24-bit value 0xeeeeee as the sentinal value that indicated
end of a linked list of entries. That is one purpose of C's null/NULL
pointer. And of course languages other than C also have a concept of
a NULL pointer by whatever name you want to use.

The machine was a BCD machine without fixed operand sizes[*] (pointers
were originally six BCD digits because the machine only supported
one million digits (500kB) of memory). Later, a multi-level segmenting
scheme was added to the architecture which exended pointers to 8 BCD
digits (co-incidentally 32-bits). The high order 8 bits of the pointer
were a sign digit and a base selector digit; the remaining 6 digits
were the offset from the specified base register. Those 6 digits were
set to the value 0xEEEEEE (an invalid BCD address) to indicate end
of list (and for a null pointer); The hardware had instructions to walk
linked lists (SLT instruction) and the hardware reinstate list (the list
of read-to-run thread/task/process contexts) was also organized as a linked
list and was dispatched using the privileged BRV (Branch Reinstate Virtual) instruction.

[*] The arithmetic instruction operands could be from one to 100 digits
(or bytes with the zone digit ignored) in length.

While we tried a couple times to make a useful C compiler for the architecture,
the lack of any form of bit shifting and/or rotation instructions made it
too inefficient for the architecture, which frankly was designed specifically
for COBOL.

Substitution failure can be an error

"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Nov 21 06:02AM +0100

On 20.11.2019 18:46, Manfred wrote:
> Bjarne's book on SFINAE

Uh, which book?

I remember introducing Bjarne to the term in a mail exchange, but it's
apparently not in my GMail so it must be have been before July 2006.

- Alf

Manfred <noname@add.invalid>: Nov 21 03:50PM +0100

On 11/21/2019 6:02 AM, Alf P. Steinbach wrote:

> I remember introducing Bjarne to the term in a mail exchange, but it's
> apparently not in my GMail so it must be have been before July 2006.

> - Alf
read: Bjarne's book says on SFINAE ...
The C++ Programming Language (4th Edition), p. 692

However, it is curious that the following still fails:

template< class Int, class UInt = make_unsigned_t<Int> >
struct S
{
// using UInt = make_unsigned_t<Int>;
UInt m;

S( Int x, Int ): m( UInt( x ) ) {}
};

The error is still about "invalid use of incomplete type", which sounds
like a type error to me.

while

template< class Int, typename = typename
enable_if<is_integral<Int>::value>::type

struct S
{
using UInt = make_unsigned_t<Int>;
UInt m;

S( Int x, Int ): m( UInt( x ) ) {}
};

seems to work fine.

David Brown <david.brown@hesbynett.no>: Nov 21 03:58PM +0100

On 21/11/2019 14:29, Öö Tiib wrote:
>> them.)

> I sometimes try programming languages (for fun puzzle with new tool)
> so I'm 100% sure that all of those (plus also Kotlin) use "type right" syntax.

I too like to try new languages - when I get the time!

> Note that it is actually matter of taste (again) so your mileage may vary
> from strongly disagree to fully agree but the argumentation felt relatively
> solid there.

I'll read it - thanks for the link.

> My own opinion is that C's declarations are simpler for parsers to parse
> but Go/Rust/Swift/Kotlin declarations feel easier for human to to type
> and to read.

Many of the humans here find C's declaration easy to parse from long
habit - it's always difficult to judge such inherently subjective issues.

Improving runtime by preventing a hashtable from growing

Jorgen Grahn <grahn+nntp@snipabacken.se>: Nov 24 04:29PM

On Sun, 2019-11-24, Jorgen Grahn wrote:
> min(seq)".

> Anyway: I [now] understand the problem and see why it's fun/interesting to
> solve.

But reading the other postings, I realize I don't understand it after all.

To me a subsequence is obviously ... well, like this:

A sequence [c, d) is a subsequence of [a, b) if it's a valid sequence
and if both c and d are in [a, b].

So the subsequences of ABC are A, B, C, AB, BC, ABC and "".

You seem to have a definition of subsequence where order doesn't
matter; your implementation took a vector<int> as input, but you
immediately discarded the ordering, as if the function had taken a
multi_set<int>.

Apologies for forking the thread and not discussing your code, but I
believe in clearly stating problems. It almost always pays off.

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .

Paul <pepstein5@gmail.com>: Nov 24 08:55AM -0800

On Sunday, November 24, 2019 at 4:29:28 PM UTC, Jorgen Grahn wrote:
> multi_set<int>.

> Apologies for forking the thread and not discussing your code, but I
> believe in clearly stating problems. It almost always pays off.

No apologies needed. I agree that the problem needs to be understood
first. "Subsequence" is more of a pure maths concept than a computer
science concept.
Suppose X is a given sequence.
A subsequence of X means a sequence that can be obtained as follows.
The first term s_0 can be any term of X.
The second term s_1 can be any term of X which occurs later in
X (larger index) than s_0.
The third term s_2 can be any term with a larger index in X than s_1 etc.

The subsequences of ABC are therefore:
""
A
B
C
AB
AC
BC
ABC

This is indeed different to your definition.
You seem to have in mind a subarray rather than a subsequence.
It is correct that, for this particular problem, the result is
always the same for any reordering of the vector.

However, it is not at all true that the subsequences of X are always
the same as the subsequences of a rearrangement of X.

In the ABC example above, AC is a subsequence. However, AC is
not a subsequence of BCA.
It's because we are only interested in the length that we can reorder
if we want.

Paul Epstein

"Öö Tiib" <ootiib@hot.ee>: Nov 24 10:10AM -0800

On Sunday, 24 November 2019 17:52:07 UTC+2, Paul wrote:
> C, after observing B taking ibuprofen pills with water:
> "B said 'I only ever drink red wine, but I've just seen B drinking water.
> B is a liar."

Bonita is just trolling and so is best to ignore.

I don't know how what I wrote can be interpreted as "count unique
elements in unordered set in rather nonsensical way". AFAIK
unordered_set::size already is required to return that in O(1) and
it was no way under discussion.

Jorgen Grahn <grahn+nntp@snipabacken.se>: Nov 24 06:30PM

On Sun, 2019-11-24, Paul wrote:
> On Sunday, November 24, 2019 at 4:29:28 PM UTC, Jorgen Grahn wrote:
...

> BC
> ABC

> This is indeed different to your definition.

I don't think it's /my/ definition, but the one that most programmers
would take for granted, at least in C++. Although it may be more
common to talk in terms of "range" and "subrange".

> You seem to have in mind a subarray rather than a subsequence.

I don't think I've heard the term "subarray" before.

Anyway, the bottom line is: you needed to define (or avoid) that term.

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .

"Öö Tiib" <ootiib@hot.ee>: Nov 24 10:41AM -0800

On Sunday, 24 November 2019 20:30:37 UTC+2, Jorgen Grahn wrote:

> > You seem to have in mind a subarray rather than a subsequence.

> I don't think I've heard the term "subarray" before.

> Anyway, the bottom line is: you needed to define (or avoid) that term.

Wikipedia seems to agree with Paul:
https://en.wikipedia.org/wiki/Subsequence
The subsequence should not be confused with substring ⟨ A , B , C , D ⟩
which can be derived from the above string ⟨ A , B , C , D , E , F ⟩
by deleting substring ⟨ E , F ⟩. The substring is a refinement of
the subsequence.

Bonita Montero <Bonita.Montero@gmail.com>: Nov 24 08:12PM +0100

> elements in unordered set in rather nonsensical way". AFAIK
> unordered_set::size already is required to return that in O(1) and
> it was no way under discussion.

Sorry, I've confused set and multiset.
But even when having changed the code I can falsify your claim:

#include <iostream>
#include <vector>
#include <unordered_set>
#include <limits>
#include <algorithm>
#include <random>
#include <chrono>

using namespace std;
using namespace chrono;

int main()
{
size_t const N_ELEMENTS = 10'000'000;
random_device rd;
uniform_int_distribution<int> uid( numeric_limits<int>::min(),
numeric_limits<int>::max() );
vector<int> v;
unordered_multiset<int> s( N_ELEMENTS );
time_point<high_resolution_clock> start;
double seconds;
v.resize( N_ELEMENTS );
for( int &e : v )
e = uid( rd ),
s.insert( e );
start = high_resolution_clock::now();
size_t uniq = 0;
for( unordered_multiset<int>::iterator it = s.begin(); it != s.end(); )
{
++uniq;
int e = *it++;
for( ; it != s.end() && *it == e; ++it );
}
seconds = duration_cast<nanoseconds>(high_resolution_clock::now() -
start).count() / 1.0E9;
cout << "unique elements: " << uniq << endl;
cout << "time to count: " << seconds << "s" << endl;
start = high_resolution_clock::now();
sort( v.begin(), v.end() );
seconds = duration_cast<nanoseconds>(high_resolution_clock::now() -
start).count() / 1.0E9;
cout << "time to sort: " << seconds << "s" << endl;
}

The difference between sorting and counting remains almost the same.
I think the mutltiset doesn't really maintain linked buckets but just
has a counter for duplicate values on each bucket. So that's a good
explanation why the runtime of the modified code is so close to the
old code.

Bonita Montero <Bonita.Montero@gmail.com>: Nov 24 08:25PM +0100

Sorry, I've drunken half a bolttle of chery-liqueur.
Better write tomorrow again.

You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

soft and program

Sunday, November 24, 2019

Digest for comp.lang.c++@googlegroups.com - 25 updates in 4 topics

No comments:

Blog Archive

About Me