Tuesday, August 10, 2021

Digest for comp.lang.c++@googlegroups.com - 25 updates in 1 topic

Richard Damon <Richard@Damon-Family.org>: Aug 09 08:57PM -0400

On 8/9/21 4:50 PM, Vir Campestris wrote:
> bit system with more than 2GB RAM, and don't expect the 64 bit limit to
> be a problem any time soon.
 
> Andy
 
Since for 16 bit size_t ptrdiff_t has a requirement to be at least 17
bits, I think the committee decided that it was unlikely for an
application with a 32-bit address space to have over half of its memory
dedicated to a single char array (the only real case where you can have
this issue).
 
Machines with 16 bit size_t are going to need to be segmented systems,
so having over 32k in an array is plausable, so ptrdif_t needs to be big
enough for that.
 
From what I remember of minumum limits, a machine with only 64k of
address space can't really be fully conforming, so if that machine makes
ptrdif_t be only 16 bits to save space, that isn't the only non-conformity.
 
Maybe if 32-bit environments that actually used the segments like was
used in the 16-bit segmented world to increase memory space were common,
then the standard might have required 33-bit ptrdif_t for those systems.
scott@slp53.sl.home (Scott Lurndal): Aug 10 01:12AM

>application with a 32-bit address space to have over half of its memory
>dedicated to a single char array (the only real case where you can have
>this issue).
 
In all the years that I've been using ptrdiff_t, every single case
has involved subtracting a base pointer from an element pointer;
the result is always positive and always within positive value range
of ptrdiff_t.
 
And given the split virtual address space of most modern operating
environments, the maximum unsigned difference for a user application will
generally be representable in 31 bits anyway absent buggy code.
 
It's not like programmers are willy-nilly subtracting random
pointers and expecting a meaningful result.
Richard Damon <Richard@Damon-Family.org>: Aug 09 09:21PM -0400

On 8/9/21 9:12 PM, Scott Lurndal wrote:
> generally be representable in 31 bits anyway absent buggy code.
 
> It's not like programmers are willy-nilly subtracting random
> pointers and expecting a meaningful result.
 
I have had cases where I subtract pointers to elements in the array
where the pointers might be in the reversed order, and thus I get a
negative difference.
 
I will admit, that it is less common than the case you describe, but it
does happen.
James Kuyper <jameskuyper@alumni.caltech.edu>: Aug 09 11:20PM -0400

>> have any idea how bad the average CS major's education might have been.
 
> Ooo look at you, supercilious and patronising all in one go. Well done,have
> a scooby snack.
 
I'm sorry - people who confess to being unfamiliar with fairly basic
aspects of C tend to produce feelings of superiority in me.
 
>> pointers, and know enough about the calculation to at least roughly
 
> Any parser beyond the most basic needs to do pointer arithmetic and I've
> written a LOT of them.
 
Unless your parsers were successfully ported to the kinds of platforms I
was talking about, that's not particularly relevant to the point I was
making.
 
>> used as ptrdiff_t.
 
> I should have clarified in my origional post that I was refering to programming
> in grown up OS's on grown up CPUs. Not on DOS with 1970s x86 segmentation.
 
As I said, I'm not personally familiar with such platforms, but the
impression I get is that they tend to be small embedded CPUs - which
would explain my total lack of familiarity with them. Embedded
programming is a large and rapidly growing part of the C/C++ programming
world, but one that has never played any part in any job I've ever held.
 
 
>> Yes, but when these issues come into play, the relevant mathematical
>> operations cannot be done on pointer types. The result of a pointer
 
> They can in *nix which is good enough for me.
 
Yes, but the intended scope of the C++ standard is considerably broader
than what's good enough for you.
Juha Nieminen <nospam@thanks.invalid>: Aug 10 05:22AM

> `<` order. That implies that internally these functions compute some
> absolute representation of pointers, mapping segment selectors to lower
> level addresses as necessary.
 
Since pointers, like any object, by necessity have a bit representation,
they can be compared and strictly ordered. However, that doesn't mean
that their difference is meaningful (or that ptrdiff_t needs to be
unambiguous for every single pair of pointers you subtract from
each other).
Juha Nieminen <nospam@thanks.invalid>: Aug 10 05:29AM


> You're probably not aware then that the CD DS ES SS registers from that
> 1970s x86 are still alive and well under the hood, even though they
> often point to the same address space, and the offsets are 32 or 64 bit.
 
I might be completely wrong on this, but I remember reading somewhere that
thread-local variables are often implemented (in x86 systems) by using
one of the segment registers, and having it different for each thread.
This way all the code can address these thread-local variables with the
exact same memory address, but they will still be pointing to different
variables.
 
If that's the case then it means that the segment registers are still
actually useful.
MrSpud_pb9bpp7Ej@6urvt16b9fax.gov.uk: Aug 10 07:23AM

On Mon, 9 Aug 2021 21:50:20 +0100
>> in grown up OS's on grown up CPUs. Not on DOS with 1970s x86 segmentation.
 
>You're probably not aware then that the CD DS ES SS registers from that
>1970s x86 are still alive and well under the hood, even though they
 
And invisible in 32 bit protected mode and not used at all in 64 bit so
irrelevant to this discussion.
MrSpud_m7k7yilq_8@u6w.biz: Aug 10 07:26AM

On Mon, 9 Aug 2021 23:20:27 -0400
>> a scooby snack.
 
>I'm sorry - people who confess to being unfamiliar with fairly basic
>aspects of C tend to produce feelings of superiority in me.
 
In 25 years I've never ever seen that type used so spare me your BS. The only
ones with misplaced feelings of superiority is you.
 
 
>Unless your parsers were successfully ported to the kinds of platforms I
>was talking about, that's not particularly relevant to the point I was
>making.
 
They've been used on x86 and ARM on various OS's.
 
>would explain my total lack of familiarity with them. Embedded
>programming is a large and rapidly growing part of the C/C++ programming
>world, but one that has never played any part in any job I've ever held.
 
Unless you're using some prehistoric 16 bit (or less) PIC then all pointers in
embedded C will be 32 bit linear.
MrSpud_12tus_Atff@bya886olr5o8dhu9.ac.uk: Aug 10 07:28AM

On Tue, 10 Aug 2021 05:29:15 -0000 (UTC)
>This way all the code can address these thread-local variables with the
>exact same memory address, but they will still be pointing to different
>variables.
 
I don't see how that'll work in 32 or particularly 64 bit.
Juha Nieminen <nospam@thanks.invalid>: Aug 10 07:51AM

> In 25 years I've never ever seen that type used so spare me your BS. The only
> ones with misplaced feelings of superiority is you.
 
I think that you know perfectly well that with that kind of hostile language
and attitude you are not going to persuade anybody, nor are you going to get
much support, neither from the person you are talking to, nor pretty much
anybody else. Thus, I think you know perfectly well that by using that kind
of language and attitude, you are making a pariah of yourself here.
 
You could express your statements in a neutral way, but instead you
willingly choose to denigrate people and be very antagonistic.
 
So I have to wonder why. What psychological issue makes you want to become
a hated pariah? Why do you willingly antagonize people? Why do you want
them to find you disgusting and unlikeable? Is it to get some kind of
sense of being a victim, a martyr?
 
Perhaps some self-reflection could do you some good. In the long run,
being nicer will make also you yourself happier.
MrSpud_ggBp1lq0@lxz.tv: Aug 10 07:59AM

On Tue, 10 Aug 2021 07:51:58 -0000 (UTC)
>much support, neither from the person you are talking to, nor pretty much
>anybody else. Thus, I think you know perfectly well that by using that kind
>of language and attitude, you are making a pariah of yourself here.
 
Ah, a nice bit of early morning irony to go with my coffee :)
 
>You could express your statements in a neutral way, but instead you
>willingly choose to denigrate people and be very antagonistic.
 
I suggest you read what he wrote. I'm simply replying in kind.
 
>a hated pariah? Why do you willingly antagonize people? Why do you want
>them to find you disgusting and unlikeable? Is it to get some kind of
>sense of being a victim, a martyr?
 
If you wish to try out your cod psychology I would suggest you get at least
a vague clue first.
 
>Perhaps some self-reflection could do you some good. In the long run,
>being nicer will make also you yourself happier.
 
Aww, bless you :)
David Brown <david.brown@hesbynett.no>: Aug 10 11:20AM +0200

>> world, but one that has never played any part in any job I've ever held.
 
> Unless you're using some prehistoric 16 bit (or less) PIC then all pointers in
> embedded C will be 32 bit linear.
 
That makes it clear that you are so ignorant about the world outside of
*nix that you have no idea how ignorant you are. If all your
programming world is within the specific segment of *nix systems, that's
fine - lucky you, some might say. But please understand there is a
world outside of that, where C and C++ are heavily used but many of the
assumptions you make do not hold.
 
You'd do well to learn from James here - he knows little about the world
of small-system embedded programming, but he /knows/ he knows little
about it - he knows it is important, and knows it can be different from
the systems he usually works with, and knows it is one of the reasons
for some of the flexibilities in the C and C++ standards.
Juha Nieminen <nospam@thanks.invalid>: Aug 10 12:24PM


>>Perhaps some self-reflection could do you some good. In the long run,
>>being nicer will make also you yourself happier.
 
> Aww, bless you :)
 
It's not surprising that you would struggle against this kind of advise,
but perhaps some time in the next years you will think about it more
seriously.
 
Being nice and polite to people is genuinely more rewarding and gives
yourself more happiness in the long run than being rude, aggressive and
confrontational, which will just make you miserable in the long run.
Mockery might give you immediate satisfaction, but in the long run it's
just going to destroy your own happiness. You might not believe it now,
but you will believe it eventually.
 
Just think about it. It's never too late to learn and change.
Tim Rentsch <tr.17687@z991.linuxsc.com>: Aug 10 06:11AM -0700

Vir Campestris <vir.campestris@invalid.invalid> writes:
 
[..stuff about 64 bit systems removed..]
 
> :(
 
> As it happens it's never been a practical problem as I've never had a
> 32 bit system with more than 2GB RAM,
 
It isn't necessary for there to be more than 2GB of RAM for
problems with ptrdiff_t to manifest (in 32-bit linux). A call to
malloc() will gladly return a memory area larger than all of RAM
if there is swap space to hold it.
Tim Rentsch <tr.17687@z991.linuxsc.com>: Aug 10 06:41AM -0700


> On Mon, 9 Aug 2021 12:16:46 -0400
> James Kuyper <jameskuyper@alumni.caltech.edu> wrote:
 
>> On 8/9/21 4:19 AM, MrSpud_ifhov@nldls6_1kg3nl2qnwv.biz wrote:
 
[...]
 
>> mathematical operations cannot be done on pointer types.
>> The result of a pointer
 
> They can in *nix which is good enough for me.
 
A few comments...
 
One, in many cases C pointers are represented internally as what
are basically integers, but the C standard says pointer types are
distinct from integer types, and the rules for operations on
pointer types, in particular subtraction of pointer values, are
specifed in terms of pointers and arrays and not in terms of
integer values. The rules for pointer subtraction depend on the
range of the implementation-chosen type ptrdiff_t.
 
Two, the specific values used for things like ptrdiff_t are
determined by the particular C implementation being used, not the
operating system. The target OS may influence some choices made
by the implementation, but it is still the implementation's
choice whether to observe those influences.
 
Three, I can tell you from first-hand experience that problems
related to the range of ptrdiff_t can and do occur in ordinary C
code running on a 32-bit linux system, using gcc to compile.
Bo Persson <bo@bo-persson.se>: Aug 10 04:13PM +0200

On 2021-08-10 at 15:11, Tim Rentsch wrote:
> problems with ptrdiff_t to manifest (in 32-bit linux). A call to
> malloc() will gladly return a memory area larger than all of RAM
> if there is swap space to hold it.
 
There still has to be a contiguous address space that is free of already
allocated memory blocks and not in a range reserved by the operating
system.
 
I'm no Linux expert, but on Windows configured for LargeAddressAware
programs, with 3 GB user + 1 GB OS, it is extremely unlikely to find a
2.x GB hole for the heap. Or that you have a program that needs exactly
1 such block for byte sized operations.
Tim Rentsch <tr.17687@z991.linuxsc.com>: Aug 10 07:13AM -0700


> On 8/9/21 4:50 PM, Vir Campestris wrote:
 
[...]
 
> application with a 32-bit address space to have over half of its memory
> dedicated to a single char array (the only real case where you can have
> this issue).
 
A few weeks ago I happened to write a small C program that does
part of its work in a single large dynamically allocated memory
area. In that memory area there are character arrays, character
pointers, and various offset values (unsigned integers). The
program runs just fine on a 64-bit linux system (in one case the
memory area allocated was about 180 GB).
 
Prompted by this discussion, I took the program and tried
compiling and running it on a 32-bit linux system. The memory
area allocated was a little over 2 GB, and the program fell down
miserably. Limiting the size of the memory area allocated to
PTRDIFF_MAX, with no other changes, got it working again. So
problems due to the limited range of ptrdiff_t definitely can
occur in practical programs.
 
> [...]
 
> From what I remember of minumum limits, a machine with only 64k
> of address space can't really be fully conforming, [...]
 
The rule for being able to have a 64 KB object applies only to
hosted implementations. It's easy to make a fully conforming
freestanding implementation for a machine with limited address
space, even one much less than 64 KB.
James Kuyper <jameskuyper@alumni.caltech.edu>: Aug 10 10:27AM -0400

> On Mon, 9 Aug 2021 23:20:27 -0400
> James Kuyper <jameskuyper@alumni.caltech.edu> wrote:
>> On 8/9/21 12:23 PM, MrSpud_HG@_0b772d8ha3yjo0xb.edu wrote:
...
>> was talking about, that's not particularly relevant to the point I was
>> making.
 
> They've been used on x86 and ARM on various OS's.
 
So, not the kinds of platforms I was talking about.
Keep in mind that you'd only run into trouble doing pointer arithmetic
on pointers into arrays with more than PTRDIFF_MAX elements. Did your
parsers ever need to parse something that big? PTRDIFF_MAX is required
to be at least 65535, but it's the actual value on the platform you're
compiling for that matters.
James Kuyper <jameskuyper@alumni.caltech.edu>: Aug 10 10:27AM -0400

On 8/10/21 5:20 AM, David Brown wrote:
> about it - he knows it is important, and knows it can be different from
> the systems he usually works with, and knows it is one of the reasons
> for some of the flexibilities in the C and C++ standards.
 
I was hoping someone with more relevant experience would respond.
However, I was, in particular, hoping someone would respond with an
example. Do you know of any particular modern system, preferably as
widely used as possible, where ptrdiff_t was not big enough to store all
possible pointer differences, or where [u]intptr_t is not supported?
Mike Terry <news.dead.person.stones@darjeeling.plus.com>: Aug 10 03:34PM +0100

>> exact same memory address, but they will still be pointing to different
>> variables.
 
> I don't see how that'll work in 32 or particularly 64 bit.
 
One of the segment registers is configured by Windows to address the
thread environment block (TEB) which is thread specific. I'm pretty
sure that it was FS register on 32-bit windows, but maybe that's changed
for 64-bit. The scheduler context switches between threads saves and
restores the segment registers, so the TEB will always be correctly
addressed as threads switch.
 
For example, for 32-bit Windows, you may notice as part of the function
prolog code, the pointer at FS:0 is updated, then restored on return -
that's the structured exception handling frame pointer, which is
per-thread as you'd expect. (Similarly the last error stored by many
APIs and retrieved through Get/SetLastError() is somewhere in the TEB.
 
 
Regards,
Mike.
scott@slp53.sl.home (Scott Lurndal): Aug 10 02:54PM

>> level addresses as necessary.
 
>Since pointers, like any object, by necessity have a bit representation,
>they can be compared and strictly ordered.
 
I would argue against the latter part of your
statement by referring to various extant
and future architectures where your statement is not true, some of
which even have C compilers.
 
I'm familiar with one extant architecture (Clearpath) where pointers
are not as you describe, and one potential future architecture
(which is currently under NDA, but similar to the CHERI research
project) where the pointers are not simple
offsets from the start of memory.
scott@slp53.sl.home (Scott Lurndal): Aug 10 02:57PM


>I might be completely wrong on this, but I remember reading somewhere that
>thread-local variables are often implemented (in x86 systems) by using
>one of the segment registers, and having it different for each thread.
 
That is the case for linux on x86 %fs is used for the user-mode
thread-local storage base and %gs is used for the kernel-model
per-cpu storage.
 
 
 
>If that's the case then it means that the segment registers are still
>actually useful.
 
The modern architectures (e.g. x86_64) treat them as general purpose
registers for all intents and purposes - the descriptor tables (gdt/ldt)
don't come into play (see, for example, the SWAPGS instruction).
scott@slp53.sl.home (Scott Lurndal): Aug 10 02:59PM

>>exact same memory address, but they will still be pointing to different
>>variables.
 
>I don't see how that'll work in 32 or particularly 64 bit.
 
If you download a copy of the Intel (or AMD) processor manual set,
you'll find sufficient information within to educate you on this
particular topic. It works and is widely used.
 
Or, in the usenet vernacular, RTFM.
scott@slp53.sl.home (Scott Lurndal): Aug 10 03:02PM

>>1970s x86 are still alive and well under the hood, even though they
 
>And invisible in 32 bit protected mode and not used at all in 64 bit so
>irrelevant to this discussion.
 
Funny, I recall a processor feature addition from 2005 which
honored the DS limit register in long mode. Added specifically
to support XEN (recall that AMD added long mode and Intel adopted it
later) before SVM was introduced to the Opterons.
scott@slp53.sl.home (Scott Lurndal): Aug 10 03:03PM

>problems with ptrdiff_t to manifest (in 32-bit linux). A call to
>malloc() will gladly return a memory area larger than all of RAM
>if there is swap space to hold it.
 
Do recall the split address space in 32-bit intel/amd systems, which
by default, limit the application to 2GB of virtual address space.
 
Malloc can't return more than the available user-mode VA space
allows.
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

No comments: