Monday, October 25, 2021

Digest for comp.lang.c++@googlegroups.com - 22 updates in 1 topic

Bart <bc@freeuk.com>: Oct 25 12:58AM +0100

On 25/10/2021 00:18, Öö Tiib wrote:
>> syntax like 'let' and 'in'), I just know how I'd like it to work.
 
> Can it be that you haven't implemented it because it is what you would
> like to want but do not always want?
 
Partly because I've classed it as low priority; this would not allow me
to do anything new, just apply extra restrictions!
 
However it is something interesting to explore.
 
 
> In most software I have seen pointers in object often point at other
> objects that are not logically components of said object. So the
> pointers often do not go to "down deep" but entirely elsewhere.
 
Determining the boundaries of a data structure, beyond which
write-protection shouldn't apply or can't be applied, would be one of
the problems to look at.
Juha Nieminen <nospam@thanks.invalid>: Oct 25 04:51AM

> I say this in case it is used to put forward the incorrect notion that
> "const" means "thread safe", which I have occasionally seen propagated
> by the ill-informed.
 
"const means thread-safe" is not said in the context of const references,
but in the context of const member functions, which is a completely
different thing.
 
(And, in this case, the idea is "const member functions *should be*
re-entrant", rather than "const member functions are thread-safe".)
 
And when I said "const can make the program more efficient" I'm
referring to compile-time literals. Especially ones in a const
array. (When the compiler sees the definition of a const array
full of compile-time literals, it can assume that the contents
of the array will never change, and can start taking values from
it at compile time if it's able to. It doesn't need to assume
that the values may change.)
Juha Nieminen <nospam@thanks.invalid>: Oct 25 05:00AM

> Neither does the syntax make it that obvious which bit of the type is
> refered to, as in:
 
> const int * const * x;
 
Actually the syntax *does* make it obvious. You are just reading the type
declaration in the wrong direction. Pointer variable declarations should
be read from right to left (this is a simple but non-obvious trick that
surprisingly few programmers know.) In your example, when we read the
declaration from right to left, it becomes:
 
"x is a pointer to a const pointer that points to an int that's const".
 
Or, if you want to be a bit clearer:
 
"x is a pointer to a (const pointer) that points to an int, the int
itself being const".
 
(In other words, x itself is not const and can be modified, but it
points to a const pointer, ie. *x cannot be modified, and this
const pointer is pointing to a const int, ie. **x cannot be modified
either.)
RacingRabbit@watershipdown.co.uk: Oct 25 08:21AM

On Sat, 23 Oct 2021 18:45:22 +0200
>typedef constant_pointer_to_constant_integer *
>pointer_to_constant_pointer_to_constant_integer;
 
>pointer_to_constant_pointer_to_constant_integer x;
 
Consts in C are pointless because it doesn't have references and its rather
difficult to "accidentaly" dereference a pointer to update the value its
pointing to.
Juha Nieminen <nospam@thanks.invalid>: Oct 25 09:47AM

> Consts in C are pointless because it doesn't have references and its rather
> difficult to "accidentaly" dereference a pointer to update the value its
> pointing to.
 
Actually it isn't. Very old-style C code, mostly prior to the C89 standard,
but you can see even modern examples sometimes (for some old-school C coders
habits die hard), often used non-const pointers to char as "strings".
In fact, I think even the K&R famous book as examples with non-const char*'s
being initialized to point to string literals.
 
The problem with this is that it can be too easy to accidentally try to
modify the contents of the "string" through that pointer. If you are
accustomed to never using 'const' when dealing with char*'s, you'll
probably pay little attention to the fact that some function somewhere
is taking a non-const char* as parameter, and you might at some point
call it with a pointer that's pointing to a string literal. If said
function does modify the "string" it's getting as parameter, that's UB.
 
Most modern C compilers will give you a warning if you try to assign
a const char* (eg. a string literal) to a non-const char* (or give
one to a function taking a non-const char*), but if you were
determined to never use 'const' and turn off such warnings, such
mistakes are not extraordinarily unlikely.
Bo Persson <bo@bo-persson.se>: Oct 25 12:33PM +0200

On 2021-10-25 at 11:47, Juha Nieminen wrote:
> habits die hard), often used non-const pointers to char as "strings".
> In fact, I think even the K&R famous book as examples with non-const char*'s
> being initialized to point to string literals.
 
In defense of K&R. :-)
 
They didn't have const in original C. It was Bjarne who first added it
to C++, and only later did C also adopt the keyword.
Bart <bc@freeuk.com>: Oct 25 12:13PM +0100

On 25/10/2021 06:00, Juha Nieminen wrote:
> points to a const pointer, ie. *x cannot be modified, and this
> const pointer is pointing to a const int, ie. **x cannot be modified
> either.)
 
If only it was that simple to read declarations!
 
Ones such as int** can work by going from right to left, but in general
it is inside out.
 
I noticed you deftly bypassed the fact that 'const' for 'int' can be
written either side of 'int', or both!
 
At least this example helps highlight which of those ** comes first.
RacingRabbit@watershipdown.co.uk: Oct 25 02:19PM

On Mon, 25 Oct 2021 09:47:50 -0000 (UTC)
 
>Actually it isn't. Very old-style C code, mostly prior to the C89 standard,
>but you can see even modern examples sometimes (for some old-school C coders
>habits die hard), often used non-const pointers to char as "strings".
 
And? How do you accidentaly write *str or str[0] for example?
 
>In fact, I think even the K&R famous book as examples with non-const char*'s
>being initialized to point to string literals.
 
The concept of const didn't exist in K&R C so what would be your alternative?
 
>is taking a non-const char* as parameter, and you might at some point
>call it with a pointer that's pointing to a string literal. If said
>function does modify the "string" it's getting as parameter, that's UB.
 
No idea what UB means, but what'll happen is it'll crash immediately so you'll
soon find out.
James Kuyper <jameskuyper@alumni.caltech.edu>: Oct 25 10:30AM -0400

On 10/25/21 4:21 AM, RacingRabbit@watershipdown.co.uk wrote:
...
> Consts in C are pointless because it doesn't have references and its rather
> difficult to "accidentaly" dereference a pointer to update the value its
> pointing to.
 
Actually, it isn't. All it takes is unfamiliarity with the functions
you're using. I remember, in particular, I've seen messages from several
people expressing surprise that strtok() writes to the string that you
pass it as it's first argument. If the pointers they had tried to pass
to strtok() had been const char * rather than char*, they would have
been reminded of the problem. Of course, they might not have understood
the reminder, if they weren't familiar with functions whose declarations
use "const" appropriately. All of the standard library functions do so,
many other libraries don't.
 
However, that's only a part of the problem that "const" is intended to
help avoid. The other part is intentionally dereferencing a pointer to
update the value it's pointing act, due to being unaware of the fact
that what it's pointing at is something that shouldn't be written to. In
code which doesn't make proper use of "const", that's a fairly common
mistake, at least in my experience (which is admittedly limited, since
my own code does make proper use of "const").
James Kuyper <jameskuyper@alumni.caltech.edu>: Oct 25 10:39AM -0400

On 10/25/21 5:47 AM, Juha Nieminen wrote:
...
> Most modern C compilers will give you a warning if you try to assign
> a const char* (eg. a string literal) to a non-const char* (or give
> one to a function taking a non-const char*),
 
They must do so on assignment; 6.5.16p2 occurs in a "Constraints" section:
"An assignment operator shall have a modifiable lvalue as its left operand."
 
And if a function prototype is in scope "... the arguments are
implicitly converted, as if by assignment, to the types of the
corresponding parameters ..." (6.5.2.2p7), so the same constraints apply
there, too. For the same reason, they also apply to return statements.
RacingRabbit@watershipdown.co.uk: Oct 25 02:39PM

On Mon, 25 Oct 2021 10:30:04 -0400
>you're using. I remember, in particular, I've seen messages from several
>people expressing surprise that strtok() writes to the string that you
>pass it as it's first argument. If the pointers they had tried to pass
 
Those are the sorts of people who should stick to python or javascript.
 
>help avoid. The other part is intentionally dereferencing a pointer to
>update the value it's pointing act, due to being unaware of the fact
>that what it's pointing at is something that shouldn't be written to. In
 
They'll soon find out if they try to write to it.
Manfred <noname@add.invalid>: Oct 25 04:58PM +0200

On 10/25/2021 7:00 AM, Juha Nieminen wrote:
> points to a const pointer, ie. *x cannot be modified, and this
> const pointer is pointing to a const int, ie. **x cannot be modified
> either.)
 
There's still the point that the C standard describes "type qualifiers"
both in the context of "declaration-specifiers" and "declarators", and,
in the first case, it says that "type specifiers" (e.g. 'int') and "type
qualifiers" (like 'const') may appear "in any order".
This flexibility is handy in simple declarations, but may be seen as
less consistent in case of multiple levels of indirection.
In the case of pointer "declarators", on the other hand, "type
qualifiers", if any, always occur /after/ their respective '*'.
 
All of this makes sense, after you pay the necessary attention, and it
allows to specify the desired qualifiers for each level of indirection,
which is a valuable feature. I'd say this is one of the cases where
flexibility comes at a price, which, in this case, is worth its value.
James Kuyper <jameskuyper@alumni.caltech.edu>: Oct 25 10:59AM -0400

>> but you can see even modern examples sometimes (for some old-school C coders
>> habits die hard), often used non-const pointers to char as "strings".
 
> And? How do you accidentaly write *str or str[0] for example?
 
It's not the *str that's accidental, its the call to a function that
contains *str, with an argument that points to a string that shouldn't
be written to. That is in fact a fairly easy mistake to made, and when
people didn't use "const" properly, it's actually a fairly common one.
 
...
>> In fact, I think even the K&R famous book as examples with non-const char*'s
>> being initialized to point to string literals.
 
> The concept of const didn't exist in K&R C so what would be your alternative?
 
There was no alternative, which is why that was the case. After "const"
was added to the language, K&R 2nd edition was updated accordingly.
 
...
>> function does modify the "string" it's getting as parameter, that's UB.
 
> No idea what UB means, but what'll happen is it'll crash immediately so you'll
> soon find out.
 
UB means "Undefined Behavior", a technical term from the C standard
which does NOT mean "behavior for which there is no definition". It
means "behavior, upon use of a nonportable or erroneous program
construct or of erroneous data, for which this document imposes no
requirements" (3.4.3). Note that "this document" refers to the C
standard; other documents (such as compiler documentation or ABI
standards) might define the behavior, without changing the fact that is
qualifies as "undefined behavior" as far as the C standard is concerned.
 
A lot of people have trouble understanding how breath-takingly wide the
scope of "imposes no requirements" is. The standard tries to make that
clear with the following examples "Possible undefined behavior ranges
from ignoring the situation completely with unpredictable results, to
behaving during translation or program execution in a documented manner
characteristic of the environment (with or without the issuance of a
diagnostic message), to terminating a translation or execution (with the
issuance of a diagnostic message)."
 
Note, in particular, that the most insidious form of undefined behavior
is that your program can behave exactly the way you incorrectly thought
it was required to behave. The reason that's dangerous is that it leaves
you with no warning that the behavior might change when you recompile
with a different compiler, or with different compiler options, or even
with the same compiler options, or even if you simply run the program a
second time, even if you give it the same inputs as the previous time.
That's how comprehensive the phrase "no requirements" is - the undefined
behavior is NOT required to be the same each time you execute the
offending program.
 
Getting back to your comment - it's not required to crash immediately -
that would constitute a requirement. And it's actually possible, as a
result of optimizations performed by the compiler, that it might
actually do something quite different. In particular, one possibility is
the attempt to write to the object might become a NOp (as indicated by
the phrase "ignoring the situation completely").
James Kuyper <jameskuyper@alumni.caltech.edu>: Oct 25 11:05AM -0400

> On Mon, 25 Oct 2021 10:30:04 -0400
> James Kuyper <jameskuyper@alumni.caltech.edu> wrote:
...
>> update the value it's pointing act, due to being unaware of the fact
>> that what it's pointing at is something that shouldn't be written to. In
 
> They'll soon find out if they try to write to it.
 
Not necessarily - the fact that the behavior is undefined gives
implementations the freedom to implement such code anyway they want,
including ways that can be quite hard to recognize as errors - even
though they are.
Back when I was first converting a lot of other people's K&R C code to
make use of the new features of C90, I frequently found errors like that
which had been masked for years - the errors were quite capable of
causing serious problems, but for one reason or the other, they had
failed to do frequently enough for the problem to be successfully
tracked down. Most of that code ran much more reliably after I finished
converting it.
Manfred <noname@add.invalid>: Oct 25 05:56PM +0200

On 10/25/2021 4:59 PM, James Kuyper wrote:
> On 10/25/21 10:19 AM, RacingRabbit@watershipdown.co.uk wrote:
<snip>
> standard; other documents (such as compiler documentation or ABI
> standards) might define the behavior, without changing the fact that is
> qualifies as "undefined behavior" as far as the C standard is concerned.
 
Thanks for the quote, it made me compare it with the definition of UB in
the C++ standard, which simply states "behavior for which this
International Standard imposes no requirements".
 
The lack of the sentence "upon use of a nonportable or erroneous program
construct or of erroneous data" actually relegates the language at the
mercy of language lawyers, and led to the UB bloat that affects C++
nowadays.
 
RacingRabbit@watershipdown.co.uk: Oct 25 04:14PM

On Mon, 25 Oct 2021 10:59:15 -0400
>standard; other documents (such as compiler documentation or ABI
>standards) might define the behavior, without changing the fact that is
>qualifies as "undefined behavior" as far as the C standard is concerned.
 
Any attempt to write to a read only program text area will result in a crash
regardless of the language. It is implicit that its read only in C because
C also provides the following initialisation which places the string
(presumably) on the heap:
 
char str[] = "hello world";
James Kuyper <jameskuyper@alumni.caltech.edu>: Oct 25 01:14PM -0400

On 10/25/21 12:14 PM, RacingRabbit@watershipdown.co.uk wrote:
...
> Any attempt to write to a read only program text area will result in a crash
> regardless of the language.
 
Perhaps that is true at the hardware level, on some processors. However,
there's also some processors which don't even have the concept of
read-only memory, and there are fully conforming C implementation that
can target some of those processors.
 
However, I'm talking about the level of C code, not hardware. The
translation from C code to machine code is defined only in terms of the
required behavior, and when there is NO required behavior, that
translation can get distinctly weird if you believe the mistaken idea
that C is a "portable assembler".
 
> C also provides the following initialisation which places the string
> (presumably) on the heap:
 
> char str[] = "hello world";
 
Such code cannot result in the string being placed in read-only memory,
because it's perfectly legal to modify str. On the other hand, both of
the following C declarations do allow strings to be placed in read-only
memory, even if they occur at block scope:
 
const char str[] = "Hello world!";
char *strptr = "Good bye!";
 
The first one is allowed to be placed in read-only memory because the
object str is declared "const". The second is allowed to be placed in
read-only memory because it's undefined behavior to write to the memory
pointed at by by strptr, despite the fact that, in C, the string literal
does NOT have the type const char[10], as it would in C++.
 
However, just because it would be permissible for an implementation to
place those objects in read-only memory, it's not actually required that
they be placed there. Many implementations won't do so, especially if
those declarations occur at block scope.
 
And even if they were placed in read-only memory, writing C code that
attempts to modify that memory need not result in machine language
instructions being executed to attempt such a read. Because the behavior
of such code is undefined, an implementation is free to translate such
source code into machine code that does nothing of the kind - and this
is, in fact, the natural result, in some contexts, of certain optimizations.
Bart <bc@freeuk.com>: Oct 25 06:19PM +0100

>> qualifies as "undefined behavior" as far as the C standard is concerned.
 
> Any attempt to write to a read only program text area will result in a crash
> regardless of the language.
 
Data is only put into readonly, write-protected memory when the data
values are already known before the program starts.
 
Lots of uses of 'const' are for data not known until the program starts
execution, and many of these will be reinitialised many times as they
are declared inside blocks.
 
Other uses will make take normally mutable data and make it readonly
when passed to function.
 
So using write-protected memory is not that much help.
Keith Thompson <Keith.S.Thompson+u@gmail.com>: Oct 25 10:48AM -0700

> C also provides the following initialisation which places the string
> (presumably) on the heap:
 
> char str[] = "hello world";
 
I suggest that you would benefit more here from asking questions than
from making assertions.
 
That declaration does not place anything on the heap. The contents of
str is placed on the stack if it appears within a function definition.
or in the static data area if it appears outside a function definition.
 
Others have addresses your errors regarding "const".
 
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
Working, but not speaking, for Philips
void Void(void) { Void(); } /* The recursive call of the void */
Keith Thompson <Keith.S.Thompson+u@gmail.com>: Oct 25 10:56AM -0700

> implicitly converted, as if by assignment, to the types of the
> corresponding parameters ..." (6.5.2.2p7), so the same constraints apply
> there, too. For the same reason, they also apply to return statements.
 
I think the example being referred to was something like:
char *s;
s = "hello";
which does not require a diagnostic in C (because C string literals are
not const). The following is recommended in C and required in C++:
const char *s;
s = "hello";
but here s is still a modifiable lvalue because the "const" applies to
what s points to, not to s itself.
 
A case that would invoke the constraint in 6.5.16p2 is:
char *const s;
s = "hello";
because s itself is read-only; you can't assign *anything* to it.
 
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
Working, but not speaking, for Philips
void Void(void) { Void(); } /* The recursive call of the void */
Keith Thompson <Keith.S.Thompson+u@gmail.com>: Oct 25 11:11AM -0700

> program construct or of erroneous data" actually relegates the
> language at the mercy of language lawyers, and led to the UB bloat
> that affects C++ nowadays.
[...]
 
I don't see how the omission of "upon use of a nonportable or erroneous
program construct or of erroneous data" in the C++ standard makes any
real difference.
 
C definition, all standard editions:
behavior, upon use of a nonportable or erroneous program construct
or of erroneous data, for which this International Standard imposes
no requirements
 
C++ definition, before C++11:
behavior, such as might arise upon use of an erroneous program
construct or erroneous data, for which this International Standard
imposes no requirement
 
C++ definition, C++11 and later:
behavior for which this International Standard imposes no requirements
 
In all cases, "undefined behavior" is determined either by an explicit
statement or by the omission of any definition of the behavior (or, in
C, by violation of a "shall" outside a constraint).
 
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
Working, but not speaking, for Philips
void Void(void) { Void(); } /* The recursive call of the void */
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Oct 25 12:45PM -0700


> Consts in C are pointless because it doesn't have references and its rather
> difficult to "accidentaly" dereference a pointer to update the value its
> pointing to.
 
Fwiw, when writing code in C, I tend to use the following pattern:
 
struct foo
{
unsigned int a;
};
 
 
void
foo_init(
struct foo* const self,
unsigned int a
){
self->a = a;
}
 
 
int
foo_compute(
struct foo const* const self,
unsigned int a
){
return self->a *= a + 123;
}
 
 
I like to use a const pointer to self so that if I accidentally modify
self, I will get a nice warning. Its basically a habit of mine. 'self'
is akin to the this pointer in C++.
 
Oh well... ;^)
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

No comments: