Sunday, November 7, 2021

Digest for comp.lang.c++@googlegroups.com - 6 updates in 2 topics

Tim Rentsch <tr.17687@z991.linuxsc.com>: Nov 07 04:51AM -0800


> A problem here is that C++ introduced the const keyword first.
 
> When C got it later, the C committee decided to do it slighly
> differently.
 
These statements leave out some important parts of the story.
 
No question that const appeared first in early versions of C++.
 
Not long after that however const was assimilated into working C
compilers (and years before ANSI C was ratified). So a basis for
comparison was evident pretty early.
 
The question here though is not about const but about changing
the C linkage model. It was obvious from day one that using
const without a storage class to mean internal linkage is a
departure from the C linkage model. And an unnecessary one: to
the best of my knowledge 'static const <type> whatever;' has
always worked in C++ the same as without the 'static'. Allowing
a redundant form gives rise to a gratuitous incompatibility.
 
In the late 1980s not many people were using C++, and C++ was
itself in a state of flux. It would have been easy at that time
to abandon the rule that const-without-static had internal
linkage, getting rid of the oddball linkage exception, and
restore compatibility with C rules (by then the ANSI C efforts
were far enough along that one could see what those rules would
be post-standardization).
 
I must admit it irks me more than a little bit that C++'s idea of
staying compatible with C is always for C to change to make C
more like C++, and never the other way around.
Tim Rentsch <tr.17687@z991.linuxsc.com>: Nov 07 05:08AM -0800


> class x {
> static const unsigned long FRED = 0xabcdef00ul;
 
> };
 
Yes, it's true that leaving off 'static' here means something
significantly different. Note however, (a) neither of these
forms is allowed in C so there is no question of incompatibility;
(b) the 'static'-less form wasn't allowed until C++11; (c) only
the 'static' form works for the purpose of being usable in
constant expressions, so the question here is moot.
 
> requires an additional declaration in a compilation unit, e.g.
 
> const unsigned long x::FRED;
 
In my tests such a declaration was needed only if the address of
the static member FRED was taken. (I confess I didn't even try
to consult the C++ standard to see if that result is officially
okay or is merely a consequence of undefined behavior.)
 
> const unsigned long FRED = 0xabcdef00ul;
> };
 
> doesn't require an additional declaration.
 
AFAICT declarations inside namespaces behave the same way as
declarations at file scope, that is, const-without-static
behaves just the same way as const-with-static (and the same
rules for declarations/definitions, etc). So I don't think
this scenario is an exception to what I said earlier.
scott@slp53.sl.home (Scott Lurndal): Nov 07 02:21PM

>the static member FRED was taken. (I confess I didn't even try
>to consult the C++ standard to see if that result is officially
>okay or is merely a consequence of undefined behavior.)
 
Actually, it depends more on context and the compilation options.
 
With gcc (on linux), if you compile with -O2 or -O3 (and do
not use the address operator), you don't need to add the
declaration. If you don't use -O, then you will likely find
the linker complaining about a missing symbol. I see this
quite regularly, to the point that we generally use enum for
constants (or const definitions in a namespace), even tho
enum (pre C++xx) doesn't lend itself to type safety.
Manfred <noname@add.invalid>: Nov 07 12:38AM +0100

On 11/6/2021 9:27 PM, Keith Thompson wrote:
> David Brown <david.brown@hesbynett.no> writes:
[...]
 
> (x == y ? a : b) = c;
 
> is knowing that a conditional expression can appear on the LHS of an
> assignment, i.e., that it can be an lvalue.
 
More precisely, the C++ standard says the "target type" of the operator
is (simplifying) "If E2 is an lvalue, the target type is "lvalue
reference to T2"" , the point being it's a reference.
 
Any C or C++ programmer
> code would be clearer.
 
> If I saw that line presented as C, my reaction would be that it's
> illegal, but *if it were legal* the meaning would be obvious.
 
The problem in C is not that it's illegal per se. It can't work because
C does not have references, which makes it somewhat hard to guess what
the construct would mean.
(unless by *if it were legal* you mean if C had references, but that's
quite wider a scope than the conditional itself)
 
It's obvious with function return values:
 
/* C++ */
int& foo()
{
static int n = 0;
return n;
}
 
foo() = 42; // OK
 
/* C */
int bar()
{
static int n = 0;
return n;
}
 
bar() = 42; /* ?!? error: lvalue required as left operand of assignment */
Ben Bacarisse <ben.usenet@bsb.me.uk>: Nov 07 12:33AM


> On 11/6/2021 9:27 PM, Keith Thompson wrote:
<snip>
 
> The problem in C is not that it's illegal per se. It can't work
> because C does not have references, which makes it somewhat hard to
> guess what the construct would mean.
 
You don't need references for it to work, or even make sense, in C. In
C, the LHS of an assignment must be an lvalue expression of which there
are many (for example a[42] and *p). If the standard declared
conditional expressions to be lvalues, the construct would be legal.
 
<snip>
--
Ben.
David Brown <david.brown@hesbynett.no>: Nov 07 03:10PM +0100

On 06/11/2021 21:27, Keith Thompson wrote:
>> }
 
> To be clear, I respect your opinion. I just don't share it in this
> case.
 
Fair enough. And discussing different opinions like this is a good way
to learn - whether we learn new ideas, or learn more about how other
people might see things, or learn something about ourselves when we are
pushed into thinking about our own opinions.
 
 
> I don't see your replacement code as a "solution", because I don't see
> the shorter version as a problem. Expanding one line to six can make
> sense in some cases. I just don't see this as one of those cases.
 
I have a preference towards spacing things out rather than having a
compact representation. That's a preference, not an absolute - there
can be many overriding factors, and the "best" choice at any given time
can vary significantly. (I think we can ignore the choice of brace
style for now - that's a different matter.)
 
Some of the reasons for usually preferring a layout with an "if" and
separate assignments, rather than a combined assignment, include:
 
1. It is clear when a given variable is assigned to. I believe it is
important to know what can be changed by a piece of code, and where each
piece of data can be changed. Along with this goes a bias towards
having multiple return value functions return structs (or tuples) rather
than taking a pointer to an variable to change. In my opinion it makes
it easier to see the data flow.
 
2. It is easier to change code if you later need to do other things in
connection with the conditional (i.e., if you want to do something else
when "x == y" other than assign "c" to "a").
 
3. A code layout with separate statements and avoiding doing too many
things in one line works much better with tools for version control or
comparison (diff tools) - it is easier to see what has changed and what
has not changed.
 
4. In my line of programming, C is dominant - the proportion of C++
coding is increasing, but still much less than C. And a sizeable
proportion of small-systems embedded programmers have a background from
hardware and electronics, rather than some kind of computer science
degree. This gives a different viewpoint and different experiences,
which can be a good thing and a bad thing (teams with mixed backgrounds
are useful). But it means that you sometimes have to be careful with
coding techniques that are relatively uncommon - there is a balance to
be found between confusing other people and having an opportunity to
teach new ideas. The kind of coding you use can be very different for a
dedicated team of higher-level, experienced C++ coders and when you are
working with one or two people who cover a broad range of tasks from
electronics design through to simple programming tasks. That does not
mean that "lowest common denominator" programming is the right tactic,
but you may have to have more justification before using a technique
that has a high chance of being unfamiliar.
 
5. Code needs to be tested and debugged. When dealing with
small-systems embedded work, techniques like simulation or unit testing
are often impractical or impossible - much of the close-to-the-metal
coding can only be tested in-system. Spacing out the code more makes it
far easier add breakpoints, logging, volatile variables, and other
debugging aids.
 
6. Sometimes code must be commented. I'm a great fan of expressing
things in code rather than in comments, by use of good names, clear
code, static asserts, etc., but comments are useful. With compact code,
it is rarely possible to put comments close to the action.
 
7. Sometimes the results from more compact form are significantly less
efficient. A quick test suggests that gcc on x86 will give the same
object code for an "if" with separate assignments as "(x == y ? a : b) =
c;". But the C equivalent, "*(x == y ? &a : &b) = c;", is massively
inefficient for local variables that would otherwise reside in
registers. (gcc is usually quite good at keeping local variables in
registers even if you take their address.) In the embedded world, some
compilers are not very good at optimising - I have seen the results of
conditional operator expressions produce very poor code even in the
right-hand side of expressions.
 
8. Coding standards often limit such expressions, whether or not a
particular programmer might think it is a good idea. If an embedded
programmer is working to the MISRA standards (which is common in the
industry), then even "c = (x == y ? a : b);" is not allowed - it must be
written "c = ((x == y) ? a : b);" or "c = (x == y) ? a : b;", and there
are limits to the complexities of the expressions "a" and "b".
 
 
 
> that's evaluated to determine what object is to be assigned to. Once
> you realize that a conditional expression can be an lvalue, the meaning
> **IMHO** becomes obvious.
 
I think it is fairly obvious what the expression does - though not
obvious that it is allowed by the language (particularly because it is
allowed in C++ but not in C). However, just because it could only mean
one thing does not make it easy to follow - and if the reader is using
more effort to understand the mechanics of what the code is doing, they
have less brain power available for the important part - /why/ the code
is doing what it does.
 
 
> Of course in real code you very probably wouldn't call the variables
> x, y, a, b, and c. With meaningful names, I suspect the intent of the
> code would be clearer.
 
Indeed - and I think we all agree that context and names are vital, and
that the clearest way to express something in code depends on wider
circumstances and cannot be covered by simple fixed rules.
 
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

No comments: