Tuesday, January 18, 2022

Digest for comp.lang.c++@googlegroups.com - 13 updates in 3 topics

Tim Rentsch <tr.17687@z991.linuxsc.com>: Jan 17 03:29PM -0800

> inline` is always redundant. It is 100% equivalent to plain
> `static`. There's never any tangible reason to use `static` and
> `inline` together. [...]
 
In C++ I expect that's right. In C though there is a key
difference that may provide a reason to use 'inline'. In C
declaring or defining a function 'inline' can provide additional
guarantees beyond just using 'static'. In the semantics section
of 6.7.4 of the C standard, there is this excerpt:
 
Making a function an inline function suggests that calls to
the function be as fast as possible. The extent to which
such suggestions are effective is implementation-defined.
 
Note the second sentence. C implementations must document
what happens with 'inline', but there is no such requirement
for 'static'.
 
(I should add that I'm assuming that C++ does not impose a
similar requirement. I have not tried looking in the C++
standard to see if that is the case.)
Tim Rentsch <tr.17687@z991.linuxsc.com>: Jan 17 03:34PM -0800

> should be performed. In C, it's an ignorable hint that "that calls to
> the function be as fast as possible.", with the method where by that
> might be achieved being unspecified.
 
Note the sentence in the semantics portion of 6.7.4 of the C
standard that says
 
The extent to which such suggestions are effective is
implementation-defined.
 
So I think "implementation-defined" is more accurate than
"unspecified".
"james...@alumni.caltech.edu" <jameskuyper@alumni.caltech.edu>: Jan 17 03:50PM -0800

On Monday, January 17, 2022 at 6:35:03 PM UTC-5, Tim Rentsch wrote:
> James Kuyper <james...@alumni.caltech.edu> writes:
...
> implementation-defined.
> So I think "implementation-defined" is more accurate than
> "unspecified".
 
"implementation-defined" behavior is unspecified behavior that an
implementation is required to document, so both terms are correct,
but "implementation-defined" is more specific. However, the point I
was making was about what the standard failed to specify, so
"unspecified" was relevant - that an implementation is required
to document the behavior wasn't.
Andrey Tarasevich <andreytarasevich@hotmail.com>: Jan 17 04:28PM -0800

On 1/17/2022 1:15 PM, James Kuyper wrote:
 
> You might disapprove of such a feature, but being a non-mandatory hint
> is the primary purpose of this feature. It seems odd to me to mention
> the primary purpose of a feature only in a footnote.
 
That is false. The primary purpose of this feature is, again, is
consistent between functions and variables: facilitate support for
multiple definitions of the the same entity with external linkage
(variable or function) across multiple TUs (see below)
 
 
> Could you show me how you would use inline for the purpose of violating
> the ODR rules, where there's no more appropriate way to achieve the same
> objective? Allowing you do do so was certainly not the purpose of 'inline'.
 
Um... This is some rather strange wording: "use inline for the purpose
of violating the ODR rules". Where did you get this? ODR rules,
obviously, are aware of `inline` and inclusive of `inline`. Nobody's
talking about "violating" them here in any formal way.
 
What I'm referring to is rather obvious from the above discussion.
 
The primary purpose of the feature is this: the whole problem, the whole
objective is provide us with a feature, that would let us write
 
unsigned foo = 42;
 
void bar()
{
}
 
in a header file and then include this file into multiple TUs.
 
A naive attempt to do this will result in ODR violations: multiple
definitions of entities with external linkage.
 
(`static` might be seen as workaround for the function, but we don't
want that. External linkage is the point here. For whatever reason, we
want a function with unique address identity for the whole program.)
 
So, we need a feature that will
1) suppress the ODR violations, and
2) ensure the unique address/storage identity for both the variable and
the function.
 
And (drum roll) that is achieved through `inline`
 
inline unsigned foo = 42;
 
inline void bar()
{
}
 
Done. End of story. And no point any "hints" or "embedding of function
call sites" come into this picture. They are completely irrelevant.
 
As a side note, of course, the reason we want this is to expose full
function body to the compiler in all TUs, thus helping the compiler to
fully analyze the function, optimize its usage and optimize surrounding
code based on its knowledge of function's internal behavior. The actual
"embedding of function call sites" is just one [minor] part of this
process.
 
How the compiler will implement the required spec is its own business,
but as we all know the most popular approach today is to just push the
problem over to the linker and let it silently eliminate the unnecessary
copies.
 
This implementational approach is what justifies referring to `inline`
as an "ODR defeater" informally. That's basically what `inline` does: it
tells the linker that instead of complaining about multiple definitions
it has to shut up and just mop things up quietly.
 
That is the prime purpose of `inline`. Has always been. All these
stories about the "hint" is just a mumbo-jumbo, which probably only
persists in standard text out of respect to someone who originally
introduced it.
 
--
Best regards,
Andrey Tarasevich
James Kuyper <jameskuyper@alumni.caltech.edu>: Jan 17 10:55PM -0500

On 1/17/22 7:28 PM, Andrey Tarasevich wrote:
...
> of violating the ODR rules". Where did you get this? ODR rules,
> obviously, are aware of `inline` and inclusive of `inline`. Nobody's
> talking about "violating" them here in any formal way.
 
You're talking about 'inline' as a way of getting around those rules. If
'inline' didn't exist, then those rules obviously couldn't cite it as an
exception, and what you're claiming is the primary purpose of 'inline'
would be a violation of those rules. That purpose is expressed by
inserting an exception for 'inline' into those rules.
 
> {
> }
 
> in a header file and then include this file into multiple TUs.
 
Declaring them with internal linkage would achieve the same benefit.
 
...
> (`static` might be seen as workaround for the function, but we don't
> want that. External linkage is the point here. For whatever reason, we
> want a function with unique address identity for the whole program.)
 
Why? I can imagine unusual circumstances where that might be needed, but
I wouldn't expect it to be a common need. Note that an inline function
is only required to have a unique address if it has external linkage or
module linkage (9.2.7p6). If having a single unique address was the main
purpose of 'inline', why would it even be permitted to declare an inline
function with internal linkage? You could have functions with internal
linkage if you don't need a unique address, and 'inline' functions, with
inherently external linkage, if you do need a unique address. If that's
the primary purpose, why didn't they it that way?
 
...
> stories about the "hint" is just a mumbo-jumbo, which probably only
> persists in standard text out of respect to someone who originally
> introduced it.
 
That's a ridiculous suggestion. That's not how standards get written. If
getting around the ODR rules was even an important secondary purpose for
'inline', there would have been some mention of it somewhere in 9.2.7.
 
That ridiculous suggestion isn't even consistent with the immediately
preceding sentence. If the person who originally introduced it had a
different conception of the purpose, one that the standard pays only lip
service to, then by definition the purpose you refer to has not "always
been" the prime purpose. There had to have been at least a short period
of time (as I understand it, that "short" period of time has been
decades long) during which the purpose it originally was introduced for
remained the primary purpose.
Bonita Montero <Bonita.Montero@gmail.com>: Jan 18 09:20AM +0100

Am 17.01.2022 um 19:13 schrieb Andrey Tarasevich:
> of the keyword.
 
> Currently, this is essentially a defect in the standard, which only
> serves to confuse people.
 
You're simply an idiot who can't accept the world like it is.
Andrey Tarasevich <andreytarasevich@hotmail.com>: Jan 18 11:32AM -0800

On 1/17/2022 7:55 PM, James Kuyper wrote:
> exception, and what you're claiming is the primary purpose of 'inline'
> would be a violation of those rules. That purpose is expressed by
> inserting an exception for 'inline' into those rules.
 
When I'm talking about `inline` as an "ODR suppressor" or "defeater",
I'm not talking about the exact formal ODR as it is defined in C++
standard. I'm referring to the general/basic/primitive linker-level idea
of ODR: "if you have two identical symbols in you object files, you end
up with multiple definition error from the linker".
 
>> }
 
>> in a header file and then include this file into multiple TUs.
 
> Declaring them with internal linkage would achieve the same benefit.
 
No it won't. In that case they won't have external linkage, meaning that
they will not refer to the same entity everywhere in the program. I
think I made it clear that this is part of the objective as well.
 
>> want a function with unique address identity for the whole program.)
 
> Why? I can imagine unusual circumstances where that might be needed, but
> I wouldn't expect it to be a common need.
 
For variables the common need is obvious. (I'm surprised I have to even
mention it.)
 
At the same time, I agree that for functions it is much less important
and common.
 
But if this were just about inline functions, chances are nobody would
even bother. However, in C++ the primary driver for this functionality
is not really inline functions at all. It is templates. 99.9% of the
pressing need for this "ODR suppressing" functionality comes from
templates. Implicit instantiation of templates faces the very same
issues: external linkage and multiple definitions, conflicting with the
"basic" idea of ODR I described above. Templates are the primary driver
behind that "gratuitously-overgenerate-then-discard" approach relied
upon by modern implementations.
 
(There were alternative implementations, intended to avoid
"overgeneration" and to play nice with "basic ODR", e.g. Sun Solaris C++
compiler, but AFAIK none of them survived.)
 
(C++ also provides you with facilities for explicit "manual"
instantiation of templates, which are, curiously, quite similar to C's
approach to `inline`. But I hope it is clear to everyone that C++
templates would never take off without their _implicit_ instantiation
mechanics.)
 
So, whether you consider inline functions with external linkage
necessary or not is not really that important. The
"gratuitously-overgenerate-then-discard" approach would still be there
anyway - for templates. And once it's there, you basically get inline
functions with external linkage for free, just for the sake of formal
completeness.
 
> module linkage (9.2.7p6). If having a single unique address was the main
> purpose of 'inline', why would it even be permitted to declare an inline
> function with internal linkage?
 
"Having a single unique out-of-line body" is probably better wording.
Having a single unique address naturally comes with it.
 
As for why inline functions with internal linkage are permitted...
There's simply no reason to prohibit it. `static inline` is redundant,
but there's no reason to spend effort to introduce a rule that would
prohibit it. There lots and lots of such redundant yet legal constructs
in C++.
 
> linkage if you don't need a unique address, and 'inline' functions, with
> inherently external linkage, if you do need a unique address. If that's
> the primary purpose, why didn't they it that way?
 
Again, that would lead to a more complicated specification of this
specifier without any tangible benefits. No need to do that.
 
Here's an immediately available example for you: variables again. Note
that `static inline` is _obviously_ redundant with namespace-scope
variables. There's no need to involve any "woulds", "whatifs" or
contrived scenarios here. `static inline` is obviously and
unconditionally redundant with variables today, right now, everywhere.
Yet, this is perfectly legal
 
/* Namespace scope */
static inline unsigned n = 42;
 
You can start asking your "why?" questions now. "Why do they allow
this?" "Why is it legal?" To me the answer is natural and obvious: why
not? No harm done, no need to overcomplicate things. This has always
been one of the cornerstone principles in thois language's design.
 
--
Best regards,
Andrey Tarasevich
Tim Rentsch <tr.17687@z991.linuxsc.com>: Jan 17 04:16PM -0800


>> return r;
>> }
 
> That's very ugly.
 
Beauty is in the eye of the beholder. Personally I think it's
rather pretty.
 
>> written using the standard C preprocessor, and left as an
>> exercise for any ambitious readers.)
 
> Exercise for you: test that unrevealed macro with Visual C++.
 
I don't have access to a Visual C++ system. The macro I wrote
compiles under gcc and g++ as C99, C11, C++11, C++14, and C++17,
using a -pedantic-errors option. (Using clang and clang++ gives
the same results.)
 
> (Making it work also with Visual C++ is doable.)
 
Does this need more than just writing the definition in
conforming C11 or C++11?
 
>> find, maybe someone else can do better.
 
> The `for(;;)`, `continue;` and `break;` constructs come to mind, so
> you're right, someone else can do better.
 
It is of course possible to write the function without using any
'goto's. Whether such a writing is an improvement is another
matter. To my eye the goto version expresses the structure more
directly and in a form that is easier to comprehend. I did try
actually writing a version with a continue-able loop around the
switch(), but it looked clumsy and not as easy to follow as the
goto version. If you have an example to post I definitely would
like to see it.
 
> At a guess the code counts the number of Unicode code points?
 
Yes, sorry, I thought that was apparent from the earlier
context.
 
> Does it return -1 also for "too long" UTF-8 sequence?
 
The -1 return values is for character sequences that are not
syntactically well-formed for UTF-8.
 
 
> Smart compiler removed all the `if`'s?
 
> I guess you're flame-baiting a little just for fun, but I chose to
> take it seriously.
 
I'm not flame-baiting at all. By "main code path" I mean that
part of the code that deals with normal ASCII characters (and not
counting the terminating null character). The switch() statement
generates an unconditional indirect branch, indexing a table of
32 targets, and the 15 targets for normal ASCII characters simply
go around the loop again, with no tests. I expect you can see
that based on the line with 15 cases, which just does a 'goto'
to start the loop again.
Tim Rentsch <tr.17687@z991.linuxsc.com>: Jan 17 10:43PM -0800

>> written using the standard C preprocessor, and left as an
>> exercise for any ambitious readers.)
 
> Exercise for you: test that unrevealed macro with Visual C++.
 
Update after my previous message. The cases() macro I wrote has
now been tested with Visual C++, and it compiles there without
complaint.
Bonita Montero <Bonita.Montero@gmail.com>: Jan 18 09:18AM +0100

Am 17.01.2022 um 19:28 schrieb Tim Rentsch:
> }
 
> return r;
> }
 
That's not C or C++ and it does have a lot of branch-mispredictions.
"Alf P. Steinbach" <alf.p.steinbach@gmail.com>: Jan 18 12:09PM +0100

On 18 Jan 2022 07:43, Tim Rentsch wrote:
 
> Update after my previous message. The cases() macro I wrote has
> now been tested with Visual C++, and it compiles there without
> complaint.
 
Good to hear. Visual C++ has or had somewhat non-standard treatment of
token pasting and variadic macros, and either it's been fixed, or your
code avoided running into it. A comment in some old code of mine says that
 
❞ [Visual C++ 2017] is unable to count `__VA_ARGS__` as *n* arguments,
and instead counts it as 1. Mostly.
 
Counting the number of arguments of a variadic macro was AFAIK invented
by Laurent Deniau in 2006, and it can go like this:
 
 
#define N_ARGUMENTS( ... ) \
INVOKE_MACRO( \
ARGUMENT_64, \
( \
__VA_ARGS__, \
63, 62, 61, 60, \
59, 58, 57, 56, 55, 54, 53, 52, 51, 50, \
49, 48, 47, 46, 45, 44, 43, 42, 41, 40, \
39, 38, 37, 36, 35, 34, 33, 32, 31, 30, \
29, 28, 27, 26, 25, 24, 23, 22, 21, 20, \
19, 18, 17, 16, 15, 14, 13, 12, 11, 10, \
9, 8, 7, 6, 5, 4, 3, 2, 1, 0 \
) \
)
 
#define ARGUMENT_64( \
a1, a2, a3, a4, a5, a6, a7, a8, a9, a10, \
a11, a12, a13, a14, a15, a16, a17, a18, a19, a20, \
a21, a22, a23, a24, a25, a26, a27, a28, a29, a30, \
a31, a32, a33, a34, a35, a36, a37, a38, a39, a40, \
a41, a42, a43, a44, a45, a46, a47, a48, a49, a50, \
a51, a52, a53, a54, a55, a56, a57, a58, a59, a60, \
a61, a62, a63, a64, ... ) \
a64
 
 
... where `INVOKE_MACRO` tackles the Visual C++ quirks:
 
 
/// \brief Invokes the specified macro with the specified arguments list.
/// \hideinitializer
/// \param m The name of a macro to invoke.
/// \param arglist A parenthesized list of arguments. Can be empty.
///
/// The only difference between `INVOKE_MACRO` and `INVOKE_MACRO_B` is
that they're
/// *different* macros. One may have to use both in order to guarantee
macro expansion in
/// certain (not very well defined) situations.
 
#define INVOKE_MACRO( m, arglist ) \
m arglist
 
#define INVOKE_MACRO_B( m, arglist ) \
m arglist
 
 
I've removed macro name prefixes in the above. For reusable code there
better be name prefixes, to reduce or avoid name collisions.
 
- Alf
Tim Rentsch <tr.17687@z991.linuxsc.com>: Jan 17 03:48PM -0800


> I was thinking of the Smalltalk workstations from the 80s like those
> made by Tektronix. I don't think they used a VM, but ran on the bare
> metal.
 
As I recall Tektronix was one of four companies licensed by Xerox
to port a Smalltalk-80 VM to other systems. One motivation for
offering these licenses was to help debug the writing in
"Smalltalk-80: The language and its implementation", where most
of the implementation part was about how the VM works and how to
write one. Given that, it would be strange if the Tektronix
effort did not use a VM but instead ran a standard Smalltalk
image on bare hardware.
legalize+jeeves@mail.xmission.com (Richard): Jan 18 06:47AM

[Please do not mail me a copy of your followup]
 
Tim Rentsch <tr.17687@z991.linuxsc.com> spake the secret code
>write one. Given that, it would be strange if the Tektronix
>effort did not use a VM but instead ran a standard Smalltalk
>image on bare hardware.
 
Interesting! I didn't know that! I found a PDF of the book on
archive.org, so I'm going to take a look at that! (If others are
interested: <https://archive.org/details/smalltalk80langu00gold>)
--
"The Direct3D Graphics Pipeline" free book <http://tinyurl.com/d3d-pipeline>
The Terminals Wiki <http://terminals-wiki.org>
The Computer Graphics Museum <http://computergraphicsmuseum.org>
Legalize Adulthood! (my blog) <http://legalizeadulthood.wordpress.com>
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

No comments: