Friday, September 23, 2022

Digest for comp.lang.c++@googlegroups.com - 25 updates in 3 topics

Andrey Tarasevich <andreytarasevich@hotmail.com>: Sep 22 07:00PM -0700

On 9/21/2022 1:49 PM, Ben Bacarisse wrote:
 
> Except for the quibble that a null in the source string is respected --
> i.e. the destination is considered to be a fixed-width field but not the
> source.
 
Yes, by spec, it is a _conversion_ function. Its specific purpose is to
convert a zero-terminated source string to a fixed-width target string.
 
In modern usage (under assumption that nobody needs fixed-width strings
anymore), this function might still be usable for secure initialization
of sensitive data fields, when one wants to make sure that the old
string content of the data field is fully erased when the new string is
copied in.
 
--
Best regards,
Andrey
Lynn McGuire <lynnmcguire5@gmail.com>: Sep 22 09:03PM -0500

On 9/21/2022 5:04 AM, Juha Nieminen wrote:
> resulting character array is not null-terminated."
 
> Perhaps return to the caller some value telling if the string was
> truncated.
 
Thanks for helping me to better understand strncpy. Of course, we use
it extensively. Although, we use the "safe" version strncpy_s about
half the time, 62 out of 125 uses.
 
Lynn
Andrey Tarasevich <andreytarasevich@hotmail.com>: Sep 22 07:04PM -0700

On 9/22/2022 7:00 PM, Andrey Tarasevich wrote:
>> source.
 
> Yes, by spec, it is a _conversion_ function. Its specific purpose is to
> convert a zero-terminated source string to a fixed-width target string.
 
... and yes, Keith Thompson makes a good point that the source does not
have to be a zero-terminated string. I.e. it can also be used for
fixed-width string copying.
 
One can probably argue that it might be more efficient than plain
`memcpy`, since `memcpy` would copy the original zeroes with a
memory-to-memory operation, while `strncpy` would fill the tail portion
of the string with "its own" zeros instead.
 
--
Best regards,
Andrey
Richard Damon <Richard@Damon-Family.org>: Sep 22 11:33PM -0400

On 9/22/22 2:09 AM, Juha Nieminen wrote:
> it gets extremely easily confused with strcpy(), as if it were a "safer"
> variant of it, in the same was as strncat() is a "safer" variant of
> strcat().
 
Maybe, but it was names LONG ago in the infancy of the language, so that
is water under the bridge.
 
One thing to remember, the n versions weren't so much designed as
"Safter" versions, but versions for a special purpose.
 
THey may work as safer versions, but I don't think that was major goal.
Keith Thompson <Keith.S.Thompson+u@gmail.com>: Sep 22 10:49PM -0700

Andrey Tarasevich <andreytarasevich@hotmail.com> writes:
[...]
> ... and yes, Keith Thompson makes a good point that the source does
> not have to be a zero-terminated string. I.e. it can also be used for
> fixed-width string copying.
 
To be fair, it was Ben Bacarisse who pointed this out -- *after* I had
incorrectly stated (based on faulty man page) that the source has to be
a pointer to a null-terminated string.
 
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
Working, but not speaking, for Philips
void Void(void) { Void(); } /* The recursive call of the void */
David Brown <david.brown@hesbynett.no>: Sep 23 01:23PM +0200

On 22/09/2022 12:02, Bonita Montero wrote:
 
> A volatile read or write is usually at least the same as a read or
> write through a memory_order relaxed. So there are some guarantees
> you can rely on.
 
The key point of an atomic access, above all else, is that it is an
all-or-nothing access. If one thread writes to an atomic object, and
another thread reads it, then the reading thread will see either the
complete old data or the complete new data.
 
"Volatile" does not give you that guarantee.
 
The key points regarding volatile accesses is that the compiler cannot
assume it knows about the way the target memory is used - it may be read
or written independently of the program code, and that the accesses are
"observable behaviour". Thus every volatile access must be done exactly
as it is in the "abstract machine" that defines the language - with the
same values, same number of accesses, same order of accesses in respect
to other volatile accesses.
 
"Atomic" does not have those semantics. The compiler can combine two
relaxed atomic writes to one. It can do some re-ordering regarding
atomics and normal accesses, and even across volatile accesses. (For
atomic memory access stricter than "relaxed", there are more
restrictions on ordering.)
 
This is why the C11 "atomic_store" and "atomic_load" functions take a
pointer to volatile atomic as their parameter, not a pointer to atomic.
 
 
> Fences and barriers mean the same. A fence or barrier makes that changes
> from a foreign thread become visible to another thread or changes from a
> thread become visible for other threads.
 
In the C standards (I refer to them as they are simpler and clearer than
the C++ standards, but the memory model is the same) say that an
"atomic_thread_fence(memory_order_relaxed)" has no effect. This is
rather different from a memory barrier, which requires compiler-specific
extensions and which can be viewed roughly as a kind of "cache flush" in
which the "cache" is the processor registers along with any information
the compiler knows about any objects.
 
Again, the non-relaxed fences will likely have a memory barrier effect
in practice - but they do so at a significantly higher cost than a
memory barrier.
Bonita Montero <Bonita.Montero@gmail.com>: Sep 23 01:49PM +0200

Am 23.09.2022 um 13:23 schrieb David Brown:
 
> all-or-nothing access.  If one thread writes to an atomic object, and
> another thread reads it, then the reading thread will see either the
> complete old data or the complete new data.
 
Although it is possible no one actually uses atomics for non-native
types. And for native types what I said holds true against volatiles.
 
> "Atomic" does not have those semantics.  The compiler can combine two
> relaxed atomic writes to one.
 
Cite the standard.
 
> extensions and which can be viewed roughly as a kind of "cache flush" in
> which the "cache" is the processor registers along with any information
> the compiler knows about any objects.
 
That's pettifogging since no one uses fences which actually won't work.
David Brown <david.brown@hesbynett.no>: Sep 23 03:25PM +0200

On 23/09/2022 13:49, Bonita Montero wrote:
>> complete old data or the complete new data.
 
> Although it is possible no one actually uses atomics for non-native
> types. And for native types what I said holds true against volatiles.
 
No.
 
>> "Atomic" does not have those semantics.  The compiler can combine two
>> relaxed atomic writes to one.
 
> Cite the standard.
 
"As if" rule.
 
>> of "cache flush" in which the "cache" is the processor registers along
>> with any information the compiler knows about any objects.
 
> That's pettifogging since no one uses fences which actually won't work.
 
They /do/ work - they just do what they are supposed to do, not what you
think they should do.
Bonita Montero <Bonita.Montero@gmail.com>: Sep 23 04:03PM +0200

Am 23.09.2022 um 15:25 schrieb David Brown:
 
>> Although it is possible no one actually uses atomics for non-native
>> types. And for native types what I said holds true against volatiles.
 
> No.
 
If you use non-native types with atomics they use STM, and that's
really slow. That's while no one uses atomic for non-native types.
 
 
>> That's pettifogging since no one uses fences which actually won't work.
 
> They /do/ work - they just do what they are supposed to do, not what you
> think they should do.
 
No, they have actually no effect.
Kaz Kylheku <864-117-4973@kylheku.com>: Sep 23 03:17AM

["Followup-To:" header set to comp.lang.c.]
> https://devclass.com/2022/09/20/microsoft-azure-cto-on-c-c/
 
> ""It's time to halt starting any new projects in C/C++ and use Rust for
> those scenarios where a non-GC language is required.
 
Those scenarios are almost never, though.
 
The niche is small, and nothing needs replacing in it.
 
> For the sake of
> security and reliability
 
... write in an easy-to use GC language that doesn't have a
fit if you get objects in a cycle, and in which any reference
to any object can go anywhere in the program you want
without a hassle.
 
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Juha Nieminen <nospam@thanks.invalid>: Sep 23 07:26AM

> has evolved over the decades, and pick a better starting point. So
> saying C is as bad, or worse, is not an excuse for poor design choices
> in Rust.
 
That's exactly what I meant. C is something like 45 years old, Rust is
something like 10 years old, and supposed to be a more modern and better
designed language taking advantage of 50+ years of experience on how a
good programming language should be like.
 
Answering the criticism "this 10yo language has this design problem" with
"oh yeah? Well, this 45yo language also has the same problem!" makes
no sense. Newer "better" language are not supposed to copy the mistakes
of the past. It's supposed to fix/improve on them.
 
I'm not saying that eg. C++ is significantly better. The
"brevity-over-clarity" mentality can oftentimes be seen throughout
the standard library.
 
For example, there's literally no reason why it should be called
'shared_ptr' instead of 'shared_pointer'. I can't think of any advantage
in shortening "pointer" like that.
 
But not to be bested, Rust makes it better: 'Rc'.
 
Because why not.
Juha Nieminen <nospam@thanks.invalid>: Sep 23 07:45AM

> relevant statistical evidence that one style of keyword length is better
> than the other in terms of fewer mistakes in code, then it will always
> be a matter of personal preference and familiarity.
 
It's not so much about making mistakes, but about readability and
understandability of the code (from the perspective of others than the
author).
 
In general, programmers are blind to the illegibility of their own code.
After all, they are thinking what to write, and writing it, so rather
obviously they understand perfectly what the code is doing. However,
someone else reading the code doesn't know in advance what the code is
doing and thus has to decipher that from reading the source code.
 
The problem with many programmers is that they get some kind of fuzzy
feeling when they create code that does a lot with as little code as
possible. Some programming language (such as Haskell) advocates even
sometimes boast how their pet language can do so much with a one-liner!
Things that require a dozen lines in other languages can be done with
a one-liner in their language!
 
As if brevity were somehow a desirable goal.
 
Of course the other extreme isn't good either: Excessive verbosity can
also make the code less readable and understandable.
 
In general, the more condensed or the more sparse the code, the less
readable it is. The perfect middle should be the goal.
 
This is where even the length of keywords steps in: The shorter the
keywords, the more condensed the code becomes. The more conceptual units
are packed into a small space, the more effort it requires the reader to
dig out the meaning. This isn't exactly helped if the code does not
consist of full English words, but some obscure abbreviations.
 
Imagine if this post consisted of nothing but very short abbreviations
of every word. It would become illegible.
Juha Nieminen <nospam@thanks.invalid>: Sep 23 07:50AM

>> those scenarios where a non-GC language is required.
 
> Those scenarios are almost never, though.
 
> The niche is small, and nothing needs replacing in it.
 
I wouldn't call embedded programming to be a "small niche".
 
(Well, the subset of embedded programming that happens on processors
so small that they can't run Linux, at least.)
Paavo Helde <eesnimi@osa.pri.ee>: Sep 23 12:13PM +0300

23.09.2022 10:45 Juha Nieminen kirjutas:
 
> Imagine if this post consisted of nothing but very short abbreviations
> of every word. It would become illegible.
 
In general, the length of the word ought to be reversely correlated with
the frequency of its use. Imagine what would English look like if the
words "is", "it", "me", "you", etc. were all 12 letter long.
Juha Nieminen <nospam@thanks.invalid>: Sep 23 10:01AM


> In general, the length of the word ought to be reversely correlated with
> the frequency of its use. Imagine what would English look like if the
> words "is", "it", "me", "you", etc. were all 12 letter long.
 
I think it's more important that names are entire words, not contracted
(with very few exceptions).
 
Secondly, the names should express as clearly as possible what the thing
is representing. Using an entire English word is not good enough if it
still doesn't express clearly what the thing is representing. In general
one shouldn't be afraid of creating even quite long names that express
clearly what the thing is.
 
This especially if you can't give a rational argument why the shorter
version has some concrete benefit from it (other than "it makes code
lines shorter").
 
For example, I think it's quite hard to argue why "ret" is a better
variable name than "returnValue" (or "return_value", depending on
your coding style/guideline). If you can't give a rational argument
for using the shorter name, just use the longer one.
 
As another example: "convert(...)" might feel more convenient to
write than something like "convert_to_utf8_from_utf16(...)", but
you have to think about the code from the perspective of someone
who is reading it and doesn't already know what it's doing.
The latter may be significantly longer but it expresses infinitely
more clearly what it's doing (even without seeing what the parameters
are). Someone seeing just "convert(...)" in the code can't have any
idea what it's doing.
Michael S <already5chosen@yahoo.com>: Sep 23 03:14AM -0700

On Friday, September 23, 2022 at 1:02:17 PM UTC+3, Juha Nieminen wrote:
> more clearly what it's doing (even without seeing what the parameters
> are). Someone seeing just "convert(...)" in the code can't have any
> idea what it's doing.
 
It sounds like an argument against C++-style static polymorphism.
Personally, I was against static polymorphism since I started to think
independently about that sort of matters (which didn't happen until I was 35+).
But I was not expecting to hear it from aficionado of "moderm C++".
Maturing?
Ben Bacarisse <ben.usenet@bsb.me.uk>: Sep 23 11:15AM +0100

> "oh yeah? Well, this 45yo language also has the same problem!" makes
> no sense. Newer "better" language are not supposed to copy the mistakes
> of the past. It's supposed to fix/improve on them.
 
I took too much from the context. The context was Rust suggested as a
replacement for C (any maybe C++, I don't recall) but your remarks
seemed to all about thing things you thought bad about Rust that C also
had. So in that context, all you seemed to be saying is that Rust is no
worse than C. Maybe everyone, even the Rust champions, agree but they
will then point to the things that /are/ better in Rust.
 
--
Ben.
Ben Bacarisse <ben.usenet@bsb.me.uk>: Sep 23 11:58AM +0100

> write than something like "convert_to_utf8_from_utf16(...)", but
> you have to think about the code from the perspective of someone
> who is reading it and doesn't already know what it's doing.
 
Is convert(...) a Rust function that converts strings or are you just
giving an example of someone who chose a bad name in a particular
program? (I know about std::convert in Rust, but that does not seem to
be what you are talking about.)
 
--
Ben.
David Brown <david.brown@hesbynett.no>: Sep 23 01:32PM +0200

On 23/09/2022 09:45, Juha Nieminen wrote:
 
> It's not so much about making mistakes, but about readability and
> understandability of the code (from the perspective of others than the
> author).
 
The prime purpose of making code readable is to avoid mistakes. Either
you see the mistakes in your own code, or someone else sees them.
(Readable code is also easier to use and re-use, and less effort to read
- but I think the avoidance of errors is key in this discussion.
Certainly it is foremost in the minds of almost everyone who promotes
Rust - every pro-Rust article starts by saying how many memory-related
errors in C could have been prevented by using Rust.)
 
> also make the code less readable and understandable.
 
> In general, the more condensed or the more sparse the code, the less
> readable it is. The perfect middle should be the goal.
 
You should aim to be like a successful psychic - a happy medium :-)
 
> consist of full English words, but some obscure abbreviations.
 
> Imagine if this post consisted of nothing but very short abbreviations
> of every word. It would become illegible.
 
I agree that there is a balance to be struck. But I think the "ideal"
balance has a lot of variation, and thus is always going to be somewhat
subjective.
David Brown <david.brown@hesbynett.no>: Sep 23 01:37PM +0200

On 23/09/2022 09:50, Juha Nieminen wrote:
 
> I wouldn't call embedded programming to be a "small niche".
 
> (Well, the subset of embedded programming that happens on processors
> so small that they can't run Linux, at least.)
 
And it will be a /long/ time before Rust has significant market share in
these devices. Despite wanting the newest microcontrollers, embedded
programmers are a conservative bunch - C90 is, I think, the most popular
choice of language. (Yes, C90 - not C99.) C++ is gaining, assembly is
dwindling (but not gone), and there is a small subset that like Ada,
Forth, MicroPython and a few others.
 
It may well be that the use of Rust in such systems would be a good idea
- but that does not mean it will happen to any serious extent.
Ralf Goertz <me@myprovider.invalid>: Sep 23 01:44PM +0200

Am Thu, 22 Sep 2022 23:15:44 +0100
 
> I've done a bit more than you then in porting a 70-line benchmark to
> it:
 
> https://github.com/sal55/langs/blob/master/fannkuch.txt
 
I just implemented the benchmark myself (it should really be called
„Pfannkuchen") quite straightforwardly (see below). There is a 1995
article about the the benchmark where the authors listed execution times
of (among others) C-versions compiled with "gcc -O2" and just "gcc". The
latter needed almost 4 times as long as the former. I wondered how "-O6"
would compare. So I compiled my program with that (using n=12 instead of
10 as in the article). To my big surprise "-O6":
 
~/c> time ./fannkuch
fannkuch(12)=65 equals 65 from oeis
 
real 0m36.325s
user 0m36.242s
sys 0m0.045s
 
was significantly slower than "-O2):
 
~/c> time ./fannkuch
fannkuch(12)=65 equals 65 from oeis
 
real 0m30.976s
user 0m30.913s
sys 0m0.021s
 
What's the reason for that?
 
 
 
 
#include <iostream>
#include <algorithm>
#include <array>
#include <numeric>
 
//from oeis.org
int results[]={0, 0, 1, 2, 4, 7, 10, 16, 22, 30, 38, 51, 65, 80, 101,
113, 139, 159, 191, 221};
 
using namespace std; // I know…
 
const int n=12;
 
int flip(array<int, n> t) {
int res=0;
do {
++res;
int top=t[0];
reverse(t.begin(),t.begin()+top);
} while (t[0]!=1);
return res;
}
 
int main() {
array<int, n> a;
iota(a.begin(),a.end(),1);
if (n>1) swap(a[0],a[1]); // don't need the perms with 1 at the beginning
int m=0; //max
do {
int i=flip(a);
if (i>m) m=i;
} while (next_permutation(a.begin(),a.end()));
cout<<"fannkuch("<<n<<")="<<m<<" equals "<<results[n]<<" from oeis"<<endl;
return 0;
}
Bart <bc@freeuk.com>: Sep 23 01:53PM +0100

On 23/09/2022 12:44, Ralf Goertz wrote:
> latter needed almost 4 times as long as the former. I wondered how "-O6"
> would compare. So I compiled my program with that (using n=12 instead of
> 10 as in the article). To my big surprise "-O6":
 
I didn't know it went up to -O6, I thought that -O3 was the highest level.
 
My own tests shows -O3 was a bit slower than -O2, but -O4 and up were
about the same as -O3.
 
Apparently there's no upper limit to optimisation level, as -O1000000
also works (with the same result as -O3), as does -O1000000000000.
 
So the real mystery is what the deal is with those silly optimisation
numbers.
 
> I just implemented the benchmark myself (it should really be called
 
(Actually the reason for all those different (p)fannkuch versions was to
use it as the basis of a bigger benchmark for comparing compilation
speeds for large input files:
 
https://github.com/sal55/langs/blob/master/Compilertest3.md)
David Brown <david.brown@hesbynett.no>: Sep 23 03:31PM +0200

On 23/09/2022 14:53, Bart wrote:
>> would compare. So I compiled my program with that (using n=12 instead of
>> 10 as in the article). To my big surprise "-O6":
 
> I didn't know it went up to -O6, I thought that -O3 was the highest level.
 
It is. But gcc (in common with many compilers) accepts higher numbers
and treats it all as highest level.
 
> also works (with the same result as -O3), as does -O1000000000000.
 
> So the real mystery is what the deal is with those silly optimisation
> numbers.
 
At a guess, someone (long, long ago) thought it was a convenient
compatibility with some other compiler that differentiated higher
numbers - thus people could move from "some_other_compiler -O6" to "gcc
-O6" without changing command line options. (Many command line options
in gcc are compatible with other *nix style compilers, new and old.)
 
David Brown <david.brown@hesbynett.no>: Sep 23 03:53PM +0200

On 23/09/2022 13:44, Ralf Goertz wrote:
> latter needed almost 4 times as long as the former. I wondered how "-O6"
> would compare. So I compiled my program with that (using n=12 instead of
> 10 as in the article). To my big surprise "-O6":
 
Anything higher than -O3 is the same as -O3 in gcc.
 
> user 0m30.913s
> sys 0m0.021s
 
> What's the reason for that?
 
gcc -O3 uses a lot effort trying to get the last few fractions of a
percent out of the code - it is rarely used, as it is rarely worth the
effort. And sometimes it backfires. Typically, this is caused by
enthusiastic loop unrolling or inlining that gives code that is
/sometimes/ faster, but also sometimes slower due to more misses for
instruction caches, return address buffers, branch prediction tables,
etc. Trying to squeeze the last drops of speed out of a compiler is an
art - you should expect to work hard at benchmarking, profiling, and
testing different compiler options in different parts of the code.
Compilers are not omniscient, and don't know everything about your code
and how it is used, your processor, or what else might be running on the
same system.
 
So generally, gcc -O2 is the normal choice until you are ready to work hard.
 
There are, however, a few flags that can make a very significant
difference, depending on source code.
 
"-march" tells the compiler details of the processor you are using,
rather than picking the lowest common denominator for the processor
family. "-march=native" tells it that the target is the same as the
compiler is running on. The "-march" flag gives the compiler access to
more SIMD and advanced instructions, as well as a more accurate model
for scheduling, pipelining, and other processor-specific fine tuning
options.
 
"-fast-math" enables a lot of floating point optimisations that are not
allowed by strict IEEE rules, but are often fine for real-world floating
point code. If your source has a lot of floating point calculations and
you are okay with the kinds of re-arrangements enabled by this flag, it
can speed up the floating point code significantly.
olcott <polcott2@gmail.com>: Sep 22 08:45PM -0500

On 9/22/2022 8:09 PM, Richard Damon wrote:
>> Zero elements of Hx/Px pairs correctly simulated by Hx reach their
>> final state thus zero elements of the Hx/Px pairs halt.
 
> SO?
 
void Px(ptr x)
{
int Halt_Status = Hx(x, x);
if (Halt_Status)
HERE: goto HERE;
return;
}
 
Zero Px elements of Hx/Px pairs correctly simulated by Hx reach their
final state thus zero Px elements of the Hx/Px pairs halt.
 
Thus the conventional "impossible" input is correctly determined to be
non-halting, thus not proof of HP undecidability.
 
 
--
Copyright 2022 Pete Olcott "Talent hits a target no one else can hit;
Genius hits a target no one else can see." Arthur Schopenhauer
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

No comments: