Tuesday, December 20, 2022

Digest for comp.lang.c++@googlegroups.com - 25 updates in 5 topics

Juha Nieminen <nospam@thanks.invalid>: Dec 20 01:52PM

> Programming Studio (as a front-end to gdb on embedded systems) long
> before "Global Positioning System" was commonly known.
 
> You don't get to decide what is "universally understood". No one does.
 
I find this hilarious. I made a *concession* to using acronyms in some
situations. "Yes, these situations I think it's ok to use an acronym
because it's not detrimental to readability." You are now arguing
*against* this concession I made.
 
It just goes to show that you are just arguing for the sake of arguing.
When I argue in one direction you argue in the opposite direction, no
matter which direction it is.
 
But by all means! If you want to write global_positioning_system in your
variable and function names, go right ahead! I would not oppose that.
The exceptions I listed are fully optional! If you don't want to use those
acronyms, then by all means don't. I'm not even being facetious.
Juha Nieminen <nospam@thanks.invalid>: Dec 20 01:54PM

>> You don't have to remember. It says it directly. It makes it so
>> much easier to understand.
 
> You are right - I wouldn't believe it.
 
Of course you don't. You are just arguing for the sake of arguing.
 
But you won't be convincing me any time soon. I know what I am seeing,
when I read code. You cannot gaslight me into seeing otherwise.
Juha Nieminen <nospam@thanks.invalid>: Dec 20 02:28PM

> with these ideas. These ideas are pretty much as old as computer
> programming itself. I am merely recounting my own experiences on this
> and confirming the validity of these ideas.
 
Out of curiosity, I searched for the coding guidelines of large companies
out there in order to check what they say about variable and function
naming, and most of them seem to broadly agree with what I have been
saying here (with some exceptions).
 
For example, Microsoft's coding guideline says:
 
"DO choose easily readable identifier names.
For example, a property named HorizontalAlignment is more English-readable
than AlignmentHorizontal."
 
"DO favor readability over brevity.
The property name CanScrollHorizontally is better than ScrollableX (an
obscure reference to the X-axis)."
 
"DO NOT use abbreviations or contractions as part of identifier names.
For example, use GetWindow rather than GetWin."
 
"DO NOT use any acronyms that are not widely accepted, and even if they
are, only when necessary."
 
I could have written that myself!
 
Google's coding guidelines also has several points I agree with (and a
few that I don't really):
 
"Use names that describe the purpose or intent of the object. Do not
worry about saving horizontal space as it is far more important to make
your code immediately understandable by a new reader. Minimize the use
of abbreviations that would likely be unknown to someone outside your
project (especially acronyms and initialisms). Do not abbreviate by
deleting letters within a word. As a rule of thumb, an abbreviation
is probably OK if it's listed in Wikipedia."
 
Again, I could have written that myself!
 
So no, I do not think that I "have found a unique way to make
code understandable". I'm merely confirming existing wisdom.
Ben Bacarisse <ben.usenet@bsb.me.uk>: Dec 20 02:45PM

> reading it months/later, and noticing how much easier it becomes
> to read when, for example, loop variables express what they are
> indexing or counting.
 
I suspect that part of the trouble is that people "read" code in
different ways. From your descriptions it really seems like you do it
differently to how I do it. If you think this might be a productive
avenue (i.e. you don't think I am just being stubborn) I'd be happy to
say more.
 
--
Ben.
scott@slp53.sl.home (Scott Lurndal): Dec 20 02:51PM


>> And what does 'c_' mean? Apparently another "universally understood
>> abbreviation" which I would not know without context.
 
>Ala c_str? not sure.
 
A convention used by a large project back around 1990. A visual indication
that the type is a class at a time before IDEs.
"daniel...@gmail.com" <danielaparker@gmail.com>: Dec 20 07:10AM -0800

On Tuesday, December 20, 2022 at 7:46:54 AM UTC-5, David Brown wrote:
 
> Actually, you /do/ know it is the correct type for its use in /that/
> line. If "a * b" returns a proxy, then "x" should be that proxy - "auto
> x" means you have the correct type.
 
The fact that you believe that makes the point that C++ coders should
be especially judicious in the use of auto. Authors of the numerical libraries
don't believe that, they write posts to StackOverflow asking if there are
ways to avoid the "auto value = copy of proxy" problem (there aren't),
because it's causing bugs in client code. Or they contribute to a proposal
for a language change, where the auto in 'auto v = foo()' or 'auto& v = foo()'
can be deduced as the value type, even if foo() returns a proxy value, see
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0672r0.pdf.
 
But you believe that. A team lead could be forgiven for being reluctant to
have you on a project where the Eigen, Armadillo or blaze libraries were
being used.
 
Daniel
David Brown <david.brown@hesbynett.no>: Dec 20 04:19PM +0100

On 20/12/2022 14:49, Juha Nieminen wrote:
> programming books and papers all the way back to the 70's and 60's.
 
> What I would actually find a bit surprising if there were books and
> papers arguing for the opposite, ie. what you are saying.
 
/Everybody/ agrees with the title of this thread!
 
/Please/ get that through your thick skull. (I know you don't take that
as an insult, and it is not meant as one.)
 
/Everybody/ thinks /overuse/ of "auto" is bad.
 
/Everybody/ thinks poor choices of identifiers are bad.
 
/Everybody/ thinks code clarity and readability is vital.
 
/Everybody/ thinks good choices of identifiers are important for clarity
and readability.
 
/Everybody/ thinks that there are too many programmers that don't code
with clarity in mind.
 
 
But only /one/ person has said that short identifiers are always bad.
Only /one/ person has said that making identifiers shorter always makes
them harder to understand. Only /one/ person has said that identifiers
should always be made from full descriptive words - except for
"universally understood" abbreviations, as determined solely by that one
person. Only /one/ person thinks it should be possible to understand
code line by line, or that it is a good thing to do. Only /one/ person
thinks that using "i" for a loop variable is a bad idea. Only /one/
person thinks that every identifier should always be written out in
full, without use of "using namespace" or other abbreviation techniques.
Only /one/ person thinks that types should always be given manually
and explicitly in full, resorting to "auto" only for unnameable types
such as lambdas.
 
Only /one/ person does not understand that short and simple code is
quicker to read and understand, given that the names used are
unambiguous and clear in the context of the surrounding code and the
task at hand.
 
Only /one/ person has so utterly and completely misinterpreted
everything said in this thread, and thinks other people use short
identifiers with a disregard for readability and code clarity, when in
fact good programmers use short identifiers precisely because they are
/easier/ to understand when used appropriately.
 
Only /one/ person would rather accuse people of arguing in bad faith, or
arguing just for the sake of it, rather than attempt to understand why
everyone in the thread has a different opinion.
 
 
Pick /good/ identifiers for the task at hand. Short-lived short-scoped
identifiers get short names. Long-lived ones get long names.
Identifiers you use a lot in the code get short names because it makes
the code easier to read, and their usage should be clear from the
context. Rarely used identifiers need longer names because you don't
see them so often.
 
Really, this is not rocket science - it's common sense.
David Brown <david.brown@hesbynett.no>: Dec 20 04:20PM +0100

On 20/12/2022 14:52, Juha Nieminen wrote:
> variable and function names, go right ahead! I would not oppose that.
> The exceptions I listed are fully optional! If you don't want to use those
> acronyms, then by all means don't. I'm not even being facetious.
 
Listen carefully - do you hear that whooshing sound? It's the sound of
the point flying /way/ over your head.
Muttley@dastardlyhq.com: Dec 20 03:49PM

On Tue, 20 Dec 2022 14:28:31 -0000 (UTC)
>For example, Microsoft's coding guideline says:
 
I doubt many people would take much notice on MS's opinion on coding.
 
>"DO NOT use abbreviations or contractions as part of identifier names.
>For example, use GetWindow rather than GetWin."
 
Yet oddly at least 3 generations of unix devs have coped with abbreviated
posix function names such as ioctl(), getuid(), uname() etc without struggling
to understand the code. Not to mention standard C functions such as fopen().
"Alf P. Steinbach" <alf.p.steinbach@gmail.com>: Dec 20 05:52PM +0100

On 19 Dec 2022 21:38, Scott Lurndal wrote:
 
> c_utf8 string;
 
> string += c_utf8::from_utf16(utf16arg);
 
> But only if I ever programmed on windows, which will never happen.
 
Current choice is a namespace called `u8` for brevity, like
 
namespace u8 {
template< class Unit_iterator >
constexpr auto to_sequence_at( const Unit_iterator it, const
char32_t code )
-> int
{
if( code < 0x80 ) { // 7 bits as 7
*(it + 0) = Byte( code );
return 1;
} else if( code < 0x800 ) { // 11 bits as 5 + 6
char32_t bits = code;
*(it + 1) = bits & continuation_bytes::value_bits_mask;
// 6
bits >>= continuation_bytes::n_value_bits;
*(it + 0) = Byte( 0b1100'0000 | bits );
// 5
return 2;
} else if( code < 0x10000 ) { // 16 bits as 4 + 6 + 6
char32_t bits = code;
*(it + 2) = bits & continuation_bytes::value_bits_mask;
// 6
bits >>= continuation_bytes::n_value_bits;
*(it + 1) = bits & continuation_bytes::value_bits_mask;
// 6
bits >>= continuation_bytes::n_value_bits;
*(it + 0) = Byte( 0b1110'0000 | bits );
// 4
return 3;
} else if( code < 0x110000 ) { // 21 bits as 3 + 6 + 6 + 6
char32_t bits = code;
*(it + 3) = bits & continuation_bytes::value_bits_mask;
// 6
bits >>= continuation_bytes::n_value_bits;
*(it + 2) = bits & continuation_bytes::value_bits_mask;
// 6
bits >>= continuation_bytes::n_value_bits;
*(it + 1) = bits & continuation_bytes::value_bits_mask;
// 6
bits >>= continuation_bytes::n_value_bits;
*(it + 0) = Byte( 0b1111'0000 | bits );
// 3
return 4;
} else {
FSM_FAIL( "Invalid Unicode code point (≥ 0x110000)." );
}
for( ;; ) {} // Should never get here.
}
}
 
I don't think I've /tested/ that code. Possibly it doesn't work. But
that's just a silly detail; the main question is whether there's really
any point in such manual loop unrolling?
 
 
- Alf
"Alf P. Steinbach" <alf.p.steinbach@gmail.com>: Dec 20 05:56PM +0100

On 20 Dec 2022 17:52, Alf P. Steinbach wrote:
> [code]
 
The code was nicely formatted, including comments line-up, before
posting. It's evidently Thunderbird fouling up things. Maintained by
script kiddies, no doubt, because WE are too lazy to do such things.
 
- Alf
"Öö Tiib" <ootiib@hot.ee>: Dec 20 08:58AM -0800

On Tuesday, 20 December 2022 at 14:09:31 UTC+2, Juha Nieminen wrote:
> words feels inconvenient and completely unnecessary. Heck, I myself
> still succumb to this from time to time, just because it feels so
> much more convenient.
 
I already addressed it. No difference in writing convenience about
identifier length. Code editors correctly offer candidates of
autocompletion after typing first letters. You keep repeating
typing difficulty or kilobytes of file or compiler capability to
process it as factor. That could be was considered in seventies.
> having to interpret what a single-letter variable means, from
> among a bunch of single-character symbols, because the variable
> is directly telling you.
 
If there are lot of single letter variables then it indeed can get rather
cryptic but no one has advocated it.
> loop variables clearly. My rought estimation/rule of thumb is that
> this importance grows about linearly with the length of the loop
> body, and exponentially with each nested loop.
 
I already said that nested loops smell for non-scalable
complexity plus rather formidable bodies mean that it is getting
to grounds of non-testable level of cyclomatic complexity too.
That can't be unfortunately cured with variable naming. I would
be super happy if just more explicit naming would help there. It
just does not.
> to discern, that's indicative that a refactoring of the entire thing
> could be in place. Sure. However, using the clearer loop variable
> names doesn't exactly hurt in either case.
 
Overly long and comprehensive names indeed hurt far less than the
function itself being so long and complex.
> see how I could get convinced of it, when I see with my own eyes the
> difference between it and using a variable that actually says what
> it's for.
 
Overuse of "i" has bothered me too but naming it "index" is even
more pointless than "i", so if it is index of row then "r" is better than
"i" however "row_index" or "row_number" does feel uselessly long,
especially if it appears several times in an expression and is clear
from context.
> is really confusing. ('n' and 'm' is not significantly better. 'x'
> and 'y' is passable and in some situations it's actually ok, when
> we are talking about actual (x, y) cartesian coordinates.)
 
I still feel that you blame complexity of algorithm itself or whatever
science involved to naming of variables. If function deals with all
of temperatures, times and tickets then naming any of those "t" is
ambiguous. But if it deals with only one of those then it is quite
obvious local shorthand.
David Brown <david.brown@hesbynett.no>: Dec 20 06:33PM +0100


> Yet oddly at least 3 generations of unix devs have coped with abbreviated
> posix function names such as ioctl(), getuid(), uname() etc without struggling
> to understand the code. Not to mention standard C functions such as fopen().
 
Despite my comments in this thread, even I would say that they could
have used a /few/ more letters in some POSIX names!
 
On the other hand, descriptive names with full words are arguably worse
if the names don't fit. I remember from old Windows programming (it may
have been early WinNT or even Win32s), there was an Windows API function
called "OpenFile" that could be used to access a device, open a pipe, or
all kinds of different things - pretty much /everything/ except open a file.
David Brown <david.brown@hesbynett.no>: Dec 20 06:39PM +0100


> But you believe that. A team lead could be forgiven for being reluctant to
> have you on a project where the Eigen, Armadillo or blaze libraries were
> being used.
 
I haven't used any of these, so I would want to see how they are
generally used before trying to write code with them. As I said, "auto"
gives you the correct type because it is the type returned by the
multiplication - that does not mean it is the type you really wanted for
the code, or that following code uses it correctly.
 
Surely if making a copy of a proxy is inappropriate, the proxy class
should disallow copying?
 
And isn't the idea of these kind of expression template libraries that
you get proxies, and can manipulate them as though they were "real"
objects, and only do the actual calculations at the end?
"Öö Tiib" <ootiib@hot.ee>: Dec 20 09:45AM -0800

On Tuesday, 20 December 2022 at 19:34:04 UTC+2, David Brown wrote:
> have been early WinNT or even Win32s), there was an Windows API function
> called "OpenFile" that could be used to access a device, open a pipe, or
> all kinds of different things - pretty much /everything/ except open a file.
 
We all use standard library containers whose method empty() does
something not related to emptying the container. Possibly we gained
requirement of size() always being O(1) thanks to that bad naming.
The if(c.empty()) feels like checking that emptying was successful
so programmers kept comparing the size() with 0 for readability.
Paavo Helde <eesnimi@osa.pri.ee>: Dec 20 08:29PM +0200


> Yet oddly at least 3 generations of unix devs have coped with abbreviated
> posix function names such as ioctl(), getuid(), uname() etc without struggling
> to understand the code. Not to mention standard C functions such as fopen().
 
I recall the authors of those names are generally happy with them,
except for one abbreviation: creat().
Frederick Virchanza Gotham <cauldwell.thomas@gmail.com>: Dec 20 10:01AM -0800

I've written a GUI program for desktop PC's that acts as a 'man in the middle', it intercepts traffic and modifies it before forwarding it.
 
I was thinking it would be cool to allow the user to write some C++ code in a text box in my GUI application to describe a text filter, so for example they could write in the text box:
 
[begin text box]
cmd.erase(0u,2u);
cmd.insert(0u,"invert_");
reply.erase(reply.find(':'));
return true;
[end text box]
 
I would then take this code and surround it in a function, like this:
 
[begin code]
bool ProcessExchange(string &cmd, string &reply)
{
cmd.erase(0u,2u);
cmd.insert(0u,"invert_");
reply.erase(reply.find(':'));
return true;
}
[end code]
 
The next thing I would do is include every C++ header file, from <any> to <bitset> to <chrono> all the way to <utility> <variant> <version>. I count that there's 107 of them when you include the C one's like <cstdlib> <cstring>. So then it would look like this:
 
[begin code]
#include <any>
#include <bitset>
#include <chrono>
...
#include <utility>
#include <variant>
#include <version>
 
bool ProcessExchange(string &cmd, string &reply)
{
cmd.erase(0u,2u);
cmd.insert(0u,"invert_");
reply.erase(reply.find(':'));
return true;
}
[end code]
 
So then I would compile this translation unit to a dynamic shared library, e.g. "custom_filter.dll" on MS-Windows or "libcustom_filter.so" on Linux/Mac. Then I would load this library into my program using "LoadLibrary" or "dlopen". I would compile the library with "-fsanitize" to make sure it dies as soon as there's a memory access violation.
 
So the question is how can I make it as safe as possible? First thing to watch out for would be the user closing the body of the function and then opening a new function, like this:
 
[begin text box]
} // This closes the 'Process' function
 
bool MyFunc(void);
 
bool const my_global_var = MyFunc();
 
bool MyFunc(void)
{
// Do something else in here
[end text box]
 
So I would have to make sure that all the curly brackets are paired up properly. Another thing is of course that I'd have to watch out for:
 
[begin text box]
std::system("format d: /y");
[end text box]
 
To prevent this, I think I'd use macroes, something like:
 
[begin code]
#include <cstdlib>
#define system /* nothing */
[end code]
 
I'd have to make a finite list of the 'dangerous' functions, stuff like std::remove, or using an std::ofstream to bulldoze a file.
 
If I code this then it would mean that in the future, anyone could use my man in the middle program to do very complex processing on the traffic.
"Öö Tiib" <ootiib@hot.ee>: Dec 20 10:13AM -0800

On Tuesday, 20 December 2022 at 20:01:24 UTC+2, Frederick Virchanza Gotham wrote:
> [end code]
 
> I'd have to make a finite list of the 'dangerous' functions, stuff like std::remove, or using an std::ofstream to bulldoze a file.
 
> If I code this then it would mean that in the future, anyone could use my man in the middle program to do very complex processing on the traffic.
 
If you mechanically compose executable code, script or request from
whatever user entered text then you have made your software as deliberate
target of code injection attacks. That can be viewed as sabotage by whatever
organisation you code for. So if you do it for yourself then you should be as
harsh you only can with yourself. ;)
Joseph Hesse <joeh@gmail.com>: Dec 20 11:00AM -0600

I want to sum an array of int's in a lambda function.
 
In the following code, function f1 does this with no problem.
 
In function f2, I am able to sum an int array using a range based
for loop. That this works surprises me since the array name is not
converted to a pointer and the for loop "looks around" to find the
size of int x[].
 
The commented out code in f2 was my attempt, as in f1, to
put the code to sum the array in a lambda. It does not compile.
 
Is it possible to make this work?
 
Thank you,
Joe
=======================================================
#include <iostream>
#include <vector>
using namespace std;
 
void f1(){
vector<int> v = {1, 2, 3, 4};
 
auto fp = [] (vector<int> vi)
{
int sum = 0;
for(const int &i : vi)
sum += i;
return sum;
};
 
cout << "sum = " << fp(v) << '\n';
}
 
void f2(){
int x[4] = {1, 2, 3, 4};
 
int sum = 0;
for(const int &i : x)
sum += i;
cout << "sum = " << sum << '\n';
 
/*
auto fp = [] (int x[])
{
int sum = 0;
for(const int &i : x)
sum += i;
return sum;
};
 
cout << "sum = " << fp(x) << '\n';
*/
}
 
int main(){
f1();
f2();
return 0;
}
"Öö Tiib" <ootiib@hot.ee>: Dec 20 09:29AM -0800

On Tuesday, 20 December 2022 at 19:00:49 UTC+2, Joseph Hesse wrote:
> size of int x[].
 
> The commented out code in f2 was my attempt, as in f1, to
> put the code to sum the array in a lambda. It does not compile.
 
The ...
 
auto fp = [] (int x[])
 
... is by language rules equivalent to ...
 
auto fp = [] (int *x)
 
... so array length information is lost and range
based for has no idea what range you mean.
 
 
> Is it possible to make this work?
 
Sure, you should either use template ...
 
auto fp = []<size_t N>(int (&x)[N])
 
... or you should have fixed array reference ...
 
auto fp = [](int (&x)[4])
 
... then the range based for is happy with it.
Paavo Helde <eesnimi@osa.pri.ee>: Dec 20 08:03PM +0200

20.12.2022 19:00 Joseph Hesse kirjutas:
 
>   cout << "sum = " << fp(x) << '\n';
> */
> }
 
You can fix it easily by over-using auto:
 
void f2() {
int x[4] = { 1, 2, 3, 4 };
 
auto fp = [] (const auto& x)
{
int sum = 0;
for(const int &i : x)
sum += i;
return sum;
};
 
std::cout << "sum = " << fp(x) << '\n';
 
}
 
 
However, using C arrays seems fragile in general as they decay to
pointers too easily. This seems better:
 
void f2() {
std::array<int, 4> x = { 1, 2, 3, 4 };
 
auto fp = [] (const auto& range)
{
int sum = 0;
for(const int &i : range)
sum += i;
return sum;
};
 
std::cout << "sum = " << fp(x) << '\n';
 
}
scott@slp53.sl.home (Scott Lurndal): Dec 20 02:49PM


>> https://www.recordcourier.com/news/2021/nov/15/bail-remains-high-driver-fatal-carson-city-wreck/
 
>> An example of person that should not be towing anything!
 
>I cannot remember the model of the boat, a Cobalt perhaps?
 
The article indicated it was a 37,000 pound yacht. 8 tons beyond the
towing capacity of the F350. Driver got 3 to 10.
Kaz Kylheku <864-117-4973@kylheku.com>: Dec 20 01:49PM

> 6) What are the three parts of a for statement and which of them are
> required?
 
Check your fingers; I count nine:
 
for ( init ; test ; step ) stmt
1 2 3 4 5 6 7 8 9
 
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca
Ben Bacarisse <ben.usenet@bsb.me.uk>: Dec 20 02:33PM


> Check your fingers; I count nine:
 
> for ( init ; test ; step ) stmt
> 1 2 3 4 5 6 7 8 9
 
or possibly 8:
 
for ( declaration test ; step ) stmt
1 2 3 4 5 6 7 8
 
depending on what you include in "init".
 
--
Ben.
Lew Pitcher <lew.pitcher@digitalfreehold.ca>: Dec 20 02:44PM

On Tue, 20 Dec 2022 14:33:36 +0000, Ben Bacarisse wrote:
 
 
> for ( declaration test ; step ) stmt 1 2 3 4 5 6 7
> 8
 
> depending on what you include in "init".
 
The C11 draft standard defines the for statement as
for ( clause-1 ; expression-2 ; expression-3 ) statement
where
clause-1 may be a declaration or an expression.
 
The notable point is that the semicolon between clause-1
and expression-2 is /not/ part of clause-1, but is, instead
part of the for statement itself.
 
Because of this, I don't believe that your count can be correct.
But, I'm willing to be educated on this :-)
 
 
--
Lew Pitcher
"In Skills, We Trust"
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

No comments: