Wednesday, September 14, 2016

Digest for comp.lang.c++@googlegroups.com - 7 updates in 4 topics

"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Sep 14 12:19PM +0200

On 13.09.2016 20:00, Stefan Ram wrote:
> |The first one gives 7 and the second one 3.
> |
> |This is also a problem with "*" and "/".
 
In most cases the two constructor call notations give the same result.
 
It's usually unexpected when they don't mean the same, because it isn't
clear from the calling code, and it depends on the concrete classes.
 
That's a problem.
 
And in the standard library it's mainly a problem with
`std::basic_string` and `std::vector`.
 
The fix direction I proposed addressed the lack of clarity in the
calling code.
 
 
> { stringfromlist s{ 2, 65 }; ::std::cout << s << '\n'; }}
 
> Above, »string ... 2, 65« always means »two 'A'«, and
> »stringfromlist ... 2, 65« always means »(char)2« and »'A'«.
 
To my eyes all these prefixes are visual noise that make the code pretty
unreadable, but I have talked with Norwegian programmers who claim that
on the contrary, when they get used to it the prefixes increase
readability for them. I can't fathom how. But.
 
Ditto for the indentation convention, I can't see how you can get used
to that at all. Do tools such as AStyle support that?
 
Anyway, one technical problem with this approach is that the derived
classes can affect overload resolution so that e.g. operators are not
found. I was surprised to find that unqualified `swap` worked with
arguments of the two different derived classes. Someone will no doubt
find an explanation for that apparent weirdness. It's both about the ADL
and about the template argument deduction. But as an example that works,
in the sense that it fails to compile, consider:
 
template< class Type >
void foo( Type&, Type& ) {}
 
int main ()
{ {string a( 1, 'A' ); stringfromlist b( 2, 'A' ); foo( a, b ); }}
 
Cheers & hth.,
 
- Alf
Bo Persson <bop@gmb.dk>: Sep 14 05:20PM +0200

On 2016-09-14 14:06, Stefan Ram wrote:
> makes it clear that the first name comes from the
> standard library and the second one from a custom
> library.
 
For the rest of us the noise is that if std::example isn't the same as
::std::example, the code base is so fucked up that you should just
discard it and start over.
 
Anyone adding a sky::std::string deserves a public flogging, so trying
to defend against that is a wasted effort. And makes the code unreadable.
 
 
Bo Persson
Ian Collins <ian-news@hotmail.com>: Sep 15 08:08AM +1200

On 09/15/16 03:20 AM, Bo Persson wrote:
> discard it and start over.
 
> Anyone adding a sky::std::string deserves a public flogging, so trying
> to defend against that is a wasted effort. And makes the code unreadable.
 
It's a bit like the Swiss defending against a seaborne invasion...
 
--
Ian
ram@zedat.fu-berlin.de (Stefan Ram): Sep 14 12:06PM

>unreadable, but I have talked with Norwegian programmers who claim that
>on the contrary, when they get used to it the prefixes increase
>readability for them. I can't fathom how. But.
 
The prefixes explain where the names come from.
 
For example »if( ::std::example() > ::sky::example() )«
makes it clear that the first name comes from the
standard library and the second one from a custom
library.
 
>Ditto for the indentation convention, I can't see how you can get used
>to that at all. Do tools such as AStyle support that?
 
If someone is paying me to write code, he can get any format
from me that he likes. But on the Usenet I enjoy the freedom
to use the indentation style that I deem best, as explained below:
 
One Way to Format Parentheses
 
There are several different ways to format texts with braces
and parentheses. One of them is being described here.
 
Indentation within Braces
 
An indentation of just one space often is too small to be seen
clearly, because the natural width and form of characters
often varies by an amount that is not very much smaller than a
space. Therefore, the indentation should amount to at least
two positions. In order not to waste horizontal spaces, an
indentation of exactly two positions is chosen. This means,
that the left position of the next level is two larger than
the position of the directly enclosing level.
 
Indentation by two positions within a block
 
{ ++x;
++x; }
^ ^
0 2
 
Bad: A small indentation by one position is not always visible
clearly
 
{++x;
++x; }
 
Good: The indentation by two positions is visible clearly
 
{ ++x;
++x; }
 
Bad: A large indentation by more than two positions wastes
horizontal space with no additional benefit
 
{ ++x;
++x; }
 
Spaces within braces
 
In mathematics, there are often no spaces at the inner side of
parentheses or braces in expressions, but spaces are used
indeed at the inner side of braces in set notation, when the
braces contain a description (not when they contain a list).
 
Spaces in set notation
 
{ x | x > 2 }
 
This style is adopted here: One space is written at the inner
side of braces.
 
Spaces at the inner side of parentheses within a block
 
{ ++x; }
 
This style is consistent with the indentation by two
positions, because only using this style, corresponding parts
of two lines have the same position.
 
Bad: No space after the first brace, the two statements are
misaligned
 
{++x;
++x; }
 
Good: One space after the first brace, the two statements are
properly aligned
 
{ ++x;
++x; }
 
Bad: Two spaces after the first brace, the two statements are
misaligned
 
{ ++x;
++x; }
 
There are some exceptions to this rule: No spaces are used
within empty braces "{}" and between two or more closing
braces of the same direction "}}", except, when the first one
of them is part of an empty pair "{} }" (an empty pair of
braces is treated like a single non-braces character).
 
Unified rules for all Brackets
 
For simplicity and uniformity, the rules from above apply to
all kinds of brackets, including parentheses, braces (curly
brackets), square brackets, and angle brackets.
 
Spaces within parentheses and square brackets
 
{ y = f( x )+ g() + a[ 2 ]; }
 
Binary operators are sorrounded by a space, but the space is
omitted, when there already is a space on the other side of a
sequence of brackets directly beside the operator: By this rule,
" )+" is written instead of " ) +".
 
Representation of the Syntactical Structure
 
A method or function definition consists of a head and a body.
The following representation shows this structure:
 
Good formatting according to the structure
 
void alpha() // head
{ beta(); } // body
 
The following formatting is misleading, because the line break
does not match the structural break:
 
Bad line break within the body
 
void alpha() { // head and the beginning of the body
beta(); } // the rest of the body
 
This formatting also would make no sense for blocks within
blocks. So it is often not used for such blocks. Therefore
even the adopters of this style can not use it uniformly.
 
Opening Braces Look Like "bullets"
 
There is a well known style to publish lists in typography
using bullets sticking out on the left, looking like this:
 
Common list representation with bullets in typography
 
o This is the first point
of this list, it is written
here just as an example.
 
o Here is another entry
 
o This is another example given
just as an example to show
an example
 
The braces of the beginnings of blocks stand out on the left
just the same, when the formatting being described here is
used, so they look quite naturally as beginning-of-a-block
markers, when one is used to the typographical list notation:
 
Left braces look like bullets to mark blocks
 
{ printf(); printf();
printf(); printf(); printf();
printf(); printf(); }
 
{ printf(); printf(); }
 
{ printf(); printf(); printf();
printf(); printf();
printf(); }
 
Neutrality
 
Someone wrote this C code:
 
while( fgets( eingabe, sizeof eingabe, stdin ))
if( sscanf( eingabe, "%d", &wert )!= 1 )
fprintf( stderr, "Please enter a number!\n" );
else
summe += wert;
 
It amazes me that I can add braces by my style conventions
(not changing the meaning of the code)
without the need to change the position of any character of
the given code or need to change the overall number of lines:
 
The code from above plus braces
 
while( fgets( eingabe, sizeof eingabe, stdin ))
{ if( sscanf( eingabe, "%d", &wert )!= 1 )
{ fprintf( stderr, "Please enter a number!\n" ); }
else
{ summe += wert; }}
 
Insofar, my bracing style might be considered non-obtrusive.
 
Lines per Contents
 
Lines containing only a single brace waste vertical space, so
less contents fits on the same screen space. Therefore, I usually
avoid them, but sometimes I do use them, when this helps to
increase readability. I also might temporarily use them when editing
a section of code. Of course, they would help programmers paid or
being judged by the lines-of-code productivity.
 
The Formatting of The If Statement
 
I want my code to express the structur of the if statement.
 
What is the structure of an if statement?
 
According to ISO/IEC 9899:1999 (E)
 
selection-statement:
if ( expression )
statement
 
So, there is a head clause
 
if ( expression )
 
and a body statement
 
statement
 
Thus,
 
if( expression ) /* head clause */
{ exam(); ple(); } /* body statement */
 
The separation between head and body in the deep structure
beautifully aligns with the separation between the first
and the second line in the visible surface structure. Thus,
the surface structure exposes the deep structure in a
not-misleading way. (Which to me is the generalized meaning
of »structured programming«: expressing as much of the deep
structure as possible already in the surface structure,
which makes the code more readable.)
 
Moreover, why would anyone sane in his mind compose a line
of the head of a structure /and the first character of its
body/ with the rest of the body following in another line?
 
if( expression ) { /* head clause AND first character
of the body statement */
exam(); ple(); /* rest of body, except for */
} /* a line wasted on a brace */
 
As a comparision, in writing, we have headings and text body:
 
The Creation
 
In the beginning God created the heaven and the earth.
 
Now, why would someone write the heading /and the first word of
its body/ on a line, as in the following example?
 
The Creation In
 
the beginning God created the heaven and the earth.
 
I write:
 
if
( sscanf( input_buffer, "%d %d %d", &length, &width, &height )== 3 &&
sscanf( other_buffer, "%d %d %d", &color, &price, &weight )== 3 &&
needs_processing( color ))
{ compute_volume( length, width, height );
compute_something_else( price, weight ); }
 
^
'---- What I like about this is:
 
This is an if statement with two variable constituents:
an expression and a statement.
 
This is the macrostructure.
 
The major questions one encounters when reading are:
 
- What a kind of entity is this at all?
 
And then, having learned that it is an if statement:
 
- Where is its expression?
 
- Where is its statement?
 
And it are exactly the answers to these three questions
that are each marked with a character in the leftmost column!
 
So the type of the entity and its two constituents
clearly stand out graphically (visually). Albeit in a
manner not too obtrusive and not wasting lines.
 
Now we can compare this with (IIRC, this is code someone else wrote):
 
if (sscanf(input_buffer, "%d %d %d", &length, &width, &height) == 3 &&
sscanf(other_buffer, "%d %d %d", &color, &price, &weight) == 3 &&
needs_processing(color)) {
compute_volume(length, width, height);
compute_something_else(price, weight);
}
 
Here the start and end of the statement are marked in the
leftmost column. This is also not bad, but the important
separation between the expression and the statement of the
if is »hidden« in the »center« of this character structure.
 
from Kernighan's B tutorial from 1972:
 
if(---)
{ -----
----- }
else if(---)
{ -----
----- }
else if(----)
{ -----
----- }
 
from en.wikipedia.org/wiki/Indent_style (under pico):
 
stuff(n):
{ x: 3 * n;
y: doStuff(x);
y + x }
 
>void foo( Type&, Type& ) {}
>int main ()
>{ {string a( 1, 'A' ); stringfromlist b( 2, 'A' ); foo( a, b ); }}
 
Ok, so in the end, factory functions might be better, and
- one might say - with »make_pair«, »make_unique« and so
on, there already are models in the standard library today.
 
A good name might start with »make_X_from_«, e.g.,
»make_string_from_list« or »make_string_by_product«
(where »"bbbbb"« would be the "product" »5×'b'«).
 
(Or, one might say the opposite: We should reserve »make_«
for the standard library and use »create_«.)
Paavo Helde <myfirstname@osa.pri.ee>: Sep 14 01:03PM +0300

On 14.09.2016 1:32, Tim Rentsch wrote:
> As it stands the above statement begs the question.
 
> Can anyone quote chapter and verse to provide a compelling
> answer to this question?
 
A C++ implementation can be hosted or free-standing. I believe the
question here is if a hosted C++ implementation must support all the
locales and character sets supported by its host.
 
In the C++ standard the difference between hosted and free-standing is
that a free-standing implementation may have fewer standard headers and
may not support multithreading. I do not find any mention about
supporting "host locales" or something like that. So I guess MSVC++
implementation might be standard-conforming in this area if the declared
that it they are just supporting half of UCS-2 as the maximum character set.
 
However, this is not what they claim. In all documentation they give an
impression that they support all Unicode. E.g. from
https://msdn.microsoft.com/en-us/library/2dax2h36.aspx:
 
[quote]
A wide character is a 2-byte multilingual character code. Most
characters used in modern computing worldwide, including technical
symbols and special publishing characters, can be represented according
to the Unicode specification as a wide character. Characters that cannot
be represented in just one wide character can be represented in a
Unicode pair by using the Unicode surrogate feature. Because every wide
character is represented in a fixed size of 16 bits, using wide
characters simplifies programming with international character sets.
[/quote]
 
It appears they have redefined UTF-16 as "surrogate pairs", and the last
sentence does not make any sense whatsoever (probably it is a left-over
from UCS-2 times where "fixed size" == 1).
 
From https://msdn.microsoft.com/en-us/library/69ze775t.aspx
 
[quote]
Universal character names cannot encode values in the surrogate code
point range D800-DFFF. For Unicode surrogate pairs, specify the
universal character name by using \UNNNNNNNN, where NNNNNNNN is the
eight-digit code point for the character. The compiler generates a
surrogate pair if required.
[/quote]
 
So it appears they are having more or less transparent support for
Unicode in string literals, but not in single wchar_t characters.
 
Indeed, an experiment with old Coptic zero:
 
#include <iostream>
#include <stdint.h>
int main() {
const wchar_t a[] = L"\U000102E0";
std::cout << sizeof(a)/sizeof(a[0]) << "\n";
std::cout << std::hex << uint32_t(a[0]) << " " << uint32_t(a[1]) << "\n";
 
wchar_t x = L'\U000102E0';
std::cout << std::hex << (int) x << "\n";
}
 
Produces:
main.cpp(8): warning C4066: characters beyond first in wide-character
constant ignored
 
3
d800 dee0
d800
 
The string literal seems to be proper UTF-16, but the value of wchar_t
is obviously wrong.
 
For comparison, gcc output for the same program:
 
2
102e0 0
102e0
"Chris M. Thomasson" <invalid@invalid.invalid>: Sep 13 04:42PM -0700

On 9/3/2016 5:24 AM, Öö Tiib wrote:
 
> Note that people may miss the fun part and consider something
> like that seriously. The '/dev/random' on Unix or 'CryptGenRandom' on
> Windows likely work orders of magnitude faster.
 
So far, AFAICT, there just may be a rather interesting property wrt this
"scheme". I am having some trouble understanding how there can be any
sort of "concrete set period" where the race-condition based "random"
numbers will "exactly repeat themselves", as in PRNGS. It sure seems to
want to be, irrational.
 
Humm...
 
 
Any thoughts?
"Öö Tiib" <ootiib@hot.ee>: Sep 14 12:45AM -0700

On Wednesday, 14 September 2016 02:42:16 UTC+3, Chris M. Thomasson wrote:
> want to be, irrational.
 
> Humm...
 
> Any thoughts?
 
Computer is specially designed to achieve repeatability of its work and
so your program causes seemingly random results because of other
I/O or programs consuming resources of computer when you run the
program (you move mouse, click keyboard, some packet is exchanged
with net, some multimedia is decoded by other program etc.).
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

No comments: