Saturday, January 11, 2020

Digest for comp.lang.c++@googlegroups.com - 25 updates in 5 topics

Tim Rentsch <tr.17687@z991.linuxsc.com>: Jan 10 02:59PM -0800

> text is? I know this is comp.lang.c++, but I suggest that the
> presence of such text in the C standard, and its absence in the C++
> standard, is relevant to C++.
 
I didn't have time for a longer posting. Also I don't think it's
relevant for C++ because there is additional text in the C++
standard that explicitly makes such accesses be undefined behavior.
 
>> being omitted).
 
> Please consider telling us *how* it can be established constructively
> rather than just being vague.
 
My comments were consciously incomplete, but I don't think it's
fair to call them vague, just incomplete. In any case I will try
to post a more complete explanation sometime soon, in comp.lang.c.
Tim Rentsch <tr.17687@z991.linuxsc.com>: Jan 10 09:12PM -0800


> Also, don't you find it a bit odd that what you consider to be
> "the whole point of a union" has always had undefined behavior in
> C++, and had undefined behavior in C all the way up until C2011?
 
Note that James's opinion on when the behavior was defined in C
doesn't match the C standard committee's opinion on when the
behavior was defined in C. See for example DR 283 (dated 2002)
 
http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_283.htm
 
> identifying that the behavior of "write as one type and read as
> another" is to reinterpret the associated memory as having the
> type of the lvalue used to read it.
 
False. See N1256 if you don't believe me.
Tim Rentsch <tr.17687@z991.linuxsc.com>: Jan 10 09:26PM -0800

> track of what type is currently stored in a union; the results are
> machine dependent if something is stored as one type and extracted
> as another."
 
As far as how unions behave goes, what this says is basically the
same as what C89/C90 says, which in turn has the same intended
meaning as the more complete descriptions in later C standards.
 
Note that I'm not saying anything about the main or intended purpose
of unions. Clearly though K&R recognized that unions could be used
for type punning (as it is called today).
David Brown <david.brown@hesbynett.no>: Jan 11 01:03PM +0100

On 10/01/2020 17:10, Tim Rentsch wrote:
> warning message saying the function is unused is the success
> indicator for testing the function in question. It is only
> functions that are called that might contain program errors.
 
The people who maintain your code must really love you!
 
If you have written code that is not used anywhere, then presumably it
was either used in the past, or it is expected that it might be used in
the future. If it could be used in the future, then it should either be
tested, or it should somehow be marked that it is not tested. And if it
is not going to be used in the future, remove it.
 
(Of course there will often be stages in development where it is fine to
check in untested code.)
Ian Collins <ian-news@hotmail.com>: Jan 12 07:50AM +1300

On 12/01/2020 01:03, David Brown wrote:
> is not going to be used in the future, remove it.
 
> (Of course there will often be stages in development where it is fine to
> check in untested code.)
 
Are there?
 
--
Ian.
"Öö Tiib" <ootiib@hot.ee>: Jan 11 11:55AM -0800

On Saturday, 11 January 2020 20:50:44 UTC+2, Ian Collins wrote:
 
> > (Of course there will often be stages in development where it is fine to
> > check in untested code.)
 
> Are there?
 
Depends what you mean by untested? Merging untested pull requests is
kind of crime but untested (and even non-compiling) check in does not
matter.
Daniel <danielaparker@gmail.com>: Jan 10 03:41PM -0800

On Friday, January 10, 2020 at 1:18:43 PM UTC-5, David Brown wrote:
 
> No, it is not plausible - you can't have ASCII 10 in the JSON string.
> It must be escaped as "\n".
 
ASCII 10 in JSON quoted string values must be escaped as "\n", but
the JSON text may otherwise contain unescaped white space characters, including
ASCII 10.
 
Daniel
Jorgen Grahn <grahn+nntp@snipabacken.se>: Jan 11 10:48AM

> On Fri, 10 Jan 2020 14:02:38 +0100
> David Brown <david.brown@hesbynett.no> wrote:
...
>>I think it would be nice to see more use of SCTP, which combines the
 
> Unfortunately SCTP is de facto dead. It was a good attempt but didn't get the
> traction.
 
I believe SCTP has its entrenched niche uses, in at least telecom,
replacing older horrors.
 
/Jorgen
 
--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .
David Brown <david.brown@hesbynett.no>: Jan 11 12:01PM +0100

On 11/01/2020 00:41, Daniel wrote:
 
> ASCII 10 in JSON quoted string values must be escaped as "\n", but
> the JSON text may otherwise contain unescaped white space characters, including
> ASCII 10.
 
I checked again, and you are entirely correct - any of the four standard
white characters are allowed (and ignored) outside of strings in JSON.
 
However, in newline-delimited JSON, which I was talking about, you can't
have ASCII 10 or ASCII 13 in the encoding. (Use \n and \r in strings.)
This is precisely so that it is suitable for streamed packets, and it
lets you find the end each message simply by scanning for ASCII 10.
(Strings are in UTF-8, and byte 10 would not be a valid part of a UTF-8
string except as a LF character.)
 
So I would have been correct if I had stuck to referring to
line-delimited or newline-delimited JSON. But I was wrong when I
referred to general JSON. Sorry for the mix-up, and thank you for the
correction.
Jorgen Grahn <grahn+nntp@snipabacken.se>: Jan 11 11:05AM

> On Thu, 9 Jan 2020 12:59:29 -0800 (PST)
> =?UTF-8?B?w5bDtiBUaWli?= <ootiib@hot.ee> wrote:
...
> the expected packet size in a stream packet (usually at the beginning) instead
> of doing something stupid like sending raw XML or json on its own where
> constant parsing is required to see if the end of the data has been reached yet
 
I kind of disagree. Having an end marker is a perfectly good way to
split a stream into messages. SMTP is one of many traditional
protocols which does it this way.
 
I agree it's not good to have a weakly defined end-of-message, or
one that's expensive to find.
 
Sending the message size before the message typically means you have
to generate the message and buffer it locally before you can begin to
transmit it -- which is inefficient for large messages, or messages
which are expensive to generate.
 
> and may get it wrong if the formatting is bad and continue waiting for data
> beyond the end.
 
That's IMO part of normal robustness, which any "genuine" network
programmer needs to implement. (Part of it involves giving up after a
timeout: if a message has been partly received and nothing more
happens after a few seconds, it's probably never going to happen.)
 
/Jorgen
 
--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .
Mister2U@honorific.org: Jan 11 11:53AM

On Fri, 10 Jan 2020 19:18:31 +0100
>> layer has to worry about.
 
>No, it is not plausible - you can't have ASCII 10 in the JSON string.
>It must be escaped as "\n".
 
I'll take your word for that, but I've definately seen json strings with raw
newlines in. Obviously not following the json spec.
Mister2U@honorific.org: Jan 11 11:55AM

On 11 Jan 2020 11:05:49 GMT
>programmer needs to implement. (Part of it involves giving up after a
>timeout: if a message has been partly received and nothing more
>happens after a few seconds, it's probably never going to happen.)
 
Partial reception is the least of your worries - reading part of the next
message by accident is far more of an issue. All this goes away with a length
field which is why all sane network and application layer protocols use one.
David Brown <david.brown@hesbynett.no>: Jan 11 01:20PM +0100

>> It must be escaped as "\n".
 
> I'll take your word for that, but I've definately seen json strings with raw
> newlines in. Obviously not following the json spec.
 
See Daniel's post, and my reply to that. ASCII 10 (and 13) are allowed
in JSON as white space, which is ignore outside strings. But
specifically for line-delimited or newline-delimited JSON, where you
have a newline character after the JSON string, you are /not/ allowed
ASCII 10 or ASCII 13 inside the JSON string object.
Mister2U@honorific.org: Jan 11 12:33PM

On Sat, 11 Jan 2020 13:20:56 +0100
>specifically for line-delimited or newline-delimited JSON, where you
>have a newline character after the JSON string, you are /not/ allowed
>ASCII 10 or ASCII 13 inside the JSON string object.
 
A rule which will almost certainly be forgotten about or not tested for by
someone at some point. For network messages the data should never be its own
delimiter. IMO anyway.
David Brown <david.brown@hesbynett.no>: Jan 11 02:28PM +0100


> A rule which will almost certainly be forgotten about or not tested for by
> someone at some point. For network messages the data should never be its own
> delimiter. IMO anyway.
 
In line-delimited JSON, the delimiter is not part of the data - so it
fulfils your requirement.
 
Most JSON libraries generate strings without any extra white space
unless you specifically ask for it (which can be useful for human
readability).
 
And for any communication with any protocol, you have to be sure you
implement the protocol correctly and test appropriately. I can't see
why line-delimited JSON would be any different.
Daniel <danielaparker@gmail.com>: Jan 11 07:05AM -0800

On Saturday, January 11, 2020 at 6:01:35 AM UTC-5, David Brown wrote:
 
> So I would have been correct if I had stuck to referring to
> line-delimited or newline-delimited JSON.
 
But what would you have been referring to? This one?
 
https://github.com/ndjson/ndjson-spec
 
For interoperability it's preferred to refer to an Internet Standards
Document, but it doesn't seem an RFC exists yet for this one.
 
Daniel
Tim Rentsch <tr.17687@z991.linuxsc.com>: Jan 11 07:10AM -0800

Tiib writes:
 
>> int p0;
>> };
 
> Standard layout with common initial sequence with base.
 
Unless I'm missing something the two struct types base and object_0
don't have a common initial sequence (or we might say the common
initial sequence is the empty set). The first member of base is of
type int. The first member of object_0 is of type base. These two
types, int and base, are not layout-compatible: they are not the
same type; they are not layout-compatible enumerations; and they
are not layout-compatible standard-layout class types. Hence there
is no common initial sequence.
Tim Rentsch <tr.17687@z991.linuxsc.com>: Jan 11 07:28AM -0800

> ___________________________
 
> Wrt POD and a common base object on each sub-object in a union, well
> does it bite the dust with UB?
 
I believe this runs afoul of the rules for common initial
sequences, because type 'base' has as its first member a type
that is not layout compatible with the type of the first members
of object_0 and object_1. Either just use an unadorned int for
the first members of object_0 and object_1:
 
struct object_0 {
int p0;
...
};
 
struct object_1 {
int p0;
...
};
 
or wrap the 'base' member of the union in a struct
 
union object {
struct { base b; } tag;
object_0 o0;
object_1 o1;
};
 
with corresponding changes to the accessing code.
 
I haven't looked carefully at the updating part of the question.
My guess is if you want to be really sure the code isn't doing
anything "bad" it should be coded more carefully. But that is
just a guess, so take it for what it's worth.
"Öö Tiib" <ootiib@hot.ee>: Jan 11 10:23AM -0800

On Saturday, 11 January 2020 17:10:43 UTC+2, Tim Rentsch wrote:
> same type; they are not layout-compatible enumerations; and they
> are not layout-compatible standard-layout class types. Hence there
> is no common initial sequence.
 
Address of standard layout object and its first element is required to
be same so standard layout struct x and other struct that has x as
first element have the whole x as common initial sequence.
Tim Rentsch <tr.17687@z991.linuxsc.com>: Jan 10 10:48PM -0800

Tiib writes:
 
 
> Nonsensical is to assert that standard says something specially about
> properties of objects of complete types when these are the only kind
> of objects that may exist in well-formed program.
 
I think you misunderstand the English word nonsensical. A
statement can be unlikely, silly, or ridiculous, but none
of those make it nonsensical. In fact to say a statement
is unlikely more or less automatically implies it is sensical.
 
> If you mean (leaving the nonsense aside) that "all objects must have
> proper alignment" is said by standard in some other wording then feel
> free to quote, I have found it not.
 
I mean I find the statement consistent with what the standard
does say.
 
 
> I used "bald assertion" in sense of "a statement used without proof
> or evidence of truth". Wikipedia claims it is used in that sense
> about marketing statements ... [...]
 
Despite what Wikipedia may say, that's isn't what the word means.
I recommend, especially for people for whom English is not their
first language, consulting a dictionary (or better, more than one)
whenever there is any question about what a word means or whether
a particular word is a good choice in a particular situation. I
routinely consult dictionaries when questions like this come up
(yes, even though English is my first language).
"Öö Tiib" <ootiib@hot.ee>: Jan 11 12:24AM -0800

On Saturday, 11 January 2020 08:48:50 UTC+2, Tim Rentsch wrote:
> a particular word is a good choice in a particular situation. I
> routinely consult dictionaries when questions like this come up
> (yes, even though English is my first language).
 
I can communicate in 4 languages and so am convinced that words
mean nothing in essence. Word may be used in one meaning in one
language in totally different meaning in other language and never
used in any meaning in third language. Meaning is what people
wanted to express with those.
Online English encyclopedia just documents with cites that certain
words are used by certain people to express something and so these
words have that meaning. In C++ standard these words are not used
to mean anything and so can not cause confusion with slang of C++.
Tim Rentsch <tr.17687@z991.linuxsc.com>: Jan 11 06:41AM -0800

Tiib writes:
 
> "The alignment required for a type might be different when it is used
> as the type of a complete object and when it is used as the type of
> a subobject."
 
So it does. Interesting.
 
(It happens that in the document I consulted, these two paragraphs
were split by a page boundary, which makes the second paragraph
easy to miss. It would be better to have the two conflicting
statements be in a single paragraph.)
 
I note that the example relies on virtual base classes. This
explains why the special case is there. ISTM that having the two
cases be different is poorly thought out, as it makes the notion
of alignment be almost meaningless. But, that's C++ for ya.
Tim Rentsch <tr.17687@z991.linuxsc.com>: Jan 10 08:44PM -0800


> Sure, if you dismiss all problems (that we have tired of facing in
> practice) with claim that those do not exist then answering to your
> question why we avoid macros is impossible.
 
Your comments are not really about macros but about how various
people misuse macros. Any language feature can be misused.
My question is about macros, not about people who misuse them.
"Öö Tiib" <ootiib@hot.ee>: Jan 10 11:44PM -0800

On Saturday, 11 January 2020 06:44:28 UTC+2, Tim Rentsch wrote:
> > <http://coliru.stacked-crooked.com/a/65fc657ef88c80fa>
> > What you suggest to do with these annoying warnings? Silence?
> > Branch library code? Get the authors to cooperate?
 
Crickets chirping? No answers?
 
 
> Your comments are not really about macros but about how various
> people misuse macros. Any language feature can be misused.
> My question is about macros, not about people who misuse them.
 
Programmer who is ordered to write software for platform where he
can not avoid wasting workdays into dealing with that misuse of
feature by platform vendors still can not avoid disliking the
feature that allows such annoying misuses. Especially in places
where usage of inline functions, templates or constants would
be as simple but would not cause the issues existence of whose
you dismiss.
Tim Rentsch <tr.17687@z991.linuxsc.com>: Jan 10 08:56PM -0800

> circumstances). It's actually the end of that sentence that defines what
> null pointer is: a pointer guaranteed to compare unequal to a pointer to
> any object or function.
 
The first part of the sentence is the definition. Other ways of
obtaining a null pointer value therefore must have derived that
value, at some point in the program's history, by converting a
null pointer to some pointer type. The second part of the
sentence gives a property that null pointers must satisfy, but is
not the definition of null pointer. It is perfectly possible to
have a pointer value that compares unequal to a pointer to any
object or function, but is not a null pointer.
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

No comments: