Saturday, July 3, 2021

Digest for comp.lang.c++@googlegroups.com - 25 updates in 3 topics

olcott <NoOne@NoWhere.com>: Jul 03 10:19AM -0500

*Halting problem undecidability and infinitely nested simulation*
When the halt decider bases its halt status decision simulating its
input then the conventional halting problem proof undecidability
counter-example templates can be correctly decided as inputs that never
halt. They will never halt because they specify infinitely nested
simulation to any simulating halt decider.
 
Because a simulating halt decider must always abort the simulation of
every input that never halts its halt deciding criteria must be adapted.
[ Does the input halt on its input? ] must become [ Does the input halt
without having its simulation aborted? ] This change is required because
every input to a simulating halt decider either halts on its own or
halts because its simulation has been aborted.
 
The standard pseudo-code halting problem template "proved" that the
halting problem could never be solved on the basis that neither value of
true (halting) nor false (not halting) could be correctly returned to
the confounding input.
 
// Simplified Linz Ĥ (Linz:1990:319)
void P(u32 x)
{
u32 Input_Halts = H(x, x);
if (Input_Halts)
HERE: goto HERE;
}
 
int main()
{
u32 Input_Halts = H((u32)P, (u32)P);
Output("Input_Halts = ", Input_Halts);
}
 
This problem is overcome on the basis that a simulating halt decider
would abort the simulation of its input before ever returning any value
to this input. It aborts the simulation of its input on the basis that
its input specifies what is essentially infinite recursion (infinitely
nested simulation) to any simulating halt decider.
 
The x86utm operating system was created so that the halting problem
could be examined concretely in the high level language of C and x86.
When examining the halting problem this way every detail can be
explicitly specified. UTM tape elements are 32-bit unsigned integers.
 
H analyzes the (currently updated) stored execution trace of its x86
emulation of P(P) after it simulates each instruction of input (P, P).
As soon as a non-halting behavior pattern is matched H aborts the
simulation of its input and decides that its input does not halt.
 
*This is the sound deductive inference (proof) that H(P,P)==0 is correct*
 
Premise(1) (axiom) Every computation that never halts unless its
simulation is aborted is a computation that never halts. This verified
as true on the basis of the meaning of its words.
 
Premise(2) (verified fact) The simulation of the input to H(P,P) never
halts without being aborted is a verified fact on the basis of its x86
execution trace. (shown below).
 
When the simulator determines whether or not it must abort the
simulation of its input based on the behavior of its input the simulator
only acts as an x86 emulator thus has no effect on the behavior of its
input. This allows the simulator to always ignore its own behavior.
 
Conclusion(3) From the above true premises it necessarily follows that
simulating halt decider H correctly reports that its input: (P,P) never
halts.
 
 
 
 
Halting problem undecidability and infinitely nested simulation
https://www.researchgate.net/publication/351947980_Halting_problem_undecidability_and_infinitely_nested_simulation
 
 
--
Copyright 2021 Pete Olcott
 
"Great spirits have always encountered violent opposition from mediocre
minds." Einstein
olcott <NoOne@NoWhere.com>: Jul 03 11:19AM -0500

On 7/3/2021 10:56 AM, Richard Damon wrote:
>> to this input.
 
> But what does aborting the simulation have to do with returning the
> answer to the caller. THEY ARE DIFFERENT.
 
void P(u32 x)
{
u32 Input_Halts = H(x, x);
if (Input_Halts)
HERE: goto HERE;
}
 
int main()
{
u32 Input_Halts = H((u32)P, (u32)P);
Output("Input_Halts = ", Input_Halts);
}
 
When the called is calling H in infinitely nested simulation
 
IN THE ABOVE C COMPUTATION NOT ANY OTHER COMPUTATION
IN THE ABOVE C COMPUTATION NOT ANY OTHER COMPUTATION
IN THE ABOVE C COMPUTATION NOT ANY OTHER COMPUTATION
IN THE ABOVE C COMPUTATION NOT ANY OTHER COMPUTATION
 
IN THE ABOVE C COMPUTATION NOT ANY OTHER COMPUTATION
IN THE ABOVE C COMPUTATION NOT ANY OTHER COMPUTATION
IN THE ABOVE C COMPUTATION NOT ANY OTHER COMPUTATION
IN THE ABOVE C COMPUTATION NOT ANY OTHER COMPUTATION
 
The simulating halt decider H must abort the simulation of its input in
the computation H(P,P) before returning any value to P.
 
IN THE ABOVE C COMPUTATION NOT ANY OTHER COMPUTATION
IN THE ABOVE C COMPUTATION NOT ANY OTHER COMPUTATION
IN THE ABOVE C COMPUTATION NOT ANY OTHER COMPUTATION
IN THE ABOVE C COMPUTATION NOT ANY OTHER COMPUTATION
 
IN THE ABOVE C COMPUTATION NOT ANY OTHER COMPUTATION
IN THE ABOVE C COMPUTATION NOT ANY OTHER COMPUTATION
IN THE ABOVE C COMPUTATION NOT ANY OTHER COMPUTATION
IN THE ABOVE C COMPUTATION NOT ANY OTHER COMPUTATION
 
> Engineering concept of execution context.
 
> After H aborts the simulation, it needs to do something. By its
> requriements, it needs to return that answer to its caller.
 
H does return a value of 0 to main(). H does not return any value to the
simulated P that calls H in infinitely nested simulation.
 
> When it does, it tells that caller that it thinks that P(P) is
> non-Halting, but when it does that to the caller P, that P then halts,
> showing that the answer it gave was wrong.
 
Because a simulating halt decider must always abort the simulation of
every input that never halts its halt deciding criteria must be adapted.
 
Does the input halt on its input?
must become
Does the input halt without having its simulation aborted?
 
This change is required because every input to a simulating halt decider
either halts on its own or halts because its simulation has been aborted.
 
> The FUNDAMENTAL definition of Halting is what the machine does when run
> as a machine with the given input. It is shown that P(P) does Halt (and
> you admit this at times) thus P(P) IS a HALTING computation, by definition.
 
That definition simply ignores simulating halt deciders. The fatal
mistake of the halting problem proofs is they simply ignoring simulating
halt deciders.
 
> The non-Halting answer is wrong, by definition.
 
The non-halting answer is correct the definition must be adapted.
 
> Any 'logic' that 'proves' otherwise is by definition wrong or at least
> shows the logic is inconsistent.
 
According to your incorrect reasoning a simulating halt decider must
always report true because all of its inputs either halt own their own
or are aborted.
 
Because people that are not dumber than a box of rocks do understand
that computations that only halt because they were aborted are
computations that never halt there is agreement that your reasoning is
incorrect.
 
Ben and Kaz both agree. that computations that only halt because their
simulation was aborted are non-halting computations.
 
 
--
Copyright 2021 Pete Olcott
 
"Great spirits have always encountered violent opposition from mediocre
minds." Einstein
Bonita Montero <Bonita.Montero@gmail.com>: Jul 03 06:28PM +0200

You are off-topic in comp.lang.c/c++.
Nothing what you say is especially related to C or C++.
Stop posting here.
olcott <NoOne@NoWhere.com>: Jul 03 11:51AM -0500

On 7/3/2021 11:28 AM, Bonita Montero wrote:
> You are off-topic in comp.lang.c/c++.
> Nothing what you say is especially related to C or C++.
> Stop posting here.
 
It is 100% totally related to software engineering in C.
The first 8 pages of my paper are only about software engineering in C.
 
Halting problem undecidability and infinitely nested simulation
 
https://www.researchgate.net/publication/351947980_Halting_problem_undecidability_and_infinitely_nested_simulation
 
 
--
Copyright 2021 Pete Olcott
 
"Great spirits have always encountered violent opposition from mediocre
minds." Einstein
Bonita Montero <Bonita.Montero@gmail.com>: Jul 03 07:18PM +0200

> It is 100% totally related to software engineering in C.
 
It's related to programming in general.
So you shouldn't post in groups related to specific languages.
olcott <NoOne@NoWhere.com>: Jul 03 12:19PM -0500

On 7/3/2021 12:18 PM, Bonita Montero wrote:
>> It is 100% totally related to software engineering in C.
 
> It's related to programming in general.
> So you shouldn't post in groups related to specific languages.
 
It is a specific software engineering problem in the C programming
language. The C source code is provided.
 
--
Copyright 2021 Pete Olcott
 
"Great spirits have always encountered violent opposition from mediocre
minds." Einstein
Bonita Montero <Bonita.Montero@gmail.com>: Jul 03 07:33PM +0200

>> So you shouldn't post in groups related to specific languages.
 
> It is a specific software engineering problem in the C programming
> language. The C source code is provided.
 
You don't don't discuss any C/C++-specific issues.
comp.theory is the only NG that fits.
Stop posting in comp.lang.c/c++.
You're a off-topic-terrorist.
"Alf P. Steinbach" <alf.p.steinbach@gmail.com>: Jul 03 03:30AM +0200

On 2 Jul 2021 22:15, James Kuyper wrote:
> the U prefix.
 
> I have a feeling that there's a misunderstanding somewhere in this
> conversation, but I'm not sure yet what it is.
 
Juha is concerned about the compiler assuming some other source code
encoding than the actual one.
 
The implementation defined encoding of `wchar_t`, where in practice the
possibilities as of 2021 are either UTF-16 or UTF-32, doesn't matter.
 
A correct source code encoding assumption can be guaranteed by simply
statically asserting that the basic execution character set is UTF-8, as
I showed in my original answer in this thread.
 
 
 
>> the stuff between the quotation marks in the source code will use
>> whatever encoding (ostensibly but not assuredly UTF-8), which the
 
> I don't understand why you think that the source code encoding matters.
 
It matters because if the compiler assumes wrong, and Visual C++
defaults to assuming Windows ANSI when no other indication is present
and it's not forced by options, then one gets incorrect literals.
 
Which may or may not be caught by unit testing.
 
 
> those encodings are for the characters {'u', '""', '\\', 'x', 'C', '2',
> '\\', 'x', 'A', '9', '"'}, any fully conforming implementation must give
> you the standard defined behavior for u"\xC2\xA9".
 
Consider:
 
#include <iostream>
using std::cout, std::hex, std::endl;
 
auto main() -> int
{
const char16_t s16[] = u"\xC2\xA9";
for( const int code: s16 ) {
if( code ) { cout << hex << code << " "; }
}
cout << endl;
}
 
The output of this program, i.e. the UTF-16 encoding values in `s16`, is
 
c2 a9
 
Since Unicode is an extension of Latin-1 the UTF-16 interpretation of
`\xC2` and `xA9` is as Latin-1 characters, respectively "Â" and (not a
coincidence) "©" according to my Windows 10 console in codepage 1252.
 
Which is not the single "©" that an UTF-8 interpretation gives.
 
 
[snip]
 
 
- Alf
James Kuyper <jameskuyper@alumni.caltech.edu>: Jul 03 12:44AM -0400

On 7/2/21 9:30 PM, Alf P. Steinbach wrote:
>> the U prefix.
 
>> I have a feeling that there's a misunderstanding somewhere in this
>> conversation, but I'm not sure yet what it is.
 
I now have a much better idea what the misunderstanding is. See below.
 
> A correct source code encoding assumption can be guaranteed by simply
> statically asserting that the basic execution character set is UTF-8, as
> I showed in my original answer in this thread.
 
The encoding of the basic execution character set is irrelevant if the
string literals are prefixed with u8, u, or U, and use only valid escape
sequences to specify members of the extended character set. The encoding
for such literals is explicitly mandated by the standard. Are you (or
he) worrying about a failure to conform to those mandates?
 
...
 
> It matters because if the compiler assumes wrong, and Visual C++
> defaults to assuming Windows ANSI when no other indication is present
> and it's not forced by options, then one gets incorrect literals.
 
Even when u8, u or U prefixes are specified?
 
...
> }
 
> The output of this program, i.e. the UTF-16 encoding values in `s16`, is
 
> c2 a9
 
 
Yes, that's precisely what the C++ standard mandates, regardless of the
encoding of the source character set. Which is why I mistakenly thought
that's what he was trying to do.
 
> `\xC2` and `xA9` is as Latin-1 characters, respectively "Â" and (not a
> coincidence) "©" according to my Windows 10 console in codepage 1252.
 
> Which is not the single "©" that an UTF-8 interpretation gives.
 
OK - it had not occurred to me that he was trying to encode "©", since
that is not the right way to do so. In a sense, I suppose that's the
point you're making.
My point is that all ten of the following escape sequences should be
perfectly portable ways of specifying that same code point in each of
three Unicode encodings:
 
UTF-8: u8"\u00A9\U000000A9"
UTF-16: u"\251\xA9\u00A9\U000000A9"
UTF-32: U"\251\xA9\u00A9\U000000A9"
 
Do you know of any implementation which is non-conforming because it
misinterprets any of those escape sequences?
Juha Nieminen <nospam@thanks.invalid>: Jul 03 06:59AM


>> the stuff between the quotation marks in the source code will use
>> whatever encoding (ostensibly but not assuredly UTF-8), which the
 
> I don't understand why you think that the source code encoding matters.
 
Because the source file will most often be a text file using 8-bit
characters and, in these situations most likely (although not assuredly)
using UTF-8 encoding for non-ascii characters.
 
However, when you write:
 
const char16_t *str = u"something";
 
if that "something" contains non-ascii characters, which in this case will
be (usually) UTF-8 encoded in this source code, the compiler will have to
interpret that UTF-8 string and convert it to UTF-16 for the output
binary.
 
So the problem is the same as with wchar_t: How does the compiler know
which encoding is being used in this source file? It needs to know that
since it has to generate an UTF-16 string literal into the output binary
from those characters appearing in the source code.
 
> When using the u8 prefix, UTF-8 encoding is guaranteed, for which every
> codepoint from U+0000 to U+007F is represented by a single character
> with a numerical value matching the code point.
 
UTF-8 encoding is guaranteed *for the result*, ie. what the compiler writes
to the output binary. Is it guaranteed to *read* the characters in the
source code between the quotation marks and interpret it as UTF-8?
 
> codepoint from U+0000 to U+D7FF, and from U+E000 to U+FFFF, is
> represented by a single character with a numerical value matching the
> codepoint.
 
Same issue, even more relevantly here.
 
> Do you know of any implementation of C++ that claims to be fully
> conforming, for which that is not the case? If so, how do they justify
> that claim?
 
Visual Studio will, by default (ie. with default project settings after
having created a new project) interpret the source files as Windows-1252
(which is very similar to ISO-Latin-1).
 
This means that when you write L"something" or u"something", if there
are any non-ascii characters between the parentheses, UTF-8 encoded,
then the result will be incorrect. (In order to make Visual Studio do
the correct conversion, you need to specify that the file is UTF-8 encoded
in the project settings).
 
> according to an implementation-defined character encoding. But so long
> as they encode {'\\', 'x', 'C', '2', '\\', 'x', 'A', '9'}, you should
> get the standard-defined behavior for u"\xC2\xA9".
 
Yes, but that's not the correct desired character in UTF-16, only in UTF-8.
You'll get garbage as your UTF-16 string literal.
"Alf P. Steinbach" <alf.p.steinbach@gmail.com>: Jul 03 01:31PM +0200

On 3 Jul 2021 06:44, James Kuyper wrote:
 
> The encoding of the basic execution character set is irrelevant if the
> string literals are prefixed with u8, u, or U, and use only valid escape
> sequences to specify members of the extended character set.
 
Having a handy simple way to guarantee a correct source code encoding
assumption doesn't seem irrelevant to me.
 
On the contrary it's directly a solution to the OP's problem, which to
me appears to be maximally relevant.
 
 
> The encoding
> for such literals is explicitly mandated by the standard. Are you (or
> he) worrying about a failure to conform to those mandates?
 
No, Juha is worrying about the compiler's source code encoding assumption.
 
 
>> defaults to assuming Windows ANSI when no other indication is present
>> and it's not forced by options, then one gets incorrect literals.
 
> Even when u8, u or U prefixes are specified?
 
Yes. As an example, consider
 
const auto& s = u"Blåbær, Mr. Watson.";
 
If the source is UTF-8 encoded, without a BOM or other encoding marker,
and if the Visual C++ compiler is not told to assume UTF-8 source code,
then it will incorrectly assume that this is Windows ANSI encoded.
 
The UTF-8 bytes in the source code will then be interpreted as Windows
ANSI character codes, e.g. as Windows ANSI Western, codepage 1252.
 
The compiler will then see this source code:
 
const auto& s = u"Blåbær, Mr. Watson.";
 
And it will proceeed to encode /that/ string with UTF-8 in the resulting
string value.
 
 
> UTF-32: U"\251\xA9\u00A9\U000000A9"
 
> Do you know of any implementation which is non-conforming because it
> misinterprets any of those escape sequences?
No, they should work. These escapes are an alternative solution to
Juha's problem. However, they lack readability and involve much more
work than necessary, so IMO the thing to do is to assert UTF-8.
 
- Alf
James Kuyper <jameskuyper@alumni.caltech.edu>: Jul 03 08:44AM -0400

On 7/3/21 7:31 AM, Alf P. Steinbach wrote:
> On 3 Jul 2021 06:44, James Kuyper wrote:
>> On 7/2/21 9:30 PM, Alf P. Steinbach wrote:
...
 
>> Even when u8, u or U prefixes are specified?
 
> Yes. As an example, consider
 
> const auto& s = u"Blåbær, Mr. Watson.";
 
The comment that led to this sub-thread was specifically about the
usability of escape sequences to specify members of the extended
character set, and that's the only thing I was talking about. While that
string does contain such members, it contains not a single escape sequence.
 
...
>> misinterprets any of those escape sequences?
> No, they should work. These escapes are an alternative solution to
> Juha's problem. ...
 
They are the only solution that this sub-thread has been about.
 
> ... However, they lack readability and involve much more
> work than necessary, so IMO the thing to do is to assert UTF-8.
 
Those are reasonable concerns. That the system's assumptions about the
source character set would prevent those escapes from working is not.
James Kuyper <jameskuyper@alumni.caltech.edu>: Jul 03 09:03AM -0400

On 7/3/21 2:59 AM, Juha Nieminen wrote:
 
> Because the source file will most often be a text file using 8-bit
> characters and, in these situations most likely (although not assuredly)
> using UTF-8 encoding for non-ascii characters.
 
Every comment I made on this sub-thread was prefaced on the absence of
any actual members of the extended character set - I was talking only
about the feasibility of using escape sequences to specify such members.
 
...
 
> Visual Studio will, by default (ie. with default project settings after
> having created a new project) interpret the source files as Windows-1252
> (which is very similar to ISO-Latin-1).
 
So, that shouldn't cause a problem for escape sequences, which, as a
matter of deliberate design, consist entirely of characters from the
basic source character set.
 
>> get the standard-defined behavior for u"\xC2\xA9".
 
> Yes, but that's not the correct desired character in UTF-16, only in UTF-8.
> You'll get garbage as your UTF-16 string literal.
 
The 'u' mandates UTF-16, which is the only thing that's relevant to the
interpretation of that string literal. That it is the correct pair of
characters, given that UTF-16 has been mandated. Whether or not it's the
intended character depends upon how well your code expresses your
intentions. Alf says that the character that was intended was U+00A9, so
that code does not correctly express that intention. The correct way to
specify it doesn't depend upon the source character set, it only depends
upon the desired encoding of the string. Each of the following ten
escape sequences is a different portably correct ways of expressing that
intention:
 
UTF-8: u8"\u00A9\U000000A9"
UTF-16: u"\251\xA9\u00A9\U000000A9"
UTF-32: U"\251\xA9\u00A9\U000000A9"
Chris Vine <chris@cvine--nospam--.freeserve.co.uk>: Jul 03 02:23PM +0100

On Sat, 3 Jul 2021 06:59:59 +0000 (UTC)
Juha Nieminen <nospam@thanks.invalid> wrote:
[snip]
> which encoding is being used in this source file? It needs to know that
> since it has to generate an UTF-16 string literal into the output binary
> from those characters appearing in the source code.
 
For encodings other than the 96 charcters of the basic source character
set (which map onto ASCII) that the C++ standard requires, this is
implementation defined and the compiler should document it. In the
case of gcc, it documents that the source character set is UTF-8 unless
a different source file encoding is indicated by the -finput-charset
option.
 
With gcc you can also set the narrow execution character set with the
-fexec-charset option. Presumably for any one string literal this can
be overridden by prefixing it with u8, or it wouldn't be consistent
with the standard, but I have never checked whether that is in fact the
case.
 
This is what gcc says about character sets, which is somewhat divergent
from the C and C++ standards:
http://gcc.gnu.org/onlinedocs/cpp/Character-sets.html
 
I doubt this is often relevant. What most multi-lingual programs do is
have source strings in English using the ASCII subset of UTF-8 and
translate to UTF-8 dynamically by reference to the locale. Gnu's
gettext is a quite commonly used implementation of this approach.
"Alf P. Steinbach" <alf.p.steinbach@gmail.com>: Jul 03 04:28PM +0200

On 3 Jul 2021 14:44, James Kuyper wrote:
> On 7/3/21 7:31 AM, Alf P. Steinbach wrote:
>> On 3 Jul 2021 06:44, James Kuyper wrote:
[snippety]
>> work than necessary, so IMO the thing to do is to assert UTF-8.
 
> Those are reasonable concerns. That the system's assumptions about the
> source character set would prevent those escapes from working is not.
 
As far as I know nobody's argued that the source encoding assumption
would prevent any escapes from working.
 
If I understand you correctly your "this sub-thread" about escapes and
universal character designators -- let's just call them all escapes
-- started when you responded to my response Richard Daemon, who had
responded to Juha Niemininen, who wrote:
 
 
[>>]
Does that work for wide string literals? Because I don't think it does.
In other words:
 
std::wstring s = L"Copyright \xC2\xA9 2001-2020";
[<<]
 
 
Richard responded to that:
 
 
[>>]
\x works in wide string literal too, and puts in a character with that
value. The difference is that if the wide string type isn't unicode
encoded then it might get the wrong character in the string.
[<<]
 
 
I responded to Richard:
 
 
[>>]
It gets the wrong characters in the wide string literal, period.
[<<]
 
 
Which it decidedly does.
 
It's trivial to just try it out and see; QED.
 
You responded to that where you snipped Juha's example, indicating some
misunderstanding on your part:
 
 
[>>]
The value of a wide character is determined by the current encoding. For
wide character literals using the u or U prefixes, that encoding is
UTF-16 and UTF-32, respectively, making octal escapes redundant with and
less convenient than the use of UCNs. But as he said, they do work for
such strings.
[<<]
 
 
So, in your mind this sub-thread may have been about whether escape
sequences (including universal character designators) are affected by
the source encoding, but to me it has been about whether Juha's example
yields the desired string, as he correctly surmised that it didn't.
 
And the outer context from the top thread, is about the source
encoding's effect on string literals, which hopefully is now clear.
 
 
- Alf
Richard Damon <Richard@Damon-Family.org>: Jul 03 11:48AM -0400

On 7/3/21 10:28 AM, Alf P. Steinbach wrote:
> It gets the wrong characters in the wide string literal, period.
> [<<]
 
> Which it decidedly does.
 
It puts into the string exactly the characters that you specified, the
character of value 0x00C2 and then the character of value 0x00A9. THAT
is what it says to do. If you meant the \x to mean this is a UTF-8
encode string, why are you expecting that?
 
The one issue with \x is it puts in the characters in whatever encoding
wide strings use, so you can't just assume unicode values unless you are
willing to assume wide string is unicode encoded.
 
 
Juha Nieminen <nospam@thanks.invalid>: Jul 03 04:59PM

> usability of escape sequences to specify members of the extended
> character set, and that's the only thing I was talking about. While that
> string does contain such members, it contains not a single escape sequence.
 
The problem is that the "\xC2\xA9" was presented as a solution to
the compiler wrongly assuming some source file encoding other than UTF-8.
Those two bytes are the UTF-8 encoding of a non-ascii character.
 
In other words, it's explicitly entering the UTF-8 encoding of that
non-ascii character. This works if we are specifying a narrow string
literal (and we want it to be UTF-8 encoded).
 
My point is that it doesn't work for a wide string literal. If you
say L"\xC2\xA9" you will *not* get that non-ascii character you
wanted. Instead, you get two UTF-16 (or UTF-32, depending on
how large wchar_t is) characters which are completely different
from the one you wanted. You essentially get garbage.
Juha Nieminen <nospam@thanks.invalid>: Jul 03 05:06PM

> have source strings in English using the ASCII subset of UTF-8 and
> translate to UTF-8 dynamically by reference to the locale. Gnu's
> gettext is a quite commonly used implementation of this approach.
 
It's quite relevant. For example, if you are writing unit tests for some
library dealing with wide strings (or UTF-16 strings), it's quite common
to write string literals in your tests, so you need to be aware of this
problem: What will work just fine with gcc might not work with Visual
Studio, and your unit test will succeed in one but not the other.
 
The solution offered elsewhere in this tread is the correct way to go,
ie. using the "\uXXXX" escape codes for such string literals, as they
will always be interpreted correctly by the compiler (even if the
readability of the source code suffers as a consequence).
Bonita Montero <Bonita.Montero@gmail.com>: Jul 03 02:45PM +0200

> I just did a line count on our codebase; we have about 1 dynamic
> cast for every 10k lines. Compare with static_cast - 1 in 300.
 
Using dynamic_cast instead of a virtual function call is bad coding.
Richard Damon <Richard@Damon-Family.org>: Jul 03 09:39AM -0400

On 7/3/21 8:45 AM, Bonita Montero wrote:
>> I just did a line count on our codebase; we have about 1 dynamic
>> cast for every 10k lines. Compare with static_cast - 1 in 300.
 
> Using dynamic_cast instead of a virtual function call is bad coding.
 
Incorrect, often to add the virtual function call to get around the need
for the dynamic_cast requires injecting into a class concepts that it
should not be dealing with.
 
As an example, say you have a collection of animals, and you want to
extract all the dogs out of it. You could add to Animal and Dog an isDog
function, (and then need to change Animal again for each new type of
thing you might want get, lousy encapsulation) or you can use a
dynamic_cast to check if this animal is a Dog.
Bonita Montero <Bonita.Montero@gmail.com>: Jul 03 03:53PM +0200


> Incorrect, often to add the virtual function call to get around the need
> for the dynamic_cast requires injecting into a class concepts that it
> should not be dealing with.
 
If you're doing a dynamic_cast you're switching upon the type of the
class. This switching should be better done inside the class with class
-dependent code - with a virtual function call.
 
> function, (and then need to change Animal again for each new type of
> thing you might want get, lousy encapsulation) or you can use a
> dynamic_cast to check if this animal is a Dog.
 
the isDog-function would be faster.
Richard Damon <Richard@Damon-Family.org>: Jul 03 09:59AM -0400

On 7/3/21 9:53 AM, Bonita Montero wrote:
 
> If you're doing a dynamic_cast you're switching upon the type of the
> class. This switching should be better done inside the class with class
> -dependent code - with a virtual function call.
 
Wrong. Application level logic should stay in the application level
code. That is the only scalable option.
 
>> thing you might want get, lousy encapsulation) or you can use a
>> dynamic_cast to check if this animal is a Dog.
 
> the isDog-function would be faster.
 
But then it also needs isCat, isHorse, isSnake, isMammel, isReptile,
and so on.
 
The virtual function method says that you need to change Animal EVERY
TIME you define a new type of animal for the system. This breaks
encapsulation, and binary compatibility, as adding a virtual function to
the base class changes the definition of EVERY subclass to that class.
Bonita Montero <Bonita.Montero@gmail.com>: Jul 03 05:01PM +0200

>> -dependent code - with a virtual function call.
 
> Wrong. Application level logic should stay in the application level
> code. That is the only scalable option.
 
Virtual function calls are more scalable.
 
 
>> the isDog-function would be faster.
 
> But then it also needs isCat, isHorse, isSnake, isMammel, isReptile,
> and so on.
 
Maybe, but that's much faster. Dynamic downcasts are slower.
 
> The virtual function method says that you need to change Animal EVERY
> TIME you define a new type of animal for the system. This breaks
> encapsulation, ...
 
No, it is encapsulation.
Richard Damon <Richard@Damon-Family.org>: Jul 03 11:42AM -0400

On 7/3/21 11:01 AM, Bonita Montero wrote:
 
>> Wrong. Application level logic should stay in the application level
>> code. That is the only scalable option.
 
> Virtual function calls are more scalable.
 
Wrong, the NUMBER of functions needed grows way to fast to be scalable.
It also requires GLOBAL changes in API to implement local algorithms.
 
The act of adding the function could well require recompiling millions
of lines of code over many applications if the base class is commonly use.
 
Imagine what would happen if someone decided that string needed a new
function that made ALL code that uses it to need to be recompiled. This
is an X.0.0 type of change.
 
>> TIME you define a new type of animal for the system. This breaks
>> encapsulation, ...
 
> No, it is encapsulation.
 
No, it BREAKS encapsulation. If adding a new type of animal requires a
change in the animal base, the base is not properly encapsulated.
 
Note, the RTTI interface that is provided built into C++ provides that
encapsulated interface that allows you to detect this without needing to
make these changes.
Bonita Montero <Bonita.Montero@gmail.com>: Jul 03 06:04PM +0200

> Wrong, the NUMBER of functions needed grows way to fast to be scalable.
 
Use an enum and return an enum-value in _one_ function.
That's faster than a dynamical downcast.
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

No comments: