soft and program: Digest for comp.lang.c++@googlegroups.com

comp.lang.c++@googlegroups.com

Google Groups

MinGW g++ encoding `u8"literal"` with Windows ANSI Western, not UTF-8 - 6 Updates
RAII design question - 15 Updates
Why is the memory allocated on the heap NOT freed? - 2 Updates
C++ struct to hold GLSL shader string - 2 Updates

MinGW g++ encoding `u8"literal"` with Windows ANSI Western, not UTF-8

Manfred <noname@invalid.add>: Jun 13 02:29PM +0200

On 6/13/2017 12:13 AM, Alf P. Steinbach wrote:
>> This is a problem only if the standard headers use non-ascii chars.

> No, it's a general problem. Consider UTF-16. ASCII text interpreted as
> UTF-16 = a lot of gobbledygook, and possibly even invalid sequences.

You are right, that would be a bad combination. That said, I wouldn't
say -finput-charset is broken per se, one could think of that option to
handle differently <> and "" included headers, or be dependent on source
tree location, but still it could not be 100% safe.

If one really wants to use different encodings for sources, I think they
should be disjoint in different compilation units.
In fact if you need a non US-ascii source this is typically (always?)
due to localized strings, and those would be good practice to be defined
in dedicated sources, that could need no standard headers at all (e.g.
by defining them as plain const char*).
Moreover, this would be practically needed in case of multiple
translations (this would end up into something similar to Windows
resource files)

"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Jun 13 02:48PM +0200

On 13-Jun-17 2:29 PM, Manfred wrote:
> Moreover, this would be practically needed in case of multiple
> translations (this would end up into something similar to Windows
> resource files)

Consider that many people prefer to use national characters in identifiers.

C++ formally supports the common set of identifier characters in
Unicode, it's rather large. I don't personally do that, because as I see
it English is the /lingua franca/ (hah!) of programming, and I think
source code should be accessible regardless of one's nationality, and
unlike Visual C++, g++ doesn't support more than ASCII. *But* I remember
one French guy in this group who argued for the national language
identifiers on the grounds that their programmers felt it was more easy.

As it happens Visual C++ has no problem with mixed encodings in a single
translation unit, proving that there is no inherent technical
show-stopper problem – that's why I felt safe characterizing the g++
scheme as broken.

Cheers!,

- Alf

David Brown <david.brown@hesbynett.no>: Jun 13 03:12PM +0200

On 13/06/17 14:48, Alf P. Steinbach wrote:
> unlike Visual C++, g++ doesn't support more than ASCII. *But* I remember
> one French guy in this group who argued for the national language
> identifiers on the grounds that their programmers felt it was more easy.

Non-ASCII identifiers /could/ be a serious problem - their usage would
be a disaster for interoperability. Most people with English language
keyboards have enough trouble with English words like naïve and café,
because most of them use Windows and have a UK or US keyboard layout
without accents, dead keys, or a *nix compose key. If these turned up
in identifiers in someone else's code, they would be lost.

But would it be any worse than people writing identifiers in their own
language, just using ASCII-only identifiers? Is it worse for English
speakers to deal with:

enum kompassretninger { nor, øst, sør, vest };

or

enum kompassretninger { nor, ost, sor, vest };

?

I am curious if any studies have been done - perhaps with languages like
Python that have had support for non-ASCII identifiers for a long time.

Some kinds of additional letters would be nice, even when sticking to
English, such as π, µs, or kΩ - but they might be hard to read, and for
many people they would be hard to type. And without UTF-8 symbols and
flexible operators, we can't write things like

y = a₂·x² + a₁·x + a₀

or

if (A ⊆ ℝ) ...

(I admit I had to resort to a character map accessory to type that last
one...)

> translation unit, proving that there is no inherent technical
> show-stopper problem – that's why I felt safe characterizing the g++
> scheme as broken.

Yes, the gcc extended identifier support is currently incomplete (it's
complete in theory - you can write extended identifiers with UTF-8. But
it's broken in practice, because you have to write them by giving the
code points!).

Manfred <noname@invalid.add>: Jun 13 03:24PM +0200

On 6/13/2017 2:48 PM, Alf P. Steinbach wrote:
> it English is the /lingua franca/ (hah!) of programming, and I think
> source code should be accessible regardless of one's nationality, and
> unlike Visual C++, g++ doesn't support more than ASCII.
Correction: gcc (including g++) uses UTF-8 as default encoding...

*But* I remember
> one French guy in this group who argued for the national language
> identifiers on the grounds that their programmers felt it was more easy.
...so French identifiers would be fine too as long as the they are valid
according to /language/ rules.
The problem is when you mix different encodings that are not compatible
with each other, not about ascii-only.

> translation unit, proving that there is no inherent technical
> show-stopper problem – that's why I felt safe characterizing the g++
> scheme as broken.
One difference is that Visual C++ is an IDE, which includes the editor.
gcc is merely a compiler, which means you have to use something else as
an editor, and this opens for trouble.
Anyway, msvc++ may be better suited for this task, but IMVHO I think
mixing encodings is not a very great idea. Besides, I /think/ (*) MSVC++
uses BOMs, which I /personally/ dislike, although I have seen others do
like them.

(* actually I have seen MSVC++ adding a BOM to UTF-8 XML, where I would
not want it - and IIRC it would be deprecated by the IETF too)

Manfred <noname@invalid.add>: Jun 13 04:24PM +0200

On 6/13/2017 3:24 PM, Manfred wrote:
>> identifiers on the grounds that their programmers felt it was more easy.
> ...so French identifiers would be fine too as long as the they are valid
> according to /language/ rules.
I was wrong here: indeed gcc only allows for ascii identifiers (other
characters must be '\u' escaped in identifiers, as Bavid Brown correctly
pointed out)

David Brown <david.brown@hesbynett.no>: Jun 13 08:34PM +0200

On 13/06/17 16:24, Manfred wrote:
> I was wrong here: indeed gcc only allows for ascii identifiers (other
> characters must be '\u' escaped in identifiers, as Bavid Brown correctly
> pointed out)

No, you were more correct than you thought. gcc can use a variety of
character sets for the source character set and the execution character
set. The default input character set is taken from the host's local, or
UTF-8 if it cannot be determined (on Linux, UTF-8 is the norm), or it
can be overridden on the command line. The execution character set is
UTF-8 by default, but can be overridden on the command line.

However, gcc requires the character set for /identifiers/ to be ASCII -
so if you enable "extended identifiers", you have \uNNNN or \UNNNNNNNN
formats to make the UTF characters in the identifiers. Basically, that
means you need an extra layer of pre-processor (or a smart editor) to
use UTF characters in identifiers.

But you can happily use UTF-8 characters in strings, character
constants, and comments.

mvh.,

David
(or Bavid, if you really insist)

RAII design question

Mr Flibble <flibbleREMOVETHISBIT@i42.co.uk>: Jun 13 01:16AM +0100

On 12/06/2017 23:06, Alf P. Steinbach wrote:
> Well, the Liskov substitution applies to instances of types, not to
> types themselves. Or in other words, the LSP is about the behavior of

Nonsense; LSP is all about types: the behaviour of a derived type when
accessed via base class reference must be the same as base class
behaviour. Types define behaviour not objects.

/Flibble

"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Jun 13 04:08AM +0200

On 13-Jun-17 2:16 AM, Mr Flibble wrote:

> Nonsense; LSP is all about types: the behaviour of a derived type when
> accessed via base class reference must be the same as base class
> behaviour. Types define behaviour not objects.

I think you're trolling, what with the snipping of a quote in the middle.

It may, however, be that you actually don't understand what I wrote.

In that case, study it some more, in particular the parts that you snipped.

Cheers!,

- Alf

Gareth Owen <gwowen@gmail.com>: Jun 13 06:14AM +0100

> duplicating the CancellableTimer functionality up in base class Timer,
> which would be meaningless except as a way to restrict the interface
> of Timer.

That was the conclusion I reached (and as I had access to the base class
that what I did). But it felt like something smarter should've been
possible.

Marcel Mueller <news.5.maazl@spamgourmet.org>: Jun 13 07:47AM +0200

On 12.06.17 22.59, Gareth Owen wrote:
> I had a timer class that measure the duration of its own existence.

> In Semi-Pseudo code it looked like this.
[...]
> CancellableTimer IS A Timer, so it'd be nice if I could extend Timer in
> this way.

> What am I missing?

There is an restriction in C++ inheritance. You can never override a
base class constructor nor destructor because at the time they are
called the object is not yet or no longer of the derived type. It is the
base type that gets con/destructed this way.

So any subclass of Timer can never /remove/ the behavior of Timer at
destruction. That's the point where the LSP is broken.

Marcel

Tim Rentsch <txr@alumni.caltech.edu>: Jun 12 11:58PM -0700

> CancellableTimer IS A Timer, so it'd be nice if I could extend Timer in
> this way.

> What am I missing?

Let me see if I can clear up some things.

First: The Liskov Substitution Principle is not a property that
automatically holds for C++ classes/subclasses. Rather it is a
design principle for defining classes and subclasses. It is
perfectly possible for a subclass in C++ not to satisfy the LSP
with respect to one (or several) of its superclasses (and in fact
is desirable in some cases but that is a topic for another day).
The point of the LSP is to give guidance for a relationship that
is nice (in certain ways) to impose between classes and their
subclasses. (Forgive me for not using the base/derived class
terminology used in C++, which still sounds funny to me after
using the original superclass/subclass terminology for so long
previously.)

Second: The Timer class you show has the property that it always
updates the total_time reference when an instance finishes. So
if we defined a subclass that somehow did /not/ do that, in fact
that subclass would violate the LSP for that behavioral property
(which seems to me like an essential property of Timer, but of
course I don't know which properties of Timer you think are the
important ones).

Third: What you want (or at least what I think you want) with
the CancellableTimer class is something that behaves like a Timer
except if the cancel() method is called, in which case it does
something different. There are two ways of thinking about this.
One is that the cancel() method is outside the Timer interface,
so whatever it does is okay, and LSP is irrelevant. The other is
that the point having CancellableTimer is to change some aspect
of Timer behavior, ie, to deliberately break the LSP.

Fourth: Finally getting back to your main question, which I
believe is How does one accomplish what is described in the last
paragraph? As you observe, given the implementation of Timer,
there is no way to avoid that final update. In other words, what
you want is to violate the LSP, but because of how references and
destructors work, you can't! To say this another way, you are
(over-)constrained by the implementation of Timer. If Timer were
implemented differently, it would be easy to write a subclass
that does what you want. But it isn't, so you can't.

What one might do about this depends on the particulars of
classes, etc, involved. But I don't know what those are
so I don't have any suggestions or advice to offer.

Does this help clear things up?

Tim Rentsch <txr@alumni.caltech.edu>: Jun 13 12:06AM -0700

> I think you're trolling, what with the snipping of a quote in the middle.

> It may, however, be that you actually don't understand what I wrote.

> In that case, study it some more, in particular the parts that you snipped.

You may not like how the statement is phrased, but what he is
saying is essentially correct. The Liskov Subsitution Principle
is about a relationship between classes and their subclasses
(usually described in terms of types and subtypes, but for C++
it's classes and subclasses). That relationship is defined in
terms of the behaviors of instances of those classes and
subclasses, but the relationship proper is one between classes,
not between instances.

Tim Rentsch <txr@alumni.caltech.edu>: Jun 13 12:21AM -0700

> objects between their creation and destruction, not including what
> happens before the object's constructor has finished, or after its
> destructor has begun execution. [...]

The question of when "construction" finishes and "objectness" begins
is rather murky, but I believe the statement about destructors is
just wrong. What happens during destructor execution is just as
much a part of object behavior as method calls are. Suppose for
example we have a 'Foo *foo', which might actually point to a
subclass (aka derived class) of Foo. The action

delete foo;

must (under LSP) satisfy the behavioral guarantees of Foo, even
if 'foo' points to an instance of a derived class.

"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Jun 13 09:50AM +0200

On 13-Jun-17 9:06 AM, Tim Rentsch wrote:

>> It may, however, be that you actually don't understand what I wrote.

>> In that case, study it some more, in particular the parts that you snipped.

> You may not like how the statement is phrased,

The absence of mention of sausages, as Leigh habitually post about,
doesn't bother me in the slightest. I don't need that kind of obvious
marker to see that the text fails to present a coherent argument.

> but what he is saying is essentially correct.

No, the "nonsense" claim is incorrect, pure nonsense.

The following paragraph of his, after the semicolon, is essentially
correct but is irrelevant to the nonsense nonsense claim.

Your claim that his nonsense nonsense claim is essentially correct, is
nonsense.

> terms of the behaviors of instances of those classes and
> subclasses, but the relationship proper is one between classes,
> not between instances

Well, DEMONSTRATE how you think Liskov substitution applies for
constructors.

I'm not saying that what you write here is more relevant to your (in
context) claim about constructors than what Leigh wrote was relevant to
his claim, so when I ask for a demonstration it is only /assuming/ that
you were thinking of some connection between your claim and alleged
argument in favor of the claim, a connection that I just don't see.

I'm not even saying that we can't come up with something like the LSP
that involves constructors, e.g. for use in template code, but that
would be something not-quite-LSP, and not part of Barbara's work, and so
we'd better call it something else.

I'm saying that your claim here doesn't make sense to me, and so, please
demonstrate it.

Thank you.

Cheers!,

- Alf

"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Jun 13 10:05AM +0200

On 13-Jun-17 9:21 AM, Tim Rentsch wrote:

> delete foo;

> must (under LSP) satisfy the behavioral guarantees of Foo, even
> if 'foo' points to an instance of a derived class.

The present case is one where the destructor has behavior in addition to
the language-mandated one of reversing the effect of the constructor.

So you could be right but I think it's impractical to think that way.

E.g. the lack of symmetry in that viewpoint bothers me, and I blithely
(perhaps wrongly) just /assumed/ symmetry, but then, there is some
asymmetry already in the degree to which constructors and destructors
have names, the former not at all, and the latter sort-of, and there is
asymmetry in the number of constructors and destructors for a class.

Cheers!,

- Alf

Alain Ketterlin <alain@universite-de-strasbourg.fr.invalid>: Jun 13 10:28AM +0200

Gareth Owen <gwowen@gmail.com> writes:

[...]

> But it also seems that by Liskov Substitution Principle, a
> CancellableTimer IS A Timer, so it'd be nice if I could extend Timer in
> this way.

You got it reversed: a timer is a cancellable timer (the kind that you
can't cancel, or, at least where cancel has a particular meaning).

You have an instance of the Circle-Ellipse problem. See
https://en.wikipedia.org/wiki/Circle-ellipse_problem.

-- Alain.

Tim Rentsch <txr@alumni.caltech.edu>: Jun 13 03:05AM -0700

> about, doesn?t bother me in the slightest. I don?t need that kind
> of obvious marker to see that the text fails to present a coherent
> argument.

He isn't giving an argument; he is simply making a statement.
The statement is about the definition of a term (ie, the Liskov
Substitution Principle). To know what that definition is we
can refer back to the papers where Barbara Liskov, and Barbara
Liskov with Jeanette Wing, introduced and formalized the
underlying ideas. Those papers are

Liskov, B. (May 1988). "Keynote address - data abstraction
and hierarchy". ACM SIGPLAN Notices. 23 (5): 17-34

Liskov, B. H.; Wing, J. M. (November 1994). A behavioral
notion of subtyping. ACM Trans. Program. Lang. Syst. 16
(6). pp. 1811-1841

I confess I haven't read these papers. What I did read is an
entry on Wikipedia, and also a couple of references it pointed
to. The Wikipedia page is here:

https://en.wikipedia.org/wiki/Liskov_substitution_principle

> claim.

> Your claim that his nonsense nonsense claim is essentially
> correct, is nonsense.

His statement agrees with how Wikipedia describes the term.
Since I have no reason to think the Wikipedia description
is wrong, I thought it appropriate to characterize his
statement as essentially correct. But please look at the
Wikipedia page and see what you think about that.

> work, and so we'd better call it something else.

> I'm saying that your claim here doesn't make sense to me, and so,
> please demonstrate it.

I don't know why you hooked into constructors. Mr Flibble didn't
mention constructors. My followup didn't mention constructors.
Did you somehow think he was talking about something that he
wasn't? In any case my comment was only about the definition of
the LSP, and wasn't meant to say anything about constructors.

Tim Rentsch <txr@alumni.caltech.edu>: Jun 13 03:21AM -0700

> have names, the former not at all, and the latter sort-of, and there
> is asymmetry in the number of constructors and destructors for a
> class.

The biggest asymmetry between constructors and destructors is
that with constructors /the name of the class must be known/.
That property does not hold for other (non-static) member
functions, including destructors. If, instead of constructors,
factory functions are used (ie, via a parameter that is a
pointer to a factory function, which may construct a base object
or may construct a derived object), then we get back the lost
symmetry, and both construction and destruction participate in
the substitution behavioral guarantees.

"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Jun 13 12:49PM +0200

On 13-Jun-17 12:05 PM, Tim Rentsch wrote:
>> of obvious marker to see that the text fails to present a coherent
>> argument.

> He isn't giving an argument;

He certainly gives that impression: an assertion followed by apparent
(but irrelevant) argument.

> he is simply making a statement.

That too.

> notion of subtyping. ACM Trans. Program. Lang. Syst. 16
> (6). pp. 1811-1841

> I confess I haven't read these papers.

I have, because in 2012 I started on a blog article series on the Liskov
Substution Principle, <url:
https://alfps.wordpress.com/2012/03/11/liskovs-substitution-principle-in-c/>.

Unfortunately that's where my illnesses caught up with me. I started on
an experimental horse's pill cure (that was once used for Tubercolosis)
that didn't work, plus a drawn-out series of surgery over a year or two,
so I never posted more than that first of three articles. :( It also
interfered with doing things to get reawarded my Microsoft Most Valued
Professional award. It's reawarded each year: I only got the first. :(

> to. The Wikipedia page is here:

> https://en.wikipedia.org/wiki/Liskov_substitution_principle

>>> but what he is saying is essentially correct.

Yes, as you will note I told you that what he wrote after the semicolon,
was essentially correct, and irrelevant.

No contest about the correctness.

But repeating that here smacks of the very same kind irrelevancy as in
Leigh' statement, the art of appearing to give an argument that defeats
some position, when that position has never been argued by anyone else.
It's a known, named, fallacy. It's called a Straw Man argument.

>> Your claim that his nonsense nonsense claim is essentially
>> correct, is nonsense.

> His statement agrees with how Wikipedia describes the term.

I rather doubt that the Wikipedia article on nonsense agrees with
Leigh's position.

> is wrong, I thought it appropriate to characterize his
> statement as essentially correct. But please look at the
> Wikipedia page and see what you think about that.

Nah, this is just bollocks. You're either not reading, or you're
pretending that you don't. I don't know which is worse, I'm sorry.

>> please demonstrate it.

> I don't know why you hooked into constructors. Mr Flibble didn't
> mention constructors.

Exactly, and I mentioned that too: that he snipped what he quoted.

I think intentionally, to mislead readers. Such as you.

And I stated that.

> My followup didn't mention constructors.

Neither did Leigh.

> Did you somehow think he was talking about something that he
> wasn't?

No, rather the opposite: I think in your first response you genuinely
thought I had been talking about something I wasn't talking about.

But then you replied to my posting that Leigh commented on, commenting
on that very paragraph he quoted some of, showing that at that time you
did understand what I wrote.

So, if at this point you don't, again, then that's a kind of yo-yo effect.

> In any case my comment was only about the definition of
> the LSP, and wasn't meant to say anything about constructors.

Yes, I believe that. :-)

Cheers!,

- Alf

Mr Flibble <flibbleREMOVETHISBIT@i42.co.uk>: Jun 13 05:34PM +0100

On 13/06/2017 08:50, Alf P. Steinbach wrote:
> marker to see that the text fails to present a coherent argument.

>> but what he is saying is essentially correct.

> No, the "nonsense" claim is incorrect, pure nonsense.

You explicitly stated that LSP was about objects not types which is
patently nonsense, as I correctly indicated.

/Flibble

Gareth Owen <gwowen@gmail.com>: Jun 13 07:34PM +0100

> (which seems to me like an essential property of Timer, but of
> course I don't know which properties of Timer you think are the
> important ones).

The important ones are the ones I need today, and the less important
ones are the ones I needed yesterday and will need tomorrow.

> The other is that the point having CancellableTimer is to change some
> aspect of Timer behavior, ie, to deliberately break the LSP.

The peculiarity though, and I think it is, is that I can change many
aspect of a class's behaviour, except those in its constructor

> What one might do about this depends on the particulars of
> classes, etc, involved. But I don't know what those are
> so I don't have any suggestions or advice to offer.

Oh, the solution was easy - I made the baseclass cancellable.

One alternative was to replace the reference with a pointer and cause
the cancel function to reseat the pointer to a private static garbage
variable than can never be looked at.

That seems kind of overkill though.

Another alternative was to delegate destruction to a virtual function
destroy() and a "destroyed" flag to the base class, so each class's
destructor looked like this

~SubClass() {
if(!destroyed) this->destroy();
destroyed = true;
}

Every subclass's destructor gets called, but only the most-derived one
does anything...

But this is a fire-and-forget class that I use for testing code-speed,
so all that seemed a bit much.

> Does this help clear things up?

It does. Though I'm not sure I agree 100% with your interpretation of
LSP. My CancellableTimer can be used anywhere a Timer can be used, but
the fact that I used a reference not a pointer added constraints that
should have been a mere implementation detail.

Why is the memory allocated on the heap NOT freed?

Juha Nieminen <nospam@thanks.invalid>: Jun 13 07:23AM

> Yes it did. E.g. section §20.6.4 called ???Pointer safety??? is all about
> support for garbage collection. In the standard's own words, in §C.2.10,
> it's a ???Minimal support for garbage-collected regions???.

So you are saying that if I use a C++11 standard-compliant compiler,
like clang or C++, I don't have to delete what I allocate with new?

Oh wait, I do.

So I suppose C++11 did *not* add garbage collection to the language.

"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Jun 13 10:09AM +0200

On 13-Jun-17 9:23 AM, Juha Nieminen wrote:
> like clang or C++, I don't have to delete what I allocate with new?

> Oh wait, I do.

> So I suppose C++11 did *not* add garbage collection to the language.

You can find a discussion of your argument here: <url:
https://en.wikipedia.org/wiki/Straw_man>.

Cheers & hth.,

- Alf

C++ struct to hold GLSL shader string

bitrex <bitrex@de.lete.earthlink.net>: Jun 12 08:36PM -0400

I'm interfacing with a straight-C99 API that makes some OpenGL calls. I
need to pass the source for a GLSL pixel shader in the form of an array
of char* to one of the functions. Basically in C code the shader source
is stored like this:

const char *shader_src[] =
{
"uniform sampler2D backBuffer;",
"uniform float r;",
"uniform float g;",
"uniform float b;",
"uniform float ratio;",
"void main() {",
" vec4 color;",

etc.

" gl_FragColor = color;",
"}"
};

const int source_len = number_of_lines_above;

I was hoping there was a good RAII-respectful way to pass
arbitrary-length source in this form to a class constructor and have it
held in some kind of structure, that can return a char** and length to
the API's compiler call when required.

Christian Gollwitzer <auriocus@gmx.de>: Jun 13 10:05AM +0200

Am 13.06.17 um 02:36 schrieb bitrex:
> arbitrary-length source in this form to a class constructor and have it
> held in some kind of structure, that can return a char** and length to
> the API's compiler call when required.

Why not std::vector<const char *> ?

&buffer[0] and buffer.size() should give you what's needed. OTOH why do
you pass individual lines and not a string for the shader?

Christian

You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

soft and program

Tuesday, June 13, 2017

Digest for comp.lang.c++@googlegroups.com - 25 updates in 4 topics

No comments:

Blog Archive

About Me