soft and program: Digest for comp.lang.c++@googlegroups.com

comp.lang.c++@googlegroups.com

Google Groups

Structured binding considered harmful - 7 Updates
Bind const L-value reference to function return value - 13 Updates

Sam <sam@email-scan.com>: Mar 16 08:06PM -0400

Bonita Montero writes:

> If you're not the designer of the code or a visionary you've to find
> the container-type for which you're getting the auto variable(s). And
> that's more work than reading the concrete type.

I know that if I have a container, I can call begin(), and/or end(), drop
the result into an auto, and have something to work with. I would think this
would save me quite a bit of time for not having to do what you think I
would have to do.

> Having the concrete type maintains readability - that's more important
> than having less typing. As someone said in another thread he got tired
> of overly use of auto in a source he didn't write himself.

I suppose that it's a matter of personal preference.

I am aware of the fact that iterator and algorithm semantics are remarkably
consistent. People's Exhibit A: my previous message in this thread.

I'm not sure how clearly identifying that something is a list iterator
translates into "readability". If all I need to do is passed something into
std::find, find_if, or some other algorithm that correctly adapts itself to
any compliant iterator, I'm not sure the value added in explicitly
designating the inputs as list or vector iterators. The end result, the
concept, in the end, is the same.

And I find the concept of readability to be quite the opposite of yours'. To
me, a compact designation that something is merely a beginning or an ending
iterator of a sequence is all I need to make it readable. See a spelled out
type take up a quarter of a width of a line, for some reason, doesn't appeal
as very readable to me.

Sam <sam@email-scan.com>: Mar 16 08:13PM -0400

jacobnavia writes:

> Structured binding goes against that principle head on: you expose
> everything and any change to the implementation of the base class needs a
> rewrite of user code!

So?

Sometimes that's exactly what you want. You may very well want a particular
change to be incompatible, by design, and force all existing callers to
adapt.

And for changes that should not break existing functionality, you have other
ways of skinning that cat. Like adding another enum value, or something. And
all structured bindings that employ that particular enum type continue to
work exactly as they do now. Amazing, isn't it?

And wasn't someone around here just kvetching, recently, how using
structured bindings can end up miscompiling code because it went undetected
and did /not/ require the "rewrite of user code".

Oh, that was you:

# jacobnavia writes:
#
# > As you see the "get" template receives an index that IS HARDCODED TO THE INDEX
# > OF THE FIELD, and any changes by INSERTING A NEW FIELD will make those indexes
# > silently access the WRONG FIELD,

But, you're not saying that /any/ change mandates a rewrite, since it's
going to get miscompiled?

Sir/ma'am/whatever: can you kindly make up your mind exactly what's the
problem with structured bindings?

Bonita Montero <Bonita.Montero@gmail.com>: Mar 17 06:37AM +0100

> I suppose that it's a matter of personal preference.

If I have to figure out of which type a container is I doing an
auto in a for, that's not a matter of preference. If I f.e. have
a pair<const key, value>, I know immediately that I'm using a map
or unordered_map.

Sam <sam@email-scan.com>: Mar 17 07:17AM -0400

Bonita Montero writes:

> auto in a for, that's not a matter of preference. If I f.e. have
> a pair<const key, value>, I know immediately that I'm using a map
> or unordered_map.

Well, the trick is to arrange things so that it doesn't matter.

for (auto &[key,value]:container)

will work for that map. But replacing that container with a std::vector of
tuples will also work, if it becomes necessary. You'll be surprised to learn
that the above line of code will remain unchanged.

Something quite similar happened a year or so, ago. I used an unordered_set
of rectangle objects (x/y/width/height) that I used to implement GUI
algorithms. I initially relied on the unordered_set itself to eliminate the
duplicates.

Through profiling I determined a little bit of a bottleneck in set
operations that will benefit from replacing the unordered_set with a vector,
to take advantage of modern CPU's optimizations on linear memory access.

So that's what I did, and I just added a few bits to all the rectangle
container oeprators to dedupe the rectangles after wrangling with them. And,…
that was it. Apart from simply redefining the container as such, and a few
tweaks to a few functions, that was it. 99% of the existing code used auto
and required no changes whatsoever.

So, if someone likes C++ so much that they prefer to do a lot of unnecessary
work, because they think the resulting code is more readable, more power to
them. If they think that

for (std::unordered_set<rectangle>::iterator rb=rs.begin(), re=rs.end();
rb != re; rb++)
{
rectangle &r=*rb;

// …
}

makes their resulting code so beautiful and purty, they can knock themselves
out. As for me,

for (auto &r:rs)

will also work, and I'll just move on to bigger and better things. I think
I'll survive not having to experience the beauty of the resulting code if I
was forced to rewrite it due to the container change.

I think I'll cope with not appreciating the "readability" of knowing that
the underlying iterators are std::unordered_set<rectangle>::iterator or
std::vector<rectangle>::iterator. I'll just have to deal with the fact that
it will never be clear to me how that adds to "readability". For me, it
appears that the exact task at hand, for which I need to chew through the
rectangles, is what's more important. Where those rectangles come from, from
which container, and its semantics, who cares.

Bonita Montero <Bonita.Montero@gmail.com>: Mar 17 06:37PM +0100

> will work for that map. But replacing that container with a std::vector
> of tuples will also work, if it becomes necessary. You'll be surprised
> to learn that the above line of code will remain unchanged.

This ease doesn't justify the lesser readability.

"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Mar 17 06:50PM +0100

On 17.03.2020 18:37, Bonita Montero wrote:
>> std::vector of tuples will also work, if it becomes necessary. You'll
>> be surprised to learn that the above line of code will remain unchanged.

> This ease doesn't justify the lesser readability.

Point.

IMO they could/should just have opened up the $ namespace for keywords.

Instead of all this grotesquely convoluted symbol based syntax.

- Alf

Melzzzzz <Melzzzzz@zzzzz.com>: Mar 17 07:46PM

> A new useless construct has been added to the language that serves no
> purpose other than saving some seconds when typing C++.

> GREAT!

It's called destructuring/pattern match in other languages. C++ wants that badly.
Rust => is that way, and promisses.

--
press any key to continue or any other to quit...
U ničemu ja ne uživam kao u svom statusu INVALIDA -- Zli Zec
Svi smo svedoci - oko 3 godine intenzivne propagande je dovoljno da jedan narod poludi -- Zli Zec
Na divljem zapadu i nije bilo tako puno nasilja, upravo zato jer su svi
bili naoruzani. -- Mladen Gogala

Bind const L-value reference to function return value

Frederick Gotham <cauldwell.thomas@gmail.com>: Mar 17 01:04AM -0700

I remember 15 years ago I frequently wrote code like this:

extern string Func(void); /* Defined elsewhere */

int main(void)
{
string const &str = Func();
}

The was back when we had the 1998 C++ Standard, (i.e. before R-value references), and you could bind a const L-value reference to the return value of a function (in the hope that the compiler would optimise it by prolonging the lifetime of the temporary object).

Well anyway I had some problematic code yesterday on an embedded Linux 32-Bit ARM device. I was using Boost::Filesystem::directory_iterator to go through a list of all the files in a directory, and I had this code:

namespace bfs = boost::filesystem;

bfs::directory_iterator const end_itr; // Default ctor yields one-past-the-end

for ( bfs::directory_iterator i("/home/frederick"); i != end_itr; ++i )
{
// Skip if a directory (but allow regular file, symbolic link, etc.)
if( bfs::is_directory(i->status()) )
{
continue;
}

string const &filename = i->path().filename().string();

if( false == boost::regex_match(filename, "\d\d\d.jpg") )
{
continue;
}

// File matches
return filename;
}

return string();

If you look at that line in the middle of the loop:

string const &filename = i->path().filename().string();

For some reason this was giving me a garbage filename, and I had to change it to:

string const filename( i->path().filename().string() );

This fixed the problem.

I don't think there should have been a problem with the original code though.

Frederick Gotham <cauldwell.thomas@gmail.com>: Mar 17 01:06AM -0700

On Tuesday, March 17, 2020 at 8:05:02 AM UTC, Frederick Gotham wrote:

> if( false == boost::regex_match(filename, "\d\d\d.jpg") )

That string should actually be "\\d\\d\\d.jpg" but this is unrelated to the original problem.

Jorgen Grahn <grahn+nntp@snipabacken.se>: Mar 17 09:44AM

On Tue, 2020-03-17, Frederick Gotham wrote:
> references), and you could bind a const L-value reference to the
> return value of a function (in the hope that the compiler would
> optimise it by prolonging the lifetime of the temporary object).

I had to read up on that in TC++PL. Don't think I've ever written
code like that -- is there really a performance benefit to it?

> continue;
> }

> string const &filename = i->path().filename().string();

I don't know Boost::Filesystem, and I got lost in their documentation,
as usual ...

But if string() returns a /reference/ to a std::string, you have a
completely different scenario: no temporary object gets created, and
you'd have to look elsewhere to figure out the lifetime of the object
behind 'filename'. I.e. in the Boost.Filesystem documentation.

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .

Bonita Montero <Bonita.Montero@gmail.com>: Mar 17 11:19AM +0100

> string const &filename = i->path().filename().string();
> For some reason this was giving me a garbage filename, and I had to change it to:
> string const filename( i->path().filename().string() );

Check this:

#include <iostream>

using namespace std;

struct S
{
S() { cout << "S::S()" << endl; }
~S() { cout << "S::~S()" << endl; }
};

S fS()
{
S s;
return s;
}

int main()
{
cout << "before" << endl;
S &s = fS();
cout << "after" << endl;
}

This outputs this:

before
S::S()
S::~S()
after
S::~S()

So the temporary object is destroyed after the definition of s.
And when s goes out of scope the compiler tries to destroy the
referenced "temporary" again although it doesn't exist anymore.

Frederick Gotham <cauldwell.thomas@gmail.com>: Mar 17 04:07AM -0700

On Tuesday, March 17, 2020 at 10:19:12 AM UTC, Bonita Montero wrote:

> So the temporary object is destroyed after the definition of s.
> And when s goes out of scope the compiler tries to destroy the
> referenced "temporary" again although it doesn't exist anymore.

The rules are different if it's a "reference to const".

Frederick Gotham <cauldwell.thomas@gmail.com>: Mar 17 04:08AM -0700

On Tuesday, March 17, 2020 at 9:44:26 AM UTC, Jorgen Grahn wrote:

> But if string() returns a /reference/ to a std::string

This would be the most likely explanation.... Now if I could only find the relevant info in the Boost documentation. I might look in the header files but they can be a bit hairy with macros and templates and so forth.

Bonita Montero <Bonita.Montero@gmail.com>: Mar 17 12:54PM +0100

>> And when s goes out of scope the compiler tries to destroy the
>> referenced "temporary" again although it doesn't exist anymore.

> The rules are different if it's a "reference to const".

You're right, but not only with const:

#include <iostream>

using namespace std;

struct S
{
S( S & ) { cout << "S::S( S & )" << endl; }
S() { cout << "S::S()" << endl; }
~S() { cout << "S::~S()" << endl; }
};

S fS()
{
S s;
return s;
}

int main()
{
cout << "before" << endl;
S &s = fS();
cout << "after" << endl;
}

before
S::S()
S::S( S & )
S::~S()
after
S::~S()

Seems there's no real reference but a copy.

felix@palmen-it.de (Felix Palmen): Mar 17 01:49PM +0100

>> optimise it by prolonging the lifetime of the temporary object).

> I had to read up on that in TC++PL. Don't think I've ever written
> code like that -- is there really a performance benefit to it?

I think this is a red herring. The code is well-defined, but as
automatic lifetime sill applies, a copy must be taken for the caller so
it can reference it. This could only be optimized away if copying is
guaranteed to have no side effects. Even then, how would a typical
implementation, using a stack frame for automatic storage of objects,
handle this kind of optimization?

> But if string() returns a /reference/ to a std::string, you have a
> completely different scenario: no temporary object gets created

Which is almost certainly the case here. It makes sense that a path
holds a filename and an accessor just returns a reference.

--
Dipl.-Inform. Felix Palmen <felix@palmen-it.de> ,.//..........
{web} http://palmen-it.de {jabber} [see email] ,//palmen-it.de
{pgp public key} http://palmen-it.de/pub.txt // """""""""""
{pgp fingerprint} A891 3D55 5F2E 3A74 3965 B997 3EF2 8B0A BC02 DA2A

Paavo Helde <myfirstname@osa.pri.ee>: Mar 17 04:18PM +0200

On 17.03.2020 13:08, Frederick Gotham wrote:
> On Tuesday, March 17, 2020 at 9:44:26 AM UTC, Jorgen Grahn wrote:

>> But if string() returns a /reference/ to a std::string

> This would be the most likely explanation.... Now if I could only find the relevant info in the Boost documentation. I might look in the header files but they can be a bit hairy with macros and templates and so forth.

I copied your code into a .cpp file and went to the definition of
string() via a single context menu click:

# ifdef BOOST_WINDOWS_API
const std::string string() const { return string(codecvt()); }
// ...
# else // BOOST_POSIX_API
// string_type is std::string, so there is no conversion
const std::string& string() const { return m_pathname; }

It appears the return type is a reference in Linux and an object in
Windows. Most devious!

Bonita Montero <Bonita.Montero@gmail.com>: Mar 17 03:21PM +0100

> after
> S::~S()

> Seems there's no real reference but a copy.

Oh, the above code only works with g++ if I define S as const.

"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Mar 17 05:47PM +0100

On 17.03.2020 15:18, Paavo Helde wrote:
> const std::string& string() const { return m_pathname; }

> It appears the return type is a reference in Linux and an object in
> Windows. Most devious!

That's crazy.

Additionally, at least for the standard library's filesystem, the
encoding is system specific, UTF-8 in Linux and Windows ANSI in Windows.

Happily since May last year it's possible to set UTF-8 as the process'
Windows ANSI encoding, but.

Until C++20 you could avoid this silliness by using .u8string() instead.

In C++20 the return type of that was/will be changed to incompatible.

- Alf

"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Mar 17 05:58PM +0100

On 17.03.2020 09:04, Frederick Gotham wrote:

> I remember 15 years ago I frequently wrote code like this:

> extern string Func(void); /* Defined elsewhere */

> int main(void)

Don't do that. It's a C-ism.

> {
> string const &str = Func();

Don't do that.

It has no performance effect and opens the door to subtle bug of only
keeping a returned reference. The only reasonable use of it that I know
was in Petru Marginean's ScopeGuard. And in C++11 and later there is no
need for that particular trick.

> namespace bfs = boost::filesystem;

> bfs::directory_iterator const end_itr; // Default ctor yields one-past-the-end

> for ( bfs::directory_iterator i("/home/frederick"); i != end_itr; ++i )

I would re-express that with a range based `for`. Depending on the
particulars of the filesystem library and C++ version that may require a
little support class. Here's an example support class: <url:
https://github.com/alf-p-steinbach/cppx-core-language/blob/master/source/cppx-core-language/syntax/collection-util/Span_.hpp>.

> continue;
> }

> string const &filename = i->path().filename().string();

Again, don't do that.

> if( false == boost::regex_match(filename, "\d\d\d.jpg") )

Better expressed with `not`.

> {
> continue;

As I see it, preferably use `continue` in the same way as early
`return`, only at start of block to get rid or errors or trivial cases.

> return filename;
> }

> return string();

Consider writing just `return "";`, I think that's more clear.

Or alternatively the general "return default constructed
instance"-idiom, namely `return {};`.

> string const filename( i->path().filename().string() );

> This fixed the problem.

> I don't think there should have been a problem with the original code though.

Binding to reference function result.

- Alf

felix@palmen-it.de (Felix Palmen): Mar 17 07:07PM +0100

> Additionally, at least for the standard library's filesystem, the
> encoding is system specific, UTF-8 in Linux and Windows ANSI in Windows.

Without knowing this particular API: This can't be entirely true. It
seems to be something handling filenames. The thing is: A filename on a
Unix system ist just an array of octets, with no encoding information
attached or implied. So it's UTF-8 IFF the file in question was named
while an UTF-8 locale was in effect.

On Windows, it's a different thing: Filenames are 16bit wide characters,
encoded in UTF-16. That seems to be the reason why the Windows
implementation returns a newly created object: It must convert the
native Windows filename to an 8-bit string. This whole thing is a
recurring PITA, as Windows API calls based on `char` (and also e.g. the
Standard C library functions like `fopen()`) indeed use a Windows ANSI
encoding, so information is lost. IMHO the only sane thing to do on
Windows when you need 8bit strings is to enforce using UTF-8, and
convert it to/from UTF-16 whenever needed to use exclusively the
wide-string APIs.

--
Dipl.-Inform. Felix Palmen <felix@palmen-it.de> ,.//..........
{web} http://palmen-it.de {jabber} [see email] ,//palmen-it.de
{pgp public key} http://palmen-it.de/pub.txt // """""""""""
{pgp fingerprint} A891 3D55 5F2E 3A74 3965 B997 3EF2 8B0A BC02 DA2A

You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

soft and program

Tuesday, March 17, 2020

Digest for comp.lang.c++@googlegroups.com - 20 updates in 2 topics

No comments:

Blog Archive

About Me