soft and program: Digest for comp.lang.c++@googlegroups.com

comp.lang.c++@googlegroups.com

Google Groups

Does the function below show undefined behavior? - 8 Updates
Does the function below show undefined behavior? - 1 Update
Dependecy Injection - 6 Updates
C++ Exception Handling - 2 Updates
notify_all_at_thread_exit() function considered dangerous - 2 Updates
Core language basic types support - 6 Updates

Does the function below show undefined behavior?

Ayrosa <jaayrosa@gmail.com>: Dec 11 08:53AM -0800

int& foo(int&& i){ return i; }

I know that this other function is OK

int& f(int& i){ return i; }

But what about the function on the top above, which has an rvalue ref argument?

Ayrosa <jaayrosa@gmail.com>: Dec 11 09:11AM -0800

Maybe I should rewrite my question to be more precise: does the code below show undefined behavior?

int& foo(int&& i){ return i; }

int main()
{
int i = foo(2);
}

"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Dec 11 06:19PM +0100

On 12/11/2015 5:53 PM, Ayrosa wrote:

> I know that this other function is OK

> int& f(int& i){ return i; }

> But what about the function on the top above, which has an rvalue ref argument?

It's OK in itself.

But you need to keep in mind that when you call it with a temporary as
argument, that temporary ceases to exist at the end of the full-expression.

Cheers & hth.,

- Alf

Ayrosa <jaayrosa@gmail.com>: Dec 11 09:37AM -0800

On Friday, December 11, 2015 at 3:19:41 PM UTC-2, Alf P. Steinbach wrote:
> argument, that temporary ceases to exist at the end of the full-expression.

> Cheers & hth.,

> - Alf

But isn't the argument i an lvalue inside foo. If this is the case, the reference returned by the function f is already undefined when the function returns. That is, the value assigned to i in main() is undefined.

Assume for the moment that foo has an argument A&&, where A is a class type, with a move constructor.

A& foo(A&& a) { return a; }

int main()
{
A aa = foo(A());
}

Inside the function foo, a is an lvalue, onto which the temporary A() is moved. By the time the function returns, the object a of type A is destroyed and the value assigned to aa is undefined.

Martin Shobe <martin.shobe@yahoo.com>: Dec 11 12:40PM -0600

On 12/11/2015 11:37 AM, Ayrosa wrote:

>> Cheers & hth.,

>> - Alf

> But isn't the argument i an lvalue inside foo.

Yes.

> If this is the case, the reference returned by the function f is already undefined when the function returns. That is, the value assigned to i in main() is undefined.

No. The reference returned is for the object passed in. It will remain
valid until that object's lifetime ends.

> A aa = foo(A());
> }

> Inside the function foo, a is an lvalue, onto which the temporary A() is moved. By the time the function returns, the object a of type A is destroyed and the value assigned to aa is undefined.

No. The reference remains valid until the end of the full expression, so
it's fine unless the move constructor tries to keep a reference to it's
argument or one of it's sub-objects.

Martin Shobe

Ayrosa <jaayrosa@gmail.com>: Dec 11 10:53AM -0800

On Friday, December 11, 2015 at 4:40:27 PM UTC-2, Martin Shobe wrote:

> No. The reference remains valid until the end of the full expression, so
> it's fine unless the move constructor tries to keep a reference to it's
> argument or one of it's sub-objects.

I have to disagree. The stack is unwound just after the execution of the return statement inside foo, and before the assignment is made to aa in main(). So, when the assignment is made, the reference returned by the function is already undefined.

See these two answers in Stackoverflow about this:

http://stackoverflow.com/a/26321330/411165

and

http://stackoverflow.com/a/8808865/411165

Paavo Helde <myfirstname@osa.pri.ee>: Dec 11 01:03PM -0600

Ayrosa <jaayrosa@gmail.com> wrote in

> http://stackoverflow.com/a/26321330/411165

> and

> http://stackoverflow.com/a/8808865/411165

The stackoverflow examples are about a local variable defined inside the
function. In your example the variable is created outside of the function
and thus lives longer than the function call.

Cheers
Paavo

Ayrosa <jaayrosa@gmail.com>: Dec 11 11:23AM -0800

On Friday, December 11, 2015 at 5:03:25 PM UTC-2, Paavo Helde wrote:
> The stackoverflow examples are about a local variable defined inside the
> function. In your example the variable is created outside of the function
> and thus lives longer than the function call.

We have the same situation here. Don't forget that the temporary `A()` is moved onto the local variable `a` inside `foo`, which is an lvalue. So the two questions in SO say exactly the same thing, i.e., the return value of `foo` is a reference to a local variable, which has already beed destructed, by the time the assignment is made to the variable `aa` in main().

The word move is important here and should be understood as something similar to a copy (more efficient than a copy of course).

The same thing happens in the first example, the prvalue 2 is "moved" into the local variable `i` inside `foo` and by the time the function returns, the reference returned by the function is already undefined, before the assignment is made to the local variable `i` in `main()`.

Does the function below show undefined behavior?

ram@zedat.fu-berlin.de (Stefan Ram): Dec 11 07:15PM

>int& foo(int&& i){ return i; }

Fundamental types have no move constructors,
so the argument value should be /copied/ into »i« AFAIK.

But I am not able to find the chapter and verse for this.

Dependecy Injection

Christopher Pisz <nospam@notanaddress.com>: Dec 10 06:33PM -0600

The concept of dependency injection is becoming a thing where I work for
all the .NET guys. I barely understand it having never been exposed in
my C++ bubble of 1 programmer on the team for some time.

If I understand the programming pattern itself, it simply means to pass
dependencies at construction time as interfaces to a class. This way,
the class isn't aware of what concrete implementation it is using. Seems
like I've been doing that all along anyway...Is there more to it?

At least in .NET land, there seems to be some hinting at "configuring
which concrete implementation to use" that goes along with this. I guess
they do it in one of the many configuration files visual studio
generates with a project through some magic library, but I don't know.

Do we have something similar we can do in C++?
It seems like it would be very useful to configure a particular
implementation to use without having to recompile, especially for unit
tests when there is a desire to use a mock/proxy interface.

Are there any good articles to read or examples to go over that I should
be aware of? Is this a thing for us too?

--
I have chosen to troll filter/ignore all subthreads containing the
words: "Rick C. Hodgins", "Flibble", and "Islam"
So, I won't be able to see or respond to any such messages
---

Victor Bazarov <v.bazarov@comcast.invalid>: Dec 10 09:41PM -0500

On 12/10/2015 7:33 PM, Christopher Pisz wrote:
> tests when there is a desire to use a mock/proxy interface.

> Are there any good articles to read or examples to go over that I should
> be aware of? Is this a thing for us too?

Sounds like a marriage between a pimpl idiom and a factory pattern.

V
--
I do not respond to top-posted replies, please don't ask

Chris Vine <chris@cvine--nospam--.freeserve.co.uk>: Dec 11 11:23AM

On Thu, 10 Dec 2015 18:33:08 -0600
> unit tests when there is a desire to use a mock/proxy interface.

> Are there any good articles to read or examples to go over that I
> should be aware of? Is this a thing for us too?

If you are talking about dependency injection at the implementation/
coding level, then using a polymorphic function object such as
std::function is the easiest way to do it: have them as data members of
the class in which dependencies are to be injected. It is similar but
not identical to the age-old strategy pattern - in the 1990s you would
have implemented it by inheritance and virtual functions, or in the
simplest cases by using function pointers.

I think your question may be directed more at build-level frameworks
for deploying dependency injection, about which I cannot help.

Chris

legalize+jeeves@mail.xmission.com (Richard): Dec 11 05:54PM

[Please do not mail me a copy of your followup]

Christopher Pisz <nospam@notanaddress.com> spake the secret code
>dependencies at construction time as interfaces to a class. This way,
>the class isn't aware of what concrete implementation it is using. Seems
>like I've been doing that all along anyway...Is there more to it?

That's basically it. This is the mechanism that allows a class to
follow DIP <https://en.wikipedia.org/wiki/Dependency_inversion_principle>.
In C#/Java it is typically done with runtime polymorphism and
interfaces. In C++ you can use static polymorphism and template
arguments to express the dependency. Andre Alexandrescu's "Modern C++
Design" is the poster-boy for static polymorphism used to express
dependencies. I don't think C#/Java generics are strong enough to
completely do everything you can do with C++ templates to express
static polymorphism.

>which concrete implementation to use" that goes along with this. I guess
>they do it in one of the many configuration files visual studio
>generates with a project through some magic library, but I don't know.

Once you start using dependency injection into the constructor for classes,
obtaining an instance of a class means you have to know the transitive
closure of all the dependencies for a class. If it's just one or two
things, it's no big deal to do this by hand. Once you have complex
chains of dependencies, it becomes too tedious to do this by hand.

To solve that problem, people have created dependency injection frameworks
(DIF). This is most likely what your coworkers are talking about when
they discuss configuring the system to say which concrete implementation
of an interface to use. Typically the application configures the DIF
to use all the production implementations and the unit tests will
configure the DIF to use mocks for everything except the system under
test.

>Do we have something similar we can do in C++?

There are a number of DIFs for C++. Here are some I've heard about
but have not yet evaluated:

* Boost.DI (proposed) <https://github.com/krzysztof-jusiak/di>
* Fruit <https://github.com/google/fruit>
* Sauce <https://github.com/phs/sauce>
* Wallaroo <http://wallaroolib.sourceforge.net/>

>It seems like it would be very useful to configure a particular
>implementation to use without having to recompile, especially for unit
>tests when there is a desire to use a mock/proxy interface.

As I said, I haven't investigated these frameworks in detail, but
generally from what I have seen so far the configuration is done at
compile time and not at runtime. However, you should investigate for
yourself and draw your own conclusions as my investigation has been
cursory at best.

>Are there any good articles to read or examples to go over that I should
>be aware of? Is this a thing for us too?

For the former, I think the wikipedia pages on dependency injection
are good. For the latter, I think yes.
--
"The Direct3D Graphics Pipeline" free book <http://tinyurl.com/d3d-pipeline>
The Computer Graphics Museum <http://computergraphicsmuseum.org>
The Terminals Wiki <http://terminals.classiccmp.org>
Legalize Adulthood! (my blog) <http://legalizeadulthood.wordpress.com>

Mr Flibble <flibbleREMOVETHISBIT@i42.co.uk>: Dec 11 06:38PM

On 11/12/2015 17:54, Richard wrote:
> dependencies. I don't think C#/Java generics are strong enough to
> completely do everything you can do with C++ templates to express
> static polymorphism.

Runtime polymorphism and interfaces are just as important in C++ as they
are in Java. I only rarely use the static polymorphism of which you
speak and I am not convinced of its appropriateness as far as dependency
inversion is concerned sausages.

/Flibble

"Tobias Müller" <troplin@bluewin.ch>: Dec 11 07:02PM

> are in Java. I only rarely use the static polymorphism of which you
> speak and I am not convinced of its appropriateness as far as dependency
> inversion is concerned sausages.

That's exactly how allocators work for the standard containers.
Just imagine you had to pass an allocator instance to every container you
use.

The nice thing about using templates for this is that it has zero overhead
and you can use default arguments.
You don't pay for it if you don't use it (explicitly). Both syntactically
and performance wise.

Tobi

C++ Exception Handling

woodbrian77@gmail.com: Dec 10 06:19PM -0800

On Thursday, December 10, 2015 at 1:24:38 PM UTC-6, Paavo Helde wrote:
> > Manipulation (http://dx.doi.org/10.1109/SCAM.2015.7335398).

> I find it ironic that a study about open-source software is not openly
> available.

I encourage people to say "what's yours is yours and
what's mine is yours."

Brian
Ebenezer Enterprises - In G-d we trust.
http://webEbenezer.net

"Öö Tiib" <ootiib@hot.ee>: Dec 11 10:59AM -0800

> > available.

> I encourage people to say "what's yours is yours and
> what's mine is yours."

Why you want people to say that to you? Do you really need their stuff?

notify_all_at_thread_exit() function considered dangerous

michael.podolsky.rrr@gmail.com: Dec 11 07:54AM -0800

Hi Everyone,

This post is related to the thread library of C++11.

There is a somewhere special function in C++

void std::notify_all_at_thread_exit( std::condition_variable& cond,
std::unique_lock<std::mutex> lk );

which is defined as to 'schedule' an execution of the following code fragment:

line 1: lk.unlock();
line 2: cond.notify_all();

at the very end of the thread when all the thread-local variables are already destructed.

I think that this particular order (first releasing the mutex and only then signalling the cond-var) may be dangerous for some particular scenarios and lead to accessing memory which is either freed or where destructor has been applied (that very memory which was previously allocated by the cond-var).

To demonstrate that, I'll consider a promise-future class pair, in which the promise class implements

std::promise::set_value_at_thread_exit(const T& val) function.

The scenario is simple:

Thread B calls promise.set_value_at_thread_exit(val);
while Thread A calls val=future.get();
and then immediately destroys the 'future' instance (say, it is going out of the scope).

I will now follow the Microsoft C++ implementation, which uses notify_all_at_thread_exit() in rather obvious and straightforward way.

The promise and future class instances share a common state which can store a value (which promise will pass to the future) and with a mutex+cond_var to control synchronization. So, when we call set_value_at_thread_exit() on the promise in the thread B, the mutex will be locked, then the value will be assigned, some readiness boolean flag will be set and then the releasing of the mutex and signalling of the cond-var will be scheduled with a call to notify_all_at_thread_exit(). While the thread B is running, the mutex will be locked and so other threads will not yet "see" the respective 'future' ready.

The thread A which possesses the future instance will do its call to

val = future.get();

which awaits for the result from the thread B. This wait is implemented as locking the same mutex, verifying the readiness flag and if it is not yet set, starting a busy-wait on the cond-var. Finally, when the readiness flag is seen to be set, reading the value and releasing the mutex. Pretty standard arrangement.

Now suppose that the thread A started its wait just at the moment when thread B is finishing, precisely after the thread B calls lk.unlock() (line 1) and BEFORE it calls cond.notify_all(). Then the wait will immediately succeed, the future.get() call will see the value ready and future.get() will finish. Then thread A will destruct its 'future' instance (remember, we made it going out of scope) which will destroy the shared (between promise and future) state as well (the 'promise' instance has been presumably destructed by the time thread B finishes). So, the shared state is destructed, the condition variable is destructed and yet we still have thread B which is now going to call cond.notify_all() on the object 'cond' (line 2) which has been already destructed.

This problem is not a feature of promise-future implementation, it may happen with some custom call to notify_all_at_thread_exit() unless the provided conditional variable does not reside in the global memory.
I suppose that if notify_all_at_thread_exit() would make its actions in opposite way, i.e. first signalling the condvar and then releasing the mutex, this problem would have gone.

Regards, Michael

michael.podolsky.rrr@gmail.com: Dec 11 08:03AM -0800

> Now suppose that the thread A started its wait just at the moment when thread B is finishing, precisely after the thread B calls lk.unlock() (line 1) and BEFORE it calls cond.notify_all(). Then the wait will immediately succeed,

Just to remove possible ambiguity, I meant the wait on the mutex and not on the condvar. That is, the thread A locks the mutex, sees the readiness flag to be set on, reads the value and unlocks the mutex. No actual wait on condvar happens.

- Michael

Core language basic types support

Nobody <nobody@nowhere.invalid>: Dec 11 04:24AM

On Thu, 10 Dec 2015 14:49:57 +0100, Alf P. Steinbach wrote:

> • No system dependent Unicode character type (like int is for integers).

What does this actually mean? Or rather, why doesn't wchar_t count?

Admittedly, wchar_t isn't guaranteed to be Unicode. But if it isn't,
that's usually because the platform itself doesn't support Unicode.

"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Dec 11 05:59AM +0100

On 12/11/2015 5:24 AM, Nobody wrote:

> What does this actually mean? Or rather, why doesn't wchar_t count?

> Admittedly, wchar_t isn't guaranteed to be Unicode. But if it isn't,
> that's usually because the platform itself doesn't support Unicode.

In practice wchar_t is guaranteed to be Unicode (I don't know of any
exception), but Unix-land OS APIs don't take wide string arguments.
E.g., the Unix-land "open" function takes a narrow string,

// http://pubs.opengroup.org/onlinepubs/009695399/functions/open.html
int open(const char *path, int oflag, ... )

The corresponding function in Windows is the appropriate mode of the
all-purpose CreateFileW function, which takes wide strings or (via a
wrapper called CreateFileA) Windows ANSI-encoded narrow strings.

This matters both for the case where such functions are used directly
(C++ is not only for writing portable code), and for the case where one
desires to write a portable interface with system-dependent
implementation that should not need to convert encodings and do dynamic
allocation and such... An example of such an interface is the Boost
filesystem library, which will be part of C++17. It kludge-solves the
issue by requiring wide string based stream constructors in Windows, but
I think that singling out a specific OS in the specification of a
standard library component for C++, is a very very bad approach, and so
needless – it would not be a problem with core support for a syschar.

Cheers & hth.,

- Alf

David Brown <david.brown@hesbynett.no>: Dec 11 10:44AM +0100

On 10/12/15 17:41, Alf P. Steinbach wrote:
> unsigned type, as very unnecessary: just by using signed types for
> numbers, and reserving unsigned types for bitlevel stuff, it's avoided.

> Well, mostly. ;-)

OK. I know that mixing signed and unsigned can introduce subtle errors,
so I can understand if you want to avoid it. (My own use is a little
different, and I use unsigned types a lot - but then, I do low-level
programming on small embedded systems, and therefore much more "bitlevel
stuff".)

> In the single case (that I know of) where the C++ standard library
> represents a size as signed, namely "count" and "count_if", it uses the
> iterator difference type, which for pointers is "ptrdiff_t".

I would then say that C++ itself does not need a standard signed size
type - if you are using Posix, then you've got "ssize_t", and if you are
using "count" you have ptrdiff_t. But if you want a nicely named signed
size type for your own use, that seems fair enough.

> C++ now supports Posix, at least to some degree. In particular with
> C++11 conversions between function and data pointers, required by Posix,
> was allowed. It's up the implemention whether it's allowed.

Posix is not supposed to work on /all/ C++ implementations. It makes
certain requirements of the C or C++ implementation beyond the
standards. For example, it requires two's complement signed integers,
8-bit chars, and 32-bit int. So Posix has always relied on certain
implementation-dependent features of C and C++ - but they are ones that
are valid on all systems for which Posix is realistic.

>> dependent, in that they may not actually be 8-bit, 16-bit or 32-bit on
>> all systems, but otherwise they are standardised.

> No, that's not usefully system dependent.

I don't see system dependent as a good thing here - in fact, I see it as
a bad thing.

> That's the kind of portability that "int" offers: it's right for the
> system at hand, and yields one common source code that adapts
> automatically to the system it's compiled on.

I am actually much happier that you /don't/ have that. You can write
u8"Blah" and have the string as a utf8 string, or U"Blah" for a utf32
string. Plain old strings give you the default system-dependent 8-bit
character encoding, which is suitable for plain ASCII and little else.

If you want to write code that is tied directly to a particular OS and
its API, you can use the types that suit that OS - utf8 for *nix, utf16
for Windows (and hope that you don't fall foul of the UCS16/utf16 mess).
The types there are clear.

If you want to write code that is independent of the OS, you use a
cross-platform library in between so that your code can stick to a
single format (usually utf8 or utf32) and the library handles the
OS-specific part.

> Yes, but it's in the wrong direction.

> I was talking about casting (back) to signed, without giving the
> compiler Unsound Optimization Ideas™ based on formal UB here and there.

Sorry, I misread you here. Converting from signed to unsigned is
well-defined, while converting from unsigned to signed has
implementation-defined behaviour when the value cannot be represented.
It is not undefined behaviour, so there is no "compiler problem". I
don't know for sure about other compilers, but gcc (and therefore llvm,
which follows gcc in these matters) implements modulo behaviour here.
Since all signed integers in gcc are two's complement, that means the
compiler will simply re-interpret the same bit pattern as a signed
value. It would surprise me if the implementation-defined behaviour was
any different on other compilers, at least for "nice" target architectures.

> unsigned, exactly as you showed here, as a first step, to get
> well-defined modulo wrapping). I guess it can be done in a more elegant
> way. And not sure if it /really/ avoids the compiler problem.

There is no compiler problem here as far as I can see (see above). The
only issue could be implementation-defined (but not undefined) behaviour
being different on some compilers.

>> Would it make sense to put "constexpr" in the code in some cases?

> Not sure. Do you have something in particular in mind? I'm still at the
> stage where I only add "constexpr" where it's directly needed.

I was thinking for cases like "is_ascii" (though you need C++14 for the
loop - in C++11 you'd need to use recursion), and wrap_to. Basically,
using "constexpr" restricts what you can do in the function (less so in
C++14), but means that the compiler can pre-calculate it (if the
parameters are constant) and use the results in more contexts. So when
the restrictions on the features needed in the function are not a
limitation, it adds flexibility that could be useful in this sort of code.

"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Dec 11 01:32PM +0100

On 12/11/2015 10:44 AM, David Brown wrote:
> cross-platform library in between so that your code can stick to a
> single format (usually utf8 or utf32) and the library handles the
> OS-specific part.

Consider if that was so for integers, that one needed some 3rd party
library to interface the integers with the OS API, converting back and
forth.

It's possible but it's just needlessly complex and inefficient.

> Sorry, I misread you here. Converting from signed to unsigned is
> well-defined, while converting from unsigned to signed has
> implementation-defined behaviour when the value cannot be represented.

Hm, you're right.

All that the function does technically then is to avoid a sillywarning
with Visual C++ 2015 update 1, but that sillywarning can instead be
turned off.

Grumble grumble...

> It is not undefined behaviour, so there is no "compiler problem".

Well, not so fast. But right, the conversion is implementation defined
behavior. Thanks, it slipped my mind!

> parameters are constant) and use the results in more contexts. So when
> the restrictions on the features needed in the function are not a
> limitation, it adds flexibility that could be useful in this sort of code.

C++14 sounds good, after all we're in 2015. But I'll have to experiment
to find out what Visual C++ 2015 supports. E.g. it doesn't yet (as of
update 1) support variable templates.

Cheers, & thanks,

- Alf

"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Dec 11 02:57PM +0100

On 12/11/2015 1:32 PM, Alf P. Steinbach wrote:

> C++14 sounds good, after all we're in 2015. But I'll have to experiment
> to find out what Visual C++ 2015 supports. E.g. it doesn't yet (as of
> update 1) support variable templates.

Unfortunately the following code does not compile with MSVC 2015 update
1 (the latest version of Visual C++) when NEWFANGLED is defined,
although it does compile with MinGW g++ 5.1.0:

<code>
#ifdef NEWFANGLED

template< class... Args >
constexpr
auto exactly_one_of( const Args&... args )
-> bool
{
const bool values[] = {!!args...};
int sum = 0;
for( bool const b : values ) { sum += b; }
return (sum == 1);
}

#else

inline constexpr
auto n_truths()
-> int
{ return 0; }

template< class... Args >
constexpr
auto n_truths( const bool first, Args const&... rest )
-> int
{ return first + n_truths( rest... ); }

template< class... Args >
constexpr
auto exactly_one_of( const Args&... args )
-> bool
{ return (n_truths( !!args... ) == 1); }

soft and program

Friday, December 11, 2015

Digest for comp.lang.c++@googlegroups.com - 25 updates in 6 topics

No comments:

Blog Archive

About Me