soft and program: Digest for comp.lang.c++@googlegroups.com

comp.lang.c++@googlegroups.com

Google Groups

How does C++ implementation cast the pointer returned by a virtual function? - 9 Updates
C++ 2017 -- win, lose and draw - 2 Updates
Whitespace and (borked) comments. - 1 Update
Using of Macros - 7 Updates
Beginner example of concurrency in c++ - 3 Updates
Comment on stackoverflow about value initialization seems to contradict my gcc compiler. - 1 Update
Parameter type deduction with constructors - 2 Updates

How does C++ implementation cast the pointer returned by a virtual function?

Shiyao Ma <i@introo.me>: Jan 21 07:53PM -0800

Hi.

A virtual function can return a pointer to the derived class.

When it comes to multiple inheritance, the pointer might be casted to add some offset, in order to point to the right base.

E.g.,

Base *ptr = someobjptr->some_func_return_pointer_to_derived();

The problem is, virtual function binding is a runtime work. While, "Base*" type is a static work.

So how is the offset calculated to adjust the result of "someobj->some_func_return_pointer_to_derived()" to be "Base*" ?

Though it's high impl. specific, any concrete example, like gcc?

The following is the code snippet, we can see the addresses of the two pointers is different.

http://ideone.com/6DmlT3

jt@toerring.de (Jens Thoms Toerring): Jan 22 05:34AM

> A virtual function can return a pointer to the derived class.

Here's your program, it's short enough for posting it.

> // the outputed two addresses are different.
> // how is the pointer cast (adding some offset) achieved?
> }

Your assumption that func() would return a pointer to a
'Derived' class instance is simply wrong. It returns a
pointer to a newly created instance of 'C'. So it has
nothing to do with the address of the 'Derived' class
instance and they must be different since they are com-
pletely different objects. The address stored in 'pb' is
exactly the same as that of the instance of 'Derived'
from which it was assigned. Try instead

Derived *dp = new Derived;
Base *bp = dp;
cout << dp << ' ' << bp << cendl;

If your assumption would be correct you could do

Base *bp2 = pb->func();

But the compiler won't let you do that because the result
of 'pb->func()' is neither a pointer to an instance of
'Derived' nor 'Base' but to a new instance of 'C'. And 'C'
isn't derived from 'Base' and thus the assignment isn't
possible (unless you force the compiler via a cast to
let you do that anyway).
Regards, Jens
--
\ Jens Thoms Toerring ___ jt@toerring.de
\__________________________ http://toerring.de

Stuart Redmann <DerTopper@web.de>: Jan 22 03:46PM +0100

> The following is the code snippet, we can see the addresses of the two
> pointers is different.

> http://ideone.com/6DmlT3

If you add a covariant implementation you actually add two virtual methods.
One is the one that you have explicitely provided, the other is supplied by
the compiler automatically, and it looks like this:

B* func ()
{
C* derived = func(); // invokes your version, but this code
// would not compile in reality because
// this compiler could not figure out which
// version of func should be called.
B* base = static_cast<B*>(derived);
return base;
}

Note that above snippet is an implementation detail that is called vtable
thunking. However, since most C++ compilers use vtables they will most
likely also use this technique.

One implication of this technique is that each covariant method declaration
adds another entry to the vtable of a class. So if you create a real large
inheritance tree that consists only of a single branch and each level adds
a covariant method, your most derived class will end up with a large vtable
as well, even if it contains only a single virtual method!

Regards,
Stuart

Shiyao Ma <i@introo.me>: Jan 22 07:06AM -0800

Thanks stuar,

Very enlightening to have a read.

On Sunday, 22 January 2017 22:46:44 UTC+8, Stuart Redmann wrote:

"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Jan 22 04:42PM +0100

On 22.01.2017 04:53, Shiyao Ma wrote:

> A virtual function can return a pointer to the derived class.

> When it comes to multiple inheritance, the pointer might be casted to
> add some offset, in order to point to the right base.

Address adjustment can happen also for single inheritance.

> Base *ptr = someobjptr->some_func_return_pointer_to_derived();

> The problem is, virtual function binding is a runtime work. While,
> "Base*" type is a static work.

The declared return type is known at compile time.

> So how is the offset calculated to adjust the result of
> "someobj->some_func_return_pointer_to_derived()" to be "Base*" ?

There are a number of different cases.

> Though it's high impl. specific, any concrete example, like gcc?

The under-the-hood details are implementation specific, yes.

> The following is the code snippet, we can see the addresses of the
> two pointers is different.

> http://ideone.com/6DmlT3

Usenet is not a web forum. Messages are archived and can be read for
many decades. Your URL will probably not be valid in a year or two.

As it happens Jens Thoms Toerring has already discussed your example
else-thread: your code creates two objects of two unrelated types, and
compares their addresses. It's an irrelevant example. :)

Here is about the simplest example where there is likely to be a pointer
adjustment when converting from `Derived*` to `Base*`:

struct Base { int x; };
struct Derived: Base { virtual ~Derived() {} };

#include <iostream>
using namespace std;
auto main()
-> int
{
Derived o;
Base& b = o;
cout << "Base at " << &b << ", Derived at " << &o << endl;
}

Since `b` is (apparently) a reference to `o`, an alias for `o`, one
might naively expect that they should be at the same address. But in
more detail `b` is a reference to /the `Base` sub-object`/ of `o`. I.e.
to something inside `o`, and since `Derived` is not a POD class that
sub-object is not guaranteed to be at the very start of `o`.

OTOH it's not guaranteed that there will be an adjustment, either: it
depends on the implementation. But it's likely. With MinGW g++ I get

Base at 0x22fd58, Derived at 0x22fd50

The Derived object logically contains a Base sub-object. And here the
g++ compiler placed a vtable pointer before the Base sub-object, at the
very start of Derived. Hence the Base sub-object is at a slightly higher
address than Derived, just sufficiently to make room for that vtable
pointer, which is 8 bytes with this 64-bit compiler.

So what happens if, in `Base`, you add this method:

virtual auto p() -> Base* = 0;

and in `Derived` you implement it as

auto p() -> Base* override { return this; }

Well that case is easy, in two ways! First, in `Derived` the type of
`this` is known to be `Derived*`, and the conversion to `Base*` can be
determined at compile time. And secondly, by adding a virtual method up
in `Base` we have introduced a vtable pointer there, so there's likely
no address adjustment at all, i.e., `Base` is at offset 0 in `Derived`.

To get an address adjustment sort of within the call of a virtual
method, one apparently needs multiple inheritance. At least for the in
practice.

Let's first construct an example with sub-objects of the same type but
at different offsets within the containing derived class object:

struct Base { int x; virtual ~Base(){} };
struct Intermediate1: Base {};
struct Intermediate2: Base {};
struct Derived: Intermediate1, Intermediate2 { };

#include <iostream>
using namespace std;
auto main()
-> int
{
Derived o;
Base& b1 = static_cast<Intermediate1&>( o );
Base& b2 = static_cast<Intermediate2&>( o );
cout << "Base 1 at " << &b1 << ", Base 2 at " << &b2 << ",
Derived at " << &o << endl;
}

With MinGW g++ I get

Base 1 at 0x22fd40, Base 2 at 0x22fd50, Derived at 0x22fd40

The problem with this example is that it's still irrelevant for the
virtual function call question, for it's not the case that a `Derived`
is-a single `Base`. A `Derived` here is two `Base`´s, a `Base` plus a
`Base`. And since they are on equal footing a `Derived*` doesn't convert
implicitly to single `Base*`: the conversion is ambiguous!

And so a virtual function implementation down in `Derived` cannot simply
return `this` in order to return a `Base*`: it must disambiguate, e.g.
via a `static_cast` as shown above, exactly which of the two `Base`
sub-objects it should return the address of.

The relevant conversion `Derived*` → `Base*` will therefore be known at
compile time, and it's also known at compile time for a virtual function
implementation in `Intermediate1` or `Intermediate2`.

To make things more complex & interesting we can sort of merge the two
`Base` sub-objects into a single shared one, by using `virtual`
inheritance from `Base`. All lines of `virtual` (direct) inheritance of
a class T go the same single T sub-object, and so in this code:

struct Base { int x; virtual ~Base(){} };
struct Derived_v: virtual Base {};
struct Derived1: Derived_v {};
struct Derived2: Derived_v {};
struct Most_derived: Derived1, Derived2 { };

#include <iostream>
using namespace std;
auto main()
-> int
{
Most_derived o;

Derived1& d1 = o;
Derived_v& d1v = d1;

Derived2& d2 = o;
Derived_v& d2v = d2;

Base& b = o;

cout << "b at " << &b << ", d1v at " << &d1v << ", d2v at " <<
&d2v << ", " << "o at " << &o << endl;
}

… the single `Base` sub-object must necessarily be at different offsets
in the two `Derived_v` sub-objects, and indeed, I get e.g. this output:

b at 0x22fd30, d1v at 0x22fd20, d2v at 0x22fd28, o at 0x22fd20

So now we're in position to ATTEMPT to create a virtual function with
covariant result, where that result must be adjusted in diffent ways
depending on through which sub-object that function is called.

That is, we attempt to force an adjustment of the function result that
depends on information only known at run-time:

struct Base
{
int x;
virtual auto p() -> Base* { return this; }
virtual ~Base(){}
};

struct Derived_v
: virtual Base
{
auto p() -> Derived_v* override { return this; }
};

struct Derived1
: Derived_v
{
auto p() -> Derived1* override { return this; }
};

struct Derived2
: Derived_v
{
auto p() -> Derived2* override { return this; }
};

struct Most_derived: Derived1, Derived2 { };

#include <iostream>
using namespace std;
auto main()
-> int
{
Most_derived o;

Derived1& d1 = o;
Derived_v& d1v = d1;

Derived2& d2 = o;
Derived_v& d2v = d2;

Base& b = o;

cout << "b at " << b.p() << ", d1v at " << d1v.p() << ", d2v at
" << d2v.p() << ", " << "o at " << &o << endl;
}

But this is just not allowed by the C++ rules: this code will not
compile with a standard-conforming compiler.

[C:\my\forums\clc++\050]
> g++ d.cpp
d.cpp:26:8: error: no unique final overrider for 'virtual Base*
Base::p()' in 'Most_derived
struct Most_derived: Derived1, Derived2 { };
^~~~~~~~~~~~

[C:\my\forums\clc++\050]
> cl d.cpp
d.cpp
d.cpp(26): error C2250: 'Most_derived': ambiguous inheritance of
'Derived1 *Base::p(void)'

[C:\my\forums\clc++\050]
> _

So, the short answer is that the C++ rules ensure that any adjustment of
a function result is completely known at compile time, when the function
implementation is compiled. And the slightly longer answer is that the
rules are quite complex, but they add up to reliable simple behavior. Or
at least, it's been that way up till and including C++14.

Cheers & hth.,

- Alf

Manfred <noname@invalid.add>: Jan 22 05:30PM +0100

On 1/22/2017 6:34 AM, Jens Thoms Toerring wrote:
>> }

> Your assumption that func() would return a pointer to a
> 'Derived' class instance is simply wrong.
This does not appear to be the assumption of the program.
It first prints (in Derived::func()) the address of a new object of type
C, and then prints (in main() the address of the same object converted
to a B*.
The confusing part is that it is using Base and Derived as well as A, B
and C.
The difference in address is, though, not due to func() being virtual
nor it has to do with polymorphism. It is simply due to the fact that C
has two bases A and B which obviously cannot both share the same
address, so the B (the second base) subobject has a different address
than the C object (which probably has the same address as the A subobject).
Conversions between A, B and C pointers are performed by the compiler
given that their relative layout is known at compile time.

"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Jan 22 10:15PM +0100

On 22.01.2017 17:30, Manfred wrote:
> It first prints (in Derived::func()) the address of a new object of type
> C, and then prints (in main() the address of the same object converted
> to a B*.

No, it prints the result of calling `pb->func()`, not `pb`.

Cheers & hth.,

- ALf

Manfred <noname@invalid.add>: Jan 22 10:53PM +0100

On 1/22/2017 10:15 PM, Alf P. Steinbach wrote:
>> C, and then prints (in main() the address of the same object converted
>> to a B*.

> No, it prints the result of calling `pb->func()`, not `pb`.

Exactly, pb is never printed nor assumed to be compared to anything.
The other printout (meant to be compared with the result of
`pb->func()`) is in the following, where the thing being printed is the
result of `new C`:

struct Derived: Base {
C* func() {
auto p = new C;
cout << p << endl;
return p;
}
};

"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Jan 22 11:23PM +0100

On 22.01.2017 22:53, Manfred wrote:
> return p;
> }
> };

Oh, a function with side effect, and covariant with respect to a
parallel class hierarchy.

Well then the analysis given by Jens up-thread is incorrect. And I
failed to see that. But who would expect such code, huh.

Now if I were the compiler I would implement it as follows:

auto func()
-> B* override // Known to be C* for this implementation.
{
auto p = new C;
cout << p << endl;
return p;
}

auto _non_virtual_func()
-> C*
{ return static_cast<C*>( func() ); }

... and translate every call of the source code's `func() -> C*`, to a
call of `_non_virtual_func`.

This keeps all the pointer type conversion (with possible address
adjustment) using only compile time information.

The direction of delegation is important because `func` can be further
overridden in a more derived class, and then one wants that reflected
also in calls of `_non_virtual_func`.

By the way, this is the usual pattern for creating covariant functions
returning smart pointers, since C++ supports covariance only for raw
pointer and raw reference result.

Cheers!, & thanks,

- Alf

C++ 2017 -- win, lose and draw

woodbrian77@gmail.com: Jan 22 01:24PM -0800

std::string_view == win

But there should be support for appending a string_view to
std::string. That seems to still be missing from gcc 7.0
and clang 3.9.1. That's kind of frustrating.

std::variant == win
std::any == lose
std::optional == draw

That's my take on these additions. I tried replacing a use of
std::unique_ptr with std::optional. The resulting executable
was 588 bytes bigger (~1.5%). It avoids the allocation/
deallocation, but not without it's own cost. In a single
threaded program perhaps unique_ptr would be better.

Brian
Ebenezer Enterprises - In G-d we trust.
http://webEbenezer.net

Mr Flibble <flibbleREMOVETHISBIT@i42.co.uk>: Jan 22 10:11PM

> was 588 bytes bigger (~1.5%). It avoids the allocation/
> deallocation, but not without it's own cost. In a single
> threaded program perhaps unique_ptr would be better.

std::optional is not meant to be an alternative to std::unique_ptr and
shouldn't be considered such.

/Flibble

Whitespace and (borked) comments.

Vir Campestris <vir.campestris@invalid.invalid>: Jan 22 05:58PM

On 21/01/2017 14:08, jmfbahciv wrote:
> figure out what the original developer intended.

> I really enjoy the ones which have comments indicating
> "Thar be dragons"

In one sense the only accurate documentation is the code.

But in another sense- it's not documentation at all.

How can the code tell you "It might look as if you can optimise this
out, but if you do XXXX will happen"?

Andy

Using of Macros

Mr Flibble <flibbleREMOVETHISBIT@i42.co.uk>: Jan 21 05:53PM

On 21/01/2017 10:12, JiiPee wrote:

> Using macro seems to be the only way to do this trace so that the
> release version ignores that line and does not compile it. How would
> "dont-use-macros" person do this other way?

As long as the macro doesn't affect program behaviour (as is the case
here) then it is perfectly fine and is little different to using assert.

/Flibble

"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Jan 21 07:51PM +0100

On 21.01.2017 19:03, David Brown wrote:

> static inline void Trace(int a) {};

>

soft and program

Sunday, January 22, 2017

Digest for comp.lang.c++@googlegroups.com - 25 updates in 7 topics

No comments:

Blog Archive

About Me