soft and program: Digest for comp.lang.c++@googlegroups.com

comp.lang.c++@googlegroups.com

Google Groups

Tutorial on threaded binary tree part 3: AVL-like updating of height differences - 1 Update
Those dang window message dispatcher macros again, too unclean? - 1 Update
Lisp interpreter - 3 Updates
Mommy's Little Idiot. - 1 Update
question about "const" before multiple commas and objects - 2 Updates
#1NEWBIEHERE question about "const" before multiple commas and objects - 2 Updates

Tutorial on threaded binary tree part 3: AVL-like updating of height differences

Tim Rentsch <txr@alumni.caltech.edu>: Dec 31 01:12PM -0800

> can be encoded in just 2 bits, e.g., for joy-joy hacking delight, in
> the least significant bits of the pointers ? but I'm not going to do
> that. :)

The code still doesn't discard (or simply count) duplicate values.
I really recommend that you do that, especially before you get to
rebalancing. (I have no other specific comments on that version
of the code so I snipped it.)

> ----------------------------------------------------------------------

> An implementation of Tim's suggested O(1) memory updating, which as I
> understand it is a common implementation:

It corresponds to what is done in the AVL algorithm given in
Knuth. I don't know how common it is, only that I thought it
worth pointing out.

> node = (go_left? node->left : node->right);
> }
> }

Another way the termination test could be done is to pass in the
address of the newly created leaf node, and compare for it. That
would make the while() loop be something like

while( node != new_leaf ) ...

and it would allow the updating of node->height_diff to be lifted
up a level rather than being inside an if().

> I haven't measured these but the O(1) memory scheme avoids any dynamic
> allocation even for the most naive direct implementation, so for the
> case of non-optimized code it should be faster.

A downside of this approach is that the value comparisons (or some
of them anyway) are done a second time. An intermediate scheme,
which I kind of like, is to have a "stack" of bits reflecting the
left/right choices made past the point of the tipping point node.
The "stack" of bits can be stored in 2 or 3 'unsigned long' words,
which is more than enough to handle any feasible tree.

Incidentally, kind of a side comment, I find the code composition
style shown to be somewhat cumbersome. I don't know if you have
changed your normal style for tutorial reasons, and it's really
tangential to the main discussion anyway, but I thought it might
be good to mention it.

> It brings more complexity to the `add` member function, and simplifies
> the `update_height_differences`... function.

It turns out some sort of tracking like this will be important
when we get to rebalancing, so it's probably better to introduce
it now.

Also, you may find it helpful to split off the special case of
adding a value to an empty tree. The general case code has an
easier time of it if that code doesn't have to deal with the tree
being completely empty.

> ----------------------------------------------------------------------

> My thinking about balancing.

> I've yet to read up on this in Wikipedia. [...]

For those who are interested I will try to walk through a general
outline. If anyone wants to figure it out on their own then stop
reading here.

Here is a canonical tree that may need rebalancing after a new
value is inserted (and obviously there are the corresponding
mirror image cases):

20
/ \
/ \
10 40
/ \
/ \
30 50

Note: the tree is shown as complete, with '20' at the top, but
that isn't actually necessary for what follows. The '20' node
could be in the middle of the tree, and there could be additional
subtrees below the '10', '30', and '50' nodes, as long as:

one: the height of those three leaf nodes is all the same; and
two: all nodes after '20' along the insertion path are "level",
ie, their left and right subtrees have equal heights.

So, if we're going to insert a new value, there are three cases:
somewhere under the 10 (eg, 5 or 15), somewhere under the 30 (eg,
25 or 35), or somewhere under the 50 (eg 45 or 55). Let's look at
the "under 10" case first. The new tree looks like this:

20
/ \
/ \
10 40
.... / \
/ \
30 50

where the .... means something got added somewhere in the left or
right subtree of the 10. The "balance" factor of the 10 node will
have been adjusted already, by virtue of being downstream from the
"tipping point" node, which is the 20 node in this case. The
height of 20's left side has grown by one, so the 20 node needs to
be marked as "level", but nothing else needs to happen.

Now let's look at the "under 50" case. The new tree looks like
this:

20
/ \
/ \
10 40
/ \
/ \
30 50
....

For sure this tree needs to be rebalanced. After inserting a new
value, rebalancing an AVL tree is a local operation, involving
just a few nodes in the interior of the tree. Here we have the
easier of the two cases where rebalancing needs to happen.
Typically the transformation done is described as a "rotation".
Conceptually it could be described thusly:

one: move the 20 node down and to the left;
two: move the 40 node up to where the 20 node used to be;
three: change the left link in the 40 node to point to the
20 node, and set the right link of the 20 node (which
used to point to the 40 node) to point to the 30 node.

Here is the post-rotation picture:

40
/ \
/ \
20 50
/ \ ....
/ \
10 30

To adjust the balance factors, the 20 node and the 40 node both
need to be set to "level". (The balance on the 50 node will have
been adjusted already, according to whether the insertion happened
on the left or the right.) Whatever parent pointer used to point
to the 20 node needs to be set to point to the 40 node. The
parent pointer could be a left- or right- pointer in what was 20's
parent node, or it could the root_ pointer if the 20 node was at
the top of the tree.

Now let's look at the "under 30" case. Here is what the tree
looks like after the new value is inserted:

20
/ \
/ \
10 40
/ \
/ \
30 50
....

Suppose we try the same rotation we did for the "under 50" case.
The resulting tree would be:

40
/ \
/ \
20 50
/ \
/ \
10 30
....

Obviously that didn't help - the tree shown has the same out-of-
balance problem as the original, except on the left instead of on
the right. We need to do something different, but what?

It's common to see the transformation needed here described as a
"double rotation", but I think it might be easier to understand if
described like this:

one: reach down, grab the 30 node, pull it up to the top;
two: set 30's left link to point to the 20 node, 30's right
link to point to the 40 node;
three: set 20's right link to point to the original value of
30's left link, and 40's left link to point to the
original value of 30's right link

Here is the resulting picture:

30
/ \
/ \
20 40
/ .? ?. \
/ \
10 50

Note the .... that was under the 30 got split, with half of it put
under the 20, and half of it put under the 40. If the 30 node was
previously a leaf, only one of those two pointers will be
non-null.

For balance factors: the 30 node should be set level; the 20
node should be set level if the insertion happened down 30's left
side, and "leftish" otherwise; the 40 node should be set level if
the insertion happened down 30's right side, and "rightish"
otherwise.

And of course whatever parent node was previously pointing to 20
needs to be set to point to the 30 node.

NOTE: The discussion above assumes all nodes along the insertion
path and downstream of the 20 node have had their balance factors
adjusted already. Nodes upstream of the 20 node all keep the same
balance factors they had before the insertion.

Those dang window message dispatcher macros again, too unclean?

Manfred <noname@invalid.add>: Dec 31 06:34PM +0100

On 12/23/2016 04:30 PM, Stefan Ram wrote:
>> dispatch method either.

> What is bothering you most about the dispatch methods that
> you have been using so far?

I still had to answer to this.
MFC uses message maps, which substantially work, but they are buried
within MFC, so they come with all of its burden (which Alf summarized well).
Moreover, from the language point of view, I think they substantially
are not C++. They are C with classes at best, IMHO - no object
connection, no polymorphism.
It results in the oddity (also IMHO), as per their functionality, that
in the era of C++17 they still use two map lookup: the window object and
the handler function. Microsoft has gone to considerable extent in
optimizing such lookup, but still every single message has to go through
them.
From the coding point of view, it is a minor issue but I find it
somewhat suboptimal that a handler for a single message has to be
declared twice: in the class and in the map.

Lisp interpreter

Ian Darroy <iandarroy@gmail.com>: Dec 30 09:46PM -0800

Hello guys! Happy New Year comming!

I've got a design question about how to construct my application the right way.

I'm writing a small lisp interpreter and it consists of few classes: Reader, Evaluator, Environment, Symbol and Function. My design is bad because hardly all classes depend on each other this way:

Reader needs Symbol to parse input and wrap strings into Symbol's name.
Evaluator fulfills these Symbols with values (each Symbol has all types in it like Number, Function, String and List (I use vector of Symbols)).
Environment keeps map<Name, Symbol> to hold values so Evauator can ask it for them.
Functions devide on builtins and user-defined, all accept vector of Symbols and I've got inheritance for them.

All this is binded with Evaluator, it's like a hypervisor and knows about everything, what makes me sad.

Please, help to make a better design decision. If you know some recourses, I will be thankful!

"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Dec 31 07:04AM +0100

On 31.12.2016 06:46, Ian Darroy wrote:
> about everything, what makes me sad.

> Please, help to make a better design decision. If you know some
> recourses, I will be thankful!

General advice is to differentiate between interface and implementation
of something.

Two modules can depend on each other's interfaces without depending on
each other's implementations.

This is harder to do with header-only modules, so when you know there's
going to be little mess, separate compilation is probably a good idea.

Cheers!,

- Alf

Ian Darroy <iandarroy@gmail.com>: Dec 31 12:08AM -0800

Thank you

Mommy's Little Idiot.

Jeff-Relf.Me <@.>: Dec 30 08:39PM -0800

question about "const" before multiple commas and objects

ram@zedat.fu-berlin.de (Stefan Ram): Dec 31 12:49AM

>//CONST int ci = 0, &cj = ci;
>so does the const entitle every object after its created?

A simple-declaration is produced thus:

simple-declaration:
decl-specifier-seq init-declarator-list[opt] ;

. Above, »const int« is the decl-specifier-seq, and »ci = 0,
&cj = ci« is the init-declarator-list[opt].

So the »const«, as a part of the decl-specifier-seq, applies
to every of the following init-declarators.

>now that the const is in the middle, does it ignore the first
>object and create a const for everything in front of it?
>including the next object?

Here, the »const« only refers to the init-declarator-list
»int &k = i«.

The semicolon »;« terminates a full simple-declaration, and
the »const« cannot refer to other simple-declarations.

ram@zedat.fu-berlin.de (Stefan Ram): Dec 31 01:22AM

>start of the declaration. I think it's unfortunate, but all the examples
>in the Holy Standard use prefix `const`. The more general notation is
>needed for pointer and function declarations.

I repost a post of mine here:

|Newsgroups: comp.lang.c++
|Subject: const T vs T const
|From: ram@zedat.fu-berlin.de (Stefan Ram)
|Message-ID: <const-20160926184659@ram.dialup.fu-berlin.de>
|
| It was possibly in this newsgroup where the topic
| "const T vs T const" was discussed some months ago.
| (Or it might have been "comp.lang.c".)
|
| Dan Sacks famously wrote about it and recommended
| "T const" IIRC.
|
| I now found something new about this topic in the
| "Library Design Guidlines" for C++: quote
|
|const goes in the wrong place, i.e., to the left
|
| unquote. Thus, they acknowledge what Dan Sacks
| wrote by using "wrong", only to then recommend
| the opposite, i.e., "left".
|
| Why? Maybe because tradition dictates this. (No
| rationale is given there.)

The C++ Core Guidelines do not seem to discuss
this, but always seem to use const on the left.

The article by Dan Sacks is called »const T vs. T
const« and was or is at www.ultimeth.com/Feb1999.pdf.

(It is the same Dan Sacks who asked for a definition
of »type« recently.)

#1NEWBIEHERE question about "const" before multiple commas and objects

Momo N <ninetynineknights@gmail.com>: Dec 30 04:27PM -0800

so the question is this:

//CONST int ci = 0, &cj = ci;

so does the const entitle every object after its created? does this read "ci is a const int" and "cj is a reference to a const int" as well? or is &cj not a const and just means "a reference to ci"?

if the const DOES encompass the entire line, what about this...

//int j = i; CONST int &k = i; int *p = &i;

now that the const is in the middle, does it ignore the first object and create a const for everything in front of it? including the next object?

"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Dec 31 01:51AM +0100

On 31.12.2016 01:27, Momo N wrote:

> //CONST int ci = 0, &cj = ci;

> so does the const entitle every object after its created? does this
> read "ci is a const int" and "cj is a reference to a const int"

Yes.

> as well? or is &cj not a const and just means "a reference to ci"?

It couldn't be: a reference to non-const and be bound to a const object,
as that would allow you to modify a const object.

> now that the const is in the middle, does it ignore the first object
> and create a const for everything in front of it? including the next
> object?

First off, C++ is case-sensitive. It has to be `const`. Unless you
defined `CONST` as a macro, to be replaced with `const` by the preprocessor.

Semicolons are much stronger delimiters than commas.

The first example is a single declaration statement, with multiple
declarators.

The second example is three declaration statements.

`const T x;` is a shorthand for `T const x`, permitted only at the very
start of the declaration. I think it's unfortunate, but all the examples
in the Holy Standard use prefix `const`. The more general notation is
needed for pointer and function declarations.

For the general notation pointer declaration, just read it backwards. :)

E.g.

char const* p;

declares `p` as a mutable pointer to `const` char, while

char c;
char* const p = &c;

declares p as a constant pointer to mutable char.

Cheers & hth.,

- Alf

You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

soft and program

Saturday, December 31, 2016

Digest for comp.lang.c++@googlegroups.com - 10 updates in 6 topics

No comments:

Blog Archive

About Me