5.4.3 Step 3: Update Balance Factors |

When we add a new node *n* to an AVL tree, the balance factor of *n*'s
parent must change, because the new node increases the height of one
of the parent's subtrees. The balance factor of *n*'s parent's parent
may need to change, too, depending on the parent's balance factor, and
in fact the change can propagate all the way up the tree to its root.

At each stage of updating balance factors, we are in a similar
situation. First, we are examining a particular node *p* that is one
of *n*'s direct ancestors. The first time around, *p* is *n*'s
parent, the next time, if necessary, *p* is *n*'s grandparent, and so
on. Second, the height of one of *p*'s subtrees has increased, and
which one can be determined using *da*[].

In general, if the height of *p*'s left subtree increases, *p*'s
balance factor decreases. On the other hand, if the right subtree's
height increases, *p*'s balance factor increases. If we account for
the three possible starting balance factors and the two possible
sides, there are six possibilities. The three of these corresponding
to an increase in one subtree's height are symmetric with the others
that go along with an increase in the other subtree's height. We
treat these three cases below.

If *p* had balance factor 0, its new balance factor is - or +,
depending on the side of the root to which the node was added. After
that, the change in height propagates up the tree to *p*'s parent
(unless *p* is the tree's root) because the height of the subtree rooted
at *p*'s parent has also increased.

The example below shows a new node *n* inserted as the left child of a
node with balance factor 0. On the far left is the original tree before
insertion; in the middle left is the tree after insertion but before any
balance factors are adjusted; in the middle right is the tree after the
first adjustment, with *p* as *n*'s parent; on the far right is the tree
after the second adjustment, with *p* as *n*'s grandparent. Only in the
trees on the far left and far right are all of the balance factors
correct.

If the new node was added to *p*'s shorter subtree, then the subtree has
become more balanced and its balance factor becomes 0. If *p* started
out with balance factor +, this means the new node is in *p*'s left
subtree. If *p* had a - balance factor, this means the new node is in
the right subtree. Since tree *p* has the same height as it did before,
the change does not propagate up the tree any farther, and we are done.
Here's an example that shows pre-insertion and post-balance factor
updating views:

If the new node was added on the taller side of a subtree with nonzero
balance factor, the balance factor becomes +2 or -2. This is a
problem, because balance factors in AVL trees must be between -1 and
+1. We have to rebalance the tree in this case. We will cover
rebalancing later. For now, take it on faith that rebalancing does
not increase the height of subtree *p* as a whole, so there is no need
to propagate changes any farther up the tree.

Here's an example of an insertion that leads to rebalancing. On the left is the tree before insertion; in the middle is the tree after insertion and updating balance factors; on the right is the tree after rebalancing to. The -2 balance factor is shown as two minus signs (--). The rebalanced tree is the same height as the original tree before insertion.

As another demonstration that the height of a rebalanced subtree does not change after insertion, here's a similar example that has one more layer of nodes. The trees below follow the same pattern as the ones above, but the rebalanced subtree has a parent. Even though the tree's root has the wrong balance factor in the middle diagram, it turns out to be correct after rebalancing.

Looking at the rules above, we can see that only in case 1, where *p*'s
balance factor is 0, do changes to balance factors continue to propagate
upward in the tree. So we can start from *n*'s parent and move upward
in the tree, handling case 1 each time, until we hit a nonzero balance
factor, handle case 2 or case 3 at that node, and we're done (except for
possible rebalancing afterward).

Wait a second—there is no efficient way to move upward in a binary
search tree!^{1} Fortunately,
there is another approach we can use. Remember the extra code we put
into <Step 1: Search AVL tree for insertion point 150>? This code kept
track of the last node we'd passed through that had a nonzero balance
factor as *y*. We can use *y* to move downward, instead of upward,
through the nodes whose balance factors are to be updated.

Node *y* itself is the topmost node to be updated; when we arrive at
node *n*, we know we're done. We also kept track of the directions we
moved downward in *da*[]. Suppose that we've got a node *p* whose
balance factor is to be updated and a direction *d* that we moved from
it. We know that if we moved down to the left (*d* == 0) then the
balance factor must be decreased, and that if we moved down to the right
(*d* == 1) then the balance factor must be increased.

Now we have enough knowledge to write the code to update balance factors. The results are almost embarrassingly short:

152. <Step 3: Update balance factors after AVL insertion 152> =for(p=y,k= 0;p!=n;p=p->avl_link[da[k]],k++)if(da[k] == 0)p->avl_balance--;else

p->avl_balance++;

This code is included in 148, 303, and 421.

Now *p* points to the new node as a consequence of the loop's exit
condition. Variable *p* will not be modified again in this function, so
it is used in the function's final **return** statement to take the
address of the new node's *avl_data* member (see <AVL item insertion function 148> above).

**Exercises:**

**1.** Can case 3 be applied to the parent of the newly inserted node?
[answer]

**2.** For each of the AVL trees below, add a new node with a value smaller
than any already in the tree and update the balance factors of the
existing nodes. For each balance factor that changes, indicate the
numbered case above that applies. Which of the trees require
rebalancing after the insertion?

[answer]

**3.** Earlier versions of libavl used **char**s, not **unsigned** **char**s, to
cache the results of comparisons, as the elements of *da*[] are used
here. At some warning levels, this caused the GNU C compiler to emit
the warning “array subscript has type `char'” when it
encountered expressions like *q*->*avl_link*[*da*[*k*]]. Explain why this
can be a useful warning message.
[answer]

**4.** If our AVL trees won't ever have a height greater than 32, then we can
portably use the bits in a single **unsigned** **long** to compactly store
what the entire *da*[] array does. Write a new version of step 3 to
use this form, along with any necessary modifications to other steps
and *avl_probe*()'s local variables.
[answer]

[1] We could make a list of the nodes as we move down the tree and reuse it on the way back up. We'll do that for deletion, but there's a simpler way for insertion, so keep reading.