|5.4.3 Step 3: Update Balance Factors|
When we add a new node n to an AVL tree, the balance factor of n's parent must change, because the new node increases the height of one of the parent's subtrees. The balance factor of n's parent's parent may need to change, too, depending on the parent's balance factor, and in fact the change can propagate all the way up the tree to its root.
At each stage of updating balance factors, we are in a similar situation. First, we are examining a particular node p that is one of n's direct ancestors. The first time around, p is n's parent, the next time, if necessary, p is n's grandparent, and so on. Second, the height of one of p's subtrees has increased, and which one can be determined using da.
In general, if the height of p's left subtree increases, p's balance factor decreases. On the other hand, if the right subtree's height increases, p's balance factor increases. If we account for the three possible starting balance factors and the two possible sides, there are six possibilities. The three of these corresponding to an increase in one subtree's height are symmetric with the others that go along with an increase in the other subtree's height. We treat these three cases below.
If p had balance factor 0, its new balance factor is - or +, depending on the side of the root to which the node was added. After that, the change in height propagates up the tree to p's parent (unless p is the tree's root) because the height of the subtree rooted at p's parent has also increased.
The example below shows a new node n inserted as the left child of a node with balance factor 0. On the far left is the original tree before insertion; in the middle left is the tree after insertion but before any balance factors are adjusted; in the middle right is the tree after the first adjustment, with p as n's parent; on the far right is the tree after the second adjustment, with p as n's grandparent. Only in the trees on the far left and far right are all of the balance factors correct.
If the new node was added to p's shorter subtree, then the subtree has become more balanced and its balance factor becomes 0. If p started out with balance factor +, this means the new node is in p's left subtree. If p had a - balance factor, this means the new node is in the right subtree. Since tree p has the same height as it did before, the change does not propagate up the tree any farther, and we are done. Here's an example that shows pre-insertion and post-balance factor updating views:
If the new node was added on the taller side of a subtree with nonzero balance factor, the balance factor becomes +2 or -2. This is a problem, because balance factors in AVL trees must be between -1 and +1. We have to rebalance the tree in this case. We will cover rebalancing later. For now, take it on faith that rebalancing does not increase the height of subtree p as a whole, so there is no need to propagate changes any farther up the tree.
Here's an example of an insertion that leads to rebalancing. On the left is the tree before insertion; in the middle is the tree after insertion and updating balance factors; on the right is the tree after rebalancing to. The -2 balance factor is shown as two minus signs (--). The rebalanced tree is the same height as the original tree before insertion.
As another demonstration that the height of a rebalanced subtree does not change after insertion, here's a similar example that has one more layer of nodes. The trees below follow the same pattern as the ones above, but the rebalanced subtree has a parent. Even though the tree's root has the wrong balance factor in the middle diagram, it turns out to be correct after rebalancing.
Looking at the rules above, we can see that only in case 1, where p's balance factor is 0, do changes to balance factors continue to propagate upward in the tree. So we can start from n's parent and move upward in the tree, handling case 1 each time, until we hit a nonzero balance factor, handle case 2 or case 3 at that node, and we're done (except for possible rebalancing afterward).
Wait a second—there is no efficient way to move upward in a binary search tree!1 Fortunately, there is another approach we can use. Remember the extra code we put into <Step 1: Search AVL tree for insertion point 148>? This code kept track of the last node we'd passed through that had a nonzero balance factor as s. We can use s to move downward, instead of upward, through the nodes whose balance factors are to be updated.
Node s itself is the topmost node to be updated; when we arrive at node n, we know we're done. We also kept track of the directions we moved downward in da. Suppose that we've got a node p whose balance factor is to be updated and a direction d that we moved from it. We know that if we moved down to the left (d == 0) then the balance factor must be decreased, and that if we moved down to the right (d == 1) then the balance factor must be increased.
Now we have enough knowledge to write the code to update balance factors. The results are almost embarrassingly short:
150. <Step 3: Update balance factors after AVL insertion 150> = for (p = y, k = 0; p != n; p = p->avl_link[da[k]], k++) if (da[k] == 0) p->avl_balance–; else
This code is included in 146, 301, and 419.
Now p points to the new node as a consequence of the loop's exit condition. Variable p will not be modified again in this function, so it is used in the function's final return statement to take the address of the new node's avl_data member (see <AVL item insertion function 146> above).
1. Can case 3 be applied to the parent of the newly inserted node? [answer]
2. For each of the AVL trees below, add a new node with a value smaller than any already in the tree and update the balance factors of the existing nodes. For each balance factor that changes, indicate the numbered case above that applies. Which of the trees require rebalancing after the insertion?
3. Earlier versions of libavl used chars, not unsigned chars, to cache the results of comparisons, as the elements of da are used here. At some warning levels, this caused the GNU C compiler to emit the warning “array subscript has type `char'” when it encountered expressions like q->avl_link[da[k]]. Explain why this can be a useful warning message. [answer]
4. If our AVL trees won't ever have a height greater than 32, then we can portably use the bits in a single unsigned long to compactly store what the entire da array does. Write a new version of step 3 to use this form, along with any necessary modifications to other steps and avl_probe()'s local variables. [answer]
 We could make a list of the nodes as we move down the tree and reuse it on the way back up. We'll do that for deletion, but there's a simpler way for insertion, so keep reading.