5.4 Insertion |
The insertion function for unbalanced BSTs does not maintain the AVL balancing rule, so we have to write a new insertion function. But before we get into the nitty-gritty details, let's talk in generalities. This is time well spent because we will be able to apply many of the same insights to AVL deletion and insertion and deletion in red-black trees.
Conceptually, there are two stages to any insertion or deletion operation in a balanced tree. The first stage may lead to violation of the tree's balancing rule. If so, we fix it in the second stage. The insertion or deletion itself is done in the first stage, in much the same way as in an unbalanced BST, and we may also do a bit of additional bookkeeping work, such as updating balance factors in an AVL tree, or swapping node “colors” in red-black trees.
If the first stage of the operation does not lead to a violation of the tree's balancing rule, nothing further needs to be done. But if it does, the second stage rearranges nodes and modifies their attributes to restore the tree's balance. This process is said to rebalance (see rebalance) the tree. The kinds of rebalancing that might be necessary depend on the way the operation is performed and the tree's balancing rule. A well-chosen balancing rule helps to minimize the necessity for rebalancing.
When rebalancing does become necessary in an AVL or red-black tree, its effects are limited to the nodes along or near the direct path from the inserted or deleted node up to the root of the tree. Usually, only one or two of these nodes are affected, but, at most, one simple manipulation is performed at each of the nodes along this path. This property ensures that balanced tree operations are efficient (see Exercise 1 for details).
That's enough theory for now. Let's return to discussing the details of AVL insertion. There are four steps in libavl's implementation of AVL insertion:
Steps 1 and 2 are the same as for insertion into a BST. Step 3 performs the additional bookkeeping alluded to above in the general description of balanced tree operations. Finally, step 4 rebalances the tree, if necessary, to restore the AVL balancing rule.
The following sections will cover all the details of AVL insertion. For now, here's an outline of avl_probe():
148. <AVL item insertion function 148> = void **
avl_probe (struct avl_table *tree, void *item)
{ <avl_probe() local variables 149> assert (tree != NULL && item != NULL); <Step 1: Search AVL tree for insertion point 150> <Step 2: Insert AVL node 151> <Step 3: Update balance factors after AVL insertion 152> <Step 4: Rebalance after AVL insertion 153> }
This code is included in 147.
149. <avl_probe() local variables 149> = struct avl_node *y, *z; /* Top node to update balance factor, and parent. */ struct avl_node *p, *q; /* Iterator, and parent. */ struct avl_node *n; /* Newly inserted node. */ struct avl_node *w; /* New root of rebalanced subtree. */ int dir; /* Direction to descend. */ unsigned char da[AVL_MAX_HEIGHT]; /* Cached comparison results. */ int k = 0; /* Number of cached results. */
This code is included in 148, 303, and 421.
See also: [Knuth 1998b], algorithm 6.2.3A.
Exercises:
*1. When rebalancing manipulations are performed on the chain of nodes from the inserted or deleted node to the root, no manipulation takes more than a fixed amount of time. In other words, individual manipulations do not involve any kind of iteration or loop. What can you conclude about the speed of an individual insertion or deletion in a large balanced tree, compared to the best-case speed of an operation for unbalanced BSTs? [answer]