Chapter 5       #### Section 5.4

1. In a BST, the time for an insertion or deletion is the time required to visit each node from the root down to the node of interest, plus some time to perform the operation itself. Functions bst_probe() and bst_delete() contain only a single loop each, which iterates once for each node examined. As the tree grows, the time for the actual operation loses significance and the total time for the operation becomes essentially proportional to the height of the tree, which is approximately log2 (n) in the best case (see Analysis of AVL Balancing Rule).

We were given that the additional work for rebalancing an AVL or red-black tree is at most a constant amount multiplied by the height of the tree. Furthermore, the maximum height of an AVL tree is 1.44 times the maximum height for the corresponding perfectly balanced binary tree, and a red-black tree has a similar bound on its height. Therefore, for trees with many nodes, the worst-case time required to insert or delete an item in a balanced tree is a constant multiple of the time required for the same operation on an unbalanced BST in the best case. In the formal terms of computer science, insertion and deletion in a balanced tree are O(log n) operations, where n is the number of nodes in the tree.

In practice, operations on balanced trees of reasonable size are, at worst, not much slower than operations on unbalanced binary trees and, at best, much faster.

#### Section 5.4.2

1. Variable y is only modified within <Step 1: Search AVL tree for insertion point 148>. If y is set during the loop, it is set to p, which is always a non-null pointer within the loop. So y can only be NULL if it is last set before the loop begins. If that is true, it will be NULL only if tree->avl_root == NULL. So, variable y can only be NULL if the AVL tree was empty before the insertion.

A NULL value for y is a special case because later code assumes that y points to a node.

#### Section 5.4.3

1. No. Suppose that n is the new node, that p is its parent, and that p has a - balance factor before n's insertion (a similar argument applies if p's balance factor is +). Then, for n's insertion to decrease p's balance factor to -2, n would have to be the left child of p. But if p had a - balance factor before the insertion, it already had a left child, so n cannot be the new left of p. This is a contradiction, so case 3 will never be applied to the parent of a newly inserted node.

2. In the leftmost tree, case 2 applies to the root's left child and the root's balance factor does not change. In the middle tree, case 1 applies to the root's left child and case 2 applies to the root. In the rightmost tree, case 1 applies to the root's left child and case 3 applies to the root. The tree on the right requires rebalancing, and the others do not.

3. Type char may be signed or unsigned, depending on the C compiler and/or how the C compiler is run. Also, a common use for subscripting an array with a character type is to translate an arbitrary character to another character or a set of properties. For example, this is a common way to implement the standard C functions from ctype.h. This means that subscripting such an array with a char value can have different behavior when char changes between signed and unsigned with different compilers (or with the same compiler invoked with different options).

4. Here is one possibility:

```650. <Step 3: Update balance factors after AVL insertion, with bitmasks 650> =
for (p = y; p != n; p = p->avl_link[cache & 1], cache >>= 1)
if ((cache & 1) == 0)
p->avl_balance–;
else     p->avl_balance++;
```

Also, replace the declarations of da[] and k by these:

```unsigned long cache = 0; /* Cached comparison results. */
int k = 0;              /* Number of cached comparison results. */
```

and replace the second paragraph of code within the loop in step 1 by this:

```if (p->avl_balance != 0)
z = q, y = p, cache = 0, k = 0;

dir = cmp > 0;
if (dir)
cache |= 1ul << k;
k++;
```

It is interesting to note that the speed difference between this version and the standard version was found to be negligible, when compiled with full optimization under GCC (both 2.95.4 and 3.0.3) on x86.

#### Section 5.4.4

1. Because then y's right subtree would have height 1, so there's no way that y could have a +2 balance factor.

2. The value of y is set during the search for item to point to the closest node above the insertion point that has a nonzero balance factor, so any node below y along this search path, including x, must have had a 0 balance factor originally. All such nodes are updated to have a nonzero balance factor later, during step 3. So x must have either a - or + balance factor at the time of rebalancing.

3.1. 3.2. 3.3. 4. w should replace y as the left or right child of z. y != z->avl_link has the value 1 if y is the right child of z, or 0 if y is the left child. So the overall expression replaces y with w as a child of z.

The suggested substitution is a poor choice because if z == (struct avl_node *) &tree->root, z->avl_link is undefined.

5. Yes.

#### Section 5.5.2

1. This approach cannot be used in libavl (see Exercise 4.8-3).

```651. <Case 3 in AVL deletion, alternate version 651> =
struct avl_node *s;

da[k] = 1;
pa[k++] = p;
for (;;)   {
da[k] = 0;
pa[k++] = r;
break;

r = s;
}
p->avl_data = s->avl_data;
p = s;
```

2. We could, if we use the standard libavl code for deletion case 3. The alternate version in Exercise 1 modifies item data, which would cause the wrong value to be returned later.

#### Section 5.5.4

1. Tree y started out with a + balance factor, meaning that its right subtree is taller than its left. So, even if y's left subtree had height 0, its right subtree has at least height 1, meaning that y must have at least one right child.

2. Rebalancing is required at each level if, at every level of the tree, the deletion causes a +2 or -2 balance factor at a node p while there is a +1 or -1 balance factor at p's child opposite the deletion.

For example, consider the AVL tree below: Deletion of node 32 in this tree leads to a -2 balance factor on the left side of node 31, causing a right rotation at node 31. This shortens the right subtree of node 28, causing it to have a -2 balance factor, leading to a right rotation there. This shortens the right subtree of node 20, causing it to have a -2 balance factor, forcing a right rotation there, too. Here is the final tree: Incidentally, our original tree was an example of a “Fibonacci tree”, a kind of binary tree whose form is defined recursively, as follows. A Fibonacci tree of order 0 is an empty tree and a Fibonacci tree of order 1 is a single node. A Fibonacci tree of order n >= 2 is a node whose left subtree is a Fibonacci tree of order n - 1 and whose right subtree is a Fibonacci tree of order n - 2. Our example is a Fibonacci tree of order 7. Any big-enough Fibonacci tree will exhibit this pathological behavior upon AVL deletion of its maximum node.

#### Section 5.6

1. At this point in the code, p points to the avl_data member of an struct avl_node. We want a pointer to the struct avl_node itself. To do this, we just subtract the offset of the avl_data member within the structure. A cast to char * is necessary before the subtraction, because offsetof returns a count of bytes, and a cast to struct avl_node * afterward, to make the result the right type.