Chapter 4 |

**1.**
This construct makes <`bst.h` 25> idempotent (see idempotent), that is, including it
many times has the same effect as including it once. This is important
because some C constructs, such as type definitions with **typedef**, are
erroneous if included in a program multiple times.

Of course, <Table assertion function control directives 595> is included
outside the #**ifndef**-protected part of <`bst.h` 25>. This is
intentional (see Exercise 2.9-1 for details).

**1.**
Under many circumstances we often want to know how many items are in a
binary tree. In these cases it's cheaper to keep track of item counts
as we go instead of counting them each time, which requires a full
binary tree traversal.

It would be better to omit it if we never needed to know how many items were in the tree, or if we only needed to know very seldom.

**1.**
The purpose for conditional definition of `BST_MAX_HEIGHT` is not to
keep it from being redefined if the header file is included multiple
times. There's a higher-level “include guard” for that (see
Exercise 4-1), and, besides, identical definitions of a macro
are okay in C. Instead, it is to allow the user to set the maximum
height of binary trees by defining that macro before <`bst.h` 25> is
#**include**d. The limit can be adjusted upward for larger computers or
downward for smaller ones.

The main pitfall is that a user program will use different values of
`BST_MAX_HEIGHT` in different source files. This leads to undefined
behavior. Less of a problem are definitions to invalid values, which
will be caught at compile time by the compiler.

**2.**
The functions need to adjust the pointer from the rotated subtree's
parent, so they take a double-pointer **struct** **bst_node** **. An
alternative would be to accept two parameters: the rotated subtree's
parent node and the *bst_link*[] index of the subtree.

/* Rotates right at *yp. */staticvoidrotate_right(structbst_node**yp)

{structbst_node*y= *yp;structbst_node*x=y->bst_link[0];y->bst_link[0] =x->bst_link[1];x->bst_link[1] =y; *yp=x; }

/* Rotates left at *xp. */staticvoidrotate_left(structbst_node**xp)

{structbst_node*x= *xp;structbst_node*y=x->bst_link[1];x->bst_link[1] =y->bst_link[0];y->bst_link[0] =x; *xp=y; }

**1.**
This is a dirty trick. The *bst_root* member of **struct** **bst_table** is
not a **struct** **bst_node**, but we are pretending that it is by casting its
address to **struct** **bst_node** *. We can get away with this only because
the first member of **struct** **bst_node** * is *bst_link*, whose first
element *bst_link*[0] is a **struct** **bst_node** *, the same type as
*bst_root*. ANSI C guarantees that a pointer to a structure is a
pointer to the structure's first member, so this is fine as long as we
never try to access any member of **p* except *bst_link*[0]. Trying to
access other members would result in undefined behavior.

The reason that we want to do this at all is that it means that the
tree's root is not a special case. Otherwise, we have to deal with the
root separately from the rest of the nodes in the tree, because of its
special status as the only node in the tree not pointed to by the
*bst_link*[] member of a **struct** **bst_node**.

It is a good idea to get used to these kinds of pointer cast, because they are common in libavl.

As an alternative, we can declare an actual instance of **struct** **bst_node**, store the tree's *bst_root* into its *bst_link*[0], and copy
its possibly updated value back into *bst_root* when done. This isn't
very elegant, but it works. This technique is used much later in this
book, in <TBST main copy function 281>. A different kind of alternative
approach is used in Exercise 2.

**2.**
Here, pointer-to-pointer *q* traverses the tree, starting with a pointer
to the root, comparing each node found against *item* while looking for
a null pointer. If an item equal to *item* is found, it returns a
pointer to the item's data. Otherwise, *q* receives the address of the
`NULL` pointer that becomes the new node, the new node is created, and a
pointer to its data is returned.

622. <BST item insertion function, alternate version 622> =void**bst_probe(structbst_table*tree,void*item)

{structbst_node**q;intcmp;assert(tree!=NULL&&item!=NULL);for(q= &tree->bst_root; *q!=NULL;q= &(*q)->bst_link[cmp> 0])

{cmp=tree->bst_compare(item, (*q)->bst_data,tree->bst_param);if(cmp== 0)return&(*q)->bst_data; } *q=tree->bst_alloc->libavl_malloc(tree->bst_alloc,sizeof**q);if(*q==NULL)returnNULL; (*q)->bst_link[0] = (*q)->bst_link[1] =NULL; (*q)->bst_data=item;tree->bst_count++;return&(*q)->bst_data; }

**3.**
The first item to be inserted have the value of the original tree's
root. After that, at each step, we can insert either an item with the
value of either child *x* of any node in the original tree
corresponding to a node *y* already in the copy tree, as long as *x*'s
value is not already in the copy tree.

**4.**
The function below traverses *tree* in “level order”. That is, it
visits the root, then the root's children, then the children of the
root's children, and so on, so that all the nodes at a particular
level in the tree are visited in sequence.

**See also:**
[Sedgewick 1998], Program 5.16.

623. <Level-order traversal 623> = /* Callsvisitfor each of the nodes intreein level order. Returns nonzero if successful, zero if out of memory. */staticintbst_traverse_level_order(structbst_table*tree,bst_item_func*visit)

{structbst_node**queue;size_thead,tail;if(tree->bst_count== 0)return1;queue=tree->bst_alloc->libavl_malloc(tree->bst_alloc,

sizeof*queue*tree->bst_count);if(queue==NULL)return0;head=tail= 0;queue[head++] =tree->bst_root;while(head!=tail)

{structbst_node*cur=queue[tail++];visit(cur->bst_data,tree->bst_param);if(cur->bst_link[0] !=NULL)queue[head++] =cur->bst_link[0];if(cur->bst_link[1] !=NULL)queue[head++] =cur->bst_link[1]; }tree->bst_alloc->libavl_free(tree->bst_alloc,queue);return1; }

624. <Root insertion of existing node in arbitrary subtree 624> = /* Performs root insertion ofnatrootwithintree. Subtreerootmust not contain a node matchingn. Returns nonzero only if successful. */staticintroot_insert(structbst_table*tree,structbst_node**root,structbst_node*n)

{structbst_node*pa[BST_MAX_HEIGHT]; /* Nodes on stack. */unsignedcharda[BST_MAX_HEIGHT]; /* Directions moved from stack nodes. */intk; /* Stack height. */structbst_node*p; /* Traverses tree looking for insertion point. */assert(tree!=NULL&&n!=NULL); <Step 1: Search for insertion point in arbitrary subtree 625> <Step 2: Insertninto arbitrary subtree 626> <Step 3: Move BST node to root 37>return1; }

625. <Step 1: Search for insertion point in arbitrary subtree 625> =pa[0] = (structbst_node*)root;da[0] = 0;k= 1;for(p= *root;p!=NULL;p=p->bst_link[da[k- 1]])

{intcmp=tree->bst_compare(n->bst_data,p->bst_data,tree->bst_param);assert(cmp!= 0);if(k>=BST_MAX_HEIGHT)return0;pa[k] =p;da[k++] =cmp> 0; }

This code is included in 624.

626. <Step 2: Insertninto arbitrary subtree 626> =pa[k- 1]->bst_link[da[k- 1]] =n;

This code is included in 624 and 627.

**2.**
The idea is to optimize for the common case but allow for fallback to
a slower algorithm that doesn't require a stack when necessary.

627. <Robust root insertion of existing node in arbitrary subtree 627> = /* Performs root insertion ofnatrootwithintree. Subtreerootmust not contain a node matchingn. Never fails and will not rebalancetree. */staticvoidroot_insert(structbst_table*tree,structbst_node**root,structbst_node*n)

{structbst_node*pa[BST_MAX_HEIGHT]; /* Nodes on stack. */unsignedcharda[BST_MAX_HEIGHT]; /* Directions moved from stack nodes. */intk; /* Stack height. */intoverflow= 0; /* Set nonzero if stack overflowed. */structbst_node*p; /* Traverses tree looking for insertion point. */assert(tree!=NULL&&n!=NULL); <Step 1: Robustly search for insertion point in arbitrary subtree 628> <Step 2: Insertninto arbitrary subtree 626> <Step 3: Robustly move BST node to root 629> }

If the stack overflows while we're searching for the insertion point,
we stop keeping track of any nodes but the last one and set *overflow*
so that later we know that overflow occurred:

628. <Step 1: Robustly search for insertion point in arbitrary subtree 628> =pa[0] = (structbst_node*)root;da[0] = 0;k= 1;for(p= *root;p!=NULL;p=p->bst_link[da[k- 1]])

{intcmp=tree->bst_compare(n->bst_data,p->bst_data,tree->bst_param);assert(cmp!= 0);if(k>=BST_MAX_HEIGHT)

{overflow= 1;k--; }pa[k] =p;da[k++] =cmp> 0; }

This code is included in 627.

Once we've inserted the node, we deal with the rotation in the same
way as before if there was no overflow. If overflow occurred, we
instead do the rotations one by one, with a full traversal from
**root* every time:

629. <Step 3: Robustly move BST node to root 629> =if(!overflow) {

<Step 3: Move BST node to root 37>

}else

{while(*root!=n)

{structbst_node**r; /* Link to node to rotate. */structbst_node*q; /* Node to rotate. */intdir;for(r=root; ;r= &q->bst_link[dir])

{q= *r;dir= 0 <tree->bst_compare(n->bst_data,q->bst_data,

tree->bst_param);if(q->bst_link[dir] ==n)break; }if(dir== 0)

{q->bst_link[0] =n->bst_link[1];n->bst_link[1] =q; }

else

{q->bst_link[1] =n->bst_link[0];n->bst_link[0] =q; } *r=n; } }

This code is included in 627.

**3.**
One insertion order that does *not* require much stack is
ascending order. If we insert 1...4 at the root in ascending
order, for instance, we get a BST that looks like this:

If we then insert node 5, it will immediately be inserted as the right child of 4, and then a left rotation will make it the root, and we're back where we started without ever using more than one stack entry. Other obvious pathological orders such as descending order and “zig-zag” order behave similarly.

One insertion order that does require an arbitrary amount of stack
space is to first insert 1...*n* in ascending order, then the
single item 0. Each of the first group of insertions requires only
one stack entry (except the first, which does not use any), but the
final insertion uses *n* - 1.

If we're interested in high average consumption of stack space, the
pattern consisting of a series of ascending insertions (*n* / 2 + 1)...*n* followed by a second ascending series 1...(*n* / 2),
for even *n*, is most effective. For instance, each insertion for
insertion order 6, 7, 8, 9, 10, 1, 2, 3, 4, 5 requires 0, 1, 1, 1, 1,
5, 6, 6, 6, 6 stack entries, respectively, for a total of 33.

These are, incidentally, the best possible results in each category, as determined by exhaustive search over the 10! == 3,628,800 possible root insertion orders for trees of 10 nodes. (Thanks to Richard Heathfield for suggesting exhaustive search.)

**1.**
Add this before the top-level **else** clause in <Step 2: Delete BST node 40>:

630. <Case 1.5 in BST deletion 630> =elseif(p->bst_link[0] ==NULL)

q->bst_link[dir] =p->bst_link[1];

**2.**
Be sure to look at Exercise 3 before actually making this
change.

631. <Case 3 in BST deletion, alternate version 631> =structbst_node*s=r->bst_link[0];while(s->bst_link[0] !=NULL)

{r=s;s=r->bst_link[0]; }p->bst_data=s->bst_data;r->bst_link[0] =s->bst_link[1];p=s;

We could, indeed, make similar changes to the other cases, but for these cases the code would become more complicated, not simpler.

**3.**
The semantics for libavl traversers only invalidate traversers with
the deleted item selected, but the revised code would actually free the
node of the successor to that item. Because **struct** **bst_traverser**
keeps a pointer to the **struct** **bst_node** of the current item, attempts
to use a traverser that had selected the successor of the deleted item
would result in undefined behavior.

Some other binary tree libraries have looser semantics on their traversers, so they can afford to use this technique.

**1.**
It would probably be faster to check before each call rather than after,
because this way many calls would be avoided. However, it might be more
difficult to maintain the code, because we would have to remember to
check for a null pointer before every call. For instance, the call to
*traverse_recursive*() within *walk*() might easily be overlooked.
Which is “better” is therefore a toss-up, dependent on a program's
goals and the programmer's esthetic sense.

632. <Recursive traversal of BST, using nested function 632> =voidwalk(structbst_table*tree,bst_item_func*action,void*param)

{void

traverse_recursive(structbst_node*node)

{if(node!=NULL)

{traverse_recursive(node->bst_link[0]);action(node->bst_data,param);traverse_recursive(node->bst_link[1]); } }assert(tree!=NULL&&action!=NULL);traverse_recursive(tree->bst_root); }

**1a.**
First of all, a minimal-height binary tree of *n* nodes has a
height (see height) of about
log2(n),
that is, starting from the root and moving only downward, you can visit
at most *n* nodes (including the root) without running out of nodes.
Examination of the code should reveal to you that only moving down to
the left pushes nodes on the stack and only moving upward pops nodes
off. What's more, the first thing the code does is move as far down to
the left as it can. So, the maximum height of the stack in a
minimum-height binary tree of *n* nodes is the binary tree's height, or,
again, about
log2(n).

**1b.**
If a binary tree has only left children, as does the BST on the left
below, the stack will grow as tall as the tree, to a height of *n*.
Conversely, if a binary tree has only right children, as does the BST on
the right below, no nodes will be pushed onto the stack at all.

**1c.**
It's only acceptable if it's known that the stack will not exceed the
fixed maximum height (or if the program aborting with an error is itself
acceptable). Otherwise, you should use a recursive method (but see part
(e) below), or a dynamically extended stack, or a balanced binary tree
library.

**1d.**
Keep in mind this is not the only way or necessarily the best way to
handle stack overflow. Our final code for tree traversal will rebalance
the tree when it grows too tall.

633. <Iterative traversal of BST, with dynamically allocated stack 633> =staticvoidtraverse_iterative(structbst_node*node,bst_item_func*action,void*param)

{structbst_node**stack=NULL;size_theight= 0;size_tmax_height= 0;for(;;)

{while(node!=NULL)

{if(height>=max_height)

{max_height=max_height* 2 + 8;stack=realloc(stack,sizeof*stack*max_height);if(stack==NULL)

{fprintf(stderr,"out of memory\n");exit(EXIT_FAILURE); } }stack[height++] =node;node=node->bst_link[0]; }if(height== 0)break;node=stack[--height];action(node->bst_data,param);node=node->bst_link[1]; }free(stack); }

**1e.**
Yes, *traverse_recursive*() can run out of memory, because its arguments
must be stored somewhere by the compiler. Given typical compilers, it
will consume more memory per call than *traverse_iterative*() will per
item on the stack, because each call includes two arguments not pushed
on *traverse_iterative*()'s stack, plus any needed compiler-specific
bookkeeping information.

**1.**
After calling *bst_balance*(), the structure of the binary tree may have
changed completely, so we need to “find our place” again by setting up
the traverser structure as if the traversal had been done on the
rebalanced tree all along. Specifically, members *node*,
*stack*[], and *height* of **struct** **traverser** need to be
updated.

It is easy to set up **struct** **traverser** in this way, given the
previous node in inorder traversal, which we'll call *prev*. Simply
search the tree from the new root to find this node. Along the way,
because the stack is used to record nodes whose left subtree we are
examining, push nodes onto the stack as we move left down the tree.
Member *node* receives *prev*->*bst_link*[1], just as it would have if
no overflow had occurred.

A small problem with this approach is that it requires knowing the
previous node in inorder, which is neither explicitly noted in **struct** **traverser** nor easy to find out. But it *is* easy to find out
the next node: it is the smallest-valued node in the binary tree rooted
at the node we were considering when the stack overflowed. (If you need
convincing, refer to the code for *next_item*() above: the **while** loop
descends to the left, pushing nodes as it goes, until it hits a `NULL`
pointer, then the node pushed last is popped and returned.) So we can
return this as the next node in inorder while setting up the traverser
to return the nodes after it.

Here's the code:

634. <Handle stack overflow during BST traversal 634> =structbst_node*prev, *iter;prev=node;while(prev->bst_link[0] !=NULL)prev=prev->bst_link[0];bst_balance(trav->table);trav->height= 0;for(iter=trav->table->bst_root;iter!=prev; )if(trav->table->bst_compare(prev->bst_data,iter->bst_data,

trav->table->bst_param) < 0)

{trav->stack[trav->height++] =iter;iter=iter->bst_link[0]; }else

iter=iter->bst_link[1];trav->node=iter->bst_link[1];returnprev->bst_data;

Without this code, it is not necessary to have member *table* in **struct** **traverser**.

**2.**
It is possible to write *prev_item*() given our current *next_item*(), but
the result is not very efficient, for two reasons, both related to the
way that **struct** **traverser** is used. First, the structure doesn't
contain a pointer to the current item. Second, its stack doesn't
contain pointers to trees that must be descended to the left to find a
predecessor node, only those that must be descended to the right to find
a successor node.

The next section will develop an alternate, more general method for traversal that avoids these problems.

**1.**
The *bst_probe*() function can't disturb any traversals. A change in
the tree is only problematic for a traverser if it deletes the currently
selected node (which is explicitly undefined: see Traversers) or if
it shuffles around any of the nodes that are on the traverser's stack.
An insertion into a tree only creates new leaves, so it can't cause
either of those problems, and there's no need to increment the
generation number.

The same logic applies to *bst_t_insert*(), presented later.

On the other hand, an insertion into the AVL and red-black trees
discussed in the next two chapters can cause restructuring of the tree
and thus potentially disturb ongoing traversals. For this reason, the
insertion functions for AVL and red-black trees *will* increment
the tree's generation number.

**2.**
First, *trav_refresh*() is only called from *bst_t_next*() and
*bst_t_prev*(), and these functions are mirrors of each other, so we
need only show it for one of them.

Second, all of the traverser functions check the stack height, so these
will not cause an item to be initialized at too high a height, nor will
*bst_t_next*() or *bst_t_prev*() increase the stack height above its
limit.

Since the traverser functions won't force a too-tall stack directly, this leaves the other functions. Only functions that modify the tree could cause problems, by pushing an item farther down in the tree.

There are only four functions that modify a tree. The insertion
functions *bst_probe*() and *bst_t_insert*() can't cause problems,
because they add leaves but never move around nodes. The deletion
function *bst_delete*() does move around nodes in case 3, but it always
moves them higher in the tree, never lower. Finally, *bst_balance*()
always ensures that all nodes in the resultant tree are within the
tree's height limit.

**3.**
This won't work because the stack may contain pointers to nodes that
have been deleted and whose memory have been freed. In ANSI C89 and
C99, any use of a pointer to an object after the end of its lifetime
results in undefined behavior, even seemingly innocuous uses such as
pointer comparisons. What's worse, the memory for the node may already
have been recycled for use for another, different node elsewhere in the
tree.

This approach does work if there are never any deletions in the tree, or if we use some kind of generation number for each node that we store along with each stack entry. The latter would be overkill unless comparisons are very expensive and the traversals in changing trees are common. Another possibility would be to somehow only select this behavior if there have been no deletions in the binary tree since the traverser was last used. This could be done, for instance, with a second generation number in the binary tree incremented only on deletions, with a corresponding number kept in the traverser.

The following reimplements *trav_refresh*() to include this
optimization. As noted, it will not work if there are any deletions in
the tree. It does work for traversers that must be refreshed due to,
e.g., rebalancing.

635. <BST traverser refresher, with caching 635> = /* Refreshes the stack of parent pointers intravand updates its generation number. Will *not* work if any deletions have occurred in the tree. */staticvoidtrav_refresh(structbst_traverser*trav)

{assert(trav!=NULL);trav->bst_generation=trav->bst_table->bst_generation;if(trav->bst_node!=NULL)

{bst_comparison_func*cmp=trav->bst_table->bst_compare;void*param=trav->bst_table->bst_param;structbst_node*node=trav->bst_node;structbst_node*i=trav->bst_table->bst_root;size_theight= 0;if(trav->bst_height> 0 &&i==trav->bst_stack[0])for(;height<trav->bst_height;height++)

{structbst_node*next=trav->bst_stack[height+ 1];if(i->bst_link[0] !=next&&i->bst_link[1] !=next)break;i=next; }while(i!=node)

{assert(height<BST_MAX_HEIGHT);assert(i!=NULL);trav->bst_stack[height++] =i;i=i->bst_link[cmp(node->bst_data,i->bst_data,param) > 0]; }trav->bst_height=height; } }

**1.**
It only calls itself if it runs out of stack space. Its call to
*bst_balance*() right before the recursive call ensures that the tree is
short enough to fit within the stack, so the recursive call cannot
overflow.

**1.**
The assignment statements are harmless, but *memcpy*() of overlapping
regions produces undefined behavior.

**1a.**
Notice the use of & instead of && below. This ensures that both
link fields get initialized, so that deallocation can be done in a
simple way. If && were used instead then we wouldn't have any way to
tell whether (**y*)->*bst_link*[1] had been initialized.

636. <Robust recursive copy of BST, take 1 636> = /* Stores in *ya new copy of tree rooted atx. Returns nonzero if successful, or zero if memory was exhausted.*/staticintbst_robust_copy_recursive_1(structbst_node*x,structbst_node**y)

{if(x!=NULL)

{ *y=malloc(sizeof**y);if(*y==NULL)return0; (*y)->bst_data=x->bst_data;if(!(bst_robust_copy_recursive_1(x->bst_link[0], &(*y)->bst_link[0]) &bst_robust_copy_recursive_1(x->bst_link[1],

&(*y)->bst_link[1])))

{bst_deallocate_recursive(*y); *y=NULL;return0; } }else

*y=NULL;return1; }

Here's a needed auxiliary function:

637. <Recursive deallocation function 637> =staticvoidbst_deallocate_recursive(structbst_node*node)

{if(node==NULL)return;bst_deallocate_recursive(node->bst_link[0]);bst_deallocate_recursive(node->bst_link[1]);free(node); }

638. <Robust recursive copy of BST, take 2 638> =staticstructbst_nodeerror_node; /* Makes and returns a new copy of tree rooted atx. If an allocation error occurs, returns &error_node. */staticstructbst_node*bst_robust_copy_recursive_2(structbst_node*x)

{structbst_node*y;if(x==NULL)returnNULL;y=malloc(sizeof*y);if(y==NULL)return&error_node;y->bst_data=x->bst_data;y->bst_link[0] =bst_robust_copy_recursive_2(x->bst_link[0]);y->bst_link[1] =bst_robust_copy_recursive_2(x->bst_link[1]);if(y->bst_link[0] == &error_node||y->bst_link[1] == &error_node)

{bst_deallocate_recursive(y);return&error_node; }returny; }

**2.**
Here's one way to do it, which is simple but perhaps not the fastest
possible.

639. <Robust recursive copy of BST, take 3 639> = /* Copies tree rooted atxtoy, which latter is allocated but not

yet initialized. Returns one if successful, zero if memory was exhausted. In the latter caseyis not freed but any partially allocated subtrees are. */staticintbst_robust_copy_recursive_3(structbst_node*x,structbst_node*y)

{y->bst_data=x->bst_data;if(x->bst_link[0] !=NULL)

{y->bst_link[0] =malloc(sizeof*y->bst_link[0]);if(y->bst_link[0] ==NULL)return0;if(!bst_robust_copy_recursive_3(x->bst_link[0],y->bst_link[0]))

{free(y->bst_link[0]);return0; } }else

y->bst_link[0] =NULL;if(x->bst_link[1] !=NULL)

{y->bst_link[1] =malloc(sizeof*y->bst_link[1]);if(y->bst_link[1] ==NULL)return0;if(!bst_robust_copy_recursive_3(x->bst_link[1],y->bst_link[1]))

{bst_deallocate_recursive(y->bst_link[0]);free(y->bst_link[1]);return0; } }else

y->bst_link[1] =NULL;return1; }

640. <Intermediate step betweenbst_copy_recursive_2() andbst_copy_iterative() 640> = /* Copiesorgto a newly created tree, which is returned. */structbst_table*bst_copy_iterative(conststructbst_table*org)

{structbst_node*stack[2 * (BST_MAX_HEIGHT+ 1)];intheight= 0;structbst_table*new;conststructbst_node*x;structbst_node*y;new=bst_create(org->bst_compare,org->bst_param,org->bst_alloc);new->bst_count=org->bst_count;if(new->bst_count== 0)returnnew;x= (conststructbst_node*) &org->bst_root;y= (structbst_node*) &new->bst_root;for(;;)

{while(x->bst_link[0] !=NULL)

{y->bst_link[0] =

org->bst_alloc->libavl_malloc(org->bst_alloc,sizeof*y->bst_link[0]);stack[height++] = (structbst_node*)x;stack[height++] =y;x=x->bst_link[0];y=y->bst_link[0]; }y->bst_link[0] =NULL;for(;;)

{y->bst_data=x->bst_data;if(x->bst_link[1] !=NULL)

{y->bst_link[1] =

org->bst_alloc->libavl_malloc(org->bst_alloc,sizeof*y->bst_link[1]);x=x->bst_link[1];y=y->bst_link[1];break; }else

y->bst_link[1] =NULL;if(height<= 2)returnnew;y=stack[--height];x=stack[--height]; } } }

**1.**
*bst_copy*() can set *bst_data* to `NULL` when memory allocation fails.

**1.**
Factoring out recursion is troublesome in this case. Writing the loop
with an explicit stack exposes more explicitly the issue of stack
overflow. Failure on stack overflow is not acceptable, because it
would leave both trees in disarray, so we handle it by dropping back
to a slower algorithm that does not require a stack.

This code also makes use of *root_insert*() from <Robust root insertion of existing node in arbitrary subtree 627>.

641. <BST join function, iterative version 641> = /* Adds totreeall the nodes in the tree rooted atp. */staticvoidfallback_join(structbst_table*tree,structbst_node*p)

{structbst_node*q;for(;p!=NULL;p=q)if(p->bst_link[0] ==NULL)

{q=p->bst_link[1];p->bst_link[0] =p->bst_link[1] =NULL;root_insert(tree, &tree->bst_root,p); }else

{q=p->bst_link[0];p->bst_link[0] =q->bst_link[1];q->bst_link[1] =p; } } /* Joinsaandb, which must be disjoint and have compatible

comparison functions.bis destroyed in the process. */voidbst_join(structbst_table*ta,structbst_table*tb)

{size_tcount=ta->bst_count+tb->bst_count;if(ta->bst_root==NULL)ta->bst_root=tb->bst_root;elseif(tb->bst_root!=NULL)

{structbst_node**pa[BST_MAX_HEIGHT];structbst_node*qa[BST_MAX_HEIGHT];intk= 0;pa[k] = &ta->bst_root;qa[k++] =tb->bst_root;while(k> 0)

{structbst_node**a=pa[--k];structbst_node*b=qa[k];for(;;)

{structbst_node*b0=b->bst_link[0];structbst_node*b1=b->bst_link[1];b->bst_link[0] =b->bst_link[1] =NULL;root_insert(ta,a,b);if(b1!=NULL)

{if(k<BST_MAX_HEIGHT)

{pa[k] = &(*a)->bst_link[1];qa[k] =b1;if(*pa[k] !=NULL)k++;else

*pa[k] =qa[k]; }

else

{intj;fallback_join(ta,b0);fallback_join(ta,b1);for(j= 0;j<k;j++)fallback_join(ta,qa[j]);ta->bst_count=count;free(tb);bst_balance(ta);return; } }a= &(*a)->bst_link[0];b=b0;if(*a==NULL)

{ *a=b;break; }

elseif(b==NULL)break; } } }ta->bst_count=count;free(tb); }

**1.**
Functions not used at all are *bst_insert*(), *bst_replace*(),
*bst_t_replace*(), *bst_malloc*(), and *bst_free*().

Functions used explicitly within *test*() or functions that it calls are
*bst_create*(), *bst_find*(), *bst_probe*(), *bst_delete*(),
*bst_t_init*(), *bst_t_first*(), *bst_t_last*(),
*bst_t_insert*(), *bst_t_find*(), *bst_t_copy*(), *bst_t_next*(), *bst_t_prev*(),
*bst_t_cur*(), *bst_copy*(), and *bst_destroy*().

The *trav_refresh*() function is called indirectly by modifying the tree
during traversal.

The *copy_error_recovery*() function is called if a memory allocation
error occurs during *bst_copy*(). The *bst_balance*() function, and
therefore also *tree_to_vine*(), *vine_to_tree*(), and *compress*(), are
called if a stack overflow occurs. It is possible to force both these
behaviors with command-line options to the test program.

**2.**
Some kinds of errors mean that we can keep going and test other parts
of the code. Other kinds of errors mean that something is deeply
wrong, and returning without cleanup is the safest action short of
terminating the program entirely. The third category is memory
allocation errors. In our test program these are always caused
intentionally in order to test out the BST functions' error recovery
abilities, so a memory allocation error is not really an error at all,
and we clean up and return successfully. (A real memory allocation
error will cause the program to abort in the memory allocator. See
the definition of *mt_allocate*() within <Memory tracker 127>.)

**1.**
The definition of **size_t** differs from one compiler to the next. All
we know about it for sure is that it's an unsigned type appropriate for
representing the size of an object. So we must convert it to some known
type in order to pass it to *printf*(), because *printf*(), having a
variable number of arguments, does not know what type to convert it
into.

Incidentally, C99 solves this problem by providing a ``z`' modifier
for *printf*() conversions, so that we could use `"%zu"` to print out
**size_t** values without the need for a cast.

**See also:**
[ISO 1999], section 7.19.6.1.

642. <Generate random permutation of integers 642> = /* Fills thenelements ofarray[] with a random permutation of the integers between 0 andn- 1. */staticvoidpermuted_integers(intarray[],size_tn)

{size_ti;for(i= 0;i<n;i++)array[i] =i;for(i= 0;i<n;i++)

{size_tj=i+ (unsigned)rand() / (RAND_MAX/ (n-i) + 1);intt=array[j];array[j] =array[i];array[i] =t; } }

This code is included in 644.

**2.**
All it takes is a preorder traversal. If the code below is confusing,
try looking back at <Initialize *smaller* and *larger* within binary search tree 618>.

643. <Generate permutation for balanced tree 643> = /* Generates a list of integers that produce a balanced tree when inserted in order into a binary tree in the usual way.minandmaxinclusively bound the values to be inserted. Output is deposited starting at *array. */staticvoidgen_balanced_tree(intmin,intmax,int**array)

{inti;if(min>max)return;i= (min+max+ 1) / 2; *(*array)++ =i;gen_balanced_tree(min,i- 1,array);gen_balanced_tree(i+ 1,max,array); }

This code is included in 644.

644. <Insertion and deletion order generation 644> = <Generate random permutation of integers 642> <Generate permutation for balanced tree 643> /* Generates a permutation of the integers 0 ton- 1 intoinsert[] according toinsert_order. */staticvoidgen_insertions(size_tn,enuminsert_orderinsert_order,intinsert[])

{size_ti;switch(insert_order)

{caseINS_RANDOM:permuted_integers(insert,n);break;caseINS_ASCENDING:for(i= 0;i<n;i++)insert[i] =i;break;caseINS_DESCENDING:for(i= 0;i<n;i++)insert[i] =n-i- 1;break;caseINS_BALANCED:gen_balanced_tree(0,n- 1, &insert);break;caseINS_ZIGZAG:for(i= 0;i<n;i++)if(i% 2 == 0)

insert[i] =i/ 2;else

insert[i] =n-i/ 2 - 1;break;caseINS_ASCENDING_SHIFTED:for(i= 0;i<n;i++)

{insert[i] =i+n/ 2;if((size_t)insert[i] >=n)insert[i] -=n; }break;caseINS_CUSTOM:for(i= 0;i<n;i++)if(scanf("%d", &insert[i]) == 0)fail("error reading insertion order from stdin");break;default:assert(0); } } /* Generates a permutation of the integers 0 ton- 1 intodelete[] according todelete_orderandinsert[]. */staticvoidgen_deletions(size_tn,enumdelete_orderdelete_order,constint*insert,int*delete)

{size_ti;switch(delete_order)

{caseDEL_RANDOM:permuted_integers(delete,n);break;caseDEL_REVERSE:for(i= 0;i<n;i++)delete[i] =insert[n-i- 1];break;caseDEL_SAME:for(i= 0;i<n;i++)delete[i] =insert[i];break;caseDEL_CUSTOM:for(i= 0;i<n;i++)if(scanf("%d", &delete[i]) == 0)fail("error reading deletion order from stdin");break;default:assert(0); } }

This code is included in 98.

**4.**
The function below is carefully designed. It uses *time*() to obtain
the current time. The alternative *clock*() is a poor choice because
it measures CPU time used, which is often more or less constant among
runs. The actual value of a **time_t** is not portable, so it computes
a “hash” of the bytes in it using a multiply-and-add technique. The
factor used for multiplication normally comes out as 257, a prime and
therefore a good candidate.

**See also:**
[Knuth 1998a], section 3.2.1;
[Aho 1986], section 7.6.

645. <Random number seeding 645> = /* Choose and return an initial random seed based on the current time. Based on code by Lawrence Kirby <fred@genesis.demon.co.uk>. */unsignedtime_seed(void)

{time_ttimeval; /* Current time. */unsignedchar*ptr; /* Type punned pointed into timeval. */unsignedseed; /* Generated seed. */size_ti;timeval=time(NULL);ptr= (unsignedchar*) &timeval;seed= 0;for(i= 0;i<sizeoftimeval;i++)seed=seed* (UCHAR_MAX+ 2u) +ptr[i];returnseed; }

This code is included in 98.

646. <Overflow testers 125> +=staticinttest_bst_t_last(structbst_table*tree,intn)

{structbst_traversertrav;int*last;last=bst_t_last(&trav,tree);if(last==NULL|| *last!=n- 1)

{printf(" Last item test failed: expected %d, got %d\n",n- 1,last!=NULL? *last: -1);return0; }return1; }staticinttest_bst_t_find(structbst_table*tree,intn)

{inti;for(i= 0;i<n;i++)

{structbst_traversertrav;int*iter;iter=bst_t_find(&trav,tree, &i);if(iter==NULL|| *iter!=i)

{printf(" Find item test failed: looked for %d, got %d\n",i,iter!=NULL? *iter: -1);return0; } }return1; }staticinttest_bst_t_insert(structbst_table*tree,intn)

{inti;for(i= 0;i<n;i++)

{structbst_traversertrav;int*iter;iter=bst_t_insert(&trav,tree, &i);if(iter==NULL||iter== &i|| *iter!=i)

{printf(" Insert item test failed: inserted dup %d, got %d\n",i,iter!=NULL? *iter: -1);return0; } }return1; }staticinttest_bst_t_next(structbst_table*tree,intn)

{structbst_traversertrav;inti;bst_t_init(&trav,tree);for(i= 0;i<n;i++)

{int*iter=bst_t_next(&trav);if(iter==NULL|| *iter!=i)

{printf(" Next item test failed: expected %d, got %d\n",i,iter!=NULL? *iter: -1);return0; } }return1; }staticinttest_bst_t_prev(structbst_table*tree,intn)

{structbst_traversertrav;inti;bst_t_init(&trav,tree);for(i=n- 1;i>= 0;i--)

{int*iter=bst_t_prev(&trav);if(iter==NULL|| *iter!=i)

{printf(" Previous item test failed: expected %d, got %d\n",i,iter!=NULL? *iter: -1);return0; } }return1; }staticinttest_bst_copy(structbst_table*tree,intn)

{structbst_table*copy=bst_copy(tree,NULL,NULL,NULL);intokay=compare_trees(tree->bst_root,copy->bst_root);bst_destroy(copy,NULL);returnokay; }

**1.**
Attempting to apply an allocation policy to allocations of zero-byte
blocks is silly. How could a failure be indicated, given that one of
the successful results for an allocation of 0 bytes is `NULL`? At any
rate, libavl never calls *bst_allocate*() with a *size* argument of
0.

**See also:**
[ISO 1990], section 7.10.3.

**1.**
We'll use *bsts_*, short for “binary search tree with sentinel”, as
the prefix for these functions. First, we need node and tree
structures:

647. <BSTS structures 647> = /* Node for binary search tree with sentinel. */structbsts_node

{structbsts_node*link[2];intdata; }; /* Binary search tree with sentinel. */structbsts_tree

{structbsts_node*root;structbsts_nodesentinel;structlibavl_allocator*alloc; };

This code is included in 651.

Searching is simple:

648. <BSTS functions 648> = /* Returns nonzero only ifitemis intree. */intbsts_find(structbsts_tree*tree,intitem)

{conststructbsts_node*node;tree->sentinel.data=item;node=tree->root;while(item!=node->data)if(item<node->data)

node=node->link[0];else

node=node->link[1];returnnode!= &tree->sentinel; }

See also 649.

This code is included in 651.

Insertion is just a little more complex, because we have to keep track of the link that we just came from (alternately, we could divide the function into multiple cases):

649. <BSTS functions 648> += /* Insertsitemintotree, if it is not already present. */voidbsts_insert(structbsts_tree*tree,intitem)

{structbsts_node**q= &tree->root;structbsts_node*p=tree->root;tree->sentinel.data=item;while(item!=p->data)

{intdir=item>p->data;q= &p->link[dir];p=p->link[dir]; }if(p== &tree->sentinel)

{ *q=tree->alloc->libavl_malloc(tree->alloc,sizeof**q);if(*q==NULL)

{fprintf(stderr,"out of memory\n");exit(EXIT_FAILURE); } (*q)->link[0] = (*q)->link[1] = &tree->sentinel; (*q)->data=item; } }

Our test function will just insert a collection of integers, then make sure that all of them are in the resulting tree. This is not as thorough as it could be, and it doesn't bother to free what it allocates, but it is good enough for now:

650. <BSTS test 650> = /* Tests BSTS functions.insertanddeletemust contain some permutation of values 0...n- 1. */inttest_correctness(structlibavl_allocator*alloc,int*insert,int*delete,intn,intverbosity)

{structbsts_treetree;intokay= 1;inti;tree.root= &tree.sentinel;tree.alloc=alloc;for(i= 0;i<n;i++)bsts_insert(&tree,insert[i]);for(i= 0;i<n;i++)if(!bsts_find(&tree,i))

{printf("%d should be in tree, but isn't\n",i);okay= 0; }returnokay; } /* Not supported. */inttest_overflow(structlibavl_allocator*alloc,intorder[],intn,

intverbosity)

{return0; }

This code is included in 651.

Function *test*() doesn't free allocated nodes, resulting in a memory
leak. You should fix this if you are concerned about it.

Here's the whole program:

651. <bsts.c651> = <Library License 1> #include<assert.h> #include<stdio.h> #include<stdlib.h> #include"test.h" <BSTS structures 647> <Memory allocator; tbl => bsts 6> <Default memory allocator header; tbl => bsts 8> <Default memory allocation functions; tbl => bsts 7> <BSTS functions 648> <BSTS test 650>

**See also:**
[Bentley 2000], exercise 7 in chapter 13.