Home > Articles

This chapter is from the book

Finding a Node

Finding a node with a specific key is the simplest of the major tree operations. It’s also the most important because it is essential to the binary search tree’s purpose.

The Visualization tool shows only the key for each node and a color for its data. Keep in mind that the purpose of the data structure is to store a collection of records, not just the key or a simple color. The keys can be more than simple integers; any data type that can be ordered could be used. The Visualization and examples shown here use integers for brevity. After a node is discovered by its key, it’s the data that’s returned to the caller, not the node itself.

Using the Visualization Tool to Find a Node

Look at the Visualization tool and pick a node, preferably one near the bottom of the tree (as far from the root as possible). The number shown in this node is its key value. We’re going to demonstrate how the Visualization tool finds the node, given the key value.

For purposes of this discussion, we choose to find the node holding the item with key value 50, as shown in Figure 8-8. Of course, when you run the Visualization tool, you may get a different tree and may need to pick a different key value.

FIGURE 8-8

FIGURE 8-8 Finding the node with key 50

Enter the key value in the text entry box, hold down the Shift key, and select the Search button, and then the Step button, image1.jpg. By repeatedly pressing the Step button, you can see all the individual steps taken to find key 50. On the second press, the current pointer shows up at the root of the tree, as seen in Figure 8-8. On the next click, a parent pointer shows up that will follow the current pointer. Ignore that pointer and the code display for a moment; we describe them in detail shortly.

As the Visualization tool looks for the specified node, it makes a decision at the current node. It compares the desired key with the one found at the current node. If it’s the same, it’s found the desired node and can quit. If not, it must decide where to look next.

In Figure 8-8 the current arrow starts at the root. The program compares the goal key value 50 with the value at the root, which is 77. The goal key is less, so the program knows the desired node must be on the left side of the tree—either the root’s left child or one of that child’s descendants. The left child of the root has the value 59, so the comparison of 50 and 59 will show that the desired node is in the left subtree of 59. The current arrow goes to 46, the root of that subtree. This time, 50 is greater than the 46 node, so it goes to the right, to node 56, as shown in Figure 8-9. A few steps later, comparing 50 with 56 leads the program to the left child. The comparison at that leaf node shows that 50 equals the node’s key value, so it has found the node we sought.

FIGURE 8-9

FIGURE 8-9 The second to last step in finding key 50

The Visualization tool changes a little after it finds the desired node. The current arrow changes into the node arrow (and parent changes into p). That’s because of the way variables are named in the code, which we show in the next section. The tool doesn’t do anything with the node after finding it, except to encircle it and display a message saying it has been found. A serious program would perform some operation on the found node, such as displaying its contents or changing one of its fields.

Python Code for Finding a Node

Listing 8-3 shows the code for the __find() and search() methods. The __find() method is private because it can return a node object. Callers of the BinarySearchTree class use the search() method to get the data stored in a node.

LISTING 8-3 The Methods to Find a Binary Search Tree Node Based on Its Key
class BinarySearchTree(object):            # A binary search tree classdef __find(self, goal):                 # Find an internal node whose key
      current = self.__root                # matches goal and its parent. Start at
      parent = self                        # root and track parent of current node
      while (current and                   # While there is a tree left to explore
             goal != current.key):         # and current key isn't the goal
          parent = current                 # Prepare to move one level down
          current = (                      # Advance current to left subtree when
             current.leftChild if goal < current.key else # goal is
             current.rightChild)           # less than current key, else right

      # If the loop ended on a node, it must have the goal key
      return (current, parent)            # Return the node or None and parent

   def search(self, goal):                # Public method to get data associated
      node, p = self.__find(goal)         # with a goal key. First, find node
      return node.data if node else None  # w/ goal & return any data

The only argument to __find() is goal, the key value to be found. This routine creates the variable current to hold the node currently being examined. The routine starts at the root – the only node it can access directly. That is, it sets current to the root. It also sets a parent variable to self, which is the tree object. In the Visualization tool, parent starts off pointing at the tree object. Because parent links are not stored in the nodes, the __find() method tracks the parent node of current so that it can return it to the caller along with the goal node. This capability will be very useful in other methods. The parent variable is always either the BinarySearchTree being searched or one of its __Node objects.

In the while loop, __find() first confirms that current is not None and references some existing node. If it doesn’t, the search has gone beyond a leaf node (or started with an empty tree), and the goal node isn’t in the tree. The second part of the while test compares the value to be found, goal, with the value of the current node’s key field. If the key matches, then the loop is done. If it doesn’t, then current needs to advance to the appropriate subtree. First, it updates parent to be the current node and then updates current. If goal is less than current’s key, current advances to its left child. If goal is greater than current’s key, current advances to its right child.

Can't Find the Node

If current becomes equal to None, you’ve reached the end of the line without finding the node you were looking for, so it can’t be in the tree. That could happen if the root node was None or if following the child links led to a node without a child (on the side where the goal key would go). Both the current node (None) and its parent are returned to the caller to indicate the result. In the Visualization tool, try entering a key that doesn’t appear in the tree and select Search. You then see the current pointer descend through the existing nodes and land on a spot where the key should be found but no node exists. Pointing to “empty space” indicates that the variable’s value is None.

Found the Node

If the condition of the while loop is not satisfied while current references some node in the tree, then the loop exits, and the current key must be the goal. That is, it has found the node being sought and current references it. It returns the node reference along with the parent reference so that the routine that called __find() can access any of the node’s (or its parent’s) data. Note that it returns the value of current for both success and failure of finding the key; it is None when the goal isn’t found.

The search() method calls the __find() method to set its node and parent (p) variables. That’s what you see in the Visualization tool after the __find() method returns. If a non-None reference was found, search() returns the data for that node. In this case, the method assumes that data items stored in the nodes can never be None; otherwise, the caller would not be able to distinguish them.

Tree Efficiency

As you can see, the time required to find a node depends on its depth in the tree, the number of levels below the root. If the tree is balanced, this is O(log N) time, or more specifically O(log2 N) time, the logarithm to base 2, where N is the number of nodes. It’s just like the binary search done in arrays where half the nodes were eliminated after each comparison. A fully balanced tree is the best case. In the worst case, the tree is completely unbalanced, like the examples shown in Figure 8-6, and the time required is O(N). We discuss the efficiency of __find() and other operations toward the end of this chapter.

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.