Saturday, 29 March 2014

Binary Search Trees and Linked Lists

Binary search trees (BSTs) and linked lists (LLs) are both abstract data types. In this post, I will explain their structure and utility.

Before defining BST, I would like to quickly review binary trees (BT). At their most basic level, BTs are defined as having a three node structure: a root node, a left node, and a right, node-- each of which can equal None. In a deeper recursive structure, each left and right node, stemming from the root, will have their own left and right nodes. A BST, is structurally similar to a BT, but with two conditions:


  1. All items to the left of the root node must be smaller in value than the root node
  2. All items to the right of the root node must be greater in value than the root node.

Please see the diagram below for an example of a B (a) vs. a BST (b).

BT vs BST
 Taken from http://msdn.microsoft.com/en-us/library/ms379572(v=vs.80).aspx
The diagram above shows a BT (a) and a BST (b), each containing similar information, only organized differently. In (b) all of the items to the left of the root node are less than 7. All the items to the right of the root node are greater than 7. BSTs have one obvious advantage over BTs: efficiency. For instance, if we want to search for a value in a BST, we can determine whether the value is less than or greater than the root node, and then traverse only the appropriate side. In this way, we can ignore one half of the tree. Because BTs are unorganized, searching for an item requires a traversal of both sides.

I would now like to switch my focus to linked lists (LLs). At its most basic level, an LL is structurally defined by a root/head node and a reference/tail node. A reference node can either be empty or contain another LL-- this is where recursion comes in handy. See the diagram below for an example of a linked list:

A Linked List Data Structure,
Taken from https://wiki.cs.auckland.ac.nz/compsci105ss/index.php/Linked_Lists

The Diagram above would be represented, in python, as:

LinkedList(2, LinkedList(4, LinkedList(6, LinkedList(8, None))))

We can tell immediately that parsing through this structure will require recursion because of its nested structure. This raises the questions, why use linked lists? Aren't regular lists simpler? They don't require recursion, after all. I immediately asked these two questions after reading about LLs. The benefit to LL is realized when muting them. Take the regular list [1, 2, 3, 4] for instance. In order to remove 2 from this list, we have to reassign the index of every item after it, i.e. 3 and 4. In this example, this might not seem so bad, given the short length of our list. However, if we were dealing with a list of millions of items, it would require a significant amount of memory. Now, if we express the previous example as a linked list, LinkedList(1, LinkedList(2, LinkedList(3, LinkedList(4, None)))), we can remove items with far less memory. In order to remove 2, python just has to reassign the reference node of 1 to 3. It does not have to reassign each index. The diagram below illustrates this concept.

Deleting an item in a LL
Taken from http://www.cs.iit.edu/~cs561/cs331/linked_lists/DeletionMiddle.html

In this way, the advantage of both BSTs and LLs is their efficiency relative to their counterparts, BTs and lists respectively. BSTs allow us to reject an entire side of a binary tree structure when searching for a value. LLs don't necessitate a reindexing of all its constituent items in the same way that lists do; LLs only require a reassignment of a reference node.

For more information on LLs and BSTs, I recommend checking out GR's post. He or she provides some great metaphors for binary search trees. Ilikechicken also provided a post with some useful graphical representations of LinkedLists.

No comments:

Post a Comment