•

Happy Holidays from Codecall!

### Recent Blog Entries

• Vaielab

I Wish You a Merry Whatever You Celebrate

### Popular Tags

Developed by TechBiz Xccelerator

# Trees 101

2 replies to this topic

### #1 WingedPanther

WingedPanther

A spammer's worst nightmare

• Moderator
• 17,283 posts
• Location:Upstate, South Carolina
• Programming Language:C, C++, PL/SQL, Delphi/Object Pascal, Pascal, Transact-SQL, Others
• Learning:Java, C#, PHP, JavaScript, Lisp, Fortran, Haskell, Others

Posted 30 January 2009 - 06:29 PM

In the last tutorial, I compared the efficiency of searches and inserting elements for arrays and linked lists. While arrays offer better efficiency on searches, linked lists offer better efficiency for inserting elements. What would be nice is to have the best of both worlds. Let's look at how the search in arrays works, and see if we can apply it to something resembling a linked list. To begin with, let's look at which element of the array is potentially being looked at in each level of the search.
```       +----+----+----+----+----+----+----+----+----+----+
array: |  1 |  3 |  5 |  7 |  9 | 11 | 13 | 15 | 17 | 19 |
+----+----+----+----+----+----+----+----+----+----+
0    1    2    3    4    5    6    7    8    9

round1:                       *
round2:        *                             *
round3:   *         *              *              *
round4:                  *              *              *

```
At each stage in the search, the array-based search checks the current location, and then moves either left or right. What this means is that at each stage, something resembling a linked list would need to be able to move to one of two links. The move can be to the left or the right. Also notice that going left is always moving to a lower value, and moving right is always moving to a larger value. Based on all this, we can create a new structure based on this idea:
```                          +-------------+
+-------------+
|        |
+------------+        +---------------+
\|/                                   \|/
+-------------+                       +-------------+
|left  3 right|                       |left 15 right|
+-------------+                       +-------------+
|        |                            |        |
+--+        +---+                   +----+        +---+
\|/             \|/                 \|/               \|/
+-------------+ +-------------+     +-------------+   +-------------+
|left  1 right| |left  5 right|     |left 11 right|   |left 17 right|
+-------------+ +-------------+     +-------------+   +-------------+
|        |      |        |          |        |        |        |
\|/      \|/    \|/       |         \|/       |       \|/       |
NULL     NULL   NULL     \|/        NULL     \|/      NULL     \|/
+-------------+     +-------------+   +-------------+
|left  7 right|     |left 13 right|   |left 19 right|
+-------------+     +-------------+   +-------------+
|        |          |        |        |        |
\|/      \|/        \|/      \|/      \|/      \|/
NULL     NULL       NULL     NULL     NULL     NULL
```
This type of structure is referred to as a binary tree (because each node can only have two children). head is pointing to the "root" of the tree, and the "leaves" are at the bottom (with two NULL pointers out), in this case containing 1, 7, 13, and 19. Binary trees can be implemented in other ways (such as an array where the left and right child are other elements of the array), but this particular diagram strongly suggests a variant of a linked list that uses two pointers instead of one. Also, if you take an ordinary linked list, and stick an additional pointer on each node that never gets used, it counts as a binary tree, even though you aren't gaining anything.

Binary trees are heavily used in a variety of applications, including parsers (both of code and of math expressions) and binary search trees. Binary search trees also have several variants, including height balanced (AVL) trees and splay trees. Also, before we go any further, notice that binary trees do incur a cost. Having two pointers per node rather than one is an additional memory overhead. If sizeof(int) = sizeof(node *), then switching from an array to a linked list doubles the size of the structure, and a binary tree is three times the size of the same sized array or 50% larger than the same sized linked list.

In general, there are three aspects of system design that interact: space complexity (memory use), time complexity (algorithm speed), and code complexity (tortured logic). The analysis above points out one of the truisms of programming: a gain in one complexity almost always comes at a cost in one or both of the others. If you can improve something in one area without a cost in the other two areas, DO IT! For example: in a standard binary tree, adding new data is a simple matter of adding a new node at a leaf (as a "child"). This can end up with one side of the tree being "deeper" (having more links to reach a leaf) than the other side, which costs time on a search. Height balanced trees avoid this problem, but come at a cost of taking more processing time when adding/removing a node (to make sure one side of the tree or a sub-tree doesn't get imbalanced). Height balanced trees have a problem that most of the data is at or near the leaves, which means accessing a node repeatedly is likely to have a high search cost. Splay trees accommodate this by moving recent searches to the root of the tree for easy frequent access. Every optimization has a cost.

Let's look at how find and insert operations might work on something like this. Let's also look at the structure of a node (to finish the implementation and be able to compare with trees/linked lists). Searching works like the hybrid it is. In each cycle it checks if the item has been found, if not, it checks for whether to continue the search on the left or right branch. Search efficiency is O(log(N)) if the tree is balanced (as in the diagram above). If everything's along one branch (say the left), then it is O(N)
```struct node
{
int data;
node* left;
node* right;
};

{
while (temp!=NULL)
{
if (temp->data == tofind) return temp;
if (temp->data > tofind)
{
temp=temp->left;
}
else
{
temp=temp->right;
}
}
return NULL;
}
```

Inserting a node uses the same logic as the search, only now the goal is to reach a NULL. When it is found, the pointer is used to allocate a new node and store the data. Notice here that inserting has the same efficiency as searching... O(log(N)) if balanced, O(N) if completely unbalanced. The diagram of an insert (for 10) is followed by the code:
```                          +-------------+
+-------------+
|        | choose right
+------------+        +---------------+
\|/                                   \|/
+-------------+                       +-------------+
|left  3 right|                       |left 15 right|
+-------------+                       +-------------+
|        |                choose left |        |
+--+        +---+                   +----+        +---+
\|/             \|/                 \|/               \|/
+-------------+ +-------------+     +-------------+   +-------------+
|left  1 right| |left  5 right|     |left 11 right|   |left 17 right|
+-------------+ +-------------+     +-------------+   +-------------+
|        |      |        |          |        |        |        |
\|/      \|/    \|/       |         \|/       |       \|/       |
NULL     NULL   NULL     \|/   10 goes here  \|/      NULL     \|/
+-------------+     +-------------+   +-------------+
|left  7 right|     |left 13 right|   |left 19 right|
+-------------+     +-------------+   +-------------+
|        |          |        |        |        |
\|/      \|/        \|/      \|/      \|/      \|/
NULL     NULL       NULL     NULL     NULL     NULL
```
```node* insert(node* head, int toadd)
{
while (temp!=NULL)
{
{
if (temp->left==NULL)
{
temp->left = new node;
temp=temp->left;
return temp;
}
else
{
temp=temp->left;
}
}
else
{
if (temp->right==NULL)
{
temp->right = new node;
temp=temp->right;
return temp;
}
else
{
temp=temp->right;
}
}
}
return NULL;
}
```

Notice that the efficiency can vary widely with this simple insert function. The various types of trees, such as AVL and splay trees seek to put a little more work into the insert functions in exchange for more control over the search efficiency. AVL trees make sure the tree is balanced, ensuring the best case scenario (along with half the nodes being at the bottom of the tree). Splay trees adjust the tree so the most recent search result is the head (useful having very quick repeat searches on a single value, but imbalances the tree). These variants deserve tutorials of their own.
• 2

Programming is a branch of mathematics.
My CodeCall Blog | My Personal Blog

My MineCraft server site: http://banishedwings.enjin.com/

### #2 Guest_Jordan_*

Guest_Jordan_*
• Guest

Posted 31 January 2009 - 06:59 AM

Wow, impressive! I've never heard of this term: tortured logic. Blog idea?
• 0

### #3 Phoenixz

Phoenixz

• Just Joined
• 222 posts

Posted 31 January 2009 - 07:54 AM

Although admittedly I didn't understand all of it, I hope to some day, very detailed and nicely done. +rep'd
• 0