Data
Trees: Introduction
(Up to Basic
Computer Science)
It is recommended that you read the essay
on linked lists before proceeding to this essay.
Trees are data structures organized into a group of interconnected
nodes that form leaves and branches. This is a little confusing,
so looking at the diagram below would be easier to understand than
reading an explanation:
You can see why this structure is called a tree. A computer data
tree, however, is different from a plant tree in that the root is
at the top and the leaves are at the bottom. The root is
the first node of the entire tree, and there is only one root per
tree. In the diagram, D is the root node. Each of the letters in
the tree above represents a node, much like the linked
list.
Most of the nodes link to one or more other nodes, represented
by the red arrows in the diagram above. Nodes that are directly
linked to a node a level above them are called child nodes
or children to the parent node, and nodes that share
a parent node are a set of sibling nodes. The link between
a parent and a child is called a branch. Thus, in the diagram
above, H is a parent node to G, K, and L. These three child nodes
are siblings to each other. A sub-tree is just a section
of the tree. G, S, and F constitute the left sub-tree extending
from H.
Some nodes do not have any children, and these are known as leaf
nodes. So, S, F, B, L, and M are all leaf nodes.
What I have just described above is the general structure for a
tree. Still, most programmers use a variation of the general tree
that is more suited to the particular problem at hand. These variations
usually deal with the organization and structure of the trees themselves,
and each have their good points.
The simplest variation to the general tree is a binary tree.
In a binary tree, each node has at most two children. The binary
tree also has a couple of variations. The best known variation is
the binary search tree, which is ordered
to allow really fast data searching.
Another type of binary tree is the heap.
A heap is primarily used to prioritize data, but there is also a
very fast array sorting algorithm implemented with a heap.
Other ordered tree variations are the 2-3
tree and the 2-3-4 tree. Although ordered, they are not binary
trees. These take care of the problem of balancing the tree.
This speeds up the searching of the tree.
One of the most complex tree variations is the Red-Black tree.
This tree is basically a binary tree representation of the 2-3-4
tree, thus combining the more simple binary tree design with the
search efficiency of a 2-3-4 tree.
Finally, a tree variation that is not necessarily ordered, the
minimax game tree. This structure is important to programming Chess
and Go AI, as well as the AI of many other board games.