Trees in Data Structures
- A tree is non-linear data structure in which items are arranged in a sorted sequence. It is used to impose a hierarchical structure on a collection of data items.
- A tree is recursively defined as a set of one or more nodes where one node is designated as the root of the tree and all the remaining nodes can be partitioned into non-empty sets each of which is a sub-tree of the root.
- A node is a structure which may contain a value, a condition, or represent a separate data structure (which could be a tree of its own).
- Each node in a tree has zero or more child nodes, which are below it in the tree (by convention, trees grow down, not up as they do in nature).
- A node that has a child is called the child's parent node (or ancestor node, or superior). A node has at most one parent.
- There is a special data item called the root of the tree. And its remaining data items are partitioned into number of mutually exclusive subsets, each of which is itself a tree. And they are called subtrees.
- A tree is a data structure which is mainly used to store hierarchical data. A tree is recursively defined as collection of one or more nodes where one node is designated as the root of the tree and the remaining nodes can be partitioned into non-empty sets each of which is a sub-tree of the root.
- Root : It is specially designed data item in a tree. It is the first in the hierarchical arrangement of data items. The root node is the topmost node in the tree.
- Node : Each data item in a tree is called a node.
- Degree of a Node : The degree of a node of a tree is the number of subtrees having this node as a root. In other words, the degree is the number of descendants of a node. If the degree is zero, it is called a terminal or leaf node of a tree.
- Degree of a Tree : The degree of a tree is defined as the maximum of degree of the nodes of the tree, that is, degree of tree = max (degree(node i) for I = 1 to n)
- Terminal node : A node with degree zero is call a terminal node or a leaf.
- Non-terminal node : Any node (except the root node) whose degree is not zero is called non-terminal node.
- Siblings : The children nodes of a given parent node are called siblings.
- Level : Every node in the tree is assigned a level number in such a way that the root node is at level 0, children of the root node are at level number 1. Thus, every node is at one level higher than its parent. So, all child nodes have a level number given by parent’s level number + 1.
- Edge : It is a connecting line of two nodes. That is, the line drawn from one node to another node is called an edge.
- Path : It is a sequence of consecutive edges from the source node to the destination node.
- Depth : It is the maximum level of any node in a given tree.
- Forest : It is a set of disjoint tree. In a given tree, if you remove its root node then it becomes a forest.
Types of trees
- Binary Tree
- Strictly Binary Trees
- Complete Binary Tree
- Extended Binary Trees
- Binary search trees
- AVL tree or height balanced binary tree. Read More >>
- Expression trees
- Tournament trees
- The binary tree is a fundamental data structure used in computer science. The binary tree is a useful data structure for rapidly storing sorted data and rapidly retrieving stored data.
- A binary tree is composed of parent nodes, or leaves, each of which stores data and also links to up to two other child nodes (leaves) which can be visualized spatially as below the first node with one placed to the left and with one placed to the right.
- It is the relationship between the leaves linked to and the linking leaf, also known as the parent node, which makes the binary tree such an efficient data structure.
- It is the leaf on the left which has a lesser key value (i.e., the value used to search for a leaf in the tree), and it is the leaf on the right which has an equal or greater key value.
- As a result, the leaves on the farthest left of the tree have the lowest values, whereas the leaves on the right of the tree have the greatest values.
- More importantly, as each leaf connects to two other leaves, it is the beginning of a new, smaller, binary tree.
- Due to this nature, it is possible to easily access and insert data in a binary tree using search and insert functions recursively called on successive leaves.
Strictly Binary Trees
- If every non-terminal node in a binary tree consists of non empty left subtree and right subtree, then such a tree is called strictly binary tree.
Complete Binary Trees
- A complete binary tree is a binary tree that satisfies two properties. First, in a complete binary tree, every level, except possibly the last, is completely filled. Second, all nodes appear as far left as possible. In a complete binary tree Tn , there are exactly n nodes and level r of T can have at most 2r nodes.
- A complete binary tree can be represented in an array in the following approach. If root node is stored at index i, its left, and right children are stored at indices 2*i+1, 2*i+2 respectively.
Extended Binary Trees
- A binary tree T is said to be an extended binary tree (or a 2-tree) if each node in the tree has either no child or exactly two children.
- In an extended binary tree, nodes having two children are called internal nodes and nodes having no children are called external nodes.
Binary search trees
- We consider a particular kind of a binary tree called a Binary Search Tree (BST). The basic idea behind this data structure is to have such a storing repository that provides the efficient way of data sorting, searching and retrieving.
- A binary search tree, also known as an ordered binary tree, is a variant of binary tree in which the nodes are arranged in an order.
A BST is a binary tree where nodes are ordered in the following way:
- each node contains one key (also known as data) ·
- the keys in the left subtree are less than the key in its parent node, in short L < P;
- the keys in the right subtree are greater the key in its parent node, in short P < R;
- duplicate keys are not allowed.
A binary search tree has the following 2 characteristics for every node n in the tree:
- Every element in n's left subtree is less or equal to the element in node n.
- Every element in n's right subtree is greater than the element in node n.
- Huffman coding is an entropy encoding algorithm developed by David A. Huffman that is widely used as a lossless data compression technique.
- The Huffman coding algorithm uses a variablelength code table to encode a source character where the variable-length code table is derived on the basis of the estimated probability of occurrence of the source character.
- The key idea behind Huffman algorithm is that it encodes the most common characters using shorter strings of bits than those used for less common source characters.
Advantages of trees
Trees are so useful and frequently used, because they have some very serious advantages:
- Trees reflect structural relationships in the data
- Trees are used to represent hierarchies
- Trees provide an efficient insertion and searching
- Trees are very flexible data, allowing to move subtrees around with minumum effort
Applications of tree
- Trees are used to store simple as well as complex data. Here simple means an integer value, character value and complex data means a structure or a record.
- Trees are often used for implementing other types of data structures like hash tables, sets, and maps.
- Trees are an important data structure used for compiler construction.
- Trees are also used in database design.
- Trees are used in file system directories.
- Trees are also widely used for information storage and retrieval in symbol tables.