A Simple Binary Search Tree Written in C#

A simple Binary Search Tree written inC# Herbert Sauro, 18 Aug 2008 BSD 334.1K 12.3K 138 4.80 (34 votes)  Rate:vote 1vote 2vote 3vote 4vote 5 A simple Binary Search Tree written in C# that can be used to store and retrieve large amounts of data quickly. Download demo and source files - 27.1 KB Introduction In Computer Science, a binary tree is a hierarchical structure of nodes, each node referencing at most to two child nodes. Every binary tree has a root from which the first two child nodes originate. If a node has no children, then such nodes are usually termed leaves, and mark the extent of the tree structure. A particular kind of binary tree, called the binary search tree, is very useful for storing data for rapid access, storage, and deletion. Data in a binary search tree are stored in tree nodes, and must have associated with them an ordinal value or key; these keys are used to structure the tree such that the value of a left child node is less than that of the parent node, and the value of a right child node is greater than that of the parent node. Sometimes, the key and datum are one and the same. Typical key values include simple integers or strings, the actual data for the key will depend on the application. In this article, I describe a binary search tree that stores string/double pairs. That is, the key is the string value and the data associated with the key is a double value. Developers can search the tree using string values. Background There are a number of basic operations one can apply to a binary search tree, the most obvious include, insertion, searching, and deletion. To insert a new node into a tree, the following method can be used. We first start at the root of the tree, and compare the ordinal value of the root to the ordinal value of the node to be inserted. If the ordinal values are identical, then we have a duplicate and we return to the caller indicating so. If the ordinal value is less than the root, then we follow the left branch of the root, else we follow the right branch. We now start the comparison again but at the branch we took, comparing the ordinal value of the child with the node to be inserted. Traversal of the tree continues in this manner until we reach a left or right node which is empty and we can go no further. At this easier to understand when considered in terms of their unit operation. then the node is simply deleted. the next ordinal node (called the successive node) on the right branch is used to replaced the deleted node. even here. Both insertion and searching are naturally recursive and are. In the source code provided with this article. A basic recursive search algorithm will look like: Hide Copy Code node search (node. To delete a node with two children. Note that new nodes are always inserted as leaves into the tree. else return search (node. insertion is implemented recursively. The successive node will always be the left most node on the right branch (likewise. key). then the node is deleted and the child node is brought forward to link to the parent. if node.right. the predecessor node will be the right most node on the left branch). arguably. while searching uses an iterative approach. then we report this to the caller. nodes are thus appended rather than inserted. key) { if node is null then return null. and nodes with two children. If a node has no children. the rules are straightforward when stated. Searching a binary search tree is almost identical to inserting a new node except that we stop the traversal when we find the node we're looking for (during an insertion. Deletion is a little bit more complicated but boils down to three rules. However. . The complication occurs when a node has two children. nodes with one child. The figure below illustrates the deletion rules. If the node has one child.point. we insert the new node into this empty location. The three rules refer to deleting nodes without any children. key). and strictly speaking.key = key then return node if key < node then return search (node. If the node is not located. this would indicate a duplicate node in the tree).left. The successive node is then deleted. The . a binary search tree scales with log (N) where N is the number of data items (still far better than a linear search). In contrast. and although not implemented here. Balanced Trees The time taken to insert or search for a specific item in a tree will be affected by a tree's depth. Deep trees take longer to search. In theory. Hash tables have better search and insertion performance metrics. The figure below shows three trees generated by three identical data sets but inserted in a different order. A random insertion order will generally produce a more bushy and hence shallower tree compared to an ordered insert. Bushy trees are often called balanced trees. balancing a tree is a highly desirable feature for a binary search tree implementation. The first is the most balanced and hence the most shallow of the three examples.A common alternative to using binary search tree is to use Hash tables. . Certain algorithms such as the red-black tree will auto-balance as the tree is constructed (see Red/Black tree animation).NET libraries contain explicit support for Hash tables. the time it takes to insert or search for an item in a Hash table is independent of the number of data items stored. and the insertion order into a tree can affect a tree's shape. it returns null.insert ("Phi Centauri". 4. particularly when the trees are unbalanced. the insertion of data into the tree. The method insert() is used to insert new data. When successful. name and value. findSymbol() returns a TTreeNode object which has two properties. bt. 4. and the method findSymbol() is used to locate and retrieve data. The following code illustrates how to use the binary search tree.insert ("Canis Minoris". The following code illustrates the instantiation of a new binary tree.insert ("Gamma Cancri". Hide Copy Code // Create a new binary tree bt = new TBinarySTree(). . bt. If findSymbol() fails to locate the data item.findSymbol ("Phi Centauri"). 5. 3.37). and subsequent retrieval.83).Implementing the search and insertion methods using a recursive approach has the potential to yield poor performance. bt.66). // Insert data bt. // Retrieve data TTreeNode symbol = bt. The class name of the binary tree is TBinarySTree.insert ("Kappa Tauri". Using the Code Using the source code provided with this article is very easy.21). and the individual nodes have class type TTreeNode. Other methods of interest include: Hide Copy Code bt. the method throws a simple exception.NET's built-in Hashtable class. . The source should compile on C# 2. count returns the number of nodes in the tree. The source code is licensed under the BSD license.count().if (symbol != null) Console. that is.000 items (21 intervals in all).0. symbol. The X axis indicates the number of data items stored. In contrast. symbol. unpack the source. If the method fails to locate the node. To use the source code. load the binary tree solution (binaryTree. this is far better than a linear search which would scale as O(N). bt. ranging from 1000 to 1.WriteLine ("Star {1} has magnitude = {0}".value).000. Hide Copy Code bt. doubling the number of stored items doubles the average time taken to retrieve a single data item.clear(). The Y axis indicates the average time required to retrieve one data item (averaged over 20. The graph on the left shows the data plotted on log axes. TheHashtable follows roughly O(1).000 retrieval attempts). include as a reference to BinaryTree.delete (key).name. Points of Interest The following graphs compare the performance of the Binary Tree search with . The implementations follow the expected scaling laws as the number of stored data items increase.dll. This version was written using Visual Studio 2003. that is the time taken to retrieve data is independent of the number of data items stored. the binary search tree scales roughly O(log (N)).sln) and compile the BinaryTree project. However. In your own project. delete will delete the node with the given key. The project where this implementation originated did not require any further sophistication. Ideally one could implement the Red/Black variant to avoid this issue (see reference 1. However. Ideally. 2. Other minor changes include using a property in place of the public method count. Other Possibilities One should consider the implementation outlined here as the minimum practical implementation. . The code for timing was derived fromTobi+C#=T#. In particular. at the end of the article for details).NET. there are a number of areas where it could be significantly improved. two areas warrant further work: 1. adding further utility methods. one would prefer a more generic implementation where a developer could employ their own object type. The current implementation is specific to storing name/value pairs. The implementation may suffer performance degradation when subjected to large data sets if the trees become significantly unbalanced. and changing the naming convention on the classes and methods to make them more consistent with .Times were computed using the QueryPerformanceCounter() method.

A Simple Binary Search Tree Written in C#

Comments

Description