All the ML code discussed in this page is available here in a text file which can be downloaded and used in the assignment.
A labeled binary tree is one where each node is labeled, and has up to two daughter nodes. Such objects are ubiquitous in computer science. They are used for sorting, searching, parsing code, and many other tasks.
To manipulate such trees, we use a feature of ML, namely the possibility of defining new data types (beyond integer, boolean, etc.). A tree is defined recursively, and the data type of a tree reflects this fact. The line of code is
datatype 'a b_tree = Empty | Node of
'a b_tree
*
'a *
'a b_tree;
Here 'a
b_tree indicates a binary tree with nodes labeled by elements of
type 'a
(called val in ML), i.e. unrestricted here but set in an application,
such as integer, or string, or list. Each such tree is empty (the first choice
on the right, or (indicated by |) a (non-empty) node that is a triple of another
tree (the left subtree), an 'a object (the label), and another tree (the right
subtree). That is
Empty is
a binary tree. Node (tree1,
lab, tree2) is a binary tree. Although we usually do not include empty nodes when we visualize a tree, it is more convenient in ML to have every non-empty node have exactly two daughters, of which one or both may be empty. This is clearly a recursive definition (for which ML is perfect), and all our manipulation of trees will be recursive.
For example, two basic functions of trees are the height
(largest level of non-empty nodes in the tree), and size (number of non-empty
nodes) (for Moscow ML, you have to first use the command load "Int";):
fun
height Empty = 0 (* height of b_tree *)
| height (Node (lft, _, rht)) = 1 + Int.max (height (lft), height
(rht));
fun size Empty = 0 (* size of a tree *)
| size (Node (lft, _, rht)) = 1 + size lft + size rht;
Understand how these simple functions work over the recursion of the data type. A binary tree is quite analogous to a list (with the empty tree the analogue of the nil list), except two successors are attached at each stage. Recursion over a tree is often called structural recursion.
A standard use for a labeled binary tree is to create an object for searching--for example, a dictionary. Of course, to find a particular word in a dictionary, one could start at the beginning and check every word in order--i.e., treat the dictionary as a list. However, this is inefficient, and no one looks up a word that way. One makes a guess where to open the dictionary, and then moves forward and backward through by comparing the word of interest to the labels in the dictionary. A search tree is set up for precisely this kind of search.
Let us suppose we have a collection of objects (of type val), and a
comparison function comp(e.g.,
less than for integers, or lexicographically preceding for strings) which for
any two such objects lab1, lab2, the
evaluation comp
(lab1,lab2) is either true or false.
Then we can create a search tree where
comp)
lab lie to the left of lab, comp)
lab lie to the right of lab. We search the tree by dropping down to the left or right at each step until we either encounter the label we are looking for, or run out of tree.
There are two basic operations necessary,
The basic step for the first is inserting a new node in the proper place. The following code does the job:
fun
insert_node (lab:'a, comp:'a*'a->bool, Empty) = Node (Empty, lab, Empty)
(*new node*)
| insert_node (lab, comp, Node (lft, lab1, rht)) =
if comp (lab, lab1) then Node (insert_node (lab, comp,
lft), lab1, rht) (* go left *)
else Node (lft, lab1, insert_node (lab, comp, rht)); (* go
right *)
With this as the recursive step, the following code creates a tree from a list of labels:
fun
create_tree (nil, comp) = Empty
| create_tree (h::q, comp) = insert_node (h, comp, create_tree (q,
comp));
Here is a small list of numbers and a comparison function for testing:
val
nums = [~5,1,0,3,~2];
fun lt(n1:int, n2:int) = n1 < n2;
Thus
val
t = create_tree (nums, lt);
creates a tree with the numbers in the list ordered ready
for searching. Try it and see what a tree looks like. Test out the height and
size functions on t.
Assignment, Part 1. Draw a picture of the resulting binary tree, and hand it in (either by hardcopy or via the dropbox). Note the height. (Note: the trees are created from the right; the root is the last item in the list.)
Assignment, Part 2. Construct three ordering functions of a tree t: inorder t, preorder t, postorder t,
which take trees to lists, and realize the three orderings of the text (these
are short one-liners). In particular, given a list lst, and a
comparison function comp
of its labels, what does
inorder
(create_tree (lst, comp))
do (try it out and/or look at your picture from part 1). Starting hint:
fun
inorder Empty = nil
| inorder (Node (lft, lab, rht)) = <you take it from here>
Here is another list of labels. In this case, the labels are 2-tuples of strings, consisting of a word and a definition. We want to put these together in a (short) dictionary. The comparison function looks at the lexicographic ordering of the words. When a comparison function looks at one part of a tuple, often that part is called the key. Thus the word is the key in this case.
val
words = [
("new","not old"),
("nu","Greek letter"),
("gnu","a silly animal"),
("New","West Virginia river"),
("knew","used to know"),
("nu","Greek letter"),
("nuu","a nonsense word"),
("gnew","past tense of gnaw"),
("noo","sound of cow with nose cold")];
fun key (word:string, def:string ) = word;
fun compwords (word1:string*string, word2:string*string) = key(word1) <
key(word2);
Thus for example
val
newdict = create_tree (words, compwords);
creates a search tree of these words.
Assignment, part 3. Repeat part 1 for this tree.
Assignment, part 4. Define a function
lookup
(t, comp, key, lab)
so that for example
lookup
(newdict, comp, key, "gnu")
looks up the word "gnu" and its definition for me
(you have to figure out what comp should be). As a test for this part, run lookup a
couple of times.
The output from search
should be either the proper word and definition, or "Not found" (if,
e.g. I searched for "news"). Your search is going to end,
respectively, at a non-empty node (with the proper label) or an empty node. We
need to define a data type to handle this:
datatype
'a lookup_result = Stringans of string | Ans of 'a;
Thus for example, your definition could begin
fun
lookup (Empty, comp, key, lab) = Stringans ("Not found")
| lookup (Node (lft, lab1, rht), comp, key, lab) = <you take it from
here>
Assignment, part 5. In the text file of code, there is a list named "europe." Each entry is a 4-tuple, consisting of the name of the county, its area, its population, and its gross national product. Create a comparison function that will create a tree using the area as key.
For more detail on data types, see PSML.