CSSE 230
Editor Trees Programming Assignment

Milestones

1 (55 pts) Implement and test: add(char c, int pos) and add(char c) plus (for testing purposes) EditTree(), EditTree(char), get(int pos), height(), totalRotationCount() and toString(). Tree must be kept height-balanced.
2 (55 pts) All of the above, plus: delete(int pos), size(), and EditTree(EditTree e). Tree must be kept height-balanced.
3 (150 pts) Final submission of all code: all of the above, plus EditTree(String s), split(int pos), concatenate(EditTree other), find(String s), find(String s, int pos), and get(int pos, int length). Includes correctness, efficiency constraints (see below), and style.

Overview

EditorTrees is the major team project, as described in the syllabus.

You'll complete this assignment in teams of three (or two, depending on the number of students in the class).

In this project you will write methods of a class that could be the "behind the scenes" data structure used by a text editor. Each piece of text will be represented as a height-balanced tree with rank (and possibly some other features, for example, parent pointers and/or threads). Each node contains a single character. Note that an EditTree is not an AVL tree in the traditional sense, since the order of nodes reflects their position within the tree's text rather than an alphabetical ordering of the characters in the tree. To repeat, this is not a Binary Search Tree. For example, the text "SLIPPERY" could be represented by the height-balanced tree at the right.

You will focus on data structures and algorithms. You will not write an interface with this. (Of course, we could always have you build a text editor interface as a later project.)

Learning Objectives

Implement a complex data structure and associated algorithms.

Read algorithm descriptions in a language other than Java, and use them as a basis for your your Java code.

Gain additional practice with team skills.

Requirements

You must implement all of the methods whose stubs are in the provided EditTree code.  Each method that creates or modifies a tree must leave all trees height-balanced. Most of the requirements (including big-oh bounds on worst-case runtime) are presented to you as comments in that code, because context should make them clearer. 

These methods must run in O(log N) time, where N is the size (number of nodes) in the largest tree involved in the operation.

Note: details on how to do concatenate and split in O(log N) time can be found here. The excerpt is from Reingold and Hansen, Data Structures In Pascal, pages 334-368.

These methods must run in O(N) time, where N is the size (number of nodes) in the tree involved in the operation.

Finally, get(pos, length)  must be O(length * log N). There are no running time requirements on the other methods you are to write.

 

The following required method is for grading purposes only:  int totalRotationCount(), which returns the total number of rotations done in this tree since the tree was first created. A single rotation adds 1 to this count, a double rotation adds two.

 

Be sure to commit your versions often as you go, and mark your Milestone 1 and Milestone 2 submissions clearly in your commit log, so that we will know which one to grade!

Design Considerations

Certain operations would be really easy to implement if you didn't care about efficiency.

In order to do certain operations efficiently, like add() and get(), you already know that you need extra data in your nodes, like balance code and rank. However, you should consider whether or not you want to add yet more data:

These might work well for you if you are willing to make some tradeoffs. For example, the drawback with adding more data is that you need to keep it up to date when you make changes to the tree.

Another decision you'll need to make is how to how to traverse your tree. For instance, when adding a node, you'll need to walk down the tree to figure out where to insert it, and back up to figure out if it needs to be rotated. How will you do this traversal? It may depend on the method. Some options:

A Quick and Dirty Visualization program

In Winter term 2010-11 Emily Graetz was a student assistant for this class. Emily developed a class that allows you to display the nodes in an EditTree, along with their balance codes and ranks. The DisplayTree class provides static methods to display a whole tree or the subtree whose root is any specific node from your tree.

This should give you a visual way to check whether your rotations are working and whether they are correctly setting balance codes and ranks.

It does not produce pretty trees, and the trees sometimes extend outside the windows, but you are likely to find it helpful. If you want "pretty" you can adapt your Diaplayable code to show the extra information. You may have to adapt the given code some, depending on how you named your fields, whether you use a NULL_NODE, etc. The Java code for this class is here.  Also see http://en.wikipedia.org/wiki/Q%26D 

Final notes

Efficiency is a big deal: As stated above, some of these operations could be implemented very easily if we didn't care about efficiency. The purpose of this assignment is for you to learn how to implement efficient algorithms for maintaining balanced trees. So correctness is only part of the assignment. Solutions that pass all the unit tests but are inefficient will only earn a small fraction of the points.

How do you check if your code is O(log n)? At every node, go either to the left or the right child. If you have to pass down both sides, it will be O(n). And when you are re-balancing the tree, you should only take one path up the tree, rebalancing as you go and stopping as soon as you can - DON'T create a rebalance() method that operates on the whole tree!

The test files for Milestones 2 and 3 may be updated.