CSSE 230
SortingRaces Programming Assignment

Overview

For the first part of this assignment, you will implement and test a Binary Heap. See here: heaps.html

In this assignment, you will compare the speeds of various O(n log n) sorting algorithms.

  1. This is a pair assignment. Pairs are here: teams.txt
  2. You will study the runtimes of five sorting algorithms: three built into Java, and two that you write (or have recently written!). Built-in:
    1. mergesort. Java's mergesort is supposed to be optimized when array is partially sorted.
    2. quicksort. Java's quicksort is an optimized "dual-pivot" version that is supposedly faster than the single pivot version we learned.
    3. bstSort. Here you just insert the items into a binary search tree and then iterate through the tree, copying the data back into the array. (Can you explain why this is O(n log n)? It would be great if we could use your solution to EditorTrees, but EditorTrees don't sort their data like AVL trees do. Please use java's TreeSet instead. It won't be a perfect comparison if the array to be sorted has duplicates (since they sets throw out duplicates), but it will suffice. Be sure to resize the array to account for any duplicates that were thrown out.
    Ones you provide:
    1. heapsort. done (hopefully)! Drag BinaryHeap and BinaryHeapTest into your project. If Eclipse gives you an error in BinaryHeapTest, hover your mouse over the first error at the top of that file, and it will give suggestions as to how to fix it (you might need to click the error to see the suggestions). Choose the last option, "Fix Project Setup" and then add JUnit 4 to the build path in the next window.
    2. skiplist sort. Copy your SkipList class into this package, so you can do some minor modifications.
      1. First, add a public void sort(T[] array) method that inserts all the array elements into your SkipList, and iterates through the skip list, updating the array to be sorted.
      2. Second, make the max height to be O(log(n)), where n is the maximum number of elements you want to sort using this algorithm. Recall from our analysis in class your elements will require a height of 3 * log n with probability 1/n2, so that's a pretty safe choice.) You can figure out this value of the array size you are using (say 10 000 000) and hard code it into your program. Better (although admittedly requiring more changes), let your program calculate the size from the number of elements you are sorting.
      3. Third, we no longer need to pass our own "random number" sequences to the SkipList. Instead, we want to use the random number generator. Just replace the contents of the getRand() method with a simple this.random.nextInt(2);, where random is a SkipList field: private Random random = new Random(17); (We pass 17 as a seed to help debugging; remove the seed if you want truly random sequences.)
    3. quicksort (optional, due to lack of time). Implement a simple version like we did in class so we can see the effects of Java's optimizations. For simplicity, you can hardcode it for ints. Try to recall it from the concepts you remember from class, but feel free to refer to Weiss or other sources as needed. You may have to modify the given code to handle cases where the pivot repeats.
  3. Test your code using at least four kinds of arrays, so you can investigate sort performance under various conditions.

    1. Random order: We have given you tests for this.
    2. Random order, but guaranteed to have no duplicates. To create such an array, you can create an array such that a[i] = i, and shuffle it. The Collections class has a static method called shuffle that is efficient.
    3. Almost sorted. Start with a sorted array as in the previous one, and perform 0.01*n swaps of pairs of elements in random positions. This will introduce some inversions.
    4. Almost sorted, but backwards. Just reverse the array from the previous step.

    Your array size should be large enough so that all of your sorts take at least 25 milliseconds. If you go much smaller than that, it will be hard to tell them apart. You can go much larger for most of the sorts.

Deliverables and grading outline

  1. (30 pts) Binary Heaps and heapsort and tests, as discussed in that specification. Copy both BinaryHeap and BinaryHeapTest into this project (you may submit one implementation per pair) and commit it to the repo. Note:
  2. (20 pts) Code for your other sort, and your timing code. Each sort you provide (heapsort, skiplistsort) should be in its own class, and committed to the repo.
  3. (30 pts) A write up giving a 2D table of sort times (sort vs. data). Discuss the performance of each algorithm. Did all of them run fine? Were there any memory issues? Which was fastest for each type of array? Was anything surprising or did you expect this? From what you know, try to explain what you see. You may use the word document here Analysis.docx as a template.