Markov Chain Programming Assignment Index

CSSE221, Fall 2007-2008

Get a zip file of this whole folder, including the starting code, here.

Note: None of the documents can be fully understood until you have read the others. They are all inter-related. That means that you should probably read all of them twice before beginning your design and code.

I suggest that you quickly read this document, then the other documents in the order listed. Then re-read all of them more carefully.

This is a pair-programming assignment. You may talk with other people about it, of course, and get help as needed, but you must do the assignment yourself and submit it.

Unlike the recent programming assignments, your code must not contain any GUI components (such as Frame, JFrame, or JApplet).

Milestones

Milestone 1: After reading the input, your program should produce a correct Markov chain of words (as described in the markov.html document). Line spacing does not matter, and one word per line will be fine for your output.

Milestone 2: Process the words created by the Markov chain algorithm so that the output contains justified lines of the length specified by the command line.

Efficiency

Minor details of efficiency are not a concern here, but the running time should certainly be an asymptotic linear function of the number of input words. If your program takes 20 minutes to run for one of the large files in the texts folder, chances are good that you have an N2 algorithm. The first thing you might look at is how you do the input. (Note: be sure you use a StringBuffer, not a String, to read in the file.)

Documents and folders

There are four main documents associated with this assignment:

overview.html describes the goals of the assignment, the four required command-line arguments, the data structures to be used, and the submission instructions.

markov.html describes the Markov Chain algorithm that you are to use to generate semi-random text.

justification.html describes the algorithm for producing text that is left and right justified.

examples.html gives examples of using several data structures and methods useful for this assignment.

There are also two folders:

multiset. We are providing a MultiSet class for you. The multiset folder contains several useful flies:
Multiset.html provides javadoc for the Multiset class.
MultiSet.class is the actual class file that you should use in your project.

MultiSet$entry.class is the class file for a nested class that MultiSet uses.

TestMultiSet.java should help you to understand how the MultiSet class works.

Driver.java contains a sample of a driver file that we will use to run your program.

The texts folder contains a collection of text files that you can use as input to your program.

Suggested process for getting started on this assignment