CSCI213/ITCS907/MCS9213
Autumn Session, 2007

Sorting in Java

Sorting in Java is quite sophisticated. There are sort functions provided (the algorithm is some form of merge sort). These are defined in the java.util.Arrays class; and also in the java.util.Collections class. Arrays and Collections are not classes from which you instantiate Arrays or Collections objects; their role is to define a kind of namespace where Sun could put useful functions like those for sorting arrays/collections or for doing binary searches on sorted arrays.

The java.util package includes many useful "collection classes" (Vector, LinkedList, Hashtable, ... ) including some like TreeSet that automatically sort data elements as they are added to the collection. The interface Collection defines some functions that all kinds of collection classes can perform, e.g. add(element), remove(element), isEmpty() etc. There are specializations like the interface java.util.List; this adds the concept of an ordering so that there are methods to get the element at a particular point in a list-collection. Actual concrete classes - like Vector, LinkedList etc - implement the List interface or one of the other specialized interfaces that extends Collections.

(From Java 1.5, collections are defined as "templates"; these are much less complicated than C++ templates. A Java template collection class simply requires that you define the class of data elements that will be stored in that collection. So, if you want use a LinkedList to store String objects, you must declare a "linked list of Strings" - LinkedList<String> mylist = new LinkedList<String>();. If you look at older text books that use dialects of Java before 1.5, they will use "type unsafe" collections - you simply declared that you needed a LinkedList and then used it to store references to objects of whatever kinds you chose.)

Arrays.sort(), Comparable, and Comparators

If you look at the Arrays class (or the Collections class) in the Java documentation you will see functions like:

static int binarySearch(double[] a, double key) 
          Searches the specified array of doubles for the specified value using the binary search algorithm. 
static int binarySearch(float[] a, float key) 
          Searches the specified array of floats for the specified value using the binary search algorithm. 
static int binarySearch(int[] a, int key) 
          Searches the specified array of ints for the specified value using the binary search algorithm. 

	  
	  
static void sort(double[] a) 
          Sorts the specified array of doubles into ascending numerical order. 
static void sort(double[] a, int fromIndex, int toIndex) 
          Sorts the specified range of the specified array of doubles into ascending numerical order. 
static void sort(int[] a) 
          Sorts the specified array of ints into ascending numerical order. 
static void sort(int[] a, int fromIndex, int toIndex) 
          Sorts the specified range of the specified array of ints into ascending numerical order. 

Of course, Java knows how to sort primitive type data, numbers like int and double. But what about objects that are instances of classes? The following functions are defined:

static void sort(Object[] a) 
          Sorts the specified array of objects into ascending order, according to the natural ordering of its elements. 
static void sort(Object[] a, int fromIndex, int toIndex) 
          Sorts the specified range of the specified array of objects into ascending order, according to the natural ordering of its elements. 

static  void 
 sort(T[] a, Comparator c) 
          Sorts the specified array of objects according to the order induced by the specified comparator. 
static  void 
 sort(T[] a, int fromIndex, int toIndex, Comparator c) 
          Sorts the specified range of the specified array of objects according to the order induced by the specified comparator. 
	  

What is "the natural ordering" of a set of objects? What is a "comparator"?

There has to be a mechanism for specifying how objects are compared. Essentially, a comparison function has to be supplied that will be used by the sort function every time it compares two elements (if you ever met the qsort function in C/C++ stdlib you will remember providing a function that compared elements of an array). The compare function used for the java.lang.String class obviously compares two strings so that they are sorted alphabetically. The compare function for the java.util.Date class will arrange dates with the oldest date first. These comparison functions are built in with the classes that Sun provides.

If you are defining a class, you must define the comparison function. There are two ways of doing this.

Method 1 : implement Comparable

The first mechanism is for the case where you are designing a new class. In this case, you arrange that your new class implements the "Comparable" interface - which really means that it defines a compareTo() function.

The following example code illustrates this approach. The main program (main() in public class OrderDemo) creates a number of "Boxes" and stores them in an array. The array is then sorted using the Array sort function.

A box is something with width, breadth, height, and color. Boxes are sorted first by volume, if their volumes are equal they are sorted by color name, if their color names are the same they are sorted by unique box identifier numbers. This box class was designed for use with a sorted collection so it implements the Comparable interface and defines its compareTo method. (To work properly, a compareTo function should only return "equal" if the "two" objects being compared are actually the same object.)

Here are the class definitions:

import java.util.*;

class Box implements Comparable<Box> {

	static int idCount = 0;

	int	id;
	int 	width;
	int	length;
	int 	height;
	String	color;

	public Box(int w, int l, int h, String c) {
		width = w;
		length = l;
		height = h;
		color = c;
		id = ++idCount;
	}

	public long getVolume() { return width*length*height; }

	public int compareTo(Box other) {
		int val = (int) (other.getVolume() - getVolume());
		if(val != 0) return val;
		val = color.compareTo(other.color);
		if(val != 0) return val;
		return id - other.id;
	}

	public String toString() {
		StringBuffer sb = new StringBuffer();
		sb.append("Box ");
		sb.append(id);
		sb.append(", Volume ");
		sb.append(getVolume());
		sb.append(", Color ");
		sb.append(color);
		return sb.toString();
	}

}

public class OrderDemo {

	public static void main(String[] args) {
		Box[] theBoxes = new Box[10];
		theBoxes[0] = new Box(7, 8, 9, "red");
		theBoxes[1] = new Box(3, 13, 12, "yellow");
		theBoxes[2] = new Box(11, 14, 22, "blue");
		theBoxes[3] = new Box(14, 13, 33, "black");
		theBoxes[4] = new Box( 28, 1, 18, "green");
		theBoxes[5] = new Box(14, 13, 33, "black");

		
		// Invoke Arrays.sort
		// The boxes are sorted "by their natural ordering"
		// as defined by their compareTo function.
		
		Arrays.sort(theBoxes, 0, 6);
		for(int i= 0;i<6;i++)
			System.out.println(theBoxes[i]);

	}

}

A compareTo() function for a class that implement Comparable<T> must have the signature:

public int compareTo(<T> o)

where T is the type of the object, here Box

The compareTo code compares by volume (and if the volumes are equal, compares by color, then by box identifier).

Apart from comparing itself with another Box, a Box object can report its volume and turn itself into a String so that it can easily be printed. It is possible to build a String by concatenating shorter Strings; it is slightly more efficient to use the helper StringBuffer class to build a complex String. Here, the toString() method makes use of a StringBuffer while building its String representation. (When you call System.out.println(obj), the obj object executes its toString method to produce a String that the println function can print.)

Method 2: use a "comparator"

The other mechanism used with sorted collections is for cases where you have already implemented a class, and it doesn't "implement Comparable", or when you want to sort by some criteria that is different from the "natural ordering".

In this situation, you explicitly pass a comparison function as an extra argument when you call the sort function (just as you did when using qsort in C/C++). However, there is a problem. Java doesn't allow the definition of "free functions". The comparison function that you define must belong to a class. A little trickery is used to get around this.

The trick is to use a "Comparator". This is a fake class (it isn't a real class in the sense of defining a type of data object that owns resources and provides functions related to those resources). Such classes exist simply to provide a collection of functions (just one function here).

The compare function in a Comparator compares two objects (of the same kind). You define different comparators with different compare functions for comparing different kinds of data object.

The Boxes code has been reworked to use Comparator as well as a "natural ordering" defined by implementing Comparable. The "natural ordering" has been redefined to place the boxes in order of their identifier numbers. The Comparator class has been defined to provide a compare function that does the comparison based on volume, colour etc.

import java.util.*;

class BoxComparator implements Comparator<Box> {
	public int compare(Box b1, Box b2) {
		int val = (int) (b2.getVolume() - b1.getVolume());
		if(val != 0) return val;
		val = b1.getColor().compareTo(b2.getColor());
		if(val != 0) return val;
		return b1.getId() - b2.getId();
	}	}
}

class Box  implements Comparable<Box> {

	private static int idCount = 0;

	private int	id;
	private int 	width;
	private int	length;
	private int 	height;
	private String	color;

	public Box(int w, int l, int h, String c) {
		width = w;
		length = l;
		height = h;
		color = c;
		id = ++idCount;
	}

	public int compareTo(Box other)
	{
		return id - other.id;
	}
	
	public int getId() { return id; }

	public long getVolume() { return width*length*height; }

	public String getColor() { return color; }


	public String toString() {
		StringBuffer sb = new StringBuffer();
		sb.append("Box ");
		sb.append(id);
		sb.append(", Volume ");
		sb.append(getVolume());
		sb.append(", Color ");
		sb.append(color);
		return sb.toString();
	}

}

public class OrderDemo2 {

	public static void main(String[] args) {
		Box[] theBoxes = new Box[10];
		theBoxes[0] = new Box(7, 8, 9, "red");
		theBoxes[1] = new Box(3, 13, 12, "yellow");
		theBoxes[2] = new Box(11, 14, 22, "blue");
		theBoxes[3] = new Box(14, 13, 33, "black");
		theBoxes[4] = new Box( 28, 1, 18, "green");
		theBoxes[5] = new Box(14, 13, 33, "black");

		System.out.println("Boxes sorted by the volume/colour comparator");
		Arrays.sort(theBoxes, 0, 6, new BoxComparator());
		for(int i= 0;i<6;i++)
			System.out.println(theBoxes[i]);
		System.out.println("Boxes sorted by the 'natural ordering' i.e. ids");
		Arrays.sort(theBoxes,0,6);
		for(int i= 0;i<6;i++)
			System.out.println(theBoxes[i]);

	}

}

If you run this demo, you should get output like the following:

Boxes sorted by the volume/colour comparator
Box 4, Volume 6006, Color black
Box 6, Volume 6006, Color black
Box 3, Volume 3388, Color blue
Box 5, Volume 504, Color green
Box 1, Volume 504, Color red
Box 2, Volume 468, Color yellow
Boxes sorted by the 'natural ordering' i.e. ids
Box 1, Volume 504, Color red
Box 2, Volume 468, Color yellow
Box 3, Volume 3388, Color blue
Box 4, Volume 6006, Color black
Box 5, Volume 504, Color green
Box 6, Volume 6006, Color black

Sorted collections: TreeSet

Sometimes arrays of data elements are useful, but often the collection classes are more convenient. In order to create an array, you must either know or guess the number of data elements that you will have. If you guess wrongly, your program either dies with an "array subscript out of bounds" exception or you waste memory. Instances of collection classes "grow" to hold the data that you add. (You can find more details about the collection classes here.)

The most common collection class is java.util.Vector; this is basically a dynamic array that grows as you add elements. You can in effect use "array subscripting" to place elements at particular points in a Vector. Once you have filled your Vector with data elements, you can apply the Collections.sort() function to that Vector. If you use the simple sort() function, the elements are sorted in their "natural order"; you can use a version of sort() that takes a Comparator as an extra argument and sort your elements by whatever criteria you wish.

Clearly, sorting a LinkedList could be costly! (It would probably be N*N*lgN - you did some of this algorithm complexity stuff in CSCI103, you will do more in CSCI203.) In practice, the Collections.sort() function copies the data in a List into a temporary array; sorts the array using Arrays.sort() in N*lgN time; copies the sorted data back into the collection.

A TreeSet object sorts data as they are added. It is based on a standard data structure called a "red-black" tree (those who have taken an algorithms course will probably have encountered red-black trees).

In your program, you simply create an instance of class TreeSet (with Java 1.5 style type security you must specify the type of element that will be stored in the TreeSet). The TreeSet object that you get will then arrange elements in their "natural ordering" as they are added.

If you want to use some different ordering, you can create another TreeSet providing it with a Comparator. This TreeSet will then order data elements using the Comparator.

You can copy data from one collection to another, and get the data re-ordered as you wish. The following fragment of code illustrates the approach:

public static void main(String[] args) {
		TreeSet<Box> data = new TreeSet<Box>();
		data.add( new Box(7, 8, 9, "red"));
		data.add( new Box(3, 13, 12, "yellow"));
		data.add( new Box(11, 14, 22, "blue"));
		data.add( new Box(14, 13, 33, "black"));
		data.add( new Box( 28, 1, 18, "green"));
		data.add( new Box(14, 13, 33, "black"));

		System.out.println("The boxes in their natural order");
		for(Box b: data)
			System.out.println(b);
		
		TreeSet<Box> data2 = new TreeSet<Box>(new BoxComparator());
		data2.addAll(data);
		System.out.println("The boxes in their volume/colour order");
		for(Box b: data2)
			System.out.println(b);
		

	}