CSCI213/ITCS907/MCS9213
Autumn Session, 2007
Assignment 1: Beginning Java and learning to use the NetBeans Development environment
You should complete "Laboratory Exercise 1" before starting this assignment.
This assignment aims to establish a basic familiarity with the NetBeans development environment and the Java class documentation. It also introduces the use of some simple Java library (package) classes including I/O reader/printer classes, strings, collections.
You will soon learn that the biggest difference from C++ is that you write very little of your Java programs and you almost never start from scratch. Instead, you construct your programs mainly through the use of library classes. Typically, you start by taking an existing Java program that is similar to what you want, you strip the bits that you don't need and then build up your new program. You gradually acquire a collection of reusable code fragments for things like simple standard graphical user interfaces.
The assignment involves several versions of essentially the same program; these versions introduce increasingly object-based, Java-oriented styles of solution to a problem. All versions of the program count toward your assignment mark.
On completion of this assignment you should be able to:
The programs for this assignment involve processing a collection of simple data records. The data are read from a file, manipulated in various ways, sorted, and listed. Several programs have to be implemented; they are increasingly sophisticated in the way in which they use Java to manipulate the data.
The actual data records supposedly describe medical data for patients. The data records (lines in the text data file) contain information like the patient's name and initials, insurance number, and results from a number of medical tests (blood-pressure, cholesterol-level, etc, etc). The programs produce reports such as listings of patients with the highest risks (obesity, high blood-pressure, etc).
Four versions of the program are implemented. The versions start with procedural code manipulating simple data records and move toward more object-based styles that make increasing use of features of the Java language and its libraries.
The assignment involves the following parts:
Vector, ArrayList, or
LinkedList).
Two datafiles are provided.
The main datafile is should be used for
most of the test runs of your program. The second
smaller datafile is used in the fourth
part of the assignment where one of the operations involves
merging two sets of data.
Each line in these files contains the following data elements:
308970906 Ebbers WE F 61.1 41 5.7 145 74 3.7 - 46 14.7 102153708 Edmonds SAV F 56.7 32 7.1 130 92 6.5 - 49 12.5 108190991 Edash BJ F 61.3 30 4 147 83 7.3 - 41 21.6 218409054 Ederm LO M 56.9 35 5.8 135 79 5.2 3.2 - -
(W.E. Ebers, a female patient, has insurance number 308970906. She is just over 61 years. She is significantly obese having a body mass index of 41. Surprisingly, her sugar and cholesterol levels are both quite good. The systolic blood pressure is slightly elevated. Apart from her obesity, she looks in reasonable condition for her age. Two of the three blood hormone levels were measured with results as shown.)
There are several hundred such records in the file.
The following code fragment illustrates how such data may be read and processed. It produces an output listing showing each patient's insurance number, name, initials, gender, and age. Warning tags are appended to those patient records where there are clearly problems with obesity, blood-pressure (hyptertension), blood-sugar (hyper- glycemia), or blood-cholesterol (hypercholesterolemia). (This code fragment is the starting point from which you develop your solutions to the assignment - as noted earlier, you very rarely start a Java program from scratch, you almost always modify existing code.)
public class DemoCode {
/**
* Demonstration program reads the data file, splitting
* input lines into component fields, printing patient ids, names, initials
* and flagging those most at risk.
* @param args the command line arguments, args[0] is name of input file
*/
public static void main(String[] args) {
if(args.length<1){
System.out.println("Need name of data file");
System.exit(1);
}
BufferedReader input = null;
try {
input = new BufferedReader(new FileReader(args[0]));
}
catch(FileNotFoundException fnfe) {
System.out.println("Didn't find the file");
System.exit(1);
}
/*
*Read input file line by line until get either empty line
*or null (end-of-file indicator); split line into components
*at whitespace, print id, name, initials, gender and age
*Flag "at risk" persons - (check for any one of following conditions)
* body mass index > 35,
* systolic blood pressure > 175
* fasting blood sugar > 9
* total cholesterol > 8
*/
System.out.println("Insurance #\tName\t\tInitials\t(M/F)\tAge\tAlerts");
for(;;)
{
String line = null;
try {
line = input.readLine();
}
catch(IOException readfail) {
System.out.println("Read failed on input file");
System.exit(1);
}
if(line==null) break;
if(line.equals(""))break;
String[] items = line.split("\\s+");
System.out.print(items[0]+"\t"+items[1]+"\t\t"+items[2]+"\t\t"+items[3]+"\t" + items[4]+"\t");
String bmiStr = items[5];
String systolicStr = items[7];
String sugarStr = items[6];
String cholesterolStr = items[9];
try {
int bmi = Integer.parseInt(bmiStr);
int systolic = Integer.parseInt(systolicStr);
double sugar = Double.parseDouble(sugarStr);
double cholesterol = Double.parseDouble(cholesterolStr);
if(bmi>35) System.out.print("Obese, ");
if(systolic>175) System.out.print("Hypertension, ");
if(sugar>9.0) System.out.print("Hypoglycemia, ");
if(cholesterol>8.0) System.out.print("Hypercholesterolemia");
}
catch(NumberFormatException nfe) {
System.out.println("Invalid numerical data for patient" + items[1]);
}
System.out.println();
}
}
}
You should be able to cut and paste the code from this page into a Demo project that you create in NetBeans.
You should get the code to run in NetBeans, configuring your project so that its Properties/Run settings identifies the data file from which input is read.
The following is a fragment from the output that you should get if you run this program:
109642375 Denby CR M 57.9 Obese, Hypertension, Hypercholesterolemia 502263058 Eagham GH M 55.2 128529487 Easlee JA F 52.6 308970906 Ebbers WE F 61.1 Obese,
The main program creates a FileReader object ("input") that can be used
to extract character data from a file. The FileReader object
is wrapped inside a BufferedReader
object. A BufferedReader is more useful in that it allows the program
to read input data line by line.
The program then has a loop to read all the lines in
the file; the readLine operation of a BufferedReader returns the next line.
If the physical end of file is encountered, readLine returns null.
(The program checks for an empty line,""; there may be some blank lines at the
end of the data before the physical end of file.)
The loop terminates when all input data have been processed.
The data in a line of the file is obtained as a java.lang.String object.
String objects cannot be changed after creation. The String class defines
methods for obtaining substrings, finding characters, and for splitting up a line.
Here, the split function is used. The somewhat strange argument
for split "\\s+" is a "regular expression" that means
"a sequence of one or more white space characters". So the input line should
be split into substrings at the white space gaps.
In this demo code, the strings containing insurance number, name, initials, etc are simply listed. Strings that encode numeric values, like the blood sugar level, are converted into actual numbers for further processing.
The demo code consists of a single main function. Your
programs will need to use private auxiliary functions.
Listings of each program are required in your final report so keep the different versions in separate projects. The code that you write to complete one part of the assignment represents a starting point for the code for the next part. NetBeans allows you to have several projects open simultaneously and provides a mechanism for copying a class definition file from one project to another. (There is also a mechanism for sharing class files between projects. Don't use this. It would result in any changes you make in later stages corrupting the code for earlier projects.)
Procedural code using basic Java classes
Write a program, derived from the demonstration code shown above, that reads the data from the file, storing each patient's insurance-number, name, initials, gender, and age in separate arrays (long[], String[], String[], String[], double[]). (The other data elements in each record are ignored in this part of the assignment.) When all data have been read, the contents of the arrays are to be sorted in parallel. using an insertion sort algorithm, and listed in order of increasing insurance-number. (Your insertion-sort code should check the insurance-numbers of compared records to determine whether records should be swapped; if the records must be swapped, you must make appropriate changes in each of the data arrays.) Output from your program should appear something like:
... 107438863 TP Ash F 60.3 107482636 IK Holly M 78.3 107576725 P Hartly F 76.9 107761288 B Lim M 53.1 107864861 TT Bailes M 69.9 ...
Your program will:
Your program will have a single public class that contains the declarations of all static data members and static functions.
(When editing this class, you will begin to see the use of the Navigator pane in the NetBeans window. It will show a listing of the class members with little icons that distinguish public from private members, and static from instance members - in this case all members are static).
Implement a new version of the program that has a main class (that has the main() function, input function, sort function, and report function) and a simple "Patient record" class
Your "Patient record" class will:
long data member for the insurance numberString data members for name, initials, and
gender;int data members for body mass index, and the two blood pressure readingsdouble data members for each of the other data fields used
to characterize a patient (results of blood tests etc).String toString() function that returns a string
with patient details in the form number-initials-name-gender-age
All data and function members are to be public. At this stage,
a "Patient record" object is really being used as a simple "struct" just to
hold data needed by other parts of the program. Normally, data members are
private.
Your main class is similar to that implemented in part1. The main difference is that it owns a single array of "Patient record" objects instead of arrays for separate String and double variables. This should result in substantial simplification of your insertion sort code.
Defining a Comparable class and a Comparator for using Java sorting, and using a collection class.
A new version of the program should be created as a separate project. This version will have three classes. A main class with the driver code; a revised version of your "Patient record" class, and a Comparator class.
Your "patient record" class will now "implement Comparable<PatientRecord>. The definition of the compareTo function will define the natural ordering of PatientRecords. They are to be ordered in increasing order by insurance number.
You will also define a class that implements Comparator<PatientRecord>.
This will define a compare function that will arrange that
PatientRecords are ordered so that the oldest patients are
listed first (if several patients are the same age thay are to be
ordered in increasing order by membership number).
You will remove your insertion sort code and eliminate the
use of fixed size arrays. Instead, your main
program will use a collection class (one of those
implenting the List interface)
and make use of the Collections.sort functions.
Your program is now to generate two report listings.
Once all the data records have been read and stored in the "list", the main
program is to use the Collections.sort() function to
sort them into their natural order and then call the first
report generating function.
The first report function prints details of those patients where an alert is needed (blood pressure too high etc - as defined in the demonstration code given above). A fragment from this report is as follows:
... 108267661 BG Ramses M 64.6 Obese, 108278713 JP MacMann F 79.2 Obese, 108429007 C Ng F 55.3 Hypertension, 109055281 EP Foz F 76.1 Obese, 109219564 H MacTavish M 65.4 Hypercholesterolemia 109494855 P Jagdish F 73.1 Hypercholesterolemia ...
After invoking the first reporting function, the main program is to re-sort the array so that the patients will now be listed with the oldest patient first. A second reporting function, that takes an argument defining the number of records required, will list the first few records in the sorted collection. (The main program will use some fixed number, ~15, in its call to this second reporting function). A fragment from this second report is as follows:
... 107338692 PI Geharn 88.1 100434625 X Xu 87.5 429990610 SD Lafna 87.1 118740547 CD Lott 86.3 419265801 FG Kay 84.3 ...
(You will use the PatientRecord class and its comparators in later assignments. Keep a copy of your code.)
Defining a "Patient Analyzer" class.
For the final part of the assignment you will define a "Patient Analyzer" class and another "Comparator".
Instances of the PatientRecordAnalyzer class will:
The main() driver function (in a separate driver class distinct from class PatientRecordAnalyzer) will:
A fragment from the output that should be obtained is:
... Second collection, 5 oldest 863802608 K Iverson, 86.7 333829360 J Zhu, 83.0 447864861 T Cao, 79.9 590736520 LT Prince, 79.9 ... First collection, statistics Statistical report for dataset data2 Number of patients 245 Minimum age 38.9 Maximum age 95.2 Average age 66.3 Average bmi 30 ...
The due date for submission will be announced in lectures; the date will probably be around the end of week 4 of session (currently set at Friday March 23rd). For this assignment, and most of the other assignments in CSCI213, you will be writing a report summarizing your work and submitting this report for assessment. The report must be prepared in a word processor, converted to PDF, transferred to your Unix account by ftp, and then submitted.
For CSCI213, asignments are submitted
electronically via the turnin system.
The turnin command only works if you are logged in on
banshee (it has to access files that are only available on the
banshee machine). Check that you are logged in on banshee (the Unix command
uname -n will tell you what machine you are connected to); if
you are not logged in on banshee, you must use ssh to remotely
login on banshee before using turnin. (If you try to use turnin when
not on banshee, you will get some enigmatic reply - most typically
turnin will reply that it doesn't know anything about the assignment that
you are trying to submit.) The turnin system is not chatty. You won't
receive any happy little emails thanking you for your work. It just takes
your file and saves it. You can use the turnout command to check that
your submission was received. The turnin and turnout programs have Unix
man pages describing their use.
You have four projects whose code is to be included in the report.. You will use a word processor to create a report document and convert this document to a PDF file prior to submission. Your report will have an index and a titled section for each part. In each part, you will have your code, copied and pasted from your development editor. (Some people try capturing screen shots of the code formatted in the integrated development environment. This isn't a good idea. It is laborious and somewhat error prone - easy to miss out part of the code etc. However, you might find screenshots of things like the NetBeans "project view" or "navigator view" to be quite useful as outlines at the start of each section documenting a program or a class.) After pasting code into a word processor, you should fix up any formatting problems (inserts of newlines etc where a long source line was badly split). With each part, you should provide some evidence for correct operation - you can capture screen shots showing a your application, and you can capture output (which should be edited down to a few lines) and paste these into your report.
The pdf report (named A1.pdf) should be submitted using the following command on banshee:
turnin -c csci213 -a 1 A1.pdf
One again, just to emphasise,
Late submissions do not have to be requested. Late submissions will be allowed for three days after close of scheduled submission. Late submissions attract a mark penalty; this penalty may be waived if an appropriate request for special consideration (for medical or similar problem) is made via the university SOLS system before the close of the late submission time. No work can be submitted after the late submission time. If submitting late, use the command:
turnin -c csci213 -a 1late A1.pdf
(That is 1late, 1 - l - a - t - e, all as one word.)
There is an example PDF report in the /share/cs-pub/csci213 directory - it illustrates how a report can be formatted with code, commentary and evidence for correct operation of a program. (It is a rather long report; it was produced by a student for an assignment worth 25% of a 400-level subject).
The /share/cs-pub/csci213 directory has a couple of PDF "driver" programs. These can be installed on a Windows PC and will allow "printing to PDF" file from any application.
(It is possible to create PDF files on the Unix system.
You must first create a postscript file - use a "print/save to
file" option - that should give you postscript. There is
a ps2pdf program that converts that to PDF. This
approach is clumsy and may not work completely.)
The /share/cs-pub/213 directory has Windows/Linux copies of the JDK and the Java documentation for those wishing to work on their own machines.
One mark of the ten marks for the assignment is for overall presentation of the report, inclusion of evidence of operation, well formatted code etc. You can only get this mark if your report presents solutions to at least two parts of overall assignment.
Another mark is for general Java style - you can only get this mark if you complete all four parts, but may then lose it if you are making incorrect use of basic Java constructs.
Part1 is worth 2 marks - if it works, and the data elements are defined appropriately, and overall structure is correct with the separate functions as were specified, and the implementation of the insertion sort was done correctly.
Part2 is worth only 1 mark - and to get that mark you must have code that works and that properly defines the "patient record" class with its data members, constructor, and toString method, and have modified the other code to use your patient record objects appropriately.
Part3 is worth 2 marks - if it works, and you have defined the comparator and "implements comparable" aspects appropriately and made correct use of Collections.sort functions.
Part4 has 3 marks - again, you only get the marks if it works and does all that it is supposed to do.