CSCI399
Autumn Session, 2009

Actually, it's the same assignment as 2008 because the 2008 class seemed to enjoy this one.

Assignment 1: Web basics and Apache

You should complete exercise 1 before atempting this assignment.


Aims

This assignment aims firstly to introduce you to the basics of creating a web of documents for which you compose the HTML and Javascript, then to provide some limited experience in the basics of web programing with CGI (using C/C++), and finally to perform some minor Apache administration tasks such as setting controls on directories.

The assignment should be completed in the laboratory using the Ubuntu (Linux) environment. The necessary software has been installed on the laboratory machines and the Apache server that you are to use has been configured as required.

Such software can be downloaded from the Internet and installed on your own Linux or Windows machine. Installation and configuration of such software is NOT part of the assessment. The installation and correct configuration of the software is likely to prove time consuming. You may use your own machine, but the extra work entailed should not be considered a part of the work done for this subject.


Objectives

The objectives are for you to:


Resources

Apache on the Ubuntu machines, and Oracle

The Ubuntu machines have a Zend/Oracle Apache installation. (The correct configuration of an Apache/PHP/Oracle-client system is time consuming! Zend corporation, the professional PHP company, has a downloadable pre-configured system.) CSCI399 will be using this installation for assignments 1..3 (the web basics assignment, the Perl assignment, and the PHP assignment). Each Ubuntu machine runs its own copy of Apache.

The Zend/Oracle Apache system will allow access to the CSCI Oracle server that runs on the "wraith" machine. CSCI is the main Oracle server for undergraduate CS assignments, all students enrolled in CSCI399 will have accounts on this Oracle server. Your database username, and initial database password, are usually the same as your Unix login identifier. You will receive email from "yuan" (the database administrator) confirming the existence of your Oracle account and password. Standard Oracle client software has also been installed on the Ubuntu machines, so you will be able to use the sqlplus program for basic database adminstration tasks.

The Apache configuration was created for ease of use in this assignment and does not represent a good example of a secure system! It is configured to allow the use of "user directories" - i.e. each of you can have a public_html directory in your home directory. Your files should be placed in your public_html directory or subdirectories (see later for some specific directory arrangments).

The Apache runs as user "nobody". This creates a problem in relation to the CGI program. Your CGI program will be launched by the web server and so will also run as user "nobody". Unfortunately, your CGI program will need to update a data file (using a database from C++ is simply too painful). Your data file, and the directory it is in, will both need to permit "write" access by others. Such "write" access is dangerous (if someone in the class doesn't like you, he/she can scribble all over your files). You should be careful to enable write access to the appropriate file and directory only for the time that you are actually testing your CGI program; remove the "write" permissions as soon as you have run your tests.

HTML/Javascript/CGI

The following support materials are available:

  1. a brief overview of HTML basics, forms, Javascript etc;
  2. a very simplified example of CGI programming in C++

Those who have not had any practical experience with Web technologies should checkout the WDVL (Web Developers) site. This has numerous tutorials:

It is not common practice to write client-side Javascript from scratch! Instead, you notice a web-site that has an interesting feature - view source - take inspiration from the Javascript code!

In the first part of the assignment (the web part) you are required to make use of Javascript for "enhancing the user experience". My support materials include some simple illustrations of "roll-over" and "pop-up menus". Provided you acknowledge the original source, you can use any other more interesting examples that you may have encountered on the Web.


Tasks

Your overall task is to write a report detailing how you completed the various subtasks listed below. You will be marked on your report. The report is to be submitted as a PDF file (the /share/cs-pub/399 directory contains a "driver" that you can add to your Windows machine that will let you "print" PDF files from Word or another word processor program; the OpenOffice word processor on the Ubuntu machines has a convert to PDF option.)

Web basics: "My Own Facebook"

For this part of the assignment, you are to create web pages containing HTML markup, content text etc, and where appropriate Javascript code. These pages are to be created using an ordinary text editor. One of the aims of this part is for you to become more familiar with the HTML markup language so that you can later write programs that generate correct HTML markup in dynamic pages.

Your web should be composed of material that you might later wish to post to your "Facebook" or "MySpaces" web site - just the usual vanity press "about me" material. It should be constructed as a number (6..12) relatively small interlinked pages. (It is probably worth skimming the book "Don't Make Me Think" for some general advice on web page design.) For this part of the assignment, you should save your HTML files in your public_html directory, with images in an "images" subdirectory. Javascript code will mostly be short and can be included in the HTML page; however, if you do import a large Javascript script then it might be better if you saved it a "jscripts" subdirectory and use links in the HTML pages.

The material that you develop for this first part will be redeployed in the later stages when you experiment with aspects like Apache's access controls.

You need to use:

The above required elements are all typical of things that you will have in dynamic pages generated by server programs.

Features like "framesets" and "client pull" (attractor pages) are optional - use them if you feel that they enhance your site.

Your report on this part of the assignment should consist of:

  1. "Thumbnail" images of screenshots of your pages. (If you have many pages using similar layout but differing content, include only a representative example.)
  2. A diagram illustrating the primary structure of you web with its links.
  3. Your CSS stylesheet (try to be more ambitious than simply copying mine).
  4. A list of your pages giving details of the main HTML features exploited.
  5. Details of your use of Javascript. You do not need to include complete code listings, simply explain what the script did and reference the original source if it was based on something you found on the Internet.

HTML/Javascript/CGI-basics

The PBase photo site provided the "inspiration" for this CGI exercise. PBase is a site where people publish galleries of photos and receive comments on their work. Some galleries are "pretty" and get comments about their prettiness; others are more concerned with mechanics of lenses, lighting conditions etc. Of course, FaceBook, MySpaces and numerous other social sites have similar facilities for comments by your friends on aspects of your vanity pages.

You will create a static HTML page with:

  1. A header message, e.g. "my favorite photo of the week"
  2. An image
  3. A section using an IFRAME to include a separate file that contains visitor comments presented in a HTML table. This separate file gets updated by the CGI program when a viewer submits a comment.
    Look up the IFRAME HTML tag on the Internet. It allows you to reserve a portion of the current page for the "included" file. Scrollbars can be added automatically if the included content exceeds the space reserved.
  4. A data entry form with two input fields and a submit button.
    The first input field is a text input, limited to ~30 characters, that is to contain the vistors email.
    The second field is a textarea that is to hold the visitor's comment.
  5. Javascript code that is invoked "onSubmit" that verifies the validity of the inputs to the best extent that you can manage.
    The email should be checked to see that it is of the form word-characters@word-characters and dots.
    The comment should be checked to detect < signs, or the % equivalent, in case a hacker is trying to inject scripting code or other problematic data.
    The Javascript code should put up an alert if it detects problems with the input and should prevent submission of known bad inputs.
Example
example

The form should "post" the data to a compiled C++ CGI program for action.

Your program should

  1. Pick up and validate the two inputs.
    You must always verify inputs even if you think that they have been checked by client side Javascript, or are composed entirely from selections of predefined choices that you presented in HTML selection-option lists. Hackers will attack. Their inputs are crafted by hand to try to disrupt your programs.
    Your checks should again verify that the "email" address looks vaguely plausible and that the comment does not contain any undesired characters.
    If the submitted data fail your validation tests, then your program should return a terse error page explaining the problem.
  2. Your CGI program will then update the "comments" file.
    The comments file should start with just the HTML defining a table with columns for email, time-stamp, and comment.
    Each user comment should be inserted as an extra row with these three elements.
    Your program will probably work by:
    1. Copying the contents of the existing file line-by-line to a new file until it encounters the closing </table> tag.
    2. Outputing the new row with its data.
    3. Outputing the </table> line.
    4. Closing input and output files.
    5. Replacing the original comments file by the new file that had been created in temporary file space.
  3. The CGI program returns a simple "thank you for your comment" page.
  4. The next time you vist (forced refresh) the original HTML page you should see the latest comment in the comments section.

You will probably find it simplest to base your CGI program on my examples (like the Echo.cgi program provided in the supplementary materials).

Your C++ source code should be stored in some directory other than your public_html directory. You compile your code, along with the CGIHelper class if you choose to use it, and rename the a.out executable as "MyCGI.cgi" (the file extension must be .cgi). You move this .cgi program to your web area.

The Apache system has been configured to permit .cgi executables in "user directories" and their subdirectories. You can store the .cgi program in your base public_html directory, but it might be better to use a subdirectory (by convention named cgi-bin) to keep different files in separate areas. Your form would reference your cgi program as action="./cgi-bin/MyCGI.cgi".

The file with the table of comments ("Comments.html") should be created initially with simply the HTML markup for the table and its column headers. This file should be placed in a "data" subdirectory of your public_html directory and should created with global write permission; the "data" directory itself must also allow for global writing. It is this file and this directory that represent your greatest security risks; keep the file permissions as normal except when you are testing your CGI code.

Your C/C++ code doesn't have to be careful about freeing memory. Each time it runs it only handles a single set of input data before terminating, so any memory leaks are inconsequential. A CGI setup that uses files can encounter problems with multiple concurrent updates. The approach suggested could lose all records of some updates; but it isn't worth dealing with the vagaries of the use of lock-files etc so the potential problem should be ignored.

Your report for this part should comprise:

  1. Screen shots of your page with its image, comments, and input form - both before and after the addition of an extra comment.
  2. A source listing of your HTML & Javascript.
  3. A source listing of your C++ CGI program (do not include the code of CGIHelper classes)

Apache administration

Traditionally, CSCI399 involved a lengthy but easy exercise on Apache configuration. But things are just becoming too simple. On Windows - install Apache means double click on a file icon. Most Linux systems these days come with Apache pre-installed and their auto-updaters will get the latest version for you. Then you get organizations like Zend who supply little helper scripts that step you through the install process anyway. It just isn't fun any more.

The Apache exercise for this year is consequently reduced to handling minor aspects such as content negotiation and access control>

The configuration file for the Apache that you are using specifies default options for your public_html directory:

MultiViews Indexes SymLinksIfOwnersMatch IncludesNoExec ExecCGI

and has the language options defined for European languages, Japanese, Korean, and Chinese (traditional and simplified).

But, since the directory specification also defines:

AllowOveride All

You can create .htaccess files in your public_html directory, or in subdirectories, to change options. (The Apache manual, which has details of option settings for things like "Server Side Includes" and "Authentication" should be accesible on your local Apache at "/manual/index.html".)

Complete the following Apache/HTTP exercises.

  1. MultiViews

    Prepare a second language version of your web-sites primary Welcome page and reorganize your site so that the version of the page that is returned is appropriate to the browser's language preferences settings.

  2. Simple Server Side Includes
    Modify your primary Welcome pages so that they use a common counter that shows how often they have been visited. The counter is to be implemented using "Server Side Includes". You will need to create an .htaccess file that adds "Includes" to the Options (overriding the IncludesNoExec that exists by default). The file used to store the count value will need to be writeable by all (actually just "nobody" - the www user, but you cannot set that).
  3. Authentication
    Some of the visitors to your site have been leaving unkind comments on your image-comments page.
    You have decided to make your site accessible only to your friends.
    Use the "htpasswd" command (at /usr/local/Zend/apache2/bin) to create a password file (this shouldn't be in your public_html directory). (If you give your password file a name starting with ".ht", the settings on the Apache server will prevent attempts to download the file via WWW). Add names and passwords for a few friends.
    Edit the .htaccess file for your public_html directory, inserting the extra controls that limit access to valid users. The AuthUserFile field in these controls should containt the full path name of your passwords file.
    Verify that your site can now only be accessed by valid-users.
    Modify the form part of your image-comments form. The email address input field can be removed.
    Modify your CGI program so that it gets the identity of the person making a comment from the environment variable that identifies the person who has logged into your site.
    Verify that comments now appear under the names of your friends.

For this part of your assignment you will need to submit:

  1. Scaled down screen shots that show your different language versions of your main welcome page and the operation of the page view counter.
  2. The final .htaccess file allowing server side includes and defining authentication mechanisms.
  3. Your SSI script that handles the counter.
  4. Evidence (screen shots, also segments from the Apache log - access and error logs should be in /usr/local/Zend/apache2/logs) that authentication mechanisms are applied to your site.
  5. The modified fragment of code from your CGI program that associates comments with the logged in user.

Submission

The due date for submission will be announced in lectures; provisionally set for April 3rd

You are assessed on a report summarizing your work. This report should be prepared using a word processor and converted to PDF format prior to submission. You have to include several screenshots; screenshots can be obtained on the Ubuntu systems via the "Take Screenshot" accessory in the Applications menu.

turnin

For CSCI399, asignments are submitted electronically via the turnin system. For this assignment you submit your assignment via the command:

turnin -c csci399 -a 1 A1.pdf

Late submissions would be submitted as:

turnin -c csci399 -a 1late A1.pdf

Remember, turnin only works when you are logged in to the main banshee undergraduate server machine. Transfer your report file to banshee using an ftp client. Then login via ssh and run the turnin program.

Mark distribution