How to Make Forms on our WWW server

A program called form-exec is installed on our server which eases the writing of fill-out forms for WWW browsers which support it. Forms can offer many conveniences for users accessing our server, but at the same time, it can be a security hole if the form is not implemented properly. See the documentation for NCSA Mosaic for more information about fill-out forms.

Form-exec description

The form-exec program uses the request sent by a form to run another program specified by the form writer. This program must be located within that user's account. Note that is the user's responsibility to have programs which do not compromise the system's security. However, only the form writer's account can be affected because of the way form-exec is written.

Form Configuration

The configuration file for form-exec command must exist in the server's path. It can be submitted through a linked directory, and provides the following information: program/script name, output mode, program arguments, environmental variables, required variables, boolean variables, and text which can be mailed to the user's wiliki account. Note: If the form configuration file is a file with the extension .cgi, form-exec will change its user to you and run the cgi program as yourself.

Elements used in the Configuration Form

The configuration file is a plain text file, which can contain lines which have the following format:
# This is a comment
Lines beginning with the pound (#) sign signify comments, and the remainder of the line is ignored.
identifier(argument ...)
The identifier can be one of the following values, arg, bool, env, inp, mail, out, sep, prog, and req. For identifiers which take multiple arguments, the arguments are separated by whitespace or commas. Each identifier is described in its own section below, and a sample form configuration file is also shown.

Argument list
The arg identifier gives a list of variables which are passed as arguments to the program or script to be run. In addition to variable values, literal strings can also be included here. Literal strings are passed directly to the program. Certain special characters can be included by typing the appropriate notation used by printf. For example, \n stands for a newline. One exception to this is the double quotation mark. In order to include a double quotation mark, use the following sequence: \042. See the section on writing the program to access these values from the program. Similarly to literal strings, environment variables (other than those created by the env identifier) can be included by preceding the variable name with the dollar sign ($).
Boolean Variables
The bool identifier can be used to declare certain values as boolean. This is helpful when forms contain the checkbox type input elements. Normally, when a variable is left blank in a form, its value is simply a null value. By specifying a variable as type bool initializes these variables' values to the string, "off". This is useful when the variable is used in an argument, where the arguments to the program should not be null. Boolean variables can only have one list, as opposed to regular variables, which can have several values, separated by the string specified by separator.
Environment variable list
The env identifier is used to set environment variables with the same name as the variables. These variables can be accessed by shell scripts and C programs, as explained below. These variables can be null without causing problems in the program.
Standard Input list
The inp identifier lists the variables and literal strings which are concatenated to the standard input in the order specified in the list. This is useful when large fields of text exist within the form. However, care must be taken when sending many fields simultaneously, as there is no way to delimit the fields.
Mail String list
The mail identifier allows the form to send mail at Wiliki to the creator of the form. The programmer is responsible for all headers needed for this form, including that which specifies the subject of the mail message. The sender of this mail message will automatically be set to webmaster@wiliki.eng.hawaii.edu. The behavior of the values being sent is the same as the inp identifier explained above.
Output Method
The out identifier can only contain three types of values; plain, direct, or a location reference. A location reference can be either a local URL, or a full url, (i.e. with http://). For example, out(plain) displays the output of the program as plain text. out(direct) sends the output of the program directly. Note that the direct method requires a CGI header. This header requires the first line to be either "Content-type: text/html" or "Location: URL", followed by a single blank line. Note: If you write a program to print the output as Html text, make sure your first line contains the title of the document, or the special tag, <HTML>
Program Location
The prog identifier contains the location of the program which is run from the person's account. The program is run as the owner of the configuration file. The directory ~/ should be specified in front of the path. If an absolute path is not specified, form-exec will assume the program is located relative to your home directory.
Required Variables
The req identifier contains the list of variables which can not be left blank. If the variables are left blank, an error message given to the user. The form should indicate that certain fields are required.
Multi-field Separator
The sep identifier contains a string which is used as a separator for those forms specifying many values for the same variable. The string can not contain a left parenthesis. If no separator is specified, a single space is assumed. In order to include special characters, such as parentheses, use backslashes with an octal number.

Sample configuration file

# sample.form
bool(agree)
inp("Sender:" name "\nThought it was neat: " agree "\nComment:" comment "\n")
out(plain)
prog(/bin/cat)
req(name)
In this form, there is one boolean variable, agree, and the standard input is sent the strings shown in addition to the two variables, name and comment. The name variable is required. The program being executed is /bin/cat, with the output being sent as plaintext.

The following file is used as the form for the configuration file given above.

<title>Sample Form</title>
<h1>Sample Form</h1>

<form method=POST action="http://www.eng.hawaii.edu/cgi-bin/form-exec/Forms/sample.form">
<input type=submit value="Submit Form"><br>
<input type=reset value="Reset Form"><br>
<br>
Click this if you think it's neat.<input type=checkbox name="agree"><br>
Name: <input type=text name="name" size=30>(Required field)<br>
Comments: <input type=text name="comment" size=60,3><br>
</form>
Try this form.

Writing Programs for form-exec

The programs which are written for form-exec can be shell scripts or they can be C programs. Compile your programs on our main server, Wiliki, using the +DAportable option. The programs are executed on our web server host, but the +DAportable option compiles in a mode compatible with the server host. Arguments passed to the program using the arg identifier are stored into the argument variables. In a shell script, these values can be referenced by $1, $2, ... and so forth. For a C program, these values can be referenced by the variables argc and argv, as shown in the program segment below:
int main(int argc, char *argv[])
{
    int i;

    for (i = 0; i < argc; i ++)
	printf("Argv %d is %s\n", i, argv[i]);
}
Note that the first element in the argv array is the program name.

The standard input sent to the program by the inp identifier is accessible through the usual input commands, such as getchar and scanf. The size of the data is not sent as an argument, so the program must check for the end-of-file condition. The usual recommendation for the use of the inp identifier is for fields which can contain large amounts of text.

For those variables which were stored as environment variables, the function call, getenv can be used to retrieve the values for the variables. Remember to include the standard library include file, stdlib.h at the beginning of your program. For example, to retrieve the environment variable pooky, the following code segment can be used:

    char pooky[64];

    ...
    strcpy(pooky, getenv("pooky"));
    ...
If your program is written so that it returns a URL instead of actual code, you can use the "direct" output method with the out identifier. Your program should merely print out the following two lines:
Location: URL

Where URL is the Universal Resource Locator where you would like your document to point to. The second line must be blank by convention. (Required by the server.)
Questions or comments can be sent to the address listed below:
webmaster@wiliki.eng.hawaii.edu
Minor change to server information on Thursday, May 16, 1996
Last updated on Friday, May 5, 1995
Copyright © 1995 University of Hawai`i, College of Engineering, Computer Facility
All rights reserved.