Introduction to CGI Programming

CGI stands for the Common Gateway Interface. This interface allows a web browser to pass input to Python scripts and pass the output of Python scripts to a web browser. Building a web interface is similar to building a graphical user interface.

Python Scripts in Browsers

The combination of a web server, a scripting language, and database is often referred to as the LAMP system.

LAMP stands for

  • L is Linux, the operating system;
  • A is Apache, the web server;
  • M is MySQL, the database;
  • P is Python, the scripting language.

Observe that all four are open source software. Apache makes a cute pun on a patchy web server, but its name is in honor of the Native American Apache tribe. Its web site is at http://www.apache.org.

Apache is platform independent. We will run our demonstrations on a Mac OS X computer.

  • Apache is already installed on Mac OS X, launch Safari with http://localhost/ to verify.
  • To enable web sharing, select Sharing from the System Preferences.
  • Instead of public_html, the Sites directory is where Mac users store their web pages.
  • Instead of /var/www/cgi-bin, CGI scripts are in /Library/WebServer/CGI-Executables. CGI = Common Gateway Interface, is set of protocols through which applications interact with web servers.

Using localhost we remain working offline. Note that Apache also runs under windows.

In older Mac OS X versions, the above list was all one needed to know, but more needs to be done in recent versions of the operating system and with later versions of Apache. To check the version of Apache, we can type the folling at the command prompt in a terminal window:

$ httpd -v
Server version: Apache/2.4.23 (Unix)
Server built:   Aug  8 2016 16:31:34
$

To launch Apache, type sudo /usr/sbin/apachectl graceful in a Terminal window. Pointing the browser to localhost (or 127.0.0.1) shows It works! if Apache was configured correctly. To check the configuration of Apache, type apachectl configtest at the command prompt.

To serve web pages from user directories, in the file /etc/apache2/extra/httpd-userdir.conf uncomment the line that contains

Include /private/etc/apache2/users/*.conf

In /etc/apache2/httpd.conf uncomment the lines

LoadModule userdir_module libexec/apache2/mod_userdir.so
# User home directories
Include /private/etc/apache2/extra/httpd-userdir.conf

and also the line

LoadModule cgi_module libexec/apache2/mod_cgi.so

Finally, in the folder /etc/apache2/users, if <user> is your user name, in the file <user>.conf, then add the line

Require all granted

inside the block

<Directory "/Users/<user>/Sites/">
  Options Indexes MultiViews
  AllowOverride None
  Require all granted
</Directory>

The working of our first CGI script is illustrated in Fig. 43.

_images/figwebworks.png

Fig. 43 Our first CGI script.

The Python script python_works.py is below.

#!/usr/bin/python
"""
Place this in /Library/WebServer/CGI-Executables.
"""
print("Content-Type: text/plain\n\n")
print("Python works")

The first line is the location of the Python interpreter.

Our second script displays the current time (at the server) in a browser window, as shown in Fig. 44.

_images/figwebtime.png

Fig. 44 Displaying the current time at the server in a browser window.

The script for Fig. 44 is short:

#!/usr/bin/python
"""
Displays the current time in a browser window.
"""
import time
print("Content-Type: text/plain\n")
print(time.ctime(time.time()))

The time gets updated every time the page refreshes.

Internet Basics

HTTP (HyperText Transfer Protocol) determines request-response communication between web browser and web server.

The methods of HTTP are

  • The GET method is a request for a static resource, such as an HTML page. Simply typing the URL of the requested web page invokes the GET method.
  • The POST method is a request for a dynamic resource, with input parameters of the request contained within the body of the request.

The GET and POST methods are most commonly used.

Some commonly used elements of HTML (HyperText Markup Language)

  • HTML: <HTML> marks the start of an HTML document, and </HTML> marks the end.
  • HEAD specifies the header information of a document.
  • TITLE specifies the title of the document.
  • BODY contains the body text of the document.
  • FONT used to alter font size and color of text.
  • H1 to display headings of type 1, other heading elements are H2 and H3.
  • P defines a paragraph.
  • OL starts an ordered list, for an unordered list use UL.
  • LI is a list element in an ordered or unordered list.

One way to learn HTML is by looking at source of web pages.

To accept data from users, three elements are generally used:

  • FORM contains all the code related to a form. Its purpose is to accept user input in a systematic and structured manner.
  • INPUT specifies the code used to create the form controls that accept user input.
  • SELECT is used to display lists in a form.

Designing and creating interactive web pages is similar to GUI design.

A form is a collection of text boxes, radio buttons, check boxes, and buttons. Two attributes of a form are

  • METHOD is GET or POST.
  • ACTION is typically used to specify the code that will process the the input data.

The general syntax of using a FORM is

<FORM METHOD="GET_or_POST" ACTION="file_name">
code_of_the_form
</FORM>

The INPUT element is specified inside a FORM element. The INPUT elements consists of controls, such as text boxes, buttons, radio buttons, and check boxes. Each of these controls can have attributes:

  • TYPE specifies type of control to accept user input.
  • NAME specifies name of a control, for identification.
  • VALUE holds value entered by user, or default.

There are five types of control: (1) submit button; (2) text boxes; (3) radio buttons; (4) check boxes; (5) combo boxes.

We distinguish between

  • Client-Side Scripting: processed by the browser, which has the advantage that it saves time on the server.
  • Server-Side Scripting: processed by the server, used where synchronization is needed, such as data modification. The server time is then the time to synchronize.

Python is a powerful server-side scripting language. The cgi module has to be imported, in order to communicate the data from client.

Interactive Web Pages

The form to prompt the user for a number is displayed in Fig. 45.

_images/figwebgivenumber.png

Fig. 45 Prompting the user for a number.

The order of operations is as follows:

  1. The displayed web page uses a form element.
  2. The form contains two input elements
  3. After the user hits submit, a Python script will run.

We distinguish two cases, depending on whether the user enters a number of nothing before hitting the submit button. The first case, in which the user enters a number is illustrated in Fig. 46 and Fig. 47.

_images/figwebgivenumdata.png

Fig. 46 The user enters a number in an HTML form.

_images/figwebgivedata.png

Fig. 47 The number entered by the user is processed by a Python script.

Observe the URL in Fig. 47 and notice that it contains the number entered by the user.

In case when the user hits the submit button before entering a number, an error message should be displayed, as illustrated in Fig. 48.

_images/figwebgivenothing.png

Fig. 48 The user receives an error message when nothing is entered.

The HTML code for the form and Input Elements is listed below.

<HTML>
<HEAD>
<TITLE> MCS 275 Lec 21: give a number </TITLE>
</HEAD>
<BODY>

<FORM action="http://localhost/cgi-bin/give_number.py">

give a number:

<INPUT type="text" name="number" size ="8">

<INPUT type="submit">

</FORM>
</BODY>
</HTML>

The action is defined by a Python script, as defined in the code below.

#!/usr/bin/python
"""
Accepts a number from a form.
"""
import cgi
form = cgi.FieldStorage()
print("Content-Type: text/plain\n")
try:
    n = form['number'].value
    print("your number is " + n)
except KeyError:
    print("please enter a number")

As an application of the above code, consider a web interface to a function to compute the greatest common divisor of two numbers. The two numbers are entered via a web form. The Python script computes the greatest common divisor of the two entered numbers and displays the result in the browser window. The form is show in Fig. 49.

_images/figwebgcdinput.png

Fig. 49 The form to pass the input to the greatest common divisor calculator.

The html code we consider earlier extends naturally. When the user submits the numbers, their greatest common divisor will be computed and displayed, as shown in Fig. 50.

_images/figwebgcdoutput.png

Fig. 50 The output of the greatest common divisor calculator.

The HTML code which defines the form to prompt the user for the input is below.

<HTML>
<HEAD>
<TITLE> MCS 275 Lec 21: web gcd </TITLE>
</HEAD>
<BODY>

<FORM action="http://localhost/cgi-bin/web_gcd.py">
<P> give first number:
<input type="text" name="A" size ="40"> </P>
<P>give second number:
<input type="text" name="B" size ="40"> </P>
<P> <input type="submit"> </P>
</FORM>

</BODY>
</HTML>

The Python script is listed entirely below.

#!/usr/bin/python
"""
Script to compute the greatest common divisor
of two numbers entered via a form.
"""
import cgi
form = cgi.FieldStorage()
print("Content-Type: text/plain\n")
try:
    x = form['A'].value
    print("your first number is " + x)
except KeyError:
    print("please enter a first number")
try:
    y = form['B'].value
    print("your second number is " + y)
except KeyError:
    print("please enter a second number")

def gcd(alpha, beta):
    """
    Returns the greatest common divisor
    of the numbers alpha and beta.
    """
    rest = alpha % beta
    if rest == 0:
        return beta
    else:
        return gcd(beta, rest)

ix = int(x)
iy = int(y)
print("gcd(%d,%d) = %d" % (ix, iy, gcd(ix, iy)))

Exercises

  1. Make your own web page, using people.uic.edu.
  2. Verify if Apache is installed on your computer. Install Apache if necessary.
  3. Design a web interface to convert pounds into kilograms and kilograms into pounds.
  4. Take the script facnum.py of Lecture 1 and write a web interface for it.