The Common Gateway Interface (CGI), originally developed as part of the NCSA HTTP server, and
is an old standard for interfacing external applications with HTTP servers that still enjoys
considerable use. It was created to allow dynamic data to be generated in response to HTTP
requests and return the results to the user's browser. Plain HTML documents are typically
static, while a CGI program allows the response data to be dynamically created. However, since
CGI was first developed, several better means of creating dynamic web pages have been created
that are faster and more efficient. Read more about such replacements in Creating Dynamic Web Pages, Embedded Server
Pages and Using PHP.
Mbedthis Appweb supports CGI so that existing applications that are written to the CGI
interface can be fully supported. Appweb has a fully featured CGI handler that alleviates many
of the pains with configuring CGI setup.
Configuring CGI Programs
CGI programs may be configured and invoked in two
primary ways:
- By URL prefix
- By URL extension
When invoked by URL prefix, the CGI programs and scripts are stored in special CGI
directories (for example cgi-bin). When invoked by URL extension, the
CGI programs may be stored anywhere in the web directory. For security, it is usually best to
store all CGI programs and scripts outside the directory containing the web content.
Consequently invoking CGI programs by extension should only be used in combination with a URL
prefix that allows the CGI directory to be specified.
Appweb nominates a directory as a CGI directory via the ScriptAlias configuration file directive.
For example:
ScriptAlias /cgi-bin/ $SERVER_ROOT/web/cgi-bin/
When a URL is requested by a browser which includes the "/cgi-bin/" prefix, the script name
immediately after "/cgi-bin/" will be run. For example, the following URL:
http://www.mbedthis.com/cgi-bin/testCgi
This will cause the testCgi program to be run. To configure
Appweb to specify CGI programs and scripts by URL extension use the AddHandler configuration file directive.
For example:
AddHandler cgiHandler .myExt
This configures Appweb to process URLs that contain the .myExt extension via the CGI
handler. To determine which program to run, the Appweb CGI handler looks up the Mime type
associated with the ".myExt" extension in the Mime types file "mime.types". In this file, the
extension is mapped to a mime type. For example:
application/x-appweb-perl myExt
This definition will map ".myExt" to the perl mime type. This mime type must then be mapped
to a program via the the Action
directive. For example:
Action application/x-appweb-perl /usr/bin/perl
This will cause /usr/bin/perl to be run to process the request. Output from perl is captured by
the CGI handler and then returned to the user's browser.
Invoking CGI Programs
When a CGI program is run, the Appweb CGI handler
communicates request information to the CGI program via Environment variables and in some
cases, via the command line. The command line is set to the name of the CGI program, CGI script
if different to the program name, and the CGI Query String. The query string is set to the
portion of the URL after any "?" character after de-escaping special characters.
CGI Command LineThe command line will be set differently depending on how the CGI
program is being invoked. There are four possible scenarios:
- Program invoked directly via the request URL.
- Program invoked indirectly if the CGI script contains a Bang path directive.
- The program is specified via an ActionProgram directive in the Appweb configuration
file.
- On windows if the program is a Windows Batch file
The command line arguments for the CGI program will be set differently in each case. See
the tables below for the specifications as to how the command line arguments are defined:
Programs Invoked Directly via the Request URL
argv[0]
|
Program name immediately after the CGI URL prefix (E.g. after
/cgi-bin/)
|
argv[1..N]
|
Each arg is set to portions of the QUERY_STRING is split at "+"
characters after de-escaping the query. |
Programs Invoked Indirectly with Bang DirectiveIf the CGI program/script specified in
the URL contains a "#!/pathToProgram" directive on the first line, it is interpreted to be the
path to the real CGI program to run. The script name is then passed in the command line.
argv[0]
|
Program name defined in the first line of the CGI script after the
"#!" characters.
|
argv[1]
|
The name of the CGI script originally specified in the URL.
|
argv[2..N]
|
Each arg is set to portions of the QUERY_STRING is split at "+"
characters after de-escaping the query.
|
Programs Specified via an ActionProgram Directive
argv[0]
|
Program name specified in the ActionProgram directive in the Appweb
configuration file.
|
argv[1]
|
The name of the CGI script originally specified in the URL.
|
argv[2..N]
|
Each arg is set to portions of the QUERY_STRING is split at "+"
characters after de-escaping the query. |
Windows Batch Commands
argv[0]
|
Set to "cmd.exe"
|
argv[1]
|
/Q
|
argv[2]
|
/C
|
argv[3]
|
Command
|
The "Command" is a quoted string set to the name of the CGI script originally specified
in the URL followed by the Query String split at "+" characters. The entire Command string is
escaped so that dangerous characters are preceded by "^" to prevent security attacks.
CGI Environment Variables
CGI uses environment variables to send your program
its additional parameters. The following environment variables are defined :
Variable
|
Description
|
AUTH_TYPE
|
Set to the value of the HTTP AUTHORIZATION header. Usually "basic"
or "digest".
|
CONTENT_LENGTH
|
Set to the length of any associated posted content.
|
DOCUMENT_ROOT
|
Set to the path location of the web documents. Defined by the
DocumentRoot directive in the Appweb configuration file.
|
GATEWAY_INTERFACE
|
Set to "CGI/1.1"
|
HTTP_ACCEPT
|
Set to the value of the HTTP ACCEPT header. This specifies what
formats are acceptable and/or preferable for the client.
|
HTTP_CONNECTION
|
Set to the value of the HTTP CONNECTION header. This specifies how
the connection should be persisted when the request is complete. (Keep-alive)
|
HTTP_HOST
|
Set to the value of the HTTP HOST header. This specifies the name of
the server to process the request. When using named virtual hosting, requests to
different servers (hosts) may be processed by a single HTTP server on a single IP
address. The HTTP_HOST field permits the server to determine which virtual host
should process the request.
|
HTTP_USER_AGENT
|
Set to the value of the HTTP USER_AGENT header. |
PATH_INFO
|
The PATH_INFO variable is set to the URL portion (if any) after the
SCRIPT_NAME.
|
PATH_TRANSLATED
|
The physical on-disk path name corresponding to PATH_INFO.
|
QUERY_STRING
|
The QUERY_STRING variable is set to the URL string portion that
follows the first "?" in the URL. The QUERY_STRING is URL encoded in the standard
URL format by changing spaces to "+", and encoding all URL special characters with
%xx hexadecimal encoding. Most major scripting languages provide
routines to assist in decoding QUERY_STRINGs. |
REMOTE_ADDR
|
Set to the IP address of the requesting client.
|
REMOTE_HOST
|
Set to the IP address of the requesting client (same as
REMOTE_ADDR).
|
REMOTE_USER
|
Set to the name of the authenticated user.
|
REMOTE_METHOD
|
Set to the HTTP method used by the request. Valid values are: "GET",
"HEAD", "OPTIONS", "POST", or "TRACE". "PUT" and "DELETE" are not supported.
|
REQUEST_URI
|
The complete request URL after the host name portion. It always
begins with a leading "/".
|
SCRIPT_NAME
|
The name of the CGI script or program to run. If an ActionProgram is
specifying the name of a CGI interpreter, then SCRIPT_NAME is set to the name of
the script to interpret.
|
SERVER_ADDR
|
The IP address of the server or virtual host responding to the
request.
|
SERVER_HOST
|
The name of the default server or virtual host serving the request.
|
SERVER_NAME
|
Same as SERVER_HOST.
|
SERVER_PORT
|
The HTTP port of the server or virtual host serving the request.
|
SERVER_PROTOCOL
|
Set to "HTTP/1.0" or "HTTP/1.1" depending on the protocol used by
the client.
|
SERVER_SOFTWARE
|
Set to "Mbedthis Appweb/VERSION"
|
SERVER_URL
|
Same as SERVER_NAME.
|
Example
Consider the following URL which will run the Perl interpreter to execute the
pricelists.pl script.
http://hostname/cgi-bin/myScript/products/pricelists.pl?id=23&payment=creditCard
This URL will cause the following environment settings:
Variable
|
Value
|
PATH_INFO
|
/products/pricelists
|
PATH_TRANSLATED
|
/var/appweb/web/products/pricelists # where /var/appweb/web is the
DocumentRoot
|
QUERY_STRING
|
id=23&payment=credit+Card
|
REQUEST_URI
|
/cgi-bin/myScript/products/pricelists?id=23&payment=credit+Card |
SCRIPT_NAME
|
myScript
|
ARGV[0]
|
/usr/bin/perl
|
ARGV[1]
|
pricelists.pl
|
ARGV[2]
|
id=23&payment=creditCard
|
This URL below demonstrates some rather cryptic encoding of URLs. The important thing to
remember is that command line arguments are delimited by "+". The hex encoding %20, is the
encoding for the space character. Once passed to the CGI program, the convention is for CGI
variables to be delimited by "&".
http://hostname/cgi-bin/cgiProgram/extra/Path?var1=a+a&var2=b%20b&var3=c
This URL will cause the following environment settings:
Variable
|
Value
|
PATH_INFO
|
/extra/Path
|
PATH_TRANSLATED
|
/var/appweb/web/extra/Path
|
QUERY_STRING
|
var1=a+a&var2=b%20b&var3=c |
REQUEST_URI
|
/cgi-bin/cgiProgram/extra/Path?var1=a+a&var2=b%20b&var3=c |
SCRIPT_NAME
|
cgiProgram
|
ARGV[0]
|
cgiProgram
|
ARGV[1]
|
var1=a
|
ARGV[2]
|
a&var2=b b&var3=c
|
URL EncodingWhen a URL is sent via HTTP certain special characters must be escaped so
the URL can be processed unambiguously by the HTTP server. To escape the special characters, a
HTTP client should convert them to their %hex equivalent. Form and query variables are
separated by "&". For example: a=1&b=2 defines two form variables "a" and "b" with
their values equal to "1" and "2" respectively.
CGI Programming
A CGI program can return almost any possible content type back to the client's
browser: plain HTML, audio, video or any other format. CGI programs can also control the user's
browser and redirect it to another URL. To do this, CGI programs return pseudo-HTTP headers
that are interpreted by Appweb before passing the data on to the client.
Appweb understands the following CGI headers that can be output by the CGI program. They are
case-insensitive.
Header
|
Description
|
Content-type
|
Nominate the content Mime Type. Typically "text/html". See the
mime.types for a list of possible mime types.
|
Status
|
Set to a HTTP response code. Success is 200. Server error is 500.
|
Location
|
Set to the URI of a new document to which to redirect the client's
browser.
|
ANY
|
Pass any other header back to the client.
|
For example:
Content-type: text/html <HTML><HEAD><TITLE>Sample CGI Output</TITLE></HEAD> <BODY> <H1>Hello World</H1> </BODY></HTML>
To redirect the browser to a new location:
Location: /newUrl.html
To signify an error in the server:
Status: 500
Hints and Tips
If you have special data or environment variables that must be
passed to your CGI program, you can wrap it with a script that defines that environment before
invoking your script.
Other Resources
The following URLs may be helpful in further reading about CGI:
|