Getting Started with CGI in C
Kristaps Dzonsons
Source Code
I'll describe this as if reading a source file from top to bottom. To wit, let's start with the header files. We'll obviously need kcgi and stdint.h, which is necessary for some types found in the header file. I'll also include the HTML library for kcgi—I'll explain why later.
Next, I'll assign the fields we're interested in to numeric identifiers. This will allow us later to assign names, then assign validators to named fields.
The enumeration will allow us to bound an array to KEY__MAX
and refer to individual buckets in the array by the
enumeration value.
I'll assume that KEY_STRING
is assigned 0 and KEY_INTEGER
, 1.
Next, connect the indices with validation functions and names.
The validation function is run by khttp_parse(3); the name is the HTML form name for the given
element.
Built-in validation functions, which we'll use, are described in kvalid_string(3).
In this example, kvalid_stringne
will validate a non-empty (nil-terminated) C string, while kvalid_int
will validate a signed 64-bit integer.
Next, I define a function that acts upon the parsed fields.
According to khttp_parse(3), if a valid value is found, it is assigned into the
fieldmap
array.
If one was found but did not validate, it is assigned into the fieldnmap
array.
Both of these are indexed by the array position in keys
.
(We could also have run the fields
list, but that's for chumps.)
In this trivial example, the function emits the string values if found or indicates that they're not found (or not valid).
As is, this routine introduces a significant problem: if the KEY_STRING
value consists of HTML, it will be inserted
directly into the stream, allowing attackers to use XSS.
Instead, let's use the kcgihtml(3) library to perform the proper encoding and element nesting.
Before doing any parsing, I sanitise the HTTP context. This consists of the page requested, MIME type, HTTP method, and so on.
To begin, I provide an array of indexed page identifiers—similarly as I did for the field validator and name.
This will also be passed to khttp_parse(3).
These define the page requests accepted by the application, in this case being only index
, which I'll also set to
be the default page when invoked without a path (i.e., just http://www.foo.com
).
Note: this is the first path component, so specifying index
will also accept
index/foo
.
Now, I validate the page request and HTTP context based upon the defined components.
This function checks the page request (it must be index
without a subpath), HTML MIME type (expanding to
index.html
), and HTTP method (it must be an HTTP GET
, such as index.html?string=foo
).
To keep things reasonable, I'll have the sanitiser return an HTTP error code (see RFC 2616 for an explanation).
Putting all of these together: parse the HTTP context, validate it, process it, then free the resources. Headers are output using khttp_head(3), with the document body started with khttp_body(3). The HTTP context is closed with khttp_free(3).
That's it!
Compile and Link
Your source is no good til it's compiled and linked into an executable.
In this section I'll mention two strategies: the first is where the application is dynamically linked; in the second,
statically.
Dynamic linking is normal for most applications, but CGI applications are often placed in a file-system jail (a chroot(2))
without access to other libraries, and are thus statically linked.
In short, it depends on your environment.
Let's call our application tutorial0.cgi
and the source file, tutorial0.c
.
To dynamically link:
For static linking, which is the norm in more sophisticated systems like OpenBSD:
Install
Installation steps depends on your operating system, web server, and a thousand other factors. I'll stick with the simplest installation using the defaults of OpenBSD with the default web server httpd(8). To begin with, configure /etc/httpd.conf with your server's root being in /var/www and FastCGI being in /var/www/cgi-bin. If you've already done this, or have a configuration file in place, you won't need to do this.
Next, we use the rcctl(8) tool to enable and start the httpd(8) webserver and slowcgi(8) wrapper. (The latter is necessary because httpd(8) only directly supports FastCGI, so a proxy is necessary.) Again, you may not need to do this part. We also make sure the instructions on the main page are followed regarding OpenBSD sandboxing in the file-system jail.
Assuming we built the static binary, we can now just install into the CGI directory and be ready to go!