Toggle All News

kcgi is an open source CGI and FastCGI library for C web applications. It is minimal, secure, and auditable—a useful addition to the BCHS application stack. To start, install the library then read the usage guide. Contact Kristaps with questions or comments. kcgi is a BSD.lv project.

The following simple example implements a server that just echoes Hello, World! as an HTTP response to a CGI request. Click on any italicitised fields to link to the documentation.
#include <stdint.h>
#include <stdlib.h>
#include <kcgi.h>

int main(void) {
  struct kreq r;
  const char *page = "index";
  if (KCGI_OK != khttp_parse(&r, NULL, 0, &page, 1, 0))
    return(EXIT_FAILURE);
  khttp_head(&r, kresps[KRESP_STATUS], "%s", khttps[KHTTP_200]);
  khttp_head(&r, kresps[KRESP_CONTENT_TYPE], "%s", kmimetypes[r.mime]);
  khttp_body(&r);
  khttp_puts(&r, "Hello, world!");
  khttp_free(&r);
  return(EXIT_SUCCESS);
}
#include <stdint.h>
#include <stdlib.h>
#include <kcgi.h>

int main(void) {
  struct kreq r;
  const char *page = "index";

  /*
   * Parse the HTTP environment.
   * We only know a single page, "index", which is also
   * the default page if none is supplied.
   * (We don't validate any input fields.)
   */
  if (KCGI_OK != khttp_parse(&r, NULL, 0, &page, 1, 0))
    return(EXIT_FAILURE);

  /* 
   * Ordinarily, here I'd switch on the method (OPTIONS, etc.,
   * defined in the method variable) then switch on which
   * page was requested (page variable).
   * But for the same of example, just output a response.
   */

  /* Emit the HTTP status 200 header: everything's ok. */
  khttp_head(&r, kresps[KRESP_STATUS], "%s", khttps[KHTTP_200]);
  /* Echo our content-type, defaulting to HTML if none was specified. */
  khttp_head(&r, kresps[KRESP_CONTENT_TYPE], "%s", kmimetypes[r.mime]);
  /* No more HTTP headers: start the HTTP document body. */
  khttp_body(&r);
  
  /*
   * We can put any content below here: JSON, HTML, etc.
   * Usually we'd switch on our MIME type.
   * However, we're just going to put the literal string as noted...
   */
  khttp_puts(&r, "Hello, world!");
  /* Flush the document and free resources. */
  khttp_free(&r);
  return(EXIT_SUCCESS);
}

Installation

First, check if kcgi isn't already a third-part port for your system, such as for OpenBSD or FreeBSD. If so, install using that system.

If not, you'll need a modern UNIX system. To date, kcgi has been built and run on GNU/Linux machines, BSD (OpenBSD, FreeBSD), and Mac OSX (Snow Leopard, Lion) on i386 and AMD64. It has been deployed under Apache, nginx, and OpenBSD's httpd(8) (the latter two natively over FastCGI and via the slowcgi wrapper). Begin by downloading kcgi.tgz and verify the archive with kcgi.tgz.sha512. Once downloaded, compile the software with make, which will automatically run a configuration script to conditionally deploy portability glue. Finally, install the software using make install, optionally specifying the PREFIX if you don't intend to use /usr/local.

If kcgi doesn't compile, please send me the config.log file and the output of the failed compilation. If you're running on an operating system with an unsupported sandbox, let me know and we can work together to fit it into the configuration and portability layer. Lastly, I'd love to compile with mingw for Microsoft machines: please contact me if you can do the small amount of work (I think?) to port the poll(2) and other non-Microsoft functions.

Usage

The kcgi manpages, starting with kcgi(3), are the canonical source of documentation. The following are introductory materials to the system.

Deploying

Applications using kcgi behave just like any other application. To compile kcgi applications, just include the kcgi.h header file and make sure it appears in the compiler inclusion path. (According to C99, you'll need to include stdint.h before it for the int64_t type used for parsing integers.) Linking is similarly normative: link to libkcgi and, if your system has compression support, libz.

Well-deployed web servers, such as the default OpenBSD server, by default are deployed within a chroot(2). If this is the case, you'll need to statically link your binary. If running within a chroot(2) and on OpenBSD, be aware that the sandbox method requires /dev/systrace within the server root. By default, this file does not exist in the web server root. Moreover, the default web server root mount-point, /var, is mounted nodev. This complication does not exist for the other sandboxes.

FastCGI applications may either be started directly by the web server (which is popular with Apache) or externally given a socket and kfcgi(8) (this method is normative for OpenBSD's httpd(8) and suggested for the security precautions taken by the wrapper).

Implementation Details

The bulk of kcgi's CGI handling lies in khttp_parse(3), which fully parses the HTTP request. Application developers must invoke this function before all others. For FastCGI, it's split between khttp_fcgi_init(3), which initialises context; and khttp_fcgi_parse(3), which receives new parsed requests. In either case, requests must be freed by an khttp_free(3).

All functions isolate the parsing and validation of untrusted network data within a sandboxed child process. Sandboxes limit the environment available to a process, so exploitable errors in the parsing process (or validation with third-party libraries) cannot touch the system environment. This parsed data is returned to the parent process over a socket. In the following, the HTTP parser and input validator manage a single HTTP request, while connection delegator accepts new HTTP requests and passes them along.

Implementation Details Implementation Details

This method of sandboxing the untrusted parsing process follows OpenSSH, and requires special handling for each operating system:

seccomp(2) (Linux)
This requires a fairly new kernel (≥Linux 3.5). It is supplemented by setrlimit(2) limiting. For the time being, this feature is only available for x86, x86_64, and arm architectures. If you're using another one, please send me your uname -m and, if you know if it, the correct AUDIT_ARCH_xxx found in /usr/include/linux/audit.h.
systrace(4) (OpenBSD)
This requires the existence of /dev/systrace if running in a chroot(2), which is strongly suggested. If you're using a stock OpenBSD, make sure that the mount-point of /dev/systrace isn't mounted nodev!
tame(2) (OpenBSD)
This will only work on OpenBSD ≥5.8. (As of this note, this has not been officially released: the system will compile with snapshots, but the function will not register as enabled during configuration.) It is selected with higher priority over systrace(4) on OpenBSD machines.
sandbox_init(3) (Apple OSX)
This uses the sandboxing profile for pure computation as provided in Mac OS X Leopard and later. This is supplemented by resource limiting via setrlimit(2).
capsicum(4) (FreeBSD)
Uses the capabilities facility on FreeBSD 10 and later. This is supplemented by resource limiting with setrlimit(2).

Since validation occurs within the sandbox, special care must be taken that validation routines don't access the environment (e.g., by opening files, network connections, etc.), as the child might be abruptly killed by the sandbox facility. (Not all sandboxes do this.) If required, this kind of validation can take place after the parse validation sequence.

The connection delegator is similar, but has different sandboxing rules, as it must manage an open socket connection and respond to new requests.

Testing

kcgi is shipped with a fully automated testing framework executed with make regress. Interfacing systems can also make use of this by working with the kcgiregress(3) library. This framework acts as a mini-webserver, listening on a local port, translating an HTTP document into a minimal CGI request, and passing the request to a kcgi CGI client. For internal tests, test requests are constructed with libcurl. The binding local port is fixed: if you plan on running the regression suite, you may need to tweak its access port.

Another testing framework exists for use with the American fuzzy lop. To use this, you'll need to compile the make afl target with your compiler of choice, e.g., make clean, then make afl CC=afl-gcc. Then run the afl-fuzz tool on the afl-multipart, afl-plain, and afl-urlencoded binaries using the test cases (and dictionaries, for the first) provided.

Performance

Security comes at a price—but not a stiff price. By design, kcgi incurs overhead in three ways: first, spawning a child to process the untrusted network data; second, enacting the sandbox framework; and third, passing parsed pairs back to the parent context. In the case of running CGI scripts, kcgi performance is bound to the operating system's ability to spawn and reap processes. For FastCGI, the bottleneck becomes the transfer of data. In the following graph, I graph the responsiveness of kcgi against the baseline web-server performance.

This shows the empirical cumulative distribution of a statisically-significant number of page requests as measured by ab(1) with 10 concurrent requests. The CGI line is the CGI sample included in the source; the FastCGI line is the FastCGI sample; the CGI (simple) simply emits a 200 HTTP status and Hello, World; and the static is a small static file on the web server. The operating system is Mac OS X 10.7.5 Air laptop (1.86 GHz Intel Core 2 Duo, 2 GB RAM) with the stock Apache. The FastCGI server was started using the kfcgi(8) defaults.