kcgi is a minimal CGI library for web applications in ISC licensed ISO C. It was designed to be secure and auditable. See a Comparison of CGI Libraries in C for alternatives. To start, download kcgi.tgz and run make install into your PREFIX of choice, then read kcgi(3). The system also ships with the kcgihtml(3), kcgijson(3), and kcgi_regress(3) libraries. Contact Kristaps with questions or comments. kcgi is a BSD.lv project.

Figure: this simple example implements a server that just echoes Hello, World! as an HTTP response.
#include <stdint.h>
#include <stdlib.h>
#include <kcgi.h>

int main(void) {
  struct kreq r;
  const char *page = "index";
  if (KCGI_OK != khttp_parse(&r, NULL, 0, &page, 1, 0))
    return(EXIT_FAILURE);
  khttp_head(&r, kresps[KRESP_STATUS], "%s", khttps[KHTTP_200]);
  khttp_head(&r, kresps[KRESP_CONTENT_TYPE], "%s", kmimetypes[r.mime]);
  khttp_body(&r);
  khttp_puts(&r, "Hello, world!");
  khttp_free(&r);
  return(EXIT_SUCCESS);
}
#include <stdint.h>
#include <stdlib.h>
#include <kcgi.h>

int main(void) {
  struct kreq r;
  const char *page = "index";

  /*
   * Parse the HTTP environment.
   * We only know a single page, "index", which is also
   * the default page if none is supplied.
   */
  if (KCGI_OK != khttp_parse(&r, NULL, 0, &page, 1, 0))
    return(EXIT_FAILURE);

  /* Emit the HTTP status 200 header: everything's ok. */
  khttp_head(&r, kresps[KRESP_STATUS], "%s", khttps[KHTTP_200]);
  /* Echo our content-type, defaulting to HTML if none was specified. */
  khttp_head(&r, kresps[KRESP_CONTENT_TYPE], "%s", kmimetypes[r.mime]);
  /* No more HTTP headers: start the HTTP document body. */
  khttp_body(&r);
  
  /*
   * We can put any content below here: JSON, HTML, etc.
   * Usually we'd switch on our MIME type.
   * However, we're just going to put the literal string as noted...
   */
  khttp_puts(&r, "Hello, world!");
  /* Flush the document and free resources. */
  khttp_free(&r);
  return(EXIT_SUCCESS);
}

How does it work?

The meat of kcgi lies in the khttp_parse(3) function, which parses key-value input pairs from the HTTP request as well as determing the page request itself, MIME type, and so on. Application developers must invoke this function before all others. As mentioned in security section, the subroutines of khttp_parse(3) are invoked inside of a sandboxed child process – this isolates unvalidated, adversarial input. It must be matched by an khttp_free(3).

Once khttp_parse(3) has been successfully invoked, the calling application is free to respond to the parsed and validated request.

Security

As a security precaution, the kcgi library parses and validates untrusted network data in a sandboxed child process by forking within khttp_parse(3), where child process is responsible for reading and parsing form data from the web server. This parsed data is returned to the parent process over a socket. This method of sandboxing the untrusted child process follows OpenSSH, and requires special handling for each operating system:

systrace(4) (OpenBSD)
This requires the existence of /dev/systrace if running in a chroot(2), which is strongly suggested. If you're using a stock OpenBSD, make sure that the mount-point of /dev/systrace isn't mounted nodev!
sandbox_init(3) (Apple OSX)
This uses the sandboxing profile for pure computation as provided in Mac OS X Leopard and later. This is supplemented by resource limiting via setrlimit(2).
capsicum(4) (FreeBSD)
Uses the capabilities facility on FreeBSD 10 and later. This is supplemented by resource limiting with setrlimit(2).

Since validation occurs within the sandbox, special care must be taken that validation routines don't access the environment (e.g., by opening files, network connections, etc.), as the child will be abruptly killed by the sandbox facility. If required, this kind of validation can take place after the parse validation sequence.

Portability

kcgi should run on any modern UNIX systems and with any web server. To date, it has been built and run on GNU/Linux machines, BSD (OpenBSD, FreeBSD), and Mac OSX (Snow Leopard, Lion) on i386 and AMD64. It has been deployed under Apache, nginx, and OpenBSD's httpd(8) (the latter two via the slowcgi wrapper).

Portability across UNIX systems is made possible by a small configure script that checks for minor inconsistencies such as strlcpy(3), the Security mechanisms, and for Compression support.

Extensibility

While page maps and input validation are entirely driven by the interfacing application, kcgi also allows for extension of the default HTTP headers, schemas, MIME types, and so on. Reasonable default have been provided for convenience. For specifics, see khttp_parse(3).

The library can also be extended for different output modes. Two such modes, kcgihtml(3) and kcgijson(3), are bundled with the system. It allows a mechanism for building HTML5 trees around the usual khttp_write(3) family of functions.

Compression

If HAVE_ZLIB is enabled during compilation (via the Portability mechanism), khttp_body(3) will signal use of zlib to compress the HTTP body. Compression is only enabled if the client provides the correct (gzip) HTTP request header.

Input Processing

All common input methods—query string, cookie, and form (multipart form-data and mixed, urlencoded, and plain—are supported by kcgi. As described in the Security section, these fields are all parsed and validated from network data in a child process. Each input key-value pair can be matched (by key name) to a validator, which is run when fields are parsed. You can then look up key-value pairs constant-time in a table indexed by that key.

Templating

Many application will want just to fill in an output template instead of creating complex output trees (relegating most work to JavaScript and JSON). kcgi provides the khttp_template(3) family of functions to fill in files or memory buffers with data. Templates are the most common usage of kcgi, as they allow for a strong disconnect between prsentation and logic.

Testing

kcgi is shipped with a fully automated testing framework executed with make regress. Interfacing systems can also make use of this by working with the kcgi_regress(3) function library. This framework acts as a mini-webserver, listening on a local port, translating an HTTP document into a minimal CGI request, and passing the request to a kcgi CGI client. For internal tests, test requests are constructed with libcurl.

The automated test framework, at the moment, only has a few tests for basic functionality and sandboxing. The binding local port is fixed, too; so if you plan on running the regression suite, you may need to tweak its access port.