kcgi: minimal CGI and FastCGI library for C/C++

kcgi is an open source CGI and FastCGI library for C/C++ web applications. It is minimal, secure, and auditable. To start, install the library. Then read the deployment and usage guides.

For fuller examples, see sample.c (C and CGI), samplepp.cc (C++ and CGI), sample-fcgi.c (C and FastCGI), or jump to the Documentation section.

kcgi supports many features: auto-compression, handling of all HTTP input operations (query strings, cookies, page bodies, multipart) with validation, authentication, configurable output caching, request debugging, formatted writers (JSON, XML, etc.) and so on. Its strongest differentiating feature is using sandboxing and process separation for handling the untrusted input path.

installation

First, check if kcgi isn't already packaged for your system, such as with pkg_add kcgi (OpenBSD), pkg install kcgi (FreeBSD), brew install kcgi (brew), pacman -S kcgi (Arch Linux), and so on. If so, install using that system.

If not, kcgi has been built and run on GNU/Linux (musl and glibc), OpenBSD, NetBSD, FreeBSD), Solaris, OmniOS, and Mac OS X (only Mojave and newer!) on i386, amd64, powerpc, arm64, and sparc64. Download kcgi.tgz and verify the archive with kcgi.tgz.sha512. To run bleeding-edge code between releases, the repository is on GitHub. In both cases, build and install instructions may be found on the repository page.

The only hard build-time dependency is BSD make (bmake on Linux) and zlib. If you're running the regression tests (see Testing), you'll also need libcurl.

deployment

To use kcgi, you'll need a web server capable of running CGI or FastCGI. Apache, nginx, and OpenBSD's httpd(8) have all been used extensively, with the latter two natively over FastCGI and via the slowcgi wrapper.

To compile kcgi applications, use the package configuration. Linking is similarly normative.

% cc `pkg-config --cflags kcgi` -c yourprog.c
% cc yourprog.o `pkg-config --libs kcgi`

Well-deployed web servers, such as the default OpenBSD server, by default are deployed within a chroot(2). If this is the case, you'll need to statically link your binary (or pull all shared libraries into the shadow root).

% cc -static yourprog.o `pkg-config --static --libs kcgi`

FastCGI applications may either be started directly by the web server (which is popular with Apache) or externally given a socket and kfcgi(8) (this method is normative for OpenBSD's httpd(8) and suggested for the security precautions taken by the wrapper).

documentation

The kcgi manpages, starting with kcgi(3), are the canonical source of documentation. The following is a list of all manpages:

Alternatively, the following are introductory materials to the system:

Getting Started with CGI in C
This tutorial describes a typical CGI example using kcgi. In it, I'll process HTML form input with two named fields, string (a non-empty, unbounded string) and integer, a signed 64-bit integer. I'll then output the input within a simple HTML page. The tutorial will be laid out in code snippets, which I'll put together at the end. I'll then follow with compilation instructions.
Getting and Setting CGI Cookies in C
Cookies are an integral part of any web application. In this tutorial, I'll describe how to use the HTTP header functionality in kcgi to set, recognise, and store cookies.
FastCGI Deployments
FastCGI allows for much higher throughput by running web applications as daemons. In this tutorial, I'll describe how to deploy a simple FastCGI application using kfcgi(8).
Custom Validation
Applications often need to validate more then doubles, integers, or strings included in kvalid_string(3). In this tutorial, I'll provide some examples on how to override the validation function.
Using Pages
The tutorial gives an overview of the basic path handling provided by kcgi, and then shows and discusses relevant code snippets.
CGI for C++ applications
kcgi supports C++ callers just as easily as C. In this brief tutorial I'll take you through compiling a C++ application that uses the kcgi library.
Best practises for pledge(2) security
Let's set the record straight for securing kcgi CGI and FastCGI applications with pledge(2). This is focussed on secure OpenBSD deployments.
CORS and kcgi
This article isn't about CORS (cross-origin resource sharing) but rather the enumerations and functions available in kcgi(3) to handle CORS requests.

In addition to these resources, the following conference sessions have referenced kcgi.

Dzonsons, Kristaps. Role-based Access Control in BCHS Web Applications. Proceedings of AsiaBSDCon, Tokyo, Japan, March 2018. (Slides, video.) paper.)
Dzonsons, Kristaps. Secure BSD Web Applications in C: Practical Strategies. Proceedings of AsiaBSDCon, Tokyo, Japan, March 2017. (Slides.)
Dzonsons, Kristaps. Secure BSD Web Application Development in C. Proceedings of AsiaBSDCon, Tokyo, Japan, March 2016. (Slides.)
Dzonsons, Kristaps. kcgi: securing CGI applications in C. Proceedings of AsiaBSDCon, Tokyo, Japan, March 2015. (Slides, paper.)

And the following relate to extending standards:

Dzonsons, Kristaps. FastCGI Extensions for Management Control. March 2016.

implementation details

The bulk of kcgi's CGI handling lies in khttp_parse(3), which fully parses the HTTP request. Application developers must invoke this function before all others. For FastCGI, this function is split between khttp_fcgi_init(3), which initialises context; and khttp_fcgi_parse(3), which receives new parsed requests. In either case, requests must be freed by khttp_free(3).

All functions isolate the parsing and validation of untrusted network data within a sandboxed child process. Sandboxes limit the environment available to a process, so exploitable errors in the parsing process (or validation with third-party libraries) cannot touch the system environment. This parsed data is returned to the parent process over a socket. In the following, the HTTP parser and input validator manage a single HTTP request, while connection delegator accepts new HTTP requests and passes them along.

This method of sandboxing the untrusted parsing process follows OpenSSH, and requires special handling for each operating system:

seccomp(2) (Linux): This requires a seccomp-enabled Linux kernel and a recognised hardware architecture. It is supplemented by setrlimit(2) limiting.
pledge(2) (OpenBSD): This will only work on OpenBSD >5.8.
sandbox_init(3) (Apple OSX): This uses the sandboxing profile for pure computation as provided in Mac OS X Leopard and later. This is supplemented by resource limiting via setrlimit(2).
capsicum(4) (FreeBSD): Uses the capabilities facility on FreeBSD 10 and later. This is supplemented by resource limiting with setrlimit(2).

Since validation occurs within the sandbox, special care must be taken that validation routines don't access the environment (e.g., by opening files, network connections, etc.), as the child might be abruptly killed by the sandbox facility. (Not all sandboxes do this.) If required, this kind of validation can take place after the parse validation sequence.

The connection delegator is similar, but has different sandboxing rules, as it must manage an open socket connection and respond to new requests.

testing

kcgi is shipped with a fully automated testing framework executed with make regress. To test your own applications, use the kcgiregress(3) library. This framework acts as a mini-webserver, listening on a local port, translating an HTTP document into a minimal CGI request, and passing the request to a kcgi CGI client. For internal tests, test requests are constructed with libcurl. The binding local port is fixed: if you plan on running the regression suite, you may need to tweak its access port.

The testing framework is used by the GitHub Actions to make sure that all commits pass the regression tests on all supported systems (including with compiler sanitisation).

Another testing framework exists for use with the American fuzzy lop. To use this, you'll need to compile the make afl target with your compiler of choice, e.g., make clean, then make afl CC=afl-gcc. Then run the afl-fuzz tool on the afl-multipart, afl-plain, and afl-urlencoded binaries using the test cases (and dictionaries, for the first) provided.

performance

Security comes at a price—but not a stiff price. By design, kcgi incurs overhead in three ways: first, spawning a child to process the untrusted network data; second, enacting the sandbox framework; and third, passing parsed pairs back to the parent context. In the case of running CGI scripts, kcgi performance is bound to the operating system's ability to spawn and reap processes. For FastCGI, the bottleneck becomes the transfer of data.

kcgi – minimal CGI and FastCGI library for C/C++

current release