DevSecFlops:

Source code sandboxing

In 2024, how easy is it for a developer to sandbox a program by its source code?

Assume that the program wants only to manipulate open file descriptors (e.g., standard IO) and manage its own memory—regular console stuff. If the running program tries to do anything else, it must be terminated or otherwise unable to acquire the requested resources (opening files, network sockets, shared memory, etc.).

This should succeed…
print("Hello, world!")
This should fail…
from pathlib import Path
txt = Path(".profile").read_text()
This should also fail…
import http.client
connection = http.client.HTTPSConnection("www.bsd.lv")

In this article, I examine different open source systems that could be used to enact such a sandbox, such as OpenBSD's pledge and Linux's seccomp, by example. Jump to the examples for a full list.

How complex is it to sandbox your source code?

This graphs the text length of all references that must be read to understand the system (manual pages, etc.) to the number of lines in an example.

Systems easy to understand (references) and implement (source code) are grouped in the lower-left, while progressively more complicated systems are in the upper-right.

Below is a list of all examples. Click the folder icon on an example to see its source code, references, and notes. Which method would you choose? Sources readable on a normal mobile phone (<20 lines) have lines show in blue, barely readable (20–30 lines) in orange, and above in red. Deprecated sandboxes are marked next to the subsytem name.

We can further clarify this data by grouping by operating systems and sandboxes—environment. How hard it is to implement a sandbox in your environment of choice?

Moving from right to left in these graphs—reducing complexity—puts to practice Jerry Saltzer's reflections on Multics, which featured a predecessor to modern call-based capabilities systems (emphasis mine):

The user… must figure out for himself how to accomplish his intentions amid a myriad of possibilities, not all of which he understands… The solution to this problem lies in better understanding the nature of the typical user's mental description of protection intent, and then devising interfaces which permit more direct specification of that protection intent. [ref]

These are all trivial examples, and only demonstrate as much as Hello, world programs demonstrate a programming language. Let's refocus on a real world example taken both as a snapshot and as developed over time.

Real-world sandboxing in 2024.

Let's use openssh-portable as a canonical example. OpenSSH is under tremendous pressure for its security. It has a number of source code sandboxes, including most of those mentioned in this survey. In general, these sandboxes will do little more than our examples—limit resources used to only communication over standard IO and to perform in-process duties (memory management, etc.). How do the lengths of source code (in lines) for each sandbox measure, with the collected references length for comparison?

Sources...

This only shows an instantaneous view of the system: part of the security of a system is its maintenance over time. How difficult is it to maintain the security implementations over time? In the following chart, I continue looking at openssh-portable, but extracting its commit history (via GitHub) over time for the specific files used for sandboxing.

This plots cumulative commit counts in the GitHub repository over time. Size of commits is not regarded—only the frequency. Subsystems that require significant maintenance will grow much more quickly than those with lighter maintenance burdens.

Findings so far? It depends on your environment.

Linux has a whopping seven-fold increase in reference complexity, and even more for source complexity, over OpenBSD and FreeBSD. Landlock showed promise as a simpler alternative, but has since grown in complexity. OpenBSD and FreeBSD rank roughly equivalently, with the former having increased mildly in complexity.

The maintenance burden for seccomp, as illustrated by openssh-portable, is considerable. The burden for the other systems is considerably less—almost nothing since original implementation.

Ominously, Mac OS X has deprecated its source sandbox (seatbelt), as has Java by discontinuing the JSM.

How are these findings reflected in a survey of real-world sandboxed systems? The chart below counts open source systems using a sandbox.

My methodology was to mine the FreeBSD and OpenBSD git repositories (specifically usr.bin and usr.sbin for sandbox invocations), then look up the earliest entry for a contributor. I've added non-BSD systems as I know of them (e.g., Chromium). I'm not aware of central repositories for Linux sandboxing, so it's hard to gather information.

This is very incomplete! Help make these charts more meaningful: if you've significantly contributed to a sandbox effort, please submit an attestation by a GitHub pull request. Thank you!

How can we make this better in 2025?

I'm starting this site to gain a full picture of the sandbox landscape in open systems. Having a list of possible combinations of languages, operating systems, and sandbox tools is a good start. And for that, if you have additions, please visit the GitHub page to add more examples.

More importantly, I want to know who is using these security systems, and where. Let's put together some numbers for how many systems in the wild really are protected, and start a conversation about why systems are more popular, and what we can do to raise the state of security on less-popular systems. To wit, I've added attestations to each example in the GitHub repository. Just open a pull request with your GitHub name (for now, just GitHub—I'll add more later) and add the repository to which you've contributed security sandboxing.

Full instructions on how to add examples, sandbox, and attestations are all on the GitHub page, and all possible through pull requests. I'll merge this into these on an as-needed basis.