Assume that the program wants only to manipulate open file descriptors (e.g., standard IO) and manage its own memory—regular console stuff. If the running program tries to do anything else, it must be terminated or otherwise unable to acquire the requested resources (opening files, network sockets, shared memory, etc.).
print("Hello, world!")
from pathlib import Path
txt = Path(".profile").read_text()
import http.client
connection = http.client.HTTPSConnection("www.bsd.lv")
In this article, I examine different open source systems that could be used to enact such a sandbox, such as OpenBSD's pledge and Linux's seccomp, by example. Jump to the examples for a full list.
Systems easy to understand (references
) and implement
(source code
) are grouped in the lower-left, while
progressively more complicated systems are in the upper-right.
Below is a list of all examples. Click the folder icon on an example to see its source code, references, and notes. Which method would you choose? Sources readable on a normal mobile phone (<20 lines) have lines show in blue, barely readable (20–30 lines) in orange, and above in red. Deprecated sandboxes are marked next to the subsytem name.
We can further clarify this data by grouping by operating systems and sandboxes—environment. How hard it is to implement a sandbox in your environment of choice?
Moving from right to left in these graphs—reducing complexity—puts to practice Jerry Saltzer's reflections on Multics, which featured a predecessor to modern call-based capabilities systems (emphasis mine):
The user… must figure out for
himself how to accomplish his
intentions amid a myriad of
possibilities, not all of which he
understands…
The solution to this
problem lies in better
understanding the nature of
the typical user's mental
description of protection
intent, and then devising
interfaces which permit more
direct specification of that
protection intent.
[ref]
These are all trivial examples, and only demonstrate as much as
Hello, world
programs demonstrate a programming language.
Let's refocus on a real world example taken both as a snapshot and
as developed over time.
Let's use openssh-portable as a canonical example. OpenSSH is under tremendous pressure for its security. It has a number of source code sandboxes, including most of those mentioned in this survey. In general, these sandboxes will do little more than our examples—limit resources used to only communication over standard IO and to perform in-process duties (memory management, etc.). How do the lengths of source code (in lines) for each sandbox measure, with the collected references length for comparison?
This only shows an instantaneous view of the system: part of the security of a system is its maintenance over time. How difficult is it to maintain the security implementations over time? In the following chart, I continue looking at openssh-portable, but extracting its commit history (via GitHub) over time for the specific files used for sandboxing.
Linux has a whopping seven-fold increase in reference complexity, and even more for source complexity, over OpenBSD and FreeBSD. Landlock showed promise as a simpler alternative, but has since grown in complexity. OpenBSD and FreeBSD rank roughly equivalently, with the former having increased mildly in complexity.
The maintenance burden for seccomp, as illustrated by openssh-portable, is considerable. The burden for the other systems is considerably less—almost nothing since original implementation.
Ominously,
Mac OS X
has deprecated its source sandbox (seatbelt
), as has
Java by discontinuing the
JSM.
How are these findings reflected in a survey of real-world sandboxed systems? The chart below counts open source systems using a sandbox.
My methodology was to mine the
FreeBSD and
OpenBSD git
repositories (specifically usr.bin
and
usr.sbin
for sandbox invocations), then look up the
earliest entry for a contributor. I've added non-BSD systems as I
know of them (e.g., Chromium). I'm not aware of central
repositories for Linux sandboxing, so it's hard to gather
information.
This is very incomplete! Help make these charts more meaningful: if you've significantly contributed to a sandbox effort, please submit an attestation by a GitHub pull request. Thank you!
I'm starting this site to gain a full picture of the sandbox
landscape in open systems. Having a list of possible combinations
of languages, operating systems, and sandbox tools is a good start.
And for that, if you have additions, please visit the
GitHub
page to add more examples.
More importantly, I want to know
who is using these security systems, and where.
Let's put together some numbers for how many systems in the wild
really are protected, and start a conversation about why systems are
more popular, and what we can do to raise the state of security on
less-popular systems.
To wit, I've added attestations
to each example in the
GitHub
repository. Just open a pull request with your GitHub name (for
now, just GitHub—I'll add more later) and add the repository
to which you've contributed security sandboxing.
Full instructions on how to add examples, sandbox, and attestations are all on the GitHub page, and all possible through pull requests. I'll merge this into these on an as-needed basis.