Cargo.lock 🔗
@@ -15153,6 +15153,10 @@ dependencies = [
"winapi-util",
]
+[[package]]
+name = "sandbox"
+version = "0.1.0"
+
[[package]]
name = "scc"
version = "3.5.6"
Richard Feldman created
Add a new sandbox crate as the future home for OS-level sandboxing code
(currently in the terminal crate). The README documents:
- Always-on process tracking for reliable cleanup, even without sandbox
restrictions configured
- macOS: convergent cleanup using Seatbelt sandbox fingerprinting with
sandbox_check() to find escaped processes, replacing the current
killpg + timer approach
- macOS: two-point fingerprint design (allow/deny sibling paths) that
uniquely identifies session processes without false positives
- Linux: cgroups v2 for inescapable process lifetime management
- Signal scoping rationale ((target children) vs bare (allow signal))
- Alternatives considered and rejected: audit sessions, Endpoint
Security, XNU coalitions, temp copies, symlinks, setsid blocking, VMs
Cargo.lock | 4
Cargo.toml | 1
crates/sandbox/Cargo.toml | 14 +
crates/sandbox/README.md | 289 +++++++++++++++++++++++++++++++++++++
crates/sandbox/src/sandbox.rs | 7
5 files changed, 315 insertions(+)
@@ -15153,6 +15153,10 @@ dependencies = [
"winapi-util",
]
+[[package]]
+name = "sandbox"
+version = "0.1.0"
+
[[package]]
name = "scc"
version = "3.5.6"
@@ -163,6 +163,7 @@ members = [
"crates/rope",
"crates/rpc",
"crates/rules_library",
+ "crates/sandbox",
"crates/scheduler",
"crates/schema_generator",
"crates/search",
@@ -0,0 +1,14 @@
+[package]
+name = "sandbox"
+version = "0.1.0"
+edition.workspace = true
+publish.workspace = true
+license = "GPL-3.0-or-later"
+description = "OS-level sandboxing for terminal processes in Zed"
+
+[lints]
+workspace = true
+
+[lib]
+path = "src/sandbox.rs"
+doctest = false
@@ -0,0 +1,289 @@
+# Sandbox
+
+OS-level sandboxing for terminal processes spawned by Zed — both interactive
+user terminals and agent tool invocations. The sandbox restricts filesystem
+access, network access, and other capabilities so that commands run in the
+terminal can only affect what they're explicitly permitted to.
+
+## Platform mechanisms
+
+- **macOS**: Seatbelt (SBPL profiles applied via `sandbox_init()`)
+- **Linux**: Landlock LSM for filesystem restrictions, cgroups v2 for process
+ lifetime management
+
+Both mechanisms are inherited by child processes and cannot be removed. A
+sandboxed shell and everything it spawns remain sandboxed for their entire
+lifetime.
+
+## Always-on process tracking
+
+Reliable process cleanup is valuable even when the user has not configured any
+sandbox restrictions. The standard approach of `killpg()` (kill by process
+group) is unreliable — a process can escape via `setsid()` or `setpgid()`, and
+the terminal's `Drop` impl will miss it.
+
+For this reason, **process tracking is always enabled for every terminal
+session**, regardless of whether sandbox restrictions are configured:
+
+- **macOS**: A minimal Seatbelt profile is applied containing only the session
+ fingerprint (see below) and `(allow default)` for everything else. This
+ doesn't restrict the process at all, but gives us the `sandbox_check()`
+ fingerprint needed to reliably find and kill all descendants. When full
+ sandbox restrictions are also enabled, the fingerprint is embedded in the
+ restrictive profile instead.
+
+- **Linux**: A cgroup is created for every terminal session. On cleanup, the
+ cgroup is frozen and all members are killed. This works regardless of whether
+ Landlock filesystem restrictions are also enabled.
+
+This replaces the current cleanup approach (100ms delay + `kill_child_process`)
+with a convergent, reliable mechanism on both platforms.
+
+## Process cleanup on terminal close
+
+When a terminal session ends, all processes it spawned must be killed. This is
+straightforward on Linux (cgroups v2 provides an atomic, inescapable kill), but
+requires careful handling on macOS where no equivalent kernel primitive exists.
+
+### The problem
+
+A process inside the sandbox can call `setsid()` or `setpgid()` to leave the
+shell's process group. After that, `killpg()` (which kills by process group)
+won't reach it. If the process also double-forks and the intermediate parent
+exits, the grandchild is reparented to PID 1 (launchd), severing the parent
+chain entirely. This means:
+
+- **Process group killing** misses it (different group).
+- **Parent chain walking** can't find it (parent is PID 1).
+- The process persists after the terminal closes, retaining whatever sandbox
+ permissions it was granted at spawn time.
+
+macOS Seatbelt has no operation for `setsid()` — it isn't a filterable
+operation in SBPL, so the sandbox can't prevent this. (On Linux, seccomp could
+block `setsid()`, but it would break legitimate programs like `ssh`.)
+
+### Why stale permissions matter
+
+The sandbox profile is a snapshot frozen at spawn time. If a process escapes
+cleanup, it retains the original permissions indefinitely. This is a problem
+because:
+
+- The user might later add secrets to a directory that was in the sandbox's
+ allowed paths.
+- The user might change sandbox settings for future sessions, but the escaped
+ process still has the old, more-permissive profile.
+- For agent tool use especially, the sandbox permissions are granted for a
+ specific task. An escaped process retaining those permissions after the task
+ is complete violates the principle of least privilege.
+
+### Linux: cgroups v2
+
+On Linux, the solution is to place the shell in a dedicated cgroup. All
+descendants are automatically tracked in the cgroup regardless of `setsid()`,
+`setpgid()`, or reparenting. No process can leave a cgroup without
+`CAP_SYS_ADMIN`. On terminal close:
+
+1. Freeze the cgroup (prevents new forks).
+2. Kill all processes in the cgroup.
+3. Delete the cgroup.
+
+This is a hard guarantee — the same mechanism containers use.
+
+cgroups v2 is the default on all modern Linux distributions (Ubuntu 21.10+,
+Fedora 31+, Debian 11+, Arch 2020+, RHEL 9+). No installation or
+configuration is needed. Regular (non-root) users can create child cgroups
+within their own systemd user slice, so no elevated privileges are required.
+
+### macOS: sandbox fingerprinting with convergent cleanup
+
+macOS has no public equivalent to cgroups. The approach is a convergent
+scan-and-kill loop that uses the Seatbelt sandbox profile itself as an
+unforgeable fingerprint.
+
+#### Sandbox fingerprint
+
+Each terminal session embeds a unique fingerprint in its SBPL profile: a
+per-session UUID path where one child path is allowed and a sibling is denied.
+
+```
+(allow file-read* (subpath "/tmp/.zed-sandbox-<uuid>/allow"))
+;; /tmp/.zed-sandbox-<uuid>/deny is implicitly denied by (deny default)
+```
+
+When the session has no sandbox restrictions (fingerprint-only mode), the
+profile uses `(allow default)` instead of `(deny default)`, but still includes
+an explicit deny for the fingerprint's deny-side path:
+
+```
+(version 1)
+(allow default)
+(deny file-read* (subpath "/tmp/.zed-sandbox-<uuid>/deny"))
+(allow file-read* (subpath "/tmp/.zed-sandbox-<uuid>/allow"))
+```
+
+This two-point fingerprint cannot be produced by any other sandbox profile:
+
+- A sandbox that blanket-allows `/tmp` would allow **both** paths — fails the
+ deny check.
+- A sandbox that blanket-denies `/tmp` would deny **both** paths — fails the
+ allow check.
+- An unsandboxed process allows everything — fails the deny check.
+- Only a process with our exact profile allows one and denies the other.
+
+The fingerprint is checked from outside the process using `sandbox_check()`:
+
+```c
+int allows = sandbox_check(pid, "file-read-data",
+ SANDBOX_FILTER_PATH, "/tmp/.zed-sandbox-<uuid>/allow") == 0;
+int denies = sandbox_check(pid, "file-read-data",
+ SANDBOX_FILTER_PATH, "/tmp/.zed-sandbox-<uuid>/deny") != 0;
+// Match requires: allows && denies
+```
+
+The fingerprint is unforgeable because the Seatbelt sandbox is a kernel-level
+invariant — no process can modify or remove its own sandbox profile.
+
+#### Convergent cleanup loop
+
+On terminal close:
+
+1. `killpg(pgid, SIGKILL)` — kill the process group. This instantly handles
+ the vast majority of descendants (everything that didn't escape the group).
+2. Enumerate all processes owned by the current UID (via `sysctl`
+ `KERN_PROC_UID`).
+3. For each process, probe with `sandbox_check` using the session fingerprint.
+4. `SIGKILL` every match.
+5. Go to step 2.
+6. When a full scan finds zero matches, every process from this session is
+ dead.
+7. Delete the fingerprint directory.
+
+**Why this terminates:** Each iteration either discovers processes (and kills
+them) or discovers none (loop exits). The total number of processes is finite,
+and the set of living fingerprinted processes shrinks monotonically.
+
+**Why this is correct:** The Seatbelt sandbox is inherited by all descendants
+and cannot be removed. Every descendant of the sandboxed shell — regardless of
+`setsid()`, `setpgid()`, double-forking, or reparenting to PID 1 — carries the
+session fingerprint. `sandbox_check` finds them by probing the kernel, not by
+walking the process tree.
+
+**Why SIGKILL on sight instead of SIGSTOP:** An earlier design froze escapees
+with `SIGSTOP` during scanning, then killed them all at the end. But `SIGSTOP`
+only stops the process you send it to, not its children — so children of a
+stopped process are still running and can fork. `SIGKILL` is equally effective:
+a dead process can't fork, and any children it already created are findable by
+fingerprint on the next scan iteration. The simpler approach is just to kill
+everything on sight and keep scanning until the scan comes back empty.
+
+**Why not process-group operations after step 1:** After `killpg` handles the
+initial process group, any remaining processes are by definition ones that
+escaped via `setsid()` or `setpgid()`. They're in different process groups (or
+their own sessions), so further `killpg` calls can't target them without
+knowing their group IDs. Worse, if a process double-forks and the intermediate
+parent exits, the grandchild is reparented to PID 1 (launchd) — there's no
+parent chain linking it back to the original shell, and its process group is
+unrelated to ours. The only reliable way to find these escapees is the
+fingerprint probe, which works regardless of process group, session, or parent
+relationship.
+
+**Residual race:** Between discovering a process (step 3) and killing it (step
+4), the process could fork. But the child inherits the fingerprint, so the next
+iteration of the loop finds it. The loop continues until no such children
+remain. The only way a process could escape is to fork a child that somehow
+doesn't inherit the sandbox — which the kernel guarantees cannot happen.
+
+### Alternatives considered and rejected
+
+#### Audit session IDs (BSM)
+
+macOS's BSM audit framework assigns each process an audit session ID
+(`ai_asid`) that is inherited by children. In principle, this could track
+descendants. Rejected because:
+
+- `getaudit_addr()` requires elevated privileges.
+- There is no "kill all processes in this audit session" syscall — you still
+ end up enumerating and killing individually.
+- macOS doesn't consistently use POSIX sessions (`ps -e -o sess` shows 0 for
+ all processes on many systems).
+
+#### Endpoint Security framework
+
+Apple's Endpoint Security framework provides kernel-level notifications for
+every fork/exec event, which would allow perfectly reliable tracking. Rejected
+because:
+
+- Requires the `com.apple.developer.endpoint-security.client` entitlement,
+ which must be approved by Apple.
+- Designed for security products (antivirus, MDM), not general-purpose apps.
+- Significantly increases the complexity and privilege requirements of Zed.
+
+#### XNU coalitions
+
+macOS has a kernel concept called "coalitions" that groups related processes for
+resource tracking and lifecycle management — essentially Apple's internal
+equivalent of cgroups. Rejected because:
+
+- The APIs (`coalition_create()`, `coalition_terminate()`) are private SPI.
+- They require entitlements not available to third-party apps.
+
+#### Temporary copy / overlay of project directory
+
+Instead of granting sandbox access to the real project directory, use a
+temporary copy or FUSE overlay, then delete it on terminal close. Rejected
+because:
+
+- Copying large projects is expensive.
+- File watching, symlinks, and build tool caching break.
+- FUSE on macOS requires macFUSE (third-party kext) or FSKit (macOS 15+).
+- Tools that embed absolute paths (compiler errors, debugger info) would show
+ wrong paths.
+
+#### Symlink indirection
+
+Grant sandbox access to a symlink path (e.g., `/tmp/.zed-link-<uuid>` →
+`/real/project/`), then delete the symlink on cleanup. Rejected because:
+
+- Seatbelt resolves symlinks to canonical paths when checking access (this is
+ why `canonicalize_paths()` is called before building the profile).
+- Deleting the symlink wouldn't revoke access to the underlying real path.
+
+#### Blocking `setsid()` / `setpgid()`
+
+Prevent processes from leaving the process group in the first place. Rejected
+because:
+
+- Seatbelt has no filterable operation for these syscalls.
+- On Linux, seccomp could block them, but this breaks legitimate programs
+ (`ssh`, some build tools, process managers).
+
+#### Lightweight VM via Virtualization framework
+
+Run agent commands inside a macOS Virtualization framework VM. This would give a
+hard process-lifetime guarantee (shutting down the VM kills everything).
+Rejected (for now) because:
+
+- Massive architectural change.
+- The VM runs Linux, not macOS — macOS-specific tools wouldn't work.
+- Resource overhead (memory, CPU, startup time).
+- Overkill for the current threat model.
+
+## Signal scoping (macOS)
+
+The SBPL profile uses `(allow signal (target children))` rather than a bare
+`(allow signal)`. This prevents the sandboxed process from signaling arbitrary
+same-user processes (other Zed instances, browsers, etc.) while still allowing
+the shell to:
+
+- Manage jobs (`kill %1`, `bg`, `fg`)
+- Use the `kill` command on child processes
+- Clean up background jobs on exit (SIGHUP)
+
+Note that Ctrl+C and Ctrl+Z are sent by the kernel's TTY driver, not by the
+shell, so they work regardless of signal sandbox rules.
+
+`(target self)` was considered but rejected because it would break all job
+control and shell cleanup of background processes.
+
+In fingerprint-only mode (no sandbox restrictions), `(allow default)` already
+permits all signals, so no explicit signal rule is needed.
@@ -0,0 +1,7 @@
+// Sandbox crate — OS-level sandboxing for terminal processes.
+//
+// See README.md for design context and rationale.
+//
+// The implementation currently lives in the `terminal` crate
+// (sandbox_exec.rs, sandbox_macos.rs, sandbox_linux.rs) and will
+// be migrated here.