README.md

  1# Sandbox
  2
  3OS-level sandboxing for terminal processes spawned by Zed — both interactive
  4user terminals and agent tool invocations. The sandbox restricts filesystem
  5access, network access, and other capabilities so that commands run in the
  6terminal can only affect what they're explicitly permitted to.
  7
  8## Platform mechanisms
  9
 10- **macOS**: Seatbelt (SBPL profiles applied via `sandbox_init()`)
 11- **Linux**: Landlock LSM for filesystem restrictions, cgroups v2 for process
 12  lifetime management
 13
 14Both mechanisms are inherited by child processes and cannot be removed. A
 15sandboxed shell and everything it spawns remain sandboxed for their entire
 16lifetime.
 17
 18## Always-on process tracking
 19
 20Reliable process cleanup is valuable even when the user has not configured any
 21sandbox restrictions. The standard approach of `killpg()` (kill by process
 22group) is unreliable — a process can escape via `setsid()` or `setpgid()`, and
 23the terminal's `Drop` impl will miss it.
 24
 25For this reason, **process tracking is always enabled for every terminal
 26session**, regardless of whether sandbox restrictions are configured:
 27
 28- **macOS**: A minimal Seatbelt profile is applied containing only the session
 29  fingerprint (see below) and `(allow default)` for everything else. This
 30  doesn't restrict the process at all, but gives us the `sandbox_check()`
 31  fingerprint needed to reliably find and kill all descendants. When full
 32  sandbox restrictions are also enabled, the fingerprint is embedded in the
 33  restrictive profile instead.
 34
 35- **Linux**: A cgroup is created for every terminal session. On cleanup, the
 36  cgroup is frozen and all members are killed. This works regardless of whether
 37  Landlock filesystem restrictions are also enabled.
 38
 39This replaces the current cleanup approach (100ms delay + `kill_child_process`)
 40with a convergent, reliable mechanism on both platforms.
 41
 42## Process cleanup on terminal close
 43
 44When a terminal session ends, all processes it spawned must be killed. This is
 45straightforward on Linux (cgroups v2 provides an atomic, inescapable kill), but
 46requires careful handling on macOS where no equivalent kernel primitive exists.
 47
 48### The problem
 49
 50A process inside the sandbox can call `setsid()` or `setpgid()` to leave the
 51shell's process group. After that, `killpg()` (which kills by process group)
 52won't reach it. If the process also double-forks and the intermediate parent
 53exits, the grandchild is reparented to PID 1 (launchd), severing the parent
 54chain entirely. This means:
 55
 56- **Process group killing** misses it (different group).
 57- **Parent chain walking** can't find it (parent is PID 1).
 58- The process persists after the terminal closes, retaining whatever sandbox
 59  permissions it was granted at spawn time.
 60
 61macOS Seatbelt has no operation for `setsid()` — it isn't a filterable
 62operation in SBPL, so the sandbox can't prevent this. (On Linux, seccomp could
 63block `setsid()`, but it would break legitimate programs like `ssh`.)
 64
 65### Why stale permissions matter
 66
 67The sandbox profile is a snapshot frozen at spawn time. If a process escapes
 68cleanup, it retains the original permissions indefinitely. This is a problem
 69because:
 70
 71- The user might later add secrets to a directory that was in the sandbox's
 72  allowed paths.
 73- The user might change sandbox settings for future sessions, but the escaped
 74  process still has the old, more-permissive profile.
 75- For agent tool use especially, the sandbox permissions are granted for a
 76  specific task. An escaped process retaining those permissions after the task
 77  is complete violates the principle of least privilege.
 78
 79### Linux: cgroups v2
 80
 81On Linux, the solution is to place the shell in a dedicated cgroup. All
 82descendants are automatically tracked in the cgroup regardless of `setsid()`,
 83`setpgid()`, or reparenting. No process can leave a cgroup without
 84`CAP_SYS_ADMIN`. On terminal close:
 85
 861. Freeze the cgroup (prevents new forks).
 872. Kill all processes in the cgroup.
 883. Delete the cgroup.
 89
 90This is a hard guarantee — the same mechanism containers use.
 91
 92cgroups v2 is the default on all modern Linux distributions (Ubuntu 21.10+,
 93Fedora 31+, Debian 11+, Arch 2020+, RHEL 9+). No installation or
 94configuration is needed. Regular (non-root) users can create child cgroups
 95within their own systemd user slice, so no elevated privileges are required.
 96
 97### macOS: sandbox fingerprinting with convergent cleanup
 98
 99macOS has no public equivalent to cgroups. The approach is a convergent
100scan-and-kill loop that uses the Seatbelt sandbox profile itself as an
101unforgeable fingerprint.
102
103#### Sandbox fingerprint
104
105Each terminal session embeds a unique fingerprint in its SBPL profile: a
106per-session UUID path where one child path is allowed and a sibling is denied.
107
108```
109(allow file-read* (subpath "/tmp/.zed-sandbox-<uuid>/allow"))
110;; /tmp/.zed-sandbox-<uuid>/deny is implicitly denied by (deny default)
111```
112
113When the session has no sandbox restrictions (fingerprint-only mode), the
114profile uses `(allow default)` instead of `(deny default)`, but still includes
115an explicit deny for the fingerprint's deny-side path:
116
117```
118(version 1)
119(allow default)
120(deny file-read* (subpath "/tmp/.zed-sandbox-<uuid>/deny"))
121(allow file-read* (subpath "/tmp/.zed-sandbox-<uuid>/allow"))
122```
123
124This two-point fingerprint cannot be produced by any other sandbox profile:
125
126- A sandbox that blanket-allows `/tmp` would allow **both** paths — fails the
127  deny check.
128- A sandbox that blanket-denies `/tmp` would deny **both** paths — fails the
129  allow check.
130- An unsandboxed process allows everything — fails the deny check.
131- Only a process with our exact profile allows one and denies the other.
132
133The fingerprint is checked from outside the process using `sandbox_check()`:
134
135```c
136int allows = sandbox_check(pid, "file-read-data",
137    SANDBOX_FILTER_PATH, "/tmp/.zed-sandbox-<uuid>/allow") == 0;
138int denies = sandbox_check(pid, "file-read-data",
139    SANDBOX_FILTER_PATH, "/tmp/.zed-sandbox-<uuid>/deny") != 0;
140// Match requires: allows && denies
141```
142
143The fingerprint is unforgeable because the Seatbelt sandbox is a kernel-level
144invariant — no process can modify or remove its own sandbox profile.
145
146#### Convergent cleanup loop
147
148On terminal close:
149
1501. `killpg(pgid, SIGKILL)` — kill the process group. This instantly handles
151   the vast majority of descendants (everything that didn't escape the group).
1522. Enumerate all processes owned by the current UID (via `sysctl`
153   `KERN_PROC_UID`).
1543. For each process, probe with `sandbox_check` using the session fingerprint.
1554. `SIGKILL` every match.
1565. Go to step 2.
1576. When a full scan finds zero matches, every process from this session is
158   dead.
1597. Delete the fingerprint directory.
160
161**Why this terminates:** Each iteration either discovers processes (and kills
162them) or discovers none (loop exits). The total number of processes is finite,
163and the set of living fingerprinted processes shrinks monotonically.
164
165**Why this is correct:** The Seatbelt sandbox is inherited by all descendants
166and cannot be removed. Every descendant of the sandboxed shell — regardless of
167`setsid()`, `setpgid()`, double-forking, or reparenting to PID 1 — carries the
168session fingerprint. `sandbox_check` finds them by probing the kernel, not by
169walking the process tree.
170
171**Why SIGKILL on sight instead of SIGSTOP:** An earlier design froze escapees
172with `SIGSTOP` during scanning, then killed them all at the end. But `SIGSTOP`
173only stops the process you send it to, not its children — so children of a
174stopped process are still running and can fork. `SIGKILL` is equally effective:
175a dead process can't fork, and any children it already created are findable by
176fingerprint on the next scan iteration. The simpler approach is just to kill
177everything on sight and keep scanning until the scan comes back empty.
178
179**Why not process-group operations after step 1:** After `killpg` handles the
180initial process group, any remaining processes are by definition ones that
181escaped via `setsid()` or `setpgid()`. They're in different process groups (or
182their own sessions), so further `killpg` calls can't target them without
183knowing their group IDs. Worse, if a process double-forks and the intermediate
184parent exits, the grandchild is reparented to PID 1 (launchd) — there's no
185parent chain linking it back to the original shell, and its process group is
186unrelated to ours. The only reliable way to find these escapees is the
187fingerprint probe, which works regardless of process group, session, or parent
188relationship.
189
190**Zombie handling:** After `SIGKILL`, a process becomes a zombie until its
191parent reaps it. If `sandbox_check` still reports the sandbox profile for
192zombies, the loop could spin on unkillable processes. The scan should skip
193processes in the zombie state (detectable via `kinfo_proc.kp_proc.p_stat ==
194SZOMB` from the same `sysctl` call used for enumeration). Zombies are harmless
195— they can't execute code or fork — so skipping them is correct.
196
197**Residual race:** Between discovering a process (step 3) and killing it (step
1984), the process could fork. But the child inherits the fingerprint, so the next
199iteration of the loop finds it. The loop continues until no such children
200remain. The only way a process could escape is to fork a child that somehow
201doesn't inherit the sandbox — which the kernel guarantees cannot happen.
202
203### Alternatives considered and rejected
204
205#### Audit session IDs (BSM)
206
207macOS's BSM audit framework assigns each process an audit session ID
208(`ai_asid`) that is inherited by children. In principle, this could track
209descendants. Rejected because:
210
211- `getaudit_addr()` requires elevated privileges.
212- There is no "kill all processes in this audit session" syscall — you still
213  end up enumerating and killing individually.
214- macOS doesn't consistently use POSIX sessions (`ps -e -o sess` shows 0 for
215  all processes on many systems).
216
217#### Endpoint Security framework
218
219Apple's Endpoint Security framework provides kernel-level notifications for
220every fork/exec event, which would allow perfectly reliable tracking. Rejected
221because:
222
223- Requires the `com.apple.developer.endpoint-security.client` entitlement,
224  which must be approved by Apple.
225- Designed for security products (antivirus, MDM), not general-purpose apps.
226- Significantly increases the complexity and privilege requirements of Zed.
227
228#### XNU coalitions
229
230macOS has a kernel concept called "coalitions" that groups related processes for
231resource tracking and lifecycle management — essentially Apple's internal
232equivalent of cgroups. Rejected because:
233
234- The APIs (`coalition_create()`, `coalition_terminate()`) are private SPI.
235- They require entitlements not available to third-party apps.
236
237#### Temporary copy / overlay of project directory
238
239Instead of granting sandbox access to the real project directory, use a
240temporary copy or FUSE overlay, then delete it on terminal close. Rejected
241because:
242
243- Copying large projects is expensive.
244- File watching, symlinks, and build tool caching break.
245- FUSE on macOS requires macFUSE (third-party kext) or FSKit (macOS 15+).
246- Tools that embed absolute paths (compiler errors, debugger info) would show
247  wrong paths.
248
249#### Symlink indirection
250
251Grant sandbox access to a symlink path (e.g., `/tmp/.zed-link-<uuid>` →
252`/real/project/`), then delete the symlink on cleanup. Rejected because:
253
254- Seatbelt resolves symlinks to canonical paths when checking access (this is
255  why `canonicalize_paths()` is called before building the profile).
256- Deleting the symlink wouldn't revoke access to the underlying real path.
257
258#### Blocking `setsid()` / `setpgid()`
259
260Prevent processes from leaving the process group in the first place. Rejected
261because:
262
263- Seatbelt has no filterable operation for these syscalls.
264- On Linux, seccomp could block them, but this breaks legitimate programs
265  (`ssh`, some build tools, process managers).
266
267#### Lightweight VM via Virtualization framework
268
269Run agent commands inside a macOS Virtualization framework VM. This would give a
270hard process-lifetime guarantee (shutting down the VM kills everything).
271Rejected (for now) because:
272
273- Massive architectural change.
274- The VM runs Linux, not macOS — macOS-specific tools wouldn't work.
275- Resource overhead (memory, CPU, startup time).
276- Overkill for the current threat model.
277
278## Signal scoping (macOS)
279
280The SBPL profile uses `(allow signal (target children))` rather than a bare
281`(allow signal)`. This prevents the sandboxed process from signaling arbitrary
282same-user processes (other Zed instances, browsers, etc.) while still allowing
283the shell to:
284
285- Manage jobs (`kill %1`, `bg`, `fg`)
286- Use the `kill` command on child processes
287- Clean up background jobs on exit (SIGHUP)
288
289Note that Ctrl+C and Ctrl+Z are sent by the kernel's TTY driver, not by the
290shell, so they work regardless of signal sandbox rules.
291
292`(target self)` was considered but rejected because it would break all job
293control and shell cleanup of background processes.
294
295In fingerprint-only mode (no sandbox restrictions), `(allow default)` already
296permits all signals, so no explicit signal rule is needed.