From 5dacf9e40568e6f027ff2964c49617ddd25d75bb Mon Sep 17 00:00:00 2001 From: Christian Rocha Date: Mon, 27 Apr 2026 11:05:33 -0400 Subject: [PATCH] docs(hooks): document new embedded shell model Specifically, we describe the embedded-shell contract, Windows behavior, permissive shebang fallback, guaranteed CRUSH/AGENT env markers, and timeout hard-return semantics. --- docs/hooks/FUTURE.md | 120 +------------------------------------------ docs/hooks/README.md | 63 ++++++++++++++++++++--- 2 files changed, 59 insertions(+), 124 deletions(-) diff --git a/docs/hooks/FUTURE.md b/docs/hooks/FUTURE.md index 5e2f9b0b6f58b0ee0ccc5758e8f590049e9cb296..ee334f7308e3264772cbcadc1c581c4e462f937a 100644 --- a/docs/hooks/FUTURE.md +++ b/docs/hooks/FUTURE.md @@ -255,121 +255,5 @@ Reuses the universal rules: ## Cross-platform shell (Windows support) -**Status:** not implemented. - -### Problem - -Today the hook runner uses `exec.Command("sh", "-c", hook.Command)`. On Windows -this fails without WSL or Git Bash on PATH. Even with `sh.exe` available, -Windows has no kernel shebang handling — `./hooks/foo.sh` can't be exec'd -directly the way it can on Unix. Hooks are effectively Unix-only. - -### Approach - -Keep the `command` field as a string. Tokenize it shell-style, examine -`argv[0]`, and branch: - -- If `argv[0]` starts with `./`, `../`, `/`, or `~/` — treat it as a **file - invocation**. Read the first ≤128 bytes, parse a shebang if present, and - dispatch to the named interpreter via `os/exec`. Extra args from the command - string pass through to the interpreter. -- Otherwise — treat the whole string as **shell code** and hand it to mvdan's - in-process interpreter. mvdan resolves `node`, `bash`, `jq`, builtins, - pipelines, redirects, etc. via its own exec handler. - -No sentinel: a script with no shebang defaults to mvdan. A script with an -explicit shebang (`#!/bin/bash`, `#!/usr/bin/env python3`, etc.) uses the named -interpreter, which the user is responsible for having on PATH. Same contract on -every platform. - -### Dispatch examples - -| `command` | `argv[0]` | Route | -| ---------------------------------------- | -------------- | ------------------------ | -| `ls -la` | `ls` | mvdan | -| `bash -c 'ls'` | `bash` | mvdan (which execs bash) | -| `node ./script.js` | `node` | mvdan (which execs node) | -| `./script.sh` (no shebang) | `./script.sh` | mvdan, fed the file | -| `./script.sh` (`#!/bin/bash`) | `./script.sh` | `bash ./script.sh` | -| `./script.py` (`#!/usr/bin/env python3`) | `./script.py` | `python3 ./script.py` | -| `./script.exe` | `./script.exe` | `os/exec` direct | - -### Contract on Windows - -- Inline shell runs through mvdan natively. No external dependency. -- Shebang-dispatched scripts require the named interpreter on PATH (`bash.exe`, - `python.exe`, `node.exe`, etc.). Crush does the dispatch that the Windows - kernel won't. -- Shebang-less scripts run through mvdan regardless of extension. CRLF line - endings are tolerated. - -### Implementation sketch - -- New function - `dispatch(ctx, cmd string, env []string, stdin io.Reader) (stdout, stderr string, exitCode int, err error)` - in `internal/hooks/`. -- Tokenize using mvdan's parser (already a dep) for consistent quoting/escape - behavior with shell intuition. -- Path-prefix check on `argv[0]`; if path, read shebang with a bounded - `io.LimitReader` and parse. Support: - - `#!/absolute/interpreter args…` - - `#!/usr/bin/env NAME` → resolve `NAME` on PATH - - `#!/usr/bin/env -S NAME args…` → treat as above; `-S` is common enough to - handle. Other `env` flags can error. -- Unified exit-code helper. mvdan's `interp.ExitStatus` and `os/exec`'s - `ProcessState.ExitCode()` both become a single `int`. -- Context cancellation: mvdan's exec handler uses `exec.CommandContext` for its - children, so a cancelled hook kills both the interpreter and any children. - Verify with a `sleep 60` test. -- One fresh `interp.Runner` per hook invocation (parallel hooks must not share - state). - -### Swap the call site - -`Runner.runOne` in `internal/hooks/runner.go` replaces its -`exec.Command("sh", "-c", …)` with a call to `dispatch(…)`. Everything -downstream (exit-code 2 / 49 / other dispatch, stdout JSON parsing, -stderr-as-reason) stays identical. - -### Tests - -Cross-platform matrix: - -- Inline: `echo hi`; `exit 2`; pipelines; redirections. -- File, no shebang: treated as shell source through mvdan. -- File, `#!/bin/bash` on Unix — invokes system bash. -- File, `#!/usr/bin/env python3` — invokes python if present, skips if not. -- File, `#!/usr/bin/env -S node --foo` — extra flags preserved. -- File with CRLF line endings in the shebang. -- `./missing-file` — non-blocking error, hook proceeds as "no opinion". -- Timeout: hook that sleeps past its timeout gets killed; context cancellation - kills the interpreter and its children. -- Concurrency: 10 hooks in parallel don't leak env/cwd/state between runners. -- Windows-specific: `./script.exe` exec'd directly; bash-shebang script fails - gracefully when bash isn't on PATH. - -### Pitfalls to watch - -- **Userland shebang parsing is now our problem.** Edge cases around `env` - flags, args with spaces, CRLF, missing interpreter. Well-trodden but needs - real tests. -- **The path-prefix heuristic is a heuristic.** `relative/path.sh` (no leading - `./`) gets mvdan'd, not file-dispatched. Matches shell intuition — at a bash - prompt, `relative/path.sh` doesn't run unless `.` is on PATH — but worth - documenting. -- **Kernel shebang handling is bypassed on Unix.** Today a chmod+x'd script is - exec'd by the kernel; after this change, by our parser. Behavior should be - byte-identical; verify with tests. -- **Two code paths.** mvdan vs direct-exec. Exit codes, stdin, signal - propagation, env inheritance must be aligned. Discipline, not cleverness. - -### Explicit non-goals - -- No bundled `bash.exe` or `python.exe`. Users bring their own interpreters. -- No custom mvdan builtins (`crush_approve` etc.). Hooks stay portable and - testable under bare `bash`. -- No `.sh`-extension filter on discovery. Hook file shape is driven by shebang, - not filename. -- No Crush-as-script-interpreter mode (users can't write `#!/usr/bin/env crush` - and have it mean something). If we want that later, it's an additive feature, - not a dependency of this work. +**Status:** implemented. See the [Execution model](README.md#execution-model) +section in `README.md` for the current behavior and contract. diff --git a/docs/hooks/README.md b/docs/hooks/README.md index 920a61bf56b2c084e571587602b88b3074e19805..1a908a8b644f28170373fdd19908e81a7a92a6a0 100644 --- a/docs/hooks/README.md +++ b/docs/hooks/README.md @@ -29,6 +29,45 @@ languages at the end, too. - Crush currently supports just one hook, `PreToolUse`, with plans to support the full gamut. If there's a hook you'd like to see, let us know. +## Execution model + +Hooks run through Crush's embedded POSIX shell (`mvdan.cc/sh`) — the same +interpreter the `bash` tool uses. Inline commands and shebang-less scripts +execute in-process; scripts with a `#!` shebang dispatch to the named +interpreter via `os/exec`. This contract is identical on macOS, Linux, and +Windows. + +What this means in practice: + +- **Windows without Unix tooling**: inline shell (`echo`, pipelines, `jq`, + `grep`), shebang-less `.sh` scripts, inline PowerShell + (`powershell -Command …`), and `.exe` invocations all work out of the box + with no WSL, Git Bash, Cygwin, or MSYS required. +- **PowerShell scripts** (`.ps1`) are not auto-dispatched by extension. + Invoke them explicitly: `powershell -File ./audit.ps1` (or + `pwsh -File ./audit.ps1`). +- **Shebang'd scripts** require the named interpreter on `PATH`. Git for + Windows ships `bash.exe`, which makes `#!/bin/bash` and + `#!/usr/bin/env bash` scripts work on Windows the same way they do on + Unix. CRLF line endings in the shebang line are tolerated. +- **Permissive shebang fallback**: if the absolute path in a shebang + doesn't exist (e.g. `#!/bin/bash` on Windows), Crush falls back to a + `PATH` lookup of the base name (`bash`) before giving up. A debug-level + log records the fallback. If the interpreter isn't on `PATH` either, the + hook fails cleanly as a non-blocking warning and the agent proceeds as + "no opinion". +- **Environment**: every hook sees `CRUSH=1`, `AGENT=crush`, and + `AI_AGENT=crush` on top of the `CRUSH_*` hook-specific variables. These + three markers are guaranteed and match what the `bash` tool sets, so + scripts that detect "am I being run by an AI agent?" behave the same in + both contexts. +- **Timeout behavior**: when a hook exceeds its timeout, Crush cancels the + context and waits a short grace period (~1s) for the interpreter to + yield. If the hook still hasn't returned, Crush abandons it, logs a + warning, and treats the result as "no opinion" so the agent can proceed. + Long-running work should honor context cancellation or run in a + subprocess via a shebang. + ## Configuration Hooks can be added to your `crush.json` at both the global and project-level, @@ -89,13 +128,15 @@ When a hook fires, Crush: 1. Filters hooks whose `matcher` regex matches the tool name (no matcher = match all). 2. Deduplicates by `command` (identical commands run once). -3. Runs all matching hooks **in parallel** as subprocesses. +3. Runs all matching hooks **in parallel** through Crush's embedded POSIX + shell (see [Execution model](#execution-model)). 4. Waits for all to finish (or time out), then aggregates results **in config order**: deny wins over allow, allow wins over none; `updated_input` patches shallow-merge in order. -Note that you can omit `matcher` and match in your shell script instead, however -you'll incur some additional overhead as Crush will `exec` each script. +Note that you can omit `matcher` and match in your shell script instead, +however you'll incur some additional overhead as Crush will still parse and +run each hook. ### Input @@ -109,6 +150,9 @@ The available environment variables are: | Variable | Description | | ---------------------------- | ---------------------------------------------- | +| `CRUSH` | Always `1` when running under Crush. | +| `AGENT` | Always `crush`. | +| `AI_AGENT` | Always `crush`. | | `CRUSH_EVENT` | The hook event name (e.g. `PreToolUse`). | | `CRUSH_TOOL_NAME` | The tool being called (e.g. `bash`). | | `CRUSH_SESSION_ID` | Current session ID. | @@ -117,6 +161,10 @@ The available environment variables are: | `CRUSH_TOOL_INPUT_COMMAND` | For `bash` calls: the shell command being run. | | `CRUSH_TOOL_INPUT_FILE_PATH` | For file tools: the target file path. | +The `CRUSH`, `AGENT`, and `AI_AGENT` markers are also set by the `bash` +tool, so a script can detect "am I running under Crush?" the same way in +either context. + #### JSON Standard input provides the full context as JSON: @@ -267,9 +315,12 @@ When multiple hooks match the same tool call: ### Timeouts -If a hook exceeds its timeout, the process is killed and treated as a -non-blocking error and the tool call proceeds. The default timeout is 30 -seconds. +If a hook exceeds its timeout, Crush cancels its context and treats the +result as a non-blocking error so the tool call proceeds. The default +timeout is 30 seconds. Shebang-dispatched subprocesses are killed through +`exec.CommandContext`; in-process hooks get a short grace period to yield +and are then abandoned (the agent moves on regardless). Long-running work +should honor context cancellation or run out-of-process via a shebang. ## Examples