AGENTS.md

  1<!--
  2SPDX-FileCopyrightText: Amolith <amolith@secluded.site>
  3
  4SPDX-License-Identifier: CC0-1.0
  5-->
  6
  7# Rumilo Agent Guide
  8
  9CLI that dispatches specialized AI research subagents for web research and git repository exploration.
 10
 11## Commands
 12
 13```bash
 14bun run dev       # Run CLI via bun (development)
 15bun run build     # Build to dist/
 16bun run typecheck # TypeScript check (also: bun run lint)
 17```
 18
 19Run `bun test` to execute the test suite.
 20
 21## Architecture
 22
 23```
 24CLI entry (src/cli/index.ts)
 25    ├── web command → runWebCommand() → runAgent() with web tools
 26    └── repo command → runRepoCommand() → runAgent() with git tools
 27```
 28
 29### Control Flow
 30
 311. **CLI** parses args with custom parser (no library), dispatches to command handlers
 322. **Command handlers** (`src/cli/commands/`) load config, create workspace, build context-aware system prompt, configure tools, invoke `runAgent()`
 333. **Agent runner** (`src/agent/runner.ts`) wraps `@mariozechner/pi-agent` - creates `Agent`, subscribes to events, prompts, extracts final message
 344. **Tools** are created via factory functions that close over workspace path for sandboxing
 355. **Workspace** (`src/workspace/manager.ts`) is a temp directory (cleaned up unless `--no-cleanup`)
 36
 37### Two Agent Modes
 38
 39| Mode | Tools | External Services |
 40|------|-------|-------------------|
 41| `web` | `web_search`, `web_fetch`, `read`, `grep`, `ls`, `find` | Kagi (search), Tabstack (fetch→markdown) |
 42| `repo` | `read`, `grep`, `ls`, `find`, `git_log`, `git_show`, `git_blame`, `git_diff`, `git_refs`, `git_checkout` | None (clones repo to workspace) |
 43
 44## Key Patterns
 45
 46### Tool Factory Pattern
 47
 48All tools follow the same signature:
 49```typescript
 50const createFooTool = (workspacePath: string): AgentTool => ({ ... })
 51```
 52
 53Tools use `@sinclair/typebox` for parameter schemas. Execute functions return `{ content: [{type: "text", text}], details?: {...} }`.
 54
 55### Workspace Sandboxing
 56
 57Filesystem tools (`read`, `grep`, `ls`, `find`) must constrain paths to workspace:
 58- `ensureWorkspacePath()` in `src/agent/tools/index.ts` validates paths don't escape
 59- `resolveToCwd()` / `resolveReadPath()` in `src/agent/tools/path-utils.ts` handle expansion and normalization
 60
 61Git tools (`git_show`, `git_blame`, `git_diff`, `git_checkout`, `git_log`, `git_refs`) do **not** apply path containment. Refs and paths are passed directly to `simple-git`, which is initialized with `workspacePath` so all commands are scoped to the cloned repository. The user explicitly chooses which repository to clone, so its git objects are trusted content. This is an accepted trust boundary: we sandbox the filesystem but trust git data within the user's chosen repo.
 62
 63### Config Cascade
 64
 65```
 66defaultConfig → XDG_CONFIG_HOME/rumilo/config.toml → CLI flags
 67```
 68
 69Config uses TOML, validated against TypeBox schema (`src/config/schema.ts`).
 70
 71### Model Resolution
 72
 73Model strings use `provider:model` format. `custom:name` prefix looks up custom model definitions from config's `[custom_models]` section. Built-in providers delegate to `@mariozechner/pi-ai`.
 74
 75API key resolution for custom models uses `resolveConfigValue()` from `src/util/env.ts`, which supports bare env var names, `$VAR` / `${VAR}` references, and `!shell-command` execution. Built-in providers fall back to `pi-ai`'s `getEnvApiKey()` (e.g. `ANTHROPIC_API_KEY`).
 76
 77### Error Handling
 78
 79Custom error classes in `src/util/errors.ts` extend `RumiloError` with error codes:
 80- `ConfigError`, `FetchError`, `CloneError`, `WorkspaceError`, `ToolInputError`
 81
 82### System Prompts
 83
 84Prompts in `src/agent/prompts/` are builder functions, not static strings. Each takes a context object and returns a templated prompt with XML-style sections (`<approach>`, `<answering>`, `<environment>`).
 85
 86- **Repo prompt** (`buildRepoPrompt`): receives `{ currentTime, hasHistory }`. Git history guidance is only included when `hasHistory` is true (i.e. `--full` clone). Instructs the agent to check for agent instruction files (AGENTS.md, CLAUDE.md, .cursorrules, etc.) before README when orienting.
 87- **Web prompt** (`buildWebPrompt`): receives `{ currentTime }`. Provider-agnostic — does not assume a specific search-then-fetch workflow.
 88
 89Custom `system_prompt_path` in config replaces the built prompt entirely.
 90
 91### Output Truncation
 92
 93`src/util/truncate.ts` handles large content:
 94- `DEFAULT_MAX_LINES = 2000`
 95- `DEFAULT_MAX_BYTES = 50KB`
 96- `GREP_MAX_LINE_LENGTH = 500` chars
 97
 98## Gotchas
 99
1001. **Grep requires ripgrep** - `createGrepTool` checks for `rg` at tool creation time and throws if missing
1012. **Web tools require credentials** - `KAGI_SESSION_TOKEN` and `TABSTACK_API_KEY` via env or config
1023. **Shallow clones by default** - repo mode uses `--depth 1 --filter=blob:limit=5m` unless `--full`
1034. **Pre-fetch injection** - web command with `-u URL` wraps content in `<attached_content>` XML tags; content ≤50KB is inlined, larger content is stored in workspace
1045. **No `.js` extension in imports** - source uses `.js` extensions for ESM compatibility even though files are `.ts`
105
106## Adding a New Tool
107
1081. Create `src/agent/tools/newtool.ts` following the factory pattern
1092. Export from `src/agent/tools/index.ts` if general-purpose
1103. Add to appropriate command's tool array in `src/cli/commands/{web,repo}.ts`
1114. Use `ToolInputError` for validation failures