agent-loop.md

  1# Agent Loop Patterns from Crush
  2
  3How Crush orchestrates LLM conversations, tool execution, streaming, and permission handling.
  4
  5## Table of Contents
  6
  71. [Architecture Overview](#architecture-overview)
  82. [The Agent Loop](#the-agent-loop)
  93. [Fantasy SDK Integration](#fantasy-sdk-integration)
 104. [Streaming Bridge to TUI](#streaming-bridge-to-tui)
 115. [Tool System](#tool-system)
 126. [Permission System](#permission-system)
 137. [Message Queue](#message-queue)
 148. [Auto-Summarization](#auto-summarization)
 159. [Coordinator Pattern](#coordinator-pattern)
 16
 17## Architecture Overview
 18
 19```
 20User types prompt
 21  -> UI.sendMessage() creates a tea.Cmd
 22    -> AgentCoordinator.Run(ctx, sessionID, prompt)
 23      -> SessionAgent.Run(ctx, call)
 24        -> fantasy.Agent.Stream(ctx, streamCall)
 25          -> LLM responds with text/tool calls
 26            -> Callbacks persist to DB via message.Service
 27              -> message.Service publishes via pubsub.Broker
 28                -> App.Events() channel delivers to bubbletea
 29                  -> UI.Update() receives pubsub.Event[message.Message]
 30                    -> Chat re-renders
 31```
 32
 33## The Agent Loop
 34
 35**Source:** `internal/agent/agent.go` `Run()` method
 36
 37The core loop:
 38
 391. **Queue check** - if session is busy, queue the prompt and return immediately
 402. **Prepare** - copy tools, model, system prompt (thread-safe via `csync.Value`)
 413. **Create user message** - persist to DB, triggers pubsub event
 424. **Generate title** - async goroutine on first message
 435. **Stream** - call `fantasy.Agent.Stream()` with callbacks
 446. **Handle result** - update session usage, check for summarization, process queue
 45
 46```go
 47func (a *sessionAgent) Run(ctx context.Context, call SessionAgentCall) (*fantasy.AgentResult, error) {
 48    // Queue if busy
 49    if a.IsSessionBusy(call.SessionID) {
 50        a.messageQueue.Set(call.SessionID, append(existing, call))
 51        return nil, nil
 52    }
 53
 54    // Thread-safe copies
 55    agentTools := a.tools.Copy()
 56    largeModel := a.largeModel.Get()
 57    systemPrompt := a.systemPrompt.Get()
 58
 59    // Add MCP server instructions to system prompt
 60    for _, server := range mcp.GetStates() {
 61        if server.State == mcp.StateConnected {
 62            instructions.WriteString(server.Client.InitializeResult().Instructions)
 63        }
 64    }
 65
 66    // Create fantasy agent
 67    agent := fantasy.NewAgent(
 68        largeModel.Model,
 69        fantasy.WithSystemPrompt(systemPrompt),
 70        fantasy.WithTools(agentTools...),
 71    )
 72
 73    // Get history, create user message, then stream
 74    history, files := a.preparePrompt(msgs, call.Attachments...)
 75    result, err := agent.Stream(ctx, fantasy.AgentStreamCall{...})
 76}
 77```
 78
 79## Fantasy SDK Integration
 80
 81**Source:** `internal/agent/agent.go`
 82
 83Fantasy (`charm.land/fantasy`) is Charm's multi-provider LLM abstraction. Crush uses the agent streaming API with rich callbacks:
 84
 85```go
 86result, err := agent.Stream(genCtx, fantasy.AgentStreamCall{
 87    Prompt:          promptText,
 88    Messages:        history,
 89    Files:           files,
 90    ProviderOptions: call.ProviderOptions,
 91    MaxOutputTokens: &call.MaxOutputTokens,
 92
 93    PrepareStep: func(ctx context.Context, opts fantasy.PrepareStepFunctionOptions) (context.Context, fantasy.PrepareStepResult, error) {
 94        // Called before each LLM call in the agentic loop
 95        // Refresh tools (MCP might have changed)
 96        prepared.Tools = a.tools.Copy()
 97        // Drain queued prompts and inject them
 98        for _, queued := range queuedCalls {
 99            prepared.Messages = append(prepared.Messages, userMessage.ToAIMessage()...)
100        }
101        // Create assistant message placeholder in DB
102        assistantMsg, _ := a.messages.Create(ctx, sessionID, ...)
103        currentAssistant = &assistantMsg
104        return ctx, prepared, nil
105    },
106
107    OnReasoningStart: func(id string, reasoning fantasy.ReasoningContent) error {
108        currentAssistant.AppendReasoningContent(reasoning.Text)
109        return a.messages.Update(ctx, *currentAssistant)
110    },
111    OnReasoningDelta: func(id string, text string) error {
112        currentAssistant.AppendReasoningContent(text)
113        return a.messages.Update(ctx, *currentAssistant)
114    },
115    OnTextDelta: func(id string, text string) error {
116        currentAssistant.AppendContent(text)
117        return a.messages.Update(ctx, *currentAssistant)
118    },
119    OnToolInputStart: func(id string, toolName string) error {
120        currentAssistant.AddToolCall(message.ToolCall{ID: id, Name: toolName})
121        return a.messages.Update(ctx, *currentAssistant)
122    },
123    OnToolCall: func(tc fantasy.ToolCallContent) error {
124        currentAssistant.AddToolCall(message.ToolCall{ID: tc.ToolCallID, Name: tc.ToolName, Input: tc.Input, Finished: true})
125        return a.messages.Update(ctx, *currentAssistant)
126    },
127    OnToolResult: func(result fantasy.ToolResultContent) error {
128        a.messages.Create(ctx, sessionID, message.CreateMessageParams{Role: message.Tool, Parts: [...]})
129        return nil
130    },
131    OnStepFinish: func(stepResult fantasy.StepResult) error {
132        // Update token usage on session
133        a.updateSessionUsage(largeModel, &session, stepResult.Usage, ...)
134        a.sessions.Save(ctx, session)
135        return nil
136    },
137
138    StopWhen: []fantasy.StopCondition{
139        // Stop when context window is nearly full
140        func(_ []fantasy.StepResult) bool {
141            remaining := contextWindow - tokensUsed
142            return remaining <= threshold && !disableAutoSummarize
143        },
144        // Stop on repeated tool calls (loop detection)
145        func(steps []fantasy.StepResult) bool {
146            return hasRepeatedToolCalls(steps, windowSize, maxRepeats)
147        },
148    },
149})
150```
151
152Key insight: `PrepareStep` runs before EACH step in the agentic loop (not just the first). This lets Crush:
153- Inject queued user messages mid-conversation
154- Refresh MCP tools dynamically
155- Create a fresh assistant message for each step
156- Apply Anthropic cache control to the right message positions
157
158## Streaming Bridge to TUI
159
160The flow from agent goroutine to TUI:
161
1621. **Agent callback** (`OnTextDelta`, etc.) calls `a.messages.Update(ctx, msg)`
1632. **message.Service** persists to SQLite, then publishes: `broker.Publish(pubsub.UpdatedEvent, msg)`
1643. **App** has a goroutine converting pubsub channels to `tea.Msg` via `app.Events()` channel
1654. **Bubbletea** reads from `app.Events()` and dispatches to `UI.Update()`
1665. **UI** receives `pubsub.Event[message.Message]` and updates the chat item
167
168```go
169// In UI.Update():
170case pubsub.Event[message.Message]:
171    if msg.Payload.SessionID != m.session.ID {
172        // Handle child session (agent tool)
173        break
174    }
175    switch msg.Type {
176    case pubsub.CreatedEvent:
177        cmds = append(cmds, m.appendSessionMessage(msg.Payload))
178    case pubsub.UpdatedEvent:
179        cmds = append(cmds, m.updateSessionMessage(msg.Payload))
180    }
181```
182
183The chat item uses cached rendering - it only re-renders when the underlying message data changes.
184
185## Tool System
186
187**Source:** `internal/agent/tools/`
188
189Tools are self-documenting pairs: a `.go` implementation file and a `.md` description file in the same directory.
190
191Built-in tools: bash, edit, multiedit, view, write, grep, glob, ls, diagnostics, references, fetch, download, lsp_restart, sourcegraph, job_output, job_kill, list_mcp_resources, todos, agent, agentic_fetch.
192
193MCP tools are dynamically loaded and prefixed with `mcp_`.
194
195Each tool implements `fantasy.AgentTool` which provides a JSON schema for the LLM and an execution function.
196
197Tool results flow back through `OnToolResult` callback -> `message.Service.Create()` -> pubsub -> TUI.
198
199## Permission System
200
201**Source:** `internal/permission/permission.go`
202
203The permission service mediates tool execution:
204
205```go
206type Service interface {
207    Request(ctx context.Context, opts CreatePermissionRequest) (bool, error)
208    Grant(permission PermissionRequest)
209    GrantPersistent(permission PermissionRequest)
210    Deny(permission PermissionRequest)
211    AutoApproveSession(sessionID string)
212    SetSkipRequests(skip bool)  // yolo mode
213}
214```
215
216When a tool needs permission:
2171. Tool calls `permissions.Request()` which blocks
2182. Permission service publishes `pubsub.Event[permission.PermissionRequest]`
2193. TUI receives event, opens `dialog.Permissions`
2204. User chooses: Allow / Allow for session / Deny
2215. TUI calls `permissions.Grant()` or `permissions.Deny()`
2226. The blocked `Request()` call returns, tool continues or errors
223
224Allow-lists can be configured in `crush.json` to skip prompting for specific tools.
225
226Yolo mode (`--yolo` flag) sets `SkipRequests(true)` which auto-approves everything.
227
228## Message Queue
229
230**Source:** `internal/agent/agent.go`
231
232If the user sends a new prompt while the agent is busy:
233
234```go
235if a.IsSessionBusy(call.SessionID) {
236    existing, _ := a.messageQueue.Get(call.SessionID)
237    existing = append(existing, call)
238    a.messageQueue.Set(call.SessionID, existing)
239    return nil, nil  // queued, not executed yet
240}
241```
242
243Queued messages are drained in `PrepareStep` (called before each LLM step):
244
245```go
246PrepareStep: func(ctx context.Context, opts ...) (...) {
247    queuedCalls, _ := a.messageQueue.Get(call.SessionID)
248    a.messageQueue.Del(call.SessionID)
249    for _, queued := range queuedCalls {
250        userMessage, _ := a.createUserMessage(ctx, queued)
251        prepared.Messages = append(prepared.Messages, userMessage.ToAIMessage()...)
252    }
253}
254```
255
256This means queued prompts are injected into the conversation at the next natural break point (between LLM steps).
257
258## Auto-Summarization
259
260When the context window is nearly full, Crush auto-summarizes:
261
262```go
263const (
264    largeContextWindowThreshold = 200_000
265    largeContextWindowBuffer    = 20_000
266    smallContextWindowRatio     = 0.2
267)
268
269// In StopWhen condition:
270remaining := contextWindow - tokensUsed
271if cw > largeContextWindowThreshold {
272    threshold = largeContextWindowBuffer  // 20K buffer for large models
273} else {
274    threshold = int64(float64(cw) * smallContextWindowRatio)  // 20% for small models
275}
276if remaining <= threshold && !disableAutoSummarize {
277    shouldSummarize = true
278    return true  // stop the loop
279}
280```
281
282After the loop stops, if `shouldSummarize` is true, the coordinator triggers summarization using the small model.
283
284## Coordinator Pattern
285
286**Source:** `internal/agent/coordinator.go`
287
288The Coordinator manages the lifecycle:
289
290```go
291type Coordinator interface {
292    Run(ctx context.Context, sessionID, prompt string, attachments ...message.Attachment) (*fantasy.AgentResult, error)
293    Cancel(sessionID string)
294    IsSessionBusy(sessionID string) bool
295    QueuedPrompts(sessionID string) int
296    UpdateModels(ctx context.Context) error
297    Model() Model
298}
299```
300
301It handles:
302- **Multi-provider setup** - creates `fantasy.LanguageModel` from config for each provider type (Anthropic, OpenAI, Google, Bedrock, Azure, OpenRouter, Vercel, etc.)
303- **Model switching** - `UpdateModels()` reconfigures agents when user changes model mid-session
304- **Tool registration** - collects built-in tools + MCP tools, passes to session agent
305- **System prompt assembly** - loads Go templates from `internal/agent/templates/`, injects runtime data (working dir, OS, skills, context files)
306
307The coordinator owns a map of named agents (currently "coder" and "task") and delegates to the current agent.
308
309Thread safety throughout uses `internal/csync` which provides `Value[T]`, `Slice[T]`, and `Map[K,V]` - simple wrappers around values with mutex protection.