Fix API errors where tool_result blocks are sent without their corresponding tool_use blocks in the assistant message (#48002)

Daniel Strobusch and Bennet Bo Fenner created

When a tool's JSON response fails to parse, the system would:
1. Create a `LanguageModelToolResult` with the error
2. Add it to `pending_message.tool_results`
3. **Never add the corresponding `ToolUse` to
`pending_message.content`**

This left an orphaned `tool_result` that would be sent to the LLM API
without a matching `tool_use` block, causing the provider to reject the
entire request with an error like:

```
messages: Assistant message must contain at least one content block, if
immediately followed by a user message with tool_result
```

The issue was in `handle_tool_use_json_parse_error_event()`. It created
and returned a `LanguageModelToolResult` (which gets added to
`tool_results`), but **failed to add the corresponding `ToolUse` to the
message `content`**.

This asymmetry meant:
- `pending_message.content`: [] (empty - no ToolUse!)
- `pending_message.tool_results`: {id: result}

When `AgentMessage::to_request()` converted this to API messages, it
would create:
- Assistant message: no tool_use blocks ❌
- User message: tool_result block ✅

APIs require tool_use and tool_result to be paired, so this would fail.

**Without this fix, the conversation becomes permanently broken** -
every subsequent message in the thread fails with the same API error
because the orphaned tool_result remains in the message history. The
only recovery is to start a completely new conversation, making this a
particularly annoying bug for users.

Modified `handle_tool_use_json_parse_error_event()` to:
1. **Add the `ToolUse` to `pending_message.content`** before returning
the result
2. Parse the raw_input JSON (falling back to `{}` if invalid, as the API
requires an object)
3. Send the `tool_call` event to update the UI
4. Check for duplicates to avoid adding the same `tool_use` twice

This ensures `tool_use` and `tool_result` are always properly paired.

Added comprehensive test coverage for
`handle_tool_use_json_parse_error_event()`:
- ✅ Verifies tool_use is added to message content
- ✅ Confirms tool_use has correct metadata and JSON fallback
- ✅ Tests deduplication logic to prevent duplicates
- ✅ Validates JSON parsing for valid input

## Manual Testing

To reproduce and test the fix:

1. Install the test MCP server:
    ```bash
   cargo install --git https://github.com/dastrobu/mcp-fail-server
   ```
3. Add to Zed settings to enable the server: 
   ```json
   {
     "context_servers": {
       "mcp-fail-server": {
         "command": "mcp-fail-server",
         "args":[]
       }
     }
   }
   ```

4. Open the assistant panel and ask it to use the `fail` tool
5. Without the fix: The conversation breaks permanently - every
subsequent message fails with the same API error, forcing you to start a
new thread

<img width="399" height="531" alt="image"
src="https://github.com/user-attachments/assets/533bdf40-80d3-4726-a9d9-dbabbe7379e5"
/>


7. With the fix: The error is handled gracefully, displayed in the UI,
and the conversation remains usable

<img width="391" height="512" alt="image"
src="https://github.com/user-attachments/assets/73aa6767-eeac-4d5d-bf6f-1beccca1d5cb"
/>


The mcp-fail-server always returns an error, triggering the JSON parse
error path that previously caused orphaned tool_result blocks.

Release Notes:

- Fixed an issue where errors could occur in the agent panel if an LLM
emitted a tool call with an invalid JSON payload

---------

Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de>

Change summary

crates/agent/src/thread.rs | 213 +++++++++++++++++++++++++++++++++------
1 file changed, 177 insertions(+), 36 deletions(-)

Detailed changes

crates/agent/src/thread.rs 🔗

@@ -1965,6 +1965,7 @@ impl Thread {
                         tool_name,
                         raw_input,
                         json_parse_error,
+                        event_stream,
                     ),
                 )));
             }
@@ -2050,42 +2051,7 @@ impl Thread {
             kind = tool.kind();
         }
 
-        // Ensure the last message ends in the current tool use
-        let last_message = self.pending_message();
-        let push_new_tool_use = last_message.content.last_mut().is_none_or(|content| {
-            if let AgentMessageContent::ToolUse(last_tool_use) = content {
-                if last_tool_use.id == tool_use.id {
-                    *last_tool_use = tool_use.clone();
-                    false
-                } else {
-                    true
-                }
-            } else {
-                true
-            }
-        });
-
-        if push_new_tool_use {
-            event_stream.send_tool_call(
-                &tool_use.id,
-                &tool_use.name,
-                title,
-                kind,
-                tool_use.input.clone(),
-            );
-            last_message
-                .content
-                .push(AgentMessageContent::ToolUse(tool_use.clone()));
-        } else {
-            event_stream.update_tool_call_fields(
-                &tool_use.id,
-                acp::ToolCallUpdateFields::new()
-                    .title(title.as_str())
-                    .kind(kind)
-                    .raw_input(tool_use.input.clone()),
-                None,
-            );
-        }
+        self.send_or_update_tool_use(&tool_use, title, kind, event_stream);
 
         if !tool_use.is_input_complete {
             return None;
@@ -2152,7 +2118,23 @@ impl Thread {
         tool_name: Arc<str>,
         raw_input: Arc<str>,
         json_parse_error: String,
+        event_stream: &ThreadEventStream,
     ) -> LanguageModelToolResult {
+        let tool_use = LanguageModelToolUse {
+            id: tool_use_id.clone(),
+            name: tool_name.clone(),
+            raw_input: raw_input.to_string(),
+            input: serde_json::json!({}),
+            is_input_complete: true,
+            thought_signature: None,
+        };
+        self.send_or_update_tool_use(
+            &tool_use,
+            SharedString::from(&tool_use.name),
+            acp::ToolKind::Other,
+            event_stream,
+        );
+
         let tool_output = format!("Error parsing input JSON: {json_parse_error}");
         LanguageModelToolResult {
             tool_use_id,
@@ -2163,6 +2145,51 @@ impl Thread {
         }
     }
 
+    fn send_or_update_tool_use(
+        &mut self,
+        tool_use: &LanguageModelToolUse,
+        title: SharedString,
+        kind: acp::ToolKind,
+        event_stream: &ThreadEventStream,
+    ) {
+        // Ensure the last message ends in the current tool use
+        let last_message = self.pending_message();
+        let push_new_tool_use = last_message.content.last_mut().is_none_or(|content| {
+            if let AgentMessageContent::ToolUse(last_tool_use) = content {
+                if last_tool_use.id == tool_use.id {
+                    *last_tool_use = tool_use.clone();
+                    false
+                } else {
+                    true
+                }
+            } else {
+                true
+            }
+        });
+
+        if push_new_tool_use {
+            event_stream.send_tool_call(
+                &tool_use.id,
+                &tool_use.name,
+                title,
+                kind,
+                tool_use.input.clone(),
+            );
+            last_message
+                .content
+                .push(AgentMessageContent::ToolUse(tool_use.clone()));
+        } else {
+            event_stream.update_tool_call_fields(
+                &tool_use.id,
+                acp::ToolCallUpdateFields::new()
+                    .title(title.as_str())
+                    .kind(kind)
+                    .raw_input(tool_use.input.clone()),
+                None,
+            );
+        }
+    }
+
     pub fn title(&self) -> SharedString {
         self.title.clone().unwrap_or("New Thread".into())
     }
@@ -3511,3 +3538,117 @@ fn convert_image(image_content: acp::ImageContent) -> LanguageModelImage {
         size: None,
     }
 }
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+    use gpui::TestAppContext;
+    use language_model::LanguageModelToolUseId;
+    use serde_json::json;
+    use std::sync::Arc;
+
+    async fn setup_thread_for_test(cx: &mut TestAppContext) -> (Entity<Thread>, ThreadEventStream) {
+        cx.update(|cx| {
+            let settings_store = settings::SettingsStore::test(cx);
+            cx.set_global(settings_store);
+        });
+
+        let fs = fs::FakeFs::new(cx.background_executor.clone());
+        let templates = Templates::new();
+        let project = Project::test(fs.clone(), [], cx).await;
+
+        cx.update(|cx| {
+            let project_context = cx.new(|_cx| prompt_store::ProjectContext::default());
+            let context_server_store = project.read(cx).context_server_store();
+            let context_server_registry =
+                cx.new(|cx| ContextServerRegistry::new(context_server_store, cx));
+
+            let thread = cx.new(|cx| {
+                Thread::new(
+                    project,
+                    project_context,
+                    context_server_registry,
+                    templates,
+                    None,
+                    cx,
+                )
+            });
+
+            let (event_tx, _event_rx) = mpsc::unbounded();
+            let event_stream = ThreadEventStream(event_tx);
+
+            (thread, event_stream)
+        })
+    }
+
+    #[gpui::test]
+    async fn test_handle_tool_use_json_parse_error_adds_tool_use_to_content(
+        cx: &mut TestAppContext,
+    ) {
+        let (thread, event_stream) = setup_thread_for_test(cx).await;
+
+        cx.update(|cx| {
+            thread.update(cx, |thread, _cx| {
+                let tool_use_id = LanguageModelToolUseId::from("test_tool_id");
+                let tool_name: Arc<str> = Arc::from("test_tool");
+                let raw_input: Arc<str> = Arc::from("{invalid json");
+                let json_parse_error = "expected value at line 1 column 1".to_string();
+
+                // Call the function under test
+                let result = thread.handle_tool_use_json_parse_error_event(
+                    tool_use_id.clone(),
+                    tool_name.clone(),
+                    raw_input.clone(),
+                    json_parse_error,
+                    &event_stream,
+                );
+
+                // Verify the result is an error
+                assert!(result.is_error);
+                assert_eq!(result.tool_use_id, tool_use_id);
+                assert_eq!(result.tool_name, tool_name);
+                assert!(matches!(
+                    result.content,
+                    LanguageModelToolResultContent::Text(_)
+                ));
+
+                // Verify the tool use was added to the message content
+                {
+                    let last_message = thread.pending_message();
+                    assert_eq!(
+                        last_message.content.len(),
+                        1,
+                        "Should have one tool_use in content"
+                    );
+
+                    match &last_message.content[0] {
+                        AgentMessageContent::ToolUse(tool_use) => {
+                            assert_eq!(tool_use.id, tool_use_id);
+                            assert_eq!(tool_use.name, tool_name);
+                            assert_eq!(tool_use.raw_input, raw_input.to_string());
+                            assert!(tool_use.is_input_complete);
+                            // Should fall back to empty object for invalid JSON
+                            assert_eq!(tool_use.input, json!({}));
+                        }
+                        _ => panic!("Expected ToolUse content"),
+                    }
+                }
+
+                // Insert the tool result (simulating what the caller does)
+                thread
+                    .pending_message()
+                    .tool_results
+                    .insert(result.tool_use_id.clone(), result);
+
+                // Verify the tool result was added
+                let last_message = thread.pending_message();
+                assert_eq!(
+                    last_message.tool_results.len(),
+                    1,
+                    "Should have one tool_result"
+                );
+                assert!(last_message.tool_results.contains_key(&tool_use_id));
+            });
+        });
+    }
+}