acp_thread: Clarify max token limit error message (#52724)

Om Chillure and Bennet Bo Fenner created 1 week ago

When generation stops due to the per-response output limit, Zed was
surfacing "Max tokens reached", which implies the full context window
was exhausted. In reality, `max_output_tokens` (the per-response cap)
may have been hit a different condition.

This change distinguishes between the two cases: if `output_tokens >=
max_output_tokens`, it surfaces "Maximum output tokens reached";
otherwise it falls back to "Maximum tokens reached".

Self-Review Checklist:

- [x] I've reviewed my own diff for quality, security, and reliability
- [x] Unsafe blocks (if any) have justifying comments
- [x] The content is consistent with the UI/UX checklist
- [ ] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable

Closes #50254


Note : Reopens #50372 suggested by @bennetbo 

Release Notes:

- Fixed misleading "Max tokens reached" error by distinguishing between
per-response output token limit and total context window limit.

---------

Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de>

Change summary

crates/acp_thread/src/acp_thread.rs      | 19 ++++++++++++++++++-
crates/agent_ui/src/conversation_view.rs |  6 +++---
2 files changed, 21 insertions(+), 4 deletions(-)

Detailed changes

crates/acp_thread/src/acp_thread.rs 🔗

@@ -2240,7 +2240,24 @@ impl AcpThread {
                             this.had_error = true;
                             cx.emit(AcpThreadEvent::Error);
                             log::error!("Max tokens reached. Usage: {:?}", this.token_usage);
-                            return Err(anyhow!("Max tokens reached"));
+
+                            let exceeded_max_output_tokens =
+                                this.token_usage.as_ref().is_some_and(|u| {
+                                    u.max_output_tokens
+                                        .is_some_and(|max| u.output_tokens >= max)
+                                });
+
+                            let message = if exceeded_max_output_tokens {
+                                log::error!(
+                                    "Max output tokens reached. Usage: {:?}",
+                                    this.token_usage
+                                );
+                                "Maximum output tokens reached"
+                            } else {
+                                log::error!("Max tokens reached. Usage: {:?}", this.token_usage);
+                                "Maximum tokens reached"
+                            };
+                            return Err(anyhow!(message));
                         }
 
                         let canceled = matches!(r.stop_reason, acp::StopReason::Cancelled);

crates/agent_ui/src/conversation_view.rs 🔗

@@ -6213,13 +6213,13 @@ pub(crate) mod tests {
             match error {
                 Some(ThreadError::Other { message, .. }) => {
                     assert!(
-                        message.contains("Max tokens reached"),
-                        "Expected 'Max tokens reached' error, got: {}",
+                        message.contains("Maximum tokens reached"),
+                        "Expected 'Maximum tokens reached' error, got: {}",
                         message
                     );
                 }
                 other => panic!(
-                    "Expected ThreadError::Other with 'Max tokens reached', got: {:?}",
+                    "Expected ThreadError::Other with 'Maximum tokens reached', got: {:?}",
                     other.is_some()
                 ),
             }