language_models: Fix the partial json streaming to not blast `\` everywhere (#51976)
Finn Eitreim
created
## Context
This PR fixes one of the issues in #51905, where model outputs are full
of errant `\` characters.
heres the problem: As the response is streamed back to zed, we
accumulate the message chunks and and need to convert those chunks to
valid json, to do that we use `partial_json_fixer::fix_json`, when the
last character of a chunk is `\`, the `fix_json` has to escape that
backslash, because its inside of a string (if it isn't, its invalid json
and the tool call will crash) and other wise you would end up escaping
the end `"` and everything would be messed up.
why is this a problem for zed:
T_0 is the output at some step.
T_1 is the output at the next step.
the `fix_json` system is meant to be used by replacing T_0 with T_1,
however in the editor, replacing the entirety of T_0 with T_1 would be
slow/cause flickering/etc.. so we calculate the difference between T_0
and T_1 and just add it to the current buffer state. So when a chunk
ends on `\`, we end up with something like `... end of line\\"}` at the
end of T_0,
in T_1, this becomes `... end of line\n ...`. then when we add the new
chunk from T_1, it just picks up after the \n because its tracking the
length to manage the deltas.
## How to Review
utils.rs:
fix_streamed_json => remove trailing backslashes from incoming json
streams so that `partial_json_fixer::fix_json` doesn't try to escape
them.
other files: call fix_streamed_json before passing to `serde_json`
I had claude write a bunch of tests while I was working on the fix,
which I have kept in for now, but the end functionality of
fix_streamed_json is really simple now, so maybe these arent really
needed.
## Videos
Behavior Before:
https://github.com/user-attachments/assets/f23f5579-b2e1-4d71-9e24-f15ea831de52
Behavior After:
https://github.com/user-attachments/assets/40acdc23-4522-4621-be28-895965f4f262
## Self-Review Checklist
<!-- Check before requesting review: -->
- [x] I've reviewed my own diff for quality, security, and reliability
- [x] Unsafe blocks (if any) have justifying comments
- [x] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [x] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable
Release Notes:
- language_models: fixed partial json streaming
@@ -24,7 +24,7 @@ use ui::{ButtonLink, ConfiguredApiCard, List, ListBulletItem, prelude::*};
use ui_input::InputField;
use util::ResultExt;
-use crate::provider::util::parse_tool_arguments;
+use crate::provider::util::{fix_streamed_json, parse_tool_arguments};
pub use settings::AnthropicAvailableModel as AvailableModel;
@@ -873,9 +873,9 @@ impl AnthropicEventMapper {
// valid JSON that serde can accept, e.g. by closing
// unclosed delimiters. This way, we can update the
// UI with whatever has been streamed back so far.
- if let Ok(input) = serde_json::Value::from_str(- &partial_json_fixer::fix_json(&tool_use.input_json),- ) {
+ if let Ok(input) =
+ serde_json::Value::from_str(&fix_streamed_json(&tool_use.input_json))
+ {
return vec![Ok(LanguageModelCompletionEvent::ToolUse(
LanguageModelToolUse {
id: tool_use.id.clone().into(),