From a777605ec5736ddb71582a7939a78e9d3ea8a186 Mon Sep 17 00:00:00 2001
From: Anil Pai <anilpai@users.noreply.github.com>
Date: Tue, 17 Mar 2026 16:12:31 +0530
Subject: [PATCH] Use split token display for xAI models (#48719)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

### Split token display for xAI

Extends the split input/output token display (introduced in #46829 for
OpenAI) to all xAI models.

Instead of the combined `48k / 1M` token counter, xAI models now show:
- **↑** input tokens used / input token limit
- **↓** output tokens used / output token limit

#### Before

<img width="513" height="128" alt="Screenshot 2026-02-08 at 11 07 13 AM"
src="https://github.com/user-attachments/assets/14e5cb4a-9b5c-4081-bbfb-407a737bf234"
/>


#### After

<img width="610" height="126" alt="Screenshot 2026-02-08 at 11 05 36 AM"
src="https://github.com/user-attachments/assets/92396dcb-8905-4f87-9b9e-d8b0f63225ba"
/>


#### Changes

- **x_ai.rs** — Override `supports_split_token_display()` to return
`true` on `XAiLanguageModel`. All built-in Grok models already implement
`max_output_tokens()`, so no additional plumbing was needed.
- **cloud.rs** — Add `XAi` to the `matches!` pattern in
`CloudLanguageModel::supports_split_token_display()` so cloud-routed xAI
models also get the split display.

#### Tests

- `test_xai_supports_split_token_display` — Verifies all built-in Grok
model variants return `true` for split token display.
- `test_xai_models_have_max_output_tokens` — Validates all built-in Grok
models report `max_output_tokens` that is `Some`, positive, and less
than `max_token_count` (required for the UI to compute the input token
limit).
- `test_split_token_display_supported_providers` — Confirms the cloud
provider match pattern includes `OpenAi` and `XAi` while excluding
`Anthropic` and `Google`.

Release Notes:

- Changed the display of tokens for xAI models to reflect the
input/output limits.

---------

Co-authored-by: Ben Brandt <benjamin.j.brandt@gmail.com>
Co-authored-by: Smit Barmase <heysmitbarmase@gmail.com>
---
 crates/language_models/src/provider/cloud.rs | 2 +-
 crates/language_models/src/provider/x_ai.rs  | 4 ++++
 2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/crates/language_models/src/provider/cloud.rs b/crates/language_models/src/provider/cloud.rs
index 8f2b6c10f3434ed51e3908d0f9de93e54a12dae6..f2570e6516a9a69811bec726097e6318d9ede04b 100644
--- a/crates/language_models/src/provider/cloud.rs
+++ b/crates/language_models/src/provider/cloud.rs
@@ -631,7 +631,7 @@ impl LanguageModel for CloudLanguageModel {
 
     fn supports_split_token_display(&self) -> bool {
         use cloud_llm_client::LanguageModelProvider::*;
-        matches!(self.model.provider, OpenAi)
+        matches!(self.model.provider, OpenAi | XAi)
     }
 
     fn telemetry_id(&self) -> String {
diff --git a/crates/language_models/src/provider/x_ai.rs b/crates/language_models/src/provider/x_ai.rs
index f1f8bb658f04a91341951d1602af04f858af7bd3..c00637bce7e67b624f5cdcae9aebe43fb43971f8 100644
--- a/crates/language_models/src/provider/x_ai.rs
+++ b/crates/language_models/src/provider/x_ai.rs
@@ -288,6 +288,10 @@ impl LanguageModel for XAiLanguageModel {
         self.model.max_output_tokens()
     }
 
+    fn supports_split_token_display(&self) -> bool {
+        true
+    }
+
     fn count_tokens(
         &self,
         request: LanguageModelRequest,