copilot_chat: Return true context window size (#47557)

Anil Pai created 2 months ago

## Fix incorrect context size limits for GitHub Copilot Chat models

Fixes #44909

### Problem

The agent panel was displaying incorrect token limits for GitHub Copilot
models. Users reported that:
- The agent panel always showed a **128K token limit** for all GitHub
Copilot models, regardless of their actual context window size
- Claude models (e.g., Claude 3.7 Sonnet, Claude Opus 4.5) were showing
~90K instead of their actual 200K context window
- GPT-4o was showing 110K instead of its actual 128K context window
- Users could continue using models beyond the displayed limit, which
worked but was confusing

### Root Cause

The `max_token_count()` method in `copilot_chat.rs` was returning
`max_prompt_tokens` instead of `max_context_window_tokens`:

```rust
// Before (incorrect)
pub fn max_token_count(&self) -> u64 {
    self.capabilities.limits.max_prompt_tokens
}
```

GitHub's API returns three different token-related fields:
- `max_context_window_tokens`: The **full context window size** (e.g.,
200K for Claude 3.7)
- `max_prompt_tokens`: GitHub's limit for prompt input (e.g., 90K for
Claude 3.7)
- `max_output_tokens`: Maximum output tokens (e.g., 16K)

The `max_token_count()` method in the `LanguageModel` trait is expected
to return the **full context window size** — this is consistent with all
other providers (Anthropic returns 200K for Claude, OpenAI returns 128K
for GPT-4o, etc.).

### Solution

<img width="583" height="132" alt="Screenshot 2026-01-25 at 1 07 53 AM"
src="https://github.com/user-attachments/assets/847e2fdb-635d-44bc-a630-2d4867ba8c32"
/>


Changed `max_token_count()` to return `max_context_window_tokens`:

```rust
// After (correct)
pub fn max_token_count(&self) -> u64 {
    self.capabilities.limits.max_context_window_tokens as u64
}
```

### Impact

| Model | Before | After |
|-------|--------|-------|
| Claude 3.7 Sonnet | 90,000 | **200,000** |
| Claude Opus 4.5 | 90,000 | **200,000** |
| GPT-4o | 110,000 | **128,000** |

### Testing

Added a new test
`test_max_token_count_returns_context_window_not_prompt_tokens` that:

1. Deserializes model JSON with distinct `max_context_window_tokens` and
`max_prompt_tokens` values
2. Verifies Claude 3.7 Sonnet returns 200,000 (context window), not
90,000 (prompt tokens)
3. Verifies GPT-4o returns 128,000 (context window), not 110,000 (prompt
tokens)

All existing tests continue to pass:
```
running 4 tests
test tests::test_unknown_vendor_resilience ... ok
test tests::test_max_token_count_returns_context_window_not_prompt_tokens ... ok
test tests::test_resilient_model_schema_deserialize ... ok
test result: ok. 4 passed; 0 failed
```


Release Notes:

- copilot: Fixed incorrect context window size displayed for GitHub
Copilot Chat models in the agent panel.

Change summary

crates/copilot_chat/src/copilot_chat.rs | 57 ++++++++++++++++++++++++++
1 file changed, 56 insertions(+), 1 deletion(-)

Detailed changes

crates/copilot_chat/src/copilot_chat.rs 🔗

@@ -223,7 +223,7 @@ impl Model {
     }
 
     pub fn max_token_count(&self) -> u64 {
-        self.capabilities.limits.max_prompt_tokens
+        self.capabilities.limits.max_context_window_tokens as u64
     }
 
     pub fn supports_tools(&self) -> bool {
@@ -1038,6 +1038,61 @@ mod tests {
         assert_eq!(schema.data[0].vendor, ModelVendor::Unknown);
     }
 
+    #[test]
+    fn test_max_token_count_returns_context_window_not_prompt_tokens() {
+        let json = r#"{
+              "data": [
+                {
+                  "billing": { "is_premium": true, "multiplier": 1 },
+                  "capabilities": {
+                    "family": "claude-sonnet-4",
+                    "limits": { "max_context_window_tokens": 200000, "max_output_tokens": 16384, "max_prompt_tokens": 90000 },
+                    "object": "model_capabilities",
+                    "supports": { "streaming": true, "tool_calls": true },
+                    "type": "chat"
+                  },
+                  "id": "claude-sonnet-4",
+                  "is_chat_default": false,
+                  "is_chat_fallback": false,
+                  "model_picker_enabled": true,
+                  "name": "Claude Sonnet 4",
+                  "object": "model",
+                  "preview": false,
+                  "vendor": "Anthropic",
+                  "version": "claude-sonnet-4"
+                },
+                {
+                  "billing": { "is_premium": false, "multiplier": 1 },
+                  "capabilities": {
+                    "family": "gpt-4o",
+                    "limits": { "max_context_window_tokens": 128000, "max_output_tokens": 16384, "max_prompt_tokens": 110000 },
+                    "object": "model_capabilities",
+                    "supports": { "streaming": true, "tool_calls": true },
+                    "type": "chat"
+                  },
+                  "id": "gpt-4o",
+                  "is_chat_default": true,
+                  "is_chat_fallback": false,
+                  "model_picker_enabled": true,
+                  "name": "GPT-4o",
+                  "object": "model",
+                  "preview": false,
+                  "vendor": "Azure OpenAI",
+                  "version": "gpt-4o"
+                }
+              ],
+              "object": "list"
+            }"#;
+
+        let schema: ModelSchema = serde_json::from_str(json).unwrap();
+
+        // max_token_count() should return context window (200000), not prompt tokens (90000)
+        assert_eq!(schema.data[0].max_token_count(), 200000);
+
+        // GPT-4o should return 128000 (context window), not 110000 (prompt tokens)
+        assert_eq!(schema.data[1].max_token_count(), 128000);
+    }
+
     #[test]
     fn test_models_with_pending_policy_deserialize() {
         // This test verifies that models with policy states other than "enabled"