configuration.md

  1# Configuring the Assistant
  2
  3Here's a bird's-eye view of all the configuration options available in Zed's Assistant:
  4
  5- Configure LLM Providers
  6  - [Zed AI (Configured by default when signed in)](#zed-ai)
  7  - [Anthropic](#anthropic)
  8  - [GitHub Copilot Chat](#github-copilot-chat)
  9  - [Google AI](#google-ai)
 10  - [Ollama](#ollama)
 11  - [OpenAI](#openai)
 12  - [DeepSeek](#deepseek)
 13  - [LM Studio](#lmstudio)
 14- Advanced configuration options
 15  - [Configuring Endpoints](#custom-endpoint)
 16  - [Configuring Timeouts](#provider-timeout)
 17  - [Configuring Models](#default-model)
 18  - [Configuring Feature-specific Models](#feature-specific-models)
 19  - [Configuring Alternative Models for Inline Assists](#alternative-assists)
 20- [Common Panel Settings](#common-panel-settings)
 21- [General Configuration Example](#general-example)
 22
 23## Providers {#providers}
 24
 25To access the Assistant configuration view, run `assistant: show configuration` in the command palette, or click on the hamburger menu at the top-right of the Assistant Panel and select "Configure".
 26
 27Below you can find all the supported providers available so far.
 28
 29### Zed AI {#zed-ai}
 30
 31A hosted service providing convenient and performant support for AI-enabled coding in Zed, powered by Anthropic's Claude 3.5 Sonnet and accessible just by signing in.
 32
 33### Anthropic {#anthropic}
 34
 35You can use Claude 3.5 Sonnet via [Zed AI](#zed-ai) for free. To use other Anthropic models you will need to configure it by providing your own API key.
 36
 371. Sign up for Anthropic and [create an API key](https://console.anthropic.com/settings/keys)
 382. Make sure that your Anthropic account has credits
 393. Open the configuration view (`assistant: show configuration`) and navigate to the Anthropic section
 404. Enter your Anthropic API key
 41
 42Even if you pay for Claude Pro, you will still have to [pay for additional credits](https://console.anthropic.com/settings/plans) to use it via the API.
 43
 44Zed will also use the `ANTHROPIC_API_KEY` environment variable if it's defined.
 45
 46#### Anthropic Custom Models {#anthropic-custom-models}
 47
 48You can add custom models to the Anthropic provider by adding the following to your Zed `settings.json`:
 49
 50```json
 51{
 52  "language_models": {
 53    "anthropic": {
 54      "available_models": [
 55        {
 56          "name": "claude-3-5-sonnet-20240620",
 57          "display_name": "Sonnet 2024-June",
 58          "max_tokens": 128000,
 59          "max_output_tokens": 2560,
 60          "cache_configuration": {
 61            "max_cache_anchors": 10,
 62            "min_total_token": 10000,
 63            "should_speculate": false
 64          },
 65          "tool_override": "some-model-that-supports-toolcalling"
 66        }
 67      ]
 68    }
 69  }
 70}
 71```
 72
 73Custom models will be listed in the model dropdown in the assistant panel.
 74
 75You can configure a model to use [extended thinking](https://docs.anthropic.com/en/docs/about-claude/models/extended-thinking-models) (if it supports it),
 76by changing the mode in of your models configuration to `thinking`, for example:
 77
 78```json
 79{
 80  "name": "claude-3-7-sonnet-latest",
 81  "display_name": "claude-3-7-sonnet-thinking",
 82  "max_tokens": 200000,
 83  "mode": {
 84    "type": "thinking",
 85    "budget_tokens": 4_096
 86  }
 87}
 88```
 89
 90### GitHub Copilot Chat {#github-copilot-chat}
 91
 92You can use GitHub Copilot chat with the Zed assistant by choosing it via the model dropdown in the assistant panel.
 93
 94### Google AI {#google-ai}
 95
 96You can use Gemini 1.5 Pro/Flash with the Zed assistant by choosing it via the model dropdown in the assistant panel.
 97
 981. Go the Google AI Studio site and [create an API key](https://aistudio.google.com/app/apikey).
 992. Open the configuration view (`assistant: show configuration`) and navigate to the Google AI section
1003. Enter your Google AI API key and press enter.
101
102The Google AI API key will be saved in your keychain.
103
104Zed will also use the `GOOGLE_AI_API_KEY` environment variable if it's defined.
105
106#### Google AI custom models {#google-ai-custom-models}
107
108By default Zed will use `stable` versions of models, but you can use specific versions of models, including [experimental models](https://ai.google.dev/gemini-api/docs/models/experimental-models) with the Google AI provider by adding the following to your Zed `settings.json`:
109
110```json
111{
112  "language_models": {
113    "google": {
114      "available_models": [
115        {
116          "name": "gemini-1.5-flash-latest",
117          "display_name": "Gemini 1.5 Flash (Latest)",
118          "max_tokens": 1000000
119        }
120      ]
121    }
122  }
123}
124```
125
126Custom models will be listed in the model dropdown in the assistant panel.
127
128### Ollama {#ollama}
129
130Download and install Ollama from [ollama.com/download](https://ollama.com/download) (Linux or macOS) and ensure it's running with `ollama --version`.
131
1321. Download one of the [available models](https://ollama.com/models), for example, for `mistral`:
133
134   ```sh
135   ollama pull mistral
136   ```
137
1382. Make sure that the Ollama server is running. You can start it either via running Ollama.app (MacOS) or launching:
139
140   ```sh
141   ollama serve
142   ```
143
1443. In the assistant panel, select one of the Ollama models using the model dropdown.
145
146#### Ollama Context Length {#ollama-context}
147
148Zed has pre-configured maximum context lengths (`max_tokens`) to match the capabilities of common models. Zed API requests to Ollama include this as `num_ctx` parameter, but the default values do not exceed `16384` so users with ~16GB of ram are able to use most models out of the box. See [get_max_tokens in ollama.rs](https://github.com/zed-industries/zed/blob/main/crates/ollama/src/ollama.rs) for a complete set of defaults.
149
150**Note**: Tokens counts displayed in the assistant panel are only estimates and will differ from the models native tokenizer.
151
152Depending on your hardware or use-case you may wish to limit or increase the context length for a specific model via settings.json:
153
154```json
155{
156  "language_models": {
157    "ollama": {
158      "api_url": "http://localhost:11434",
159      "available_models": [
160        {
161          "name": "qwen2.5-coder",
162          "display_name": "qwen 2.5 coder 32K",
163          "max_tokens": 32768
164        }
165      ]
166    }
167  }
168}
169```
170
171If you specify a context length that is too large for your hardware, Ollama will log an error. You can watch these logs by running: `tail -f ~/.ollama/logs/ollama.log` (MacOS) or `journalctl -u ollama -f` (Linux). Depending on the memory available on your machine, you may need to adjust the context length to a smaller value.
172
173You may also optionally specify a value for `keep_alive` for each available model. This can be an integer (seconds) or alternately a string duration like "5m", "10m", "1h", "1d", etc., For example `"keep_alive": "120s"` will allow the remote server to unload the model (freeing up GPU VRAM) after 120seconds.
174
175### OpenAI {#openai}
176
1771. Visit the OpenAI platform and [create an API key](https://platform.openai.com/account/api-keys)
1782. Make sure that your OpenAI account has credits
1793. Open the configuration view (`assistant: show configuration`) and navigate to the OpenAI section
1804. Enter your OpenAI API key
181
182The OpenAI API key will be saved in your keychain.
183
184Zed will also use the `OPENAI_API_KEY` environment variable if it's defined.
185
186#### OpenAI Custom Models {#openai-custom-models}
187
188The Zed Assistant comes pre-configured to use the latest version for common models (GPT-3.5 Turbo, GPT-4, GPT-4 Turbo, GPT-4o, GPT-4o mini). If you wish to use alternate models, perhaps a preview release or a dated model release or you wish to control the request parameters you can do so by adding the following to your Zed `settings.json`:
189
190```json
191{
192  "language_models": {
193    "openai": {
194      "available_models": [
195        {
196          "name": "gpt-4o-2024-08-06",
197          "display_name": "GPT 4o Summer 2024",
198          "max_tokens": 128000
199        },
200        {
201          "name": "o1-mini",
202          "display_name": "o1-mini",
203          "max_tokens": 128000,
204          "max_completion_tokens": 20000
205        }
206      ]
207      "version": "1"
208    },
209  }
210}
211```
212
213You must provide the model's Context Window in the `max_tokens` parameter, this can be found [OpenAI Model Docs](https://platform.openai.com/docs/models). OpenAI `o1` models should set `max_completion_tokens` as well to avoid incurring high reasoning token costs. Custom models will be listed in the model dropdown in the assistant panel.
214
215### DeepSeek {#deepseek}
216
2171. Visit the DeepSeek platform and [create an API key](https://platform.deepseek.com/api_keys)
2182. Open the configuration view (`assistant: show configuration`) and navigate to the DeepSeek section
2193. Enter your DeepSeek API key
220
221The DeepSeek API key will be saved in your keychain.
222
223Zed will also use the `DEEPSEEK_API_KEY` environment variable if it's defined.
224
225#### DeepSeek Custom Models {#deepseek-custom-models}
226
227The Zed Assistant comes pre-configured to use the latest version for common models (DeepSeek Chat, DeepSeek Reasoner). If you wish to use alternate models or customize the API endpoint, you can do so by adding the following to your Zed `settings.json`:
228
229```json
230{
231  "language_models": {
232    "deepseek": {
233      "api_url": "https://api.deepseek.com",
234      "available_models": [
235        {
236          "name": "deepseek-chat",
237          "display_name": "DeepSeek Chat",
238          "max_tokens": 64000
239        },
240        {
241          "name": "deepseek-reasoner",
242          "display_name": "DeepSeek Reasoner",
243          "max_tokens": 64000,
244          "max_output_tokens": 4096
245        }
246      ]
247    }
248  }
249}
250```
251
252Custom models will be listed in the model dropdown in the assistant panel. You can also modify the `api_url` to use a custom endpoint if needed.
253
254### OpenAI API Compatible
255
256Zed supports using OpenAI compatible APIs by specifying a custom `endpoint` and `available_models` for the OpenAI provider.
257
258#### X.ai Grok
259
260Example configuration for using X.ai Grok with Zed:
261
262```json
263  "language_models": {
264    "openai": {
265      "api_url": "https://api.x.ai/v1",
266      "available_models": [
267        {
268          "name": "grok-beta",
269          "display_name": "X.ai Grok (Beta)",
270          "max_tokens": 131072
271        }
272      ],
273      "version": "1"
274    },
275  }
276```
277
278### LM Studio {#lmstudio}
279
2801. Download and install the latest version of LM Studio from https://lmstudio.ai/download
2812. In the app press ⌘/Ctrl + Shift + M and download at least one model, e.g. qwen2.5-coder-7b
282
283   You can also get models via the LM Studio CLI:
284
285   ```sh
286   lms get qwen2.5-coder-7b
287   ```
288
2893. Make sure the LM Studio API server by running:
290
291   ```sh
292   lms server start
293   ```
294
295Tip: Set [LM Studio as a login item](https://lmstudio.ai/docs/advanced/headless#run-the-llm-service-on-machine-login) to automate running the LM Studio server.
296
297## Advanced Configuration {#advanced-configuration}
298
299### Custom Endpoints {#custom-endpoint}
300
301You can use a custom API endpoint for different providers, as long as it's compatible with the providers API structure.
302
303To do so, add the following to your Zed `settings.json`:
304
305```json
306{
307  "language_models": {
308    "some-provider": {
309      "api_url": "http://localhost:11434"
310    }
311  }
312}
313```
314
315Where `some-provider` can be any of the following values: `anthropic`, `google`, `ollama`, `openai`.
316
317### Configuring Models {#default-model}
318
319Zed's hosted LLM service sets `claude-3-7-sonnet-latest` as the default model.
320However, you can change it either via the model dropdown in the Assistant Panel's bottom-left corner or by manually editing the `default_model` object in your settings:
321
322```json
323{
324  "assistant": {
325    "version": "2",
326    "default_model": {
327      "provider": "zed.dev",
328      "model": "gpt-4o"
329    }
330  }
331}
332```
333
334#### Feature-specific Models {#feature-specific-models}
335
336> Currently only available in [Preview](https://zed.dev/releases/preview).
337
338Zed allows you to configure different models for specific features.
339This provides flexibility to use more powerful models for certain tasks while using faster or more efficient models for others.
340
341If a feature-specific model is not set, it will fall back to using the default model, which is the one you set on the Agent Panel.
342
343You can configure the following feature-specific models:
344
345- Thread summary model: Used for generating thread summaries
346- Inline assistant model: Used for the inline assistant feature
347- Commit message model: Used for generating Git commit messages
348
349Example configuration:
350
351```json
352{
353  "assistant": {
354    "version": "2",
355    "default_model": {
356      "provider": "zed.dev",
357      "model": "claude-3-7-sonnet"
358    },
359    "inline_assistant_model": {
360      "provider": "anthropic",
361      "model": "claude-3-5-sonnet"
362    },
363    "commit_message_model": {
364      "provider": "openai",
365      "model": "gpt-4o-mini"
366    },
367    "thread_summary_model": {
368      "provider": "google",
369      "model": "gemini-2.0-flash"
370    }
371  }
372}
373```
374
375### Configuring Alternative Models for Inline Assists {#alternative-assists}
376
377You can configure additional models that will be used to perform inline assists in parallel. When you do this,
378the inline assist UI will surface controls to cycle between the alternatives generated by each model. The models
379you specify here are always used in _addition_ to your default model. For example, the following configuration
380will generate two outputs for every assist. One with Claude 3.5 Sonnet, and one with GPT-4o.
381
382```json
383{
384  "assistant": {
385    "default_model": {
386      "provider": "zed.dev",
387      "model": "claude-3-5-sonnet"
388    },
389    "inline_alternatives": [
390      {
391        "provider": "zed.dev",
392        "model": "gpt-4o"
393      }
394    ],
395    "version": "2"
396  }
397}
398```
399
400## Common Panel Settings {#common-panel-settings}
401
402| key            | type    | default | description                                                                           |
403| -------------- | ------- | ------- | ------------------------------------------------------------------------------------- |
404| enabled        | boolean | true    | Setting this to `false` will completely disable the assistant                         |
405| button         | boolean | true    | Show the assistant icon in the status bar                                             |
406| dock           | string  | "right" | The default dock position for the assistant panel. Can be ["left", "right", "bottom"] |
407| default_height | string  | null    | The pixel height of the assistant panel when docked to the bottom                     |
408| default_width  | string  | null    | The pixel width of the assistant panel when docked to the left or right               |
409
410## General Configuration Example {#general-example}
411
412```json
413{
414  "assistant": {
415    "enabled": true,
416    "default_model": {
417      "provider": "zed.dev",
418      "model": "claude-3-7-sonnet"
419    },
420    "editor_model": {
421      "provider": "openai",
422      "model": "gpt-4o"
423    },
424    "inline_assistant_model": {
425      "provider": "anthropic",
426      "model": "claude-3-5-sonnet"
427    },
428    "commit_message_model": {
429      "provider": "openai",
430      "model": "gpt-4o-mini"
431    },
432    "thread_summary_model": {
433      "provider": "google",
434      "model": "gemini-1.5-flash"
435    },
436    "version": "2",
437    "button": true,
438    "default_width": 480,
439    "dock": "right"
440  }
441}
442```