configuration.md

  1# Configuring the Assistant
  2
  3## Providers {#providers}
  4
  5The following providers are supported:
  6
  7- [Zed AI (Configured by default when signed in)](#zed-ai)
  8- [Anthropic](#anthropic)
  9- [GitHub Copilot Chat](#github-copilot-chat)
 10- [Google AI](#google-ai)
 11- [Ollama](#ollama)
 12- [OpenAI](#openai)
 13- [DeepSeek](#deepseek)
 14- [LM Studio](#lmstudio)
 15
 16To configure different providers, run `assistant: show configuration` in the command palette, or click on the hamburger menu at the top-right of the assistant panel and select "Configure".
 17
 18To further customize providers, you can use `settings.json` to do that as follows:
 19
 20- [Configuring endpoints](#custom-endpoint)
 21- [Configuring timeouts](#provider-timeout)
 22- [Configuring models](#default-model)
 23- [Configuring feature-specific models](#feature-specific-models)
 24- [Configuring alternative models for inline assists](#alternative-assists)
 25
 26### Zed AI {#zed-ai}
 27
 28A hosted service providing convenient and performant support for AI-enabled coding in Zed, powered by Anthropic's Claude 3.5 Sonnet and accessible just by signing in.
 29
 30### Anthropic {#anthropic}
 31
 32You can use Claude 3.5 Sonnet via [Zed AI](#zed-ai) for free. To use other Anthropic models you will need to configure it by providing your own API key.
 33
 341. Sign up for Anthropic and [create an API key](https://console.anthropic.com/settings/keys)
 352. Make sure that your Anthropic account has credits
 363. Open the configuration view (`assistant: show configuration`) and navigate to the Anthropic section
 374. Enter your Anthropic API key
 38
 39Even if you pay for Claude Pro, you will still have to [pay for additional credits](https://console.anthropic.com/settings/plans) to use it via the API.
 40
 41Zed will also use the `ANTHROPIC_API_KEY` environment variable if it's defined.
 42
 43#### Anthropic Custom Models {#anthropic-custom-models}
 44
 45You can add custom models to the Anthropic provider by adding the following to your Zed `settings.json`:
 46
 47```json
 48{
 49  "language_models": {
 50    "anthropic": {
 51      "available_models": [
 52        {
 53          "name": "claude-3-5-sonnet-20240620",
 54          "display_name": "Sonnet 2024-June",
 55          "max_tokens": 128000,
 56          "max_output_tokens": 2560,
 57          "cache_configuration": {
 58            "max_cache_anchors": 10,
 59            "min_total_token": 10000,
 60            "should_speculate": false
 61          },
 62          "tool_override": "some-model-that-supports-toolcalling"
 63        }
 64      ]
 65    }
 66  }
 67}
 68```
 69
 70Custom models will be listed in the model dropdown in the assistant panel.
 71
 72You can configure a model to use [extended thinking](https://docs.anthropic.com/en/docs/about-claude/models/extended-thinking-models) (if it supports it),
 73by changing the mode in of your models configuration to `thinking`, for example:
 74
 75```json
 76{
 77  "name": "claude-3-7-sonnet-latest",
 78  "display_name": "claude-3-7-sonnet-thinking",
 79  "max_tokens": 200000,
 80  "mode": {
 81    "type": "thinking",
 82    "budget_tokens": 4_096
 83  }
 84}
 85```
 86
 87### GitHub Copilot Chat {#github-copilot-chat}
 88
 89You can use GitHub Copilot chat with the Zed assistant by choosing it via the model dropdown in the assistant panel.
 90
 91### Google AI {#google-ai}
 92
 93You can use Gemini 1.5 Pro/Flash with the Zed assistant by choosing it via the model dropdown in the assistant panel.
 94
 951. Go the Google AI Studio site and [create an API key](https://aistudio.google.com/app/apikey).
 962. Open the configuration view (`assistant: show configuration`) and navigate to the Google AI section
 973. Enter your Google AI API key and press enter.
 98
 99The Google AI API key will be saved in your keychain.
100
101Zed will also use the `GOOGLE_AI_API_KEY` environment variable if it's defined.
102
103#### Google AI custom models {#google-ai-custom-models}
104
105By default Zed will use `stable` versions of models, but you can use specific versions of models, including [experimental models](https://ai.google.dev/gemini-api/docs/models/experimental-models) with the Google AI provider by adding the following to your Zed `settings.json`:
106
107```json
108{
109  "language_models": {
110    "google": {
111      "available_models": [
112        {
113          "name": "gemini-1.5-flash-latest",
114          "display_name": "Gemini 1.5 Flash (Latest)",
115          "max_tokens": 1000000
116        }
117      ]
118    }
119  }
120}
121```
122
123Custom models will be listed in the model dropdown in the assistant panel.
124
125### Ollama {#ollama}
126
127Download and install Ollama from [ollama.com/download](https://ollama.com/download) (Linux or macOS) and ensure it's running with `ollama --version`.
128
1291. Download one of the [available models](https://ollama.com/models), for example, for `mistral`:
130
131   ```sh
132   ollama pull mistral
133   ```
134
1352. Make sure that the Ollama server is running. You can start it either via running Ollama.app (MacOS) or launching:
136
137   ```sh
138   ollama serve
139   ```
140
1413. In the assistant panel, select one of the Ollama models using the model dropdown.
142
143#### Ollama Context Length {#ollama-context}
144
145Zed has pre-configured maximum context lengths (`max_tokens`) to match the capabilities of common models. Zed API requests to Ollama include this as `num_ctx` parameter, but the default values do not exceed `16384` so users with ~16GB of ram are able to use most models out of the box. See [get_max_tokens in ollama.rs](https://github.com/zed-industries/zed/blob/main/crates/ollama/src/ollama.rs) for a complete set of defaults.
146
147**Note**: Tokens counts displayed in the assistant panel are only estimates and will differ from the models native tokenizer.
148
149Depending on your hardware or use-case you may wish to limit or increase the context length for a specific model via settings.json:
150
151```json
152{
153  "language_models": {
154    "ollama": {
155      "api_url": "http://localhost:11434",
156      "available_models": [
157        {
158          "name": "qwen2.5-coder",
159          "display_name": "qwen 2.5 coder 32K",
160          "max_tokens": 32768
161        }
162      ]
163    }
164  }
165}
166```
167
168If you specify a context length that is too large for your hardware, Ollama will log an error. You can watch these logs by running: `tail -f ~/.ollama/logs/ollama.log` (MacOS) or `journalctl -u ollama -f` (Linux). Depending on the memory available on your machine, you may need to adjust the context length to a smaller value.
169
170You may also optionally specify a value for `keep_alive` for each available model. This can be an integer (seconds) or alternately a string duration like "5m", "10m", "1h", "1d", etc., For example `"keep_alive": "120s"` will allow the remote server to unload the model (freeing up GPU VRAM) after 120seconds.
171
172### OpenAI {#openai}
173
1741. Visit the OpenAI platform and [create an API key](https://platform.openai.com/account/api-keys)
1752. Make sure that your OpenAI account has credits
1763. Open the configuration view (`assistant: show configuration`) and navigate to the OpenAI section
1774. Enter your OpenAI API key
178
179The OpenAI API key will be saved in your keychain.
180
181Zed will also use the `OPENAI_API_KEY` environment variable if it's defined.
182
183#### OpenAI Custom Models {#openai-custom-models}
184
185The Zed Assistant comes pre-configured to use the latest version for common models (GPT-3.5 Turbo, GPT-4, GPT-4 Turbo, GPT-4o, GPT-4o mini). If you wish to use alternate models, perhaps a preview release or a dated model release or you wish to control the request parameters you can do so by adding the following to your Zed `settings.json`:
186
187```json
188{
189  "language_models": {
190    "openai": {
191      "available_models": [
192        {
193          "name": "gpt-4o-2024-08-06",
194          "display_name": "GPT 4o Summer 2024",
195          "max_tokens": 128000
196        },
197        {
198          "name": "o1-mini",
199          "display_name": "o1-mini",
200          "max_tokens": 128000,
201          "max_completion_tokens": 20000
202        }
203      ]
204      "version": "1"
205    },
206  }
207}
208```
209
210You must provide the model's Context Window in the `max_tokens` parameter, this can be found [OpenAI Model Docs](https://platform.openai.com/docs/models). OpenAI `o1` models should set `max_completion_tokens` as well to avoid incurring high reasoning token costs. Custom models will be listed in the model dropdown in the assistant panel.
211
212### DeepSeek {#deepseek}
213
2141. Visit the DeepSeek platform and [create an API key](https://platform.deepseek.com/api_keys)
2152. Open the configuration view (`assistant: show configuration`) and navigate to the DeepSeek section
2163. Enter your DeepSeek API key
217
218The DeepSeek API key will be saved in your keychain.
219
220Zed will also use the `DEEPSEEK_API_KEY` environment variable if it's defined.
221
222#### DeepSeek Custom Models {#deepseek-custom-models}
223
224The Zed Assistant comes pre-configured to use the latest version for common models (DeepSeek Chat, DeepSeek Reasoner). If you wish to use alternate models or customize the API endpoint, you can do so by adding the following to your Zed `settings.json`:
225
226```json
227{
228  "language_models": {
229    "deepseek": {
230      "api_url": "https://api.deepseek.com",
231      "available_models": [
232        {
233          "name": "deepseek-chat",
234          "display_name": "DeepSeek Chat",
235          "max_tokens": 64000
236        },
237        {
238          "name": "deepseek-reasoner",
239          "display_name": "DeepSeek Reasoner",
240          "max_tokens": 64000,
241          "max_output_tokens": 4096
242        }
243      ]
244    }
245  }
246}
247```
248
249Custom models will be listed in the model dropdown in the assistant panel. You can also modify the `api_url` to use a custom endpoint if needed.
250
251### OpenAI API Compatible
252
253Zed supports using OpenAI compatible APIs by specifying a custom `endpoint` and `available_models` for the OpenAI provider.
254
255#### X.ai Grok
256
257Example configuration for using X.ai Grok with Zed:
258
259```json
260  "language_models": {
261    "openai": {
262      "api_url": "https://api.x.ai/v1",
263      "available_models": [
264        {
265          "name": "grok-beta",
266          "display_name": "X.ai Grok (Beta)",
267          "max_tokens": 131072
268        }
269      ],
270      "version": "1"
271    },
272  }
273```
274
275### Advanced configuration {#advanced-configuration}
276
277#### Example Configuration
278
279```json
280{
281  "assistant": {
282    "enabled": true,
283    "default_model": {
284      "provider": "zed.dev",
285      "model": "claude-3-7-sonnet"
286    },
287    "editor_model": {
288      "provider": "openai",
289      "model": "gpt-4o"
290    },
291    "inline_assistant_model": {
292      "provider": "anthropic",
293      "model": "claude-3-5-sonnet"
294    },
295    "commit_message_model": {
296      "provider": "openai",
297      "model": "gpt-4o-mini"
298    },
299    "thread_summary_model": {
300      "provider": "google",
301      "model": "gemini-1.5-flash"
302    },
303    "version": "2",
304    "button": true,
305    "default_width": 480,
306    "dock": "right"
307  }
308}
309```
310
311### LM Studio {#lmstudio}
312
3131. Download and install the latest version of LM Studio from https://lmstudio.ai/download
3142. In the app press ⌘/Ctrl + Shift + M and download at least one model, e.g. qwen2.5-coder-7b
315
316   You can also get models via the LM Studio CLI:
317
318   ```sh
319   lms get qwen2.5-coder-7b
320   ```
321
3223. Make sure the LM Studio API server by running:
323
324   ```sh
325   lms server start
326   ```
327
328Tip: Set [LM Studio as a login item](https://lmstudio.ai/docs/advanced/headless#run-the-llm-service-on-machine-login) to automate running the LM Studio server.
329
330#### Custom endpoints {#custom-endpoint}
331
332You can use a custom API endpoint for different providers, as long as it's compatible with the providers API structure.
333
334To do so, add the following to your Zed `settings.json`:
335
336```json
337{
338  "language_models": {
339    "some-provider": {
340      "api_url": "http://localhost:11434"
341    }
342  }
343}
344```
345
346Where `some-provider` can be any of the following values: `anthropic`, `google`, `ollama`, `openai`.
347
348#### Configuring models {#default-model}
349
350The default model can be set via the model dropdown in the assistant panel's top-right corner. Selecting a model saves it as the default.
351You can also manually edit the `default_model` object in your settings:
352
353```json
354{
355  "assistant": {
356    "version": "2",
357    "default_model": {
358      "provider": "zed.dev",
359      "model": "claude-3-5-sonnet"
360    }
361  }
362}
363```
364
365#### Feature-specific models {#feature-specific-models}
366
367> Currently only available in [Preview](https://zed.dev/releases/preview).
368
369Zed allows you to configure different models for specific features.
370This provides flexibility to use more powerful models for certain tasks while using faster or more efficient models for others.
371
372If a feature-specific model is not set, it will fall back to using the default model, which is the one you set on the Agent Panel.
373
374You can configure the following feature-specific models:
375
376- Thread summary model: Used for generating thread summaries
377- Inline assistant model: Used for the inline assistant feature
378- Commit message model: Used for generating Git commit messages
379
380Example configuration:
381
382```json
383{
384  "assistant": {
385    "version": "2",
386    "default_model": {
387      "provider": "zed.dev",
388      "model": "claude-3-7-sonnet"
389    },
390    "inline_assistant_model": {
391      "provider": "anthropic",
392      "model": "claude-3-5-sonnet"
393    },
394    "commit_message_model": {
395      "provider": "openai",
396      "model": "gpt-4o-mini"
397    },
398    "thread_summary_model": {
399      "provider": "google",
400      "model": "gemini-2.0-flash"
401    }
402  }
403}
404```
405
406#### Configuring alternative models for inline assists {#alternative-assists}
407
408You can configure additional models that will be used to perform inline assists in parallel. When you do this,
409the inline assist UI will surface controls to cycle between the alternatives generated by each model. The models
410you specify here are always used in _addition_ to your default model. For example, the following configuration
411will generate two outputs for every assist. One with Claude 3.5 Sonnet, and one with GPT-4o.
412
413```json
414{
415  "assistant": {
416    "default_model": {
417      "provider": "zed.dev",
418      "model": "claude-3-5-sonnet"
419    },
420    "inline_alternatives": [
421      {
422        "provider": "zed.dev",
423        "model": "gpt-4o"
424      }
425    ],
426    "version": "2"
427  }
428}
429```
430
431#### Common Panel Settings
432
433| key            | type    | default | description                                                                           |
434| -------------- | ------- | ------- | ------------------------------------------------------------------------------------- |
435| enabled        | boolean | true    | Setting this to `false` will completely disable the assistant                         |
436| button         | boolean | true    | Show the assistant icon in the status bar                                             |
437| dock           | string  | "right" | The default dock position for the assistant panel. Can be ["left", "right", "bottom"] |
438| default_height | string  | null    | The pixel height of the assistant panel when docked to the bottom                     |
439| default_width  | string  | null    | The pixel width of the assistant panel when docked to the left or right               |