configuration.md

  1# Configuring the Assistant
  2
  3## Providers {#providers}
  4
  5The following providers are supported:
  6
  7- [Zed AI (Configured by default when signed in)](#zed-ai)
  8- [Anthropic](#anthropic)
  9- [GitHub Copilot Chat](#github-copilot-chat) [^1]
 10- [Google AI](#google-ai) [^1]
 11- [Ollama](#ollama)
 12- [OpenAI](#openai)
 13- [LM Studio](#lmstudio)
 14
 15To configure different providers, run `assistant: show configuration` in the command palette, or click on the hamburger menu at the top-right of the assistant panel and select "Configure".
 16
 17[^1]: This provider does not support the [`/workflow`](./commands#workflow-not-generally-available) command.
 18
 19To further customize providers, you can use `settings.json` to do that as follows:
 20
 21- [Configuring endpoints](#custom-endpoint)
 22- [Configuring timeouts](#provider-timeout)
 23- [Configuring default model](#default-model)
 24- [Configuring alternative models for inline assists](#alternative-assists)
 25
 26### Zed AI {#zed-ai}
 27
 28A hosted service providing convenient and performant support for AI-enabled coding in Zed, powered by Anthropic's Claude 3.5 Sonnet and accessible just by signing in.
 29
 30### Anthropic {#anthropic}
 31
 32You can use Claude 3.5 Sonnet via [Zed AI](#zed-ai) for free. To use other Anthropic models you will need to configure it by providing your own API key.
 33
 341. Sign up for Anthropic and [create an API key](https://console.anthropic.com/settings/keys)
 352. Make sure that your Anthropic account has credits
 363. Open the configuration view (`assistant: show configuration`) and navigate to the Anthropic section
 374. Enter your Anthropic API key
 38
 39Even if you pay for Claude Pro, you will still have to [pay for additional credits](https://console.anthropic.com/settings/plans) to use it via the API.
 40
 41Zed will also use the `ANTHROPIC_API_KEY` environment variable if it's defined.
 42
 43#### Anthropic Custom Models {#anthropic-custom-models}
 44
 45You can add custom models to the Anthropic provider by adding the following to your Zed `settings.json`:
 46
 47```json
 48{
 49  "language_models": {
 50    "anthropic": {
 51      "available_models": [
 52        {
 53          "name": "claude-3-5-sonnet-20240620",
 54          "display_name": "Sonnet 2024-June",
 55          "max_tokens": 128000,
 56          "max_output_tokens": 2560,
 57          "cache_configuration": {
 58            "max_cache_anchors": 10,
 59            "min_total_token": 10000,
 60            "should_speculate": false
 61          },
 62          "tool_override": "some-model-that-supports-toolcalling"
 63        }
 64      ]
 65    }
 66  }
 67}
 68```
 69
 70Custom models will be listed in the model dropdown in the assistant panel.
 71
 72### GitHub Copilot Chat {#github-copilot-chat}
 73
 74You can use GitHub Copilot chat with the Zed assistant by choosing it via the model dropdown in the assistant panel.
 75
 76### Google AI {#google-ai}
 77
 78You can use Gemini 1.5 Pro/Flash with the Zed assistant by choosing it via the model dropdown in the assistant panel.
 79
 801. Go the Google AI Studio site and [create an API key](https://aistudio.google.com/app/apikey).
 812. Open the configuration view (`assistant: show configuration`) and navigate to the Google AI section
 823. Enter your Google AI API key and press enter.
 83
 84The Google AI API key will be saved in your keychain.
 85
 86Zed will also use the `GOOGLE_AI_API_KEY` environment variable if it's defined.
 87
 88#### Google AI custom models {#google-ai-custom-models}
 89
 90By default Zed will use `stable` versions of models, but you can use specific versions of models, including [experimental models](https://ai.google.dev/gemini-api/docs/models/experimental-models) with the Google AI provider by adding the following to your Zed `settings.json`:
 91
 92```json
 93{
 94  "language_models": {
 95    "google": {
 96      "available_models": [
 97        {
 98          "name": "gemini-1.5-flash-latest",
 99          "display_name": "Gemini 1.5 Flash (Latest)",
100          "max_tokens": 1000000
101        }
102      ]
103    }
104  }
105}
106```
107
108Custom models will be listed in the model dropdown in the assistant panel.
109
110### Ollama {#ollama}
111
112Download and install Ollama from [ollama.com/download](https://ollama.com/download) (Linux or macOS) and ensure it's running with `ollama --version`.
113
1141. Download one of the [available models](https://ollama.com/models), for example, for `mistral`:
115
116   ```sh
117   ollama pull mistral
118   ```
119
1202. Make sure that the Ollama server is running. You can start it either via running Ollama.app (MacOS) or launching:
121
122   ```sh
123   ollama serve
124   ```
125
1263. In the assistant panel, select one of the Ollama models using the model dropdown.
127
128#### Ollama Context Length {#ollama-context}
129
130Zed has pre-configured maximum context lengths (`max_tokens`) to match the capabilities of common models. Zed API requests to Ollama include this as `num_ctx` parameter, but the default values do not exceed `16384` so users with ~16GB of ram are able to use most models out of the box. See [get_max_tokens in ollama.rs](https://github.com/zed-industries/zed/blob/main/crates/ollama/src/ollama.rs) for a complete set of defaults.
131
132**Note**: Tokens counts displayed in the assistant panel are only estimates and will differ from the models native tokenizer.
133
134Depending on your hardware or use-case you may wish to limit or increase the context length for a specific model via settings.json:
135
136```json
137{
138  "language_models": {
139    "ollama": {
140      "api_url": "http://localhost:11434",
141      "available_models": [
142        {
143          "name": "qwen2.5-coder",
144          "display_name": "qwen 2.5 coder 32K",
145          "max_tokens": 32768
146        }
147      ]
148    }
149  }
150}
151```
152
153If you specify a context length that is too large for your hardware, Ollama will log an error. You can watch these logs by running: `tail -f ~/.ollama/logs/ollama.log` (MacOS) or `journalctl -u ollama -f` (Linux). Depending on the memory available on your machine, you may need to adjust the context length to a smaller value.
154
155You may also optionally specify a value for `keep_alive` for each available model. This can be an integer (seconds) or alternately a string duration like "5m", "10m", "1h", "1d", etc., For example `"keep_alive": "120s"` will allow the remote server to unload the model (freeing up GPU VRAM) after 120seconds.
156
157### OpenAI {#openai}
158
1591. Visit the OpenAI platform and [create an API key](https://platform.openai.com/account/api-keys)
1602. Make sure that your OpenAI account has credits
1613. Open the configuration view (`assistant: show configuration`) and navigate to the OpenAI section
1624. Enter your OpenAI API key
163
164The OpenAI API key will be saved in your keychain.
165
166Zed will also use the `OPENAI_API_KEY` environment variable if it's defined.
167
168#### OpenAI Custom Models {#openai-custom-models}
169
170The Zed Assistant comes pre-configured to use the latest version for common models (GPT-3.5 Turbo, GPT-4, GPT-4 Turbo, GPT-4o, GPT-4o mini). If you wish to use alternate models, perhaps a preview release or a dated model release or you wish to control the request parameters you can do so by adding the following to your Zed `settings.json`:
171
172```json
173{
174  "language_models": {
175    "openai": {
176      "available_models": [
177        {
178          "name": "gpt-4o-2024-08-06",
179          "display_name": "GPT 4o Summer 2024",
180          "max_tokens": 128000
181        },
182        {
183          "name": "o1-mini",
184          "display_name": "o1-mini",
185          "max_tokens": 128000,
186          "max_completion_tokens": 20000
187        }
188      ]
189    },
190    "version": "1"
191  }
192}
193```
194
195You must provide the model's Context Window in the `max_tokens` parameter, this can be found [OpenAI Model Docs](https://platform.openai.com/docs/models). OpenAI `o1` models should set `max_completion_tokens` as well to avoid incurring high reasoning token costs. Custom models will be listed in the model dropdown in the assistant panel.
196
197### DeepSeek {#deepseek}
198
1991. Visit the DeepSeek platform and [create an API key](https://platform.deepseek.com/api_keys)
2002. Open the configuration view (`assistant: show configuration`) and navigate to the DeepSeek section
2013. Enter your DeepSeek API key
202
203The DeepSeek API key will be saved in your keychain.
204
205Zed will also use the `DEEPSEEK_API_KEY` environment variable if it's defined.
206
207#### DeepSeek Custom Models {#deepseek-custom-models}
208
209The Zed Assistant comes pre-configured to use the latest version for common models (DeepSeek Chat, DeepSeek Reasoner). If you wish to use alternate models or customize the API endpoint, you can do so by adding the following to your Zed `settings.json`:
210
211```json
212{
213  "language_models": {
214    "deepseek": {
215      "api_url": "https://api.deepseek.com",
216      "available_models": [
217        {
218          "name": "deepseek-chat",
219          "display_name": "DeepSeek Chat",
220          "max_tokens": 64000
221        },
222        {
223          "name": "deepseek-reasoner",
224          "display_name": "DeepSeek Reasoner",
225          "max_tokens": 64000,
226          "max_output_tokens": 4096
227        }
228      ]
229    }
230  }
231}
232```
233
234Custom models will be listed in the model dropdown in the assistant panel. You can also modify the `api_url` to use a custom endpoint if needed.
235
236### OpenAI API Compatible
237
238Zed supports using OpenAI compatible APIs by specifying a custom `endpoint` and `available_models` for the OpenAI provider.
239
240#### X.ai Grok
241
242Example configuration for using X.ai Grok with Zed:
243
244```json
245  "language_models": {
246    "openai": {
247      "api_url": "https://api.x.ai/v1",
248      "available_models": [
249        {
250          "name": "grok-beta",
251          "display_name": "X.ai Grok (Beta)",
252          "max_tokens": 131072
253        }
254      ],
255      "version": "1"
256    },
257  }
258```
259
260### Advanced configuration {#advanced-configuration}
261
262#### Example Configuration
263
264```json
265{
266  "assistant": {
267    "enabled": true,
268    "default_model": {
269      "provider": "zed.dev",
270      "model": "claude-3-5-sonnet"
271    },
272    "version": "2",
273    "button": true,
274    "default_width": 480,
275    "dock": "right"
276  }
277}
278```
279
280### LM Studio {#lmstudio}
281
2821. Download and install the latest version of LM Studio from https://lmstudio.ai/download
2832. In the app press ⌘/Ctrl + Shift + M and download at least one model, e.g. qwen2.5-coder-7b
284
285   You can also get models via the LM Studio CLI:
286
287   ```sh
288   lms get qwen2.5-coder-7b
289   ```
290
2913. Make sure the LM Studio API server by running:
292
293   ```sh
294   lms server start
295   ```
296
297Tip: Set [LM Studio as a login item](https://lmstudio.ai/docs/advanced/headless#run-the-llm-service-on-machine-login) to automate running the LM Studio server.
298
299#### Custom endpoints {#custom-endpoint}
300
301You can use a custom API endpoint for different providers, as long as it's compatible with the providers API structure.
302
303To do so, add the following to your Zed `settings.json`:
304
305```json
306{
307  "language_models": {
308    "some-provider": {
309      "api_url": "http://localhost:11434"
310    }
311  }
312}
313```
314
315Where `some-provider` can be any of the following values: `anthropic`, `google`, `ollama`, `openai`.
316
317#### Configuring the default model {#default-model}
318
319The default model can be set via the model dropdown in the assistant panel's top-right corner. Selecting a model saves it as the default.
320You can also manually edit the `default_model` object in your settings:
321
322```json
323{
324  "assistant": {
325    "version": "2",
326    "default_model": {
327      "provider": "zed.dev",
328      "model": "claude-3-5-sonnet"
329    }
330  }
331}
332```
333
334#### Configuring alternative models for inline assists {#alternative-assists}
335
336You can configure additional models that will be used to perform inline assists in parallel. When you do this,
337the inline assist UI will surface controls to cycle between the alternatives generated by each model. The models
338you specify here are always used in _addition_ to your default model. For example, the following configuration
339will generate two outputs for every assist. One with Claude 3.5 Sonnet, and one with GPT-4o.
340
341```json
342{
343  "assistant": {
344    "default_model": {
345      "provider": "zed.dev",
346      "model": "claude-3-5-sonnet"
347    },
348    "inline_alternatives": [
349      {
350        "provider": "zed.dev",
351        "model": "gpt-4o"
352      }
353    ],
354    "version": "2"
355  }
356}
357```
358
359#### Common Panel Settings
360
361| key            | type    | default | description                                                                           |
362| -------------- | ------- | ------- | ------------------------------------------------------------------------------------- |
363| enabled        | boolean | true    | Setting this to `false` will completely disable the assistant                         |
364| button         | boolean | true    | Show the assistant icon in the status bar                                             |
365| dock           | string  | "right" | The default dock position for the assistant panel. Can be ["left", "right", "bottom"] |
366| default_height | string  | null    | The pixel height of the assistant panel when docked to the bottom                     |
367| default_width  | string  | null    | The pixel width of the assistant panel when docked to the left or right               |