configuration.md

  1# Configuring the Assistant
  2
  3## Providers {#providers}
  4
  5The following providers are supported:
  6
  7- [Zed AI (Configured by default when signed in)](#zed-ai)
  8- [Anthropic](#anthropic)
  9- [GitHub Copilot Chat](#github-copilot-chat) [^1]
 10- [Google AI](#google-ai) [^1]
 11- [Ollama](#ollama)
 12- [OpenAI](#openai)
 13
 14To configure different providers, run `assistant: show configuration` in the command palette, or click on the hamburger menu at the top-right of the assistant panel and select "Configure".
 15
 16[^1]: This provider does not support the [`/workflow`](./commands#workflow-not-generally-available) command.
 17
 18To further customize providers, you can use `settings.json` to do that as follows:
 19
 20- [Configuring endpoints](#custom-endpoint)
 21- [Configuring timeouts](#provider-timeout)
 22- [Configuring default model](#default-model)
 23- [Configuring alternative models for inline assists](#alternative-assists)
 24
 25### Zed AI {#zed-ai}
 26
 27A hosted service providing convenient and performant support for AI-enabled coding in Zed, powered by Anthropic's Claude 3.5 Sonnet and accessible just by signing in.
 28
 29### Anthropic {#anthropic}
 30
 31You can use Claude 3.5 Sonnet via [Zed AI](#zed-ai) for free. To use other Anthropic models you will need to configure it by providing your own API key.
 32
 331. Sign up for Anthropic and [create an API key](https://console.anthropic.com/settings/keys)
 342. Make sure that your Anthropic account has credits
 353. Open the configuration view (`assistant: show configuration`) and navigate to the Anthropic section
 364. Enter your Anthropic API key
 37
 38Even if you pay for Claude Pro, you will still have to [pay for additional credits](https://console.anthropic.com/settings/plans) to use it via the API.
 39
 40Zed will also use the `ANTHROPIC_API_KEY` environment variable if it's defined.
 41
 42#### Anthropic Custom Models {#anthropic-custom-models}
 43
 44You can add custom models to the Anthropic provider by adding the following to your Zed `settings.json`:
 45
 46```json
 47{
 48  "language_models": {
 49    "anthropic": {
 50      "available_models": [
 51        {
 52          "name": "claude-3-5-sonnet-20240620",
 53          "display_name": "Sonnet 2024-June",
 54          "max_tokens": 128000,
 55          "max_output_tokens": 2560,
 56          "cache_configuration": {
 57            "max_cache_anchors": 10,
 58            "min_total_token": 10000,
 59            "should_speculate": false
 60          },
 61          "tool_override": "some-model-that-supports-toolcalling"
 62        }
 63      ]
 64    }
 65  }
 66}
 67```
 68
 69Custom models will be listed in the model dropdown in the assistant panel.
 70
 71### GitHub Copilot Chat {#github-copilot-chat}
 72
 73You can use GitHub Copilot chat with the Zed assistant by choosing it via the model dropdown in the assistant panel.
 74
 75### Google AI {#google-ai}
 76
 77You can use Gemini 1.5 Pro/Flash with the Zed assistant by choosing it via the model dropdown in the assistant panel.
 78
 791. Go the Google AI Studio site and [create an API key](https://aistudio.google.com/app/apikey).
 802. Open the configuration view (`assistant: show configuration`) and navigate to the Google AI section
 813. Enter your Google AI API key and press enter.
 82
 83The Google AI API key will be saved in your keychain.
 84
 85Zed will also use the `GOOGLE_AI_API_KEY` environment variable if it's defined.
 86
 87#### Google AI custom models {#google-ai-custom-models}
 88
 89By default Zed will use `stable` versions of models, but you can use specific versions of models, including [experimental models](https://ai.google.dev/gemini-api/docs/models/experimental-models) with the Google AI provider by adding the following to your Zed `settings.json`:
 90
 91```json
 92{
 93  "language_models": {
 94    "google": {
 95      "available_models": [
 96        {
 97          "name": "gemini-1.5-flash-latest",
 98          "display_name": "Gemini 1.5 Flash (Latest)",
 99          "max_tokens": 1000000
100        }
101      ]
102    }
103  }
104}
105```
106
107Custom models will be listed in the model dropdown in the assistant panel.
108
109### Ollama {#ollama}
110
111Download and install Ollama from [ollama.com/download](https://ollama.com/download) (Linux or macOS) and ensure it's running with `ollama --version`.
112
1131. Download one of the [available models](https://ollama.com/models), for example, for `mistral`:
114
115   ```sh
116   ollama pull mistral
117   ```
118
1192. Make sure that the Ollama server is running. You can start it either via running Ollama.app (MacOS) or launching:
120
121   ```sh
122   ollama serve
123   ```
124
1253. In the assistant panel, select one of the Ollama models using the model dropdown.
126
1274. (Optional) Specify an [`api_url`](#custom-endpoint) or [`low_speed_timeout_in_seconds`](#provider-timeout) if required.
128
129#### Ollama Context Length {#ollama-context}
130
131Zed has pre-configured maximum context lengths (`max_tokens`) to match the capabilities of common models. Zed API requests to Ollama include this as `num_ctx` parameter, but the default values do not exceed `16384` so users with ~16GB of ram are able to use most models out of the box. See [get_max_tokens in ollama.rs](https://github.com/zed-industries/zed/blob/main/crates/ollama/src/ollama.rs) for a complete set of defaults.
132
133**Note**: Tokens counts displayed in the assistant panel are only estimates and will differ from the models native tokenizer.
134
135Depending on your hardware or use-case you may wish to limit or increase the context length for a specific model via settings.json:
136
137```json
138{
139  "language_models": {
140    "ollama": {
141      "api_url": "http://localhost:11434",
142      "low_speed_timeout_in_seconds": 120,
143      "available_models": [
144        {
145          "name": "qwen2.5-coder",
146          "display_name": "qwen 2.5 coder 32K",
147          "max_tokens": 32768
148        }
149      ]
150    }
151  }
152}
153```
154
155If you specify a context length that is too large for your hardware, Ollama will log an error. You can watch these logs by running: `tail -f ~/.ollama/logs/ollama.log` (MacOS) or `journalctl -u ollama -f` (Linux). Depending on the memory available on your machine, you may need to adjust the context length to a smaller value.
156
157You may also optionally specify a value for `keep_alive` for each available model. This can be an integer (seconds) or alternately a string duration like "5m", "10m", "1h", "1d", etc., For example `"keep_alive": "120s"` will allow the remote server to unload the model (freeing up GPU VRAM) after 120seconds.
158
159### OpenAI {#openai}
160
1611. Visit the OpenAI platform and [create an API key](https://platform.openai.com/account/api-keys)
1622. Make sure that your OpenAI account has credits
1633. Open the configuration view (`assistant: show configuration`) and navigate to the OpenAI section
1644. Enter your OpenAI API key
165
166The OpenAI API key will be saved in your keychain.
167
168Zed will also use the `OPENAI_API_KEY` environment variable if it's defined.
169
170#### OpenAI Custom Models {#openai-custom-models}
171
172The Zed Assistant comes pre-configured to use the latest version for common models (GPT-3.5 Turbo, GPT-4, GPT-4 Turbo, GPT-4o, GPT-4o mini). If you wish to use alternate models, perhaps a preview release or a dated model release or you wish to control the request parameters you can do so by adding the following to your Zed `settings.json`:
173
174```json
175{
176  "language_models": {
177    "openai": {
178      "available_models": [
179        {
180          "provider": "openai",
181          "name": "gpt-4o-2024-08-06",
182          "max_tokens": 128000
183        },
184        {
185          "name": "o1-mini",
186          "display_name": "o1-mini",
187          "max_tokens": 128000,
188          "max_completion_tokens": 20000
189        }
190      ]
191    }
192  }
193}
194```
195
196You must provide the model's Context Window in the `max_tokens` parameter, this can be found [OpenAI Model Docs](https://platform.openai.com/docs/models). OpenAI `o1` models should set `max_completion_tokens` as well to avoid incurring high reasoning token costs. Custom models will be listed in the model dropdown in the assistant panel.
197
198### Advanced configuration {#advanced-configuration}
199
200#### Example Configuration
201
202```json
203{
204  "assistant": {
205    "enabled": true,
206    "default_model": {
207      "provider": "zed.dev",
208      "model": "claude-3-5-sonnet"
209    },
210    "version": "2",
211    "button": true,
212    "default_width": 480,
213    "dock": "right"
214  }
215}
216```
217
218#### Custom endpoints {#custom-endpoint}
219
220You can use a custom API endpoint for different providers, as long as it's compatible with the providers API structure.
221
222To do so, add the following to your Zed `settings.json`:
223
224```json
225{
226  "language_models": {
227    "some-provider": {
228      "api_url": "http://localhost:11434"
229    }
230  }
231}
232```
233
234Where `some-provider` can be any of the following values: `anthropic`, `google`, `ollama`, `openai`.
235
236#### Custom timeout {#provider-timeout}
237
238You can customize the timeout that's used for LLM requests, by adding the following to your Zed `settings.json`:
239
240```json
241{
242  "language_models": {
243    "some-provider": {
244      "low_speed_timeout_in_seconds": 10
245    }
246  }
247}
248```
249
250Where `some-provider` can be any of the following values: `anthropic`, `copilot_chat`, `google`, `ollama`, `openai`.
251
252#### Configuring the default model {#default-model}
253
254The default model can be set via the model dropdown in the assistant panel's top-right corner. Selecting a model saves it as the default.
255You can also manually edit the `default_model` object in your settings:
256
257```json
258{
259  "assistant": {
260    "version": "2",
261    "default_model": {
262      "provider": "zed.dev",
263      "model": "claude-3-5-sonnet"
264    }
265  }
266}
267```
268
269#### Configuring alternative models for inline assists {#alternative-assists}
270
271You can configure additional models that will be used to perform inline assists in parallel. When you do this,
272the inline assist UI will surface controls to cycle between the alternatives generated by each model. The models
273you specify here are always used in _addition_ to your default model. For example, the following configuration
274will generate two outputs for every assist. One with Claude 3.5 Sonnet, and one with GPT-4o.
275
276```json
277{
278  "assistant": {
279    "default_model": {
280      "provider": "zed.dev",
281      "model": "claude-3-5-sonnet"
282    },
283    "inline_alternatives": [
284      {
285        "provider": "zed.dev",
286        "model": "gpt-4o"
287      }
288    ],
289    "version": "2"
290  }
291}
292```
293
294#### Common Panel Settings
295
296| key            | type    | default | description                                                                           |
297| -------------- | ------- | ------- | ------------------------------------------------------------------------------------- |
298| enabled        | boolean | true    | Setting this to `false` will completely disable the assistant                         |
299| button         | boolean | true    | Show the assistant icon in the status bar                                             |
300| dock           | string  | "right" | The default dock position for the assistant panel. Can be ["left", "right", "bottom"] |
301| default_height | string  | null    | The pixel height of the assistant panel when docked to the bottom                     |
302| default_width  | string  | null    | The pixel width of the assistant panel when docked to the left or right               |