custom-api-keys.md

  1# Configuring Custom API Keys
  2
  3While Zed offers hosted versions of models through our various plans, we're always happy to support users wanting to supply their own API keys for LLM providers.
  4
  5> Using your own API keys is **_free_** - you do not need to subscribe to a Zed plan to use our AI features with your own keys.
  6
  7## Supported LLM Providers
  8
  9| Provider                                        | Tool Use Supported |
 10| ----------------------------------------------- | ------------------ |
 11| [Anthropic](#anthropic)                         | ✅                 |
 12| [GitHub Copilot Chat](#github-copilot-chat)     | In Some Cases      |
 13| [Google AI](#google-ai)                         | ✅                 |
 14| [Ollama](#ollama)                               | ✅                 |
 15| [OpenAI](#openai)                               | ✅                 |
 16| [DeepSeek](#deepseek)                           | 🚫                 |
 17| [OpenAI API Compatible](#openai-api-compatible) | 🚫                 |
 18| [LM Studio](#lmstudio)                          | 🚫                 |
 19
 20## Providers {#providers}
 21
 22To access the Assistant configuration view, run `assistant: show configuration` in the command palette, or click on the hamburger menu at the top-right of the Assistant Panel and select "Configure".
 23
 24Below you can find all the supported providers available so far.
 25
 26### Anthropic {#anthropic}
 27
 28> 🔨Supports tool use
 29
 30You can use Anthropic models with the Zed assistant by choosing it via the model dropdown in the assistant panel.
 31
 321. Sign up for Anthropic and [create an API key](https://console.anthropic.com/settings/keys)
 332. Make sure that your Anthropic account has credits
 343. Open the configuration view (`assistant: show configuration`) and navigate to the Anthropic section
 354. Enter your Anthropic API key
 36
 37Even if you pay for Claude Pro, you will still have to [pay for additional credits](https://console.anthropic.com/settings/plans) to use it via the API.
 38
 39Zed will also use the `ANTHROPIC_API_KEY` environment variable if it's defined.
 40
 41#### Anthropic Custom Models {#anthropic-custom-models}
 42
 43You can add custom models to the Anthropic provider by adding the following to your Zed `settings.json`:
 44
 45```json
 46{
 47  "language_models": {
 48    "anthropic": {
 49      "available_models": [
 50        {
 51          "name": "claude-3-5-sonnet-20240620",
 52          "display_name": "Sonnet 2024-June",
 53          "max_tokens": 128000,
 54          "max_output_tokens": 2560,
 55          "cache_configuration": {
 56            "max_cache_anchors": 10,
 57            "min_total_token": 10000,
 58            "should_speculate": false
 59          },
 60          "tool_override": "some-model-that-supports-toolcalling"
 61        }
 62      ]
 63    }
 64  }
 65}
 66```
 67
 68Custom models will be listed in the model dropdown in the assistant panel.
 69
 70You can configure a model to use [extended thinking](https://docs.anthropic.com/en/docs/about-claude/models/extended-thinking-models) (if it supports it),
 71by changing the mode in of your models configuration to `thinking`, for example:
 72
 73```json
 74{
 75  "name": "claude-3-7-sonnet-latest",
 76  "display_name": "claude-3-7-sonnet-thinking",
 77  "max_tokens": 200000,
 78  "mode": {
 79    "type": "thinking",
 80    "budget_tokens": 4_096
 81  }
 82}
 83```
 84
 85### GitHub Copilot Chat {#github-copilot-chat}
 86
 87> 🔨Supports tool use in some cases.
 88> See [here](https://github.com/zed-industries/zed/blob/9e0330ba7d848755c9734bf456c716bddf0973f3/crates/language_models/src/provider/copilot_chat.rs#L189-L198) for the supported subset
 89
 90You can use GitHub Copilot chat with the Zed assistant by choosing it via the model dropdown in the assistant panel.
 91
 92### Google AI {#google-ai}
 93
 94> 🔨Supports tool use
 95
 96You can use Gemini 1.5 Pro/Flash with the Zed assistant by choosing it via the model dropdown in the assistant panel.
 97
 981. Go the Google AI Studio site and [create an API key](https://aistudio.google.com/app/apikey).
 992. Open the configuration view (`assistant: show configuration`) and navigate to the Google AI section
1003. Enter your Google AI API key and press enter.
101
102The Google AI API key will be saved in your keychain.
103
104Zed will also use the `GOOGLE_AI_API_KEY` environment variable if it's defined.
105
106#### Google AI custom models {#google-ai-custom-models}
107
108By default Zed will use `stable` versions of models, but you can use specific versions of models, including [experimental models](https://ai.google.dev/gemini-api/docs/models/experimental-models) with the Google AI provider by adding the following to your Zed `settings.json`:
109
110```json
111{
112  "language_models": {
113    "google": {
114      "available_models": [
115        {
116          "name": "gemini-1.5-flash-latest",
117          "display_name": "Gemini 1.5 Flash (Latest)",
118          "max_tokens": 1000000
119        }
120      ]
121    }
122  }
123}
124```
125
126Custom models will be listed in the model dropdown in the assistant panel.
127
128### Ollama {#ollama}
129
130> 🔨Supports tool use
131
132Download and install Ollama from [ollama.com/download](https://ollama.com/download) (Linux or macOS) and ensure it's running with `ollama --version`.
133
1341. Download one of the [available models](https://ollama.com/models), for example, for `mistral`:
135
136   ```sh
137   ollama pull mistral
138   ```
139
1402. Make sure that the Ollama server is running. You can start it either via running Ollama.app (MacOS) or launching:
141
142   ```sh
143   ollama serve
144   ```
145
1463. In the assistant panel, select one of the Ollama models using the model dropdown.
147
148#### Ollama Context Length {#ollama-context}
149
150Zed has pre-configured maximum context lengths (`max_tokens`) to match the capabilities of common models. Zed API requests to Ollama include this as `num_ctx` parameter, but the default values do not exceed `16384` so users with ~16GB of ram are able to use most models out of the box. See [get_max_tokens in ollama.rs](https://github.com/zed-industries/zed/blob/main/crates/ollama/src/ollama.rs) for a complete set of defaults.
151
152**Note**: Tokens counts displayed in the assistant panel are only estimates and will differ from the models native tokenizer.
153
154Depending on your hardware or use-case you may wish to limit or increase the context length for a specific model via settings.json:
155
156```json
157{
158  "language_models": {
159    "ollama": {
160      "api_url": "http://localhost:11434",
161      "available_models": [
162        {
163          "name": "qwen2.5-coder",
164          "display_name": "qwen 2.5 coder 32K",
165          "max_tokens": 32768
166        }
167      ]
168    }
169  }
170}
171```
172
173If you specify a context length that is too large for your hardware, Ollama will log an error. You can watch these logs by running: `tail -f ~/.ollama/logs/ollama.log` (MacOS) or `journalctl -u ollama -f` (Linux). Depending on the memory available on your machine, you may need to adjust the context length to a smaller value.
174
175You may also optionally specify a value for `keep_alive` for each available model. This can be an integer (seconds) or alternately a string duration like "5m", "10m", "1h", "1d", etc., For example `"keep_alive": "120s"` will allow the remote server to unload the model (freeing up GPU VRAM) after 120seconds.
176
177### OpenAI {#openai}
178
179> 🔨Supports tool use
180
1811. Visit the OpenAI platform and [create an API key](https://platform.openai.com/account/api-keys)
1822. Make sure that your OpenAI account has credits
1833. Open the configuration view (`assistant: show configuration`) and navigate to the OpenAI section
1844. Enter your OpenAI API key
185
186The OpenAI API key will be saved in your keychain.
187
188Zed will also use the `OPENAI_API_KEY` environment variable if it's defined.
189
190#### OpenAI Custom Models {#openai-custom-models}
191
192The Zed Assistant comes pre-configured to use the latest version for common models (GPT-3.5 Turbo, GPT-4, GPT-4 Turbo, GPT-4o, GPT-4o mini). If you wish to use alternate models, perhaps a preview release or a dated model release or you wish to control the request parameters you can do so by adding the following to your Zed `settings.json`:
193
194```json
195{
196  "language_models": {
197    "openai": {
198      "available_models": [
199        {
200          "name": "gpt-4o-2024-08-06",
201          "display_name": "GPT 4o Summer 2024",
202          "max_tokens": 128000
203        },
204        {
205          "name": "o1-mini",
206          "display_name": "o1-mini",
207          "max_tokens": 128000,
208          "max_completion_tokens": 20000
209        }
210      ]
211      "version": "1"
212    },
213  }
214}
215```
216
217You must provide the model's Context Window in the `max_tokens` parameter, this can be found [OpenAI Model Docs](https://platform.openai.com/docs/models). OpenAI `o1` models should set `max_completion_tokens` as well to avoid incurring high reasoning token costs. Custom models will be listed in the model dropdown in the assistant panel.
218
219### DeepSeek {#deepseek}
220
221> 🚫 Does not support tool use 🚫
222
2231. Visit the DeepSeek platform and [create an API key](https://platform.deepseek.com/api_keys)
2242. Open the configuration view (`assistant: show configuration`) and navigate to the DeepSeek section
2253. Enter your DeepSeek API key
226
227The DeepSeek API key will be saved in your keychain.
228
229Zed will also use the `DEEPSEEK_API_KEY` environment variable if it's defined.
230
231#### DeepSeek Custom Models {#deepseek-custom-models}
232
233The Zed Assistant comes pre-configured to use the latest version for common models (DeepSeek Chat, DeepSeek Reasoner). If you wish to use alternate models or customize the API endpoint, you can do so by adding the following to your Zed `settings.json`:
234
235```json
236{
237  "language_models": {
238    "deepseek": {
239      "api_url": "https://api.deepseek.com",
240      "available_models": [
241        {
242          "name": "deepseek-chat",
243          "display_name": "DeepSeek Chat",
244          "max_tokens": 64000
245        },
246        {
247          "name": "deepseek-reasoner",
248          "display_name": "DeepSeek Reasoner",
249          "max_tokens": 64000,
250          "max_output_tokens": 4096
251        }
252      ]
253    }
254  }
255}
256```
257
258Custom models will be listed in the model dropdown in the assistant panel. You can also modify the `api_url` to use a custom endpoint if needed.
259
260### OpenAI API Compatible{#openai-api-compatible}
261
262Zed supports using OpenAI compatible APIs by specifying a custom `endpoint` and `available_models` for the OpenAI provider.
263
264#### X.ai Grok
265
266Example configuration for using X.ai Grok with Zed:
267
268```json
269  "language_models": {
270    "openai": {
271      "api_url": "https://api.x.ai/v1",
272      "available_models": [
273        {
274          "name": "grok-beta",
275          "display_name": "X.ai Grok (Beta)",
276          "max_tokens": 131072
277        }
278      ],
279      "version": "1"
280    },
281  }
282```
283
284### LM Studio {#lmstudio}
285
286> 🚫 Does not support tool use 🚫
287
2881. Download and install the latest version of LM Studio from https://lmstudio.ai/download
2892. In the app press ⌘/Ctrl + Shift + M and download at least one model, e.g. qwen2.5-coder-7b
290
291   You can also get models via the LM Studio CLI:
292
293   ```sh
294   lms get qwen2.5-coder-7b
295   ```
296
2973. Make sure the LM Studio API server by running:
298
299   ```sh
300   lms server start
301   ```
302
303Tip: Set [LM Studio as a login item](https://lmstudio.ai/docs/advanced/headless#run-the-llm-service-on-machine-login) to automate running the LM Studio server.