llm-providers.md

  1# LLM Providers
  2
  3To use AI in Zed, you need to have at least one large language model provider set up.
  4
  5You can do that by either subscribing to [one of Zed's plans](./plans-and-usage.md), or by using API keys you already have for the supported providers.
  6
  7## Use Your Own Keys {#use-your-own-keys}
  8
  9If you already have an API key for an existing LLM provider—say Anthropic or OpenAI, for example—you can insert them in Zed and use the Agent Panel **_for free_**.
 10
 11You can add your API key to a given provider either via the Agent Panel's settings UI or directly via the `settings.json` through the `language_models` key.
 12
 13## Supported Providers
 14
 15Here's all the supported LLM providers for which you can use your own API keys:
 16
 17| Provider                                        | Tool Use Supported                                                                                                                                                          |
 18| ----------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
 19| [Amazon Bedrock](#amazon-bedrock)               | Depends on the model                                                                                                                                                        |
 20| [Anthropic](#anthropic)                         | ✅                                                                                                                                                                          |
 21| [DeepSeek](#deepseek)                           | ✅                                                                                                                                                                          |
 22| [GitHub Copilot Chat](#github-copilot-chat)     | For some models ([link](https://github.com/zed-industries/zed/blob/9e0330ba7d848755c9734bf456c716bddf0973f3/crates/language_models/src/provider/copilot_chat.rs#L189-L198)) |
 23| [Google AI](#google-ai)                         | ✅                                                                                                                                                                          |
 24| [LM Studio](#lmstudio)                          | ✅                                                                                                                                                                          |
 25| [Mistral](#mistral)                             | ✅                                                                                                                                                                          |
 26| [Ollama](#ollama)                               | ✅                                                                                                                                                                          |
 27| [OpenAI](#openai)                               | ✅                                                                                                                                                                          |
 28| [OpenAI API Compatible](#openai-api-compatible) | ✅                                                                                                                                                                          |
 29| [OpenRouter](#openrouter)                       | ✅                                                                                                                                                                          |
 30| [Vercel](#vercel-v0)                            | ✅                                                                                                                                                                          |
 31| [xAI](#xai)                                     | ✅                                                                                                                                                                          |
 32
 33### Amazon Bedrock {#amazon-bedrock}
 34
 35> ✅ Supports tool use with models that support streaming tool use.
 36> More details can be found in the [Amazon Bedrock's Tool Use documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/conversation-inference-supported-models-features.html).
 37
 38To use Amazon Bedrock's models, an AWS authentication is required.
 39Ensure your credentials have the following permissions set up:
 40
 41- `bedrock:InvokeModelWithResponseStream`
 42- `bedrock:InvokeModel`
 43- `bedrock:ConverseStream`
 44
 45Your IAM policy should look similar to:
 46
 47```json
 48{
 49  "Version": "2012-10-17",
 50  "Statement": [
 51    {
 52      "Effect": "Allow",
 53      "Action": [
 54        "bedrock:InvokeModel",
 55        "bedrock:InvokeModelWithResponseStream",
 56        "bedrock:ConverseStream"
 57      ],
 58      "Resource": "*"
 59    }
 60  ]
 61}
 62```
 63
 64With that done, choose one of the two authentication methods:
 65
 66#### Authentication via Named Profile (Recommended)
 67
 681. Ensure you have the AWS CLI installed and configured with a named profile
 692. Open your `settings.json` (`zed: open settings`) and include the `bedrock` key under `language_models` with the following settings:
 70   ```json
 71   {
 72     "language_models": {
 73       "bedrock": {
 74         "authentication_method": "named_profile",
 75         "region": "your-aws-region",
 76         "profile": "your-profile-name"
 77       }
 78     }
 79   }
 80   ```
 81
 82#### Authentication via Static Credentials
 83
 84While it's possible to configure through the Agent Panel settings UI by entering your AWS access key and secret directly, we recommend using named profiles instead for better security practices.
 85To do this:
 86
 871. Create an IAM User that you can assume in the [IAM Console](https://us-east-1.console.aws.amazon.com/iam/home?region=us-east-1#/users).
 882. Create security credentials for that User, save them and keep them secure.
 893. Open the Agent Configuration with (`agent: open settings`) and go to the Amazon Bedrock section
 904. Copy the credentials from Step 2 into the respective **Access Key ID**, **Secret Access Key**, and **Region** fields.
 91
 92#### Cross-Region Inference
 93
 94The Zed implementation of Amazon Bedrock uses [Cross-Region inference](https://docs.aws.amazon.com/bedrock/latest/userguide/cross-region-inference.html) for all the models and region combinations that support it.
 95With Cross-Region inference, you can distribute traffic across multiple AWS Regions, enabling higher throughput.
 96
 97For example, if you use `Claude Sonnet 3.7 Thinking` from `us-east-1`, it may be processed across the US regions, namely: `us-east-1`, `us-east-2`, or `us-west-2`.
 98Cross-Region inference requests are kept within the AWS Regions that are part of the geography where the data originally resides.
 99For example, a request made within the US is kept within the AWS Regions in the US.
100
101Although the data remains stored only in the source Region, your input prompts and output results might move outside of your source Region during cross-Region inference.
102All data will be transmitted encrypted across Amazon's secure network.
103
104We will support Cross-Region inference for each of the models on a best-effort basis, please refer to the [Cross-Region Inference method Code](https://github.com/zed-industries/zed/blob/main/crates/bedrock/src/models.rs#L297).
105
106For the most up-to-date supported regions and models, refer to the [Supported Models and Regions for Cross Region inference](https://docs.aws.amazon.com/bedrock/latest/userguide/inference-profiles-support.html).
107
108### Anthropic {#anthropic}
109
110> ✅ Supports tool use
111
112You can use Anthropic models by choosing them via the model dropdown in the Agent Panel.
113
1141. Sign up for Anthropic and [create an API key](https://console.anthropic.com/settings/keys)
1152. Make sure that your Anthropic account has credits
1163. Open the settings view (`agent: open settings`) and go to the Anthropic section
1174. Enter your Anthropic API key
118
119Even if you pay for Claude Pro, you will still have to [pay for additional credits](https://console.anthropic.com/settings/plans) to use it via the API.
120
121Zed will also use the `ANTHROPIC_API_KEY` environment variable if it's defined.
122
123#### Custom Models {#anthropic-custom-models}
124
125You can add custom models to the Anthropic provider by adding the following to your Zed `settings.json`:
126
127```json
128{
129  "language_models": {
130    "anthropic": {
131      "available_models": [
132        {
133          "name": "claude-3-5-sonnet-20240620",
134          "display_name": "Sonnet 2024-June",
135          "max_tokens": 128000,
136          "max_output_tokens": 2560,
137          "cache_configuration": {
138            "max_cache_anchors": 10,
139            "min_total_token": 10000,
140            "should_speculate": false
141          },
142          "tool_override": "some-model-that-supports-toolcalling"
143        }
144      ]
145    }
146  }
147}
148```
149
150Custom models will be listed in the model dropdown in the Agent Panel.
151
152You can configure a model to use [extended thinking](https://docs.anthropic.com/en/docs/about-claude/models/extended-thinking-models) (if it supports it) by changing the mode in your model's configuration to `thinking`, for example:
153
154```json
155{
156  "name": "claude-sonnet-4-latest",
157  "display_name": "claude-sonnet-4-thinking",
158  "max_tokens": 200000,
159  "mode": {
160    "type": "thinking",
161    "budget_tokens": 4_096
162  }
163}
164```
165
166### DeepSeek {#deepseek}
167
168> ✅ Supports tool use
169
1701. Visit the DeepSeek platform and [create an API key](https://platform.deepseek.com/api_keys)
1712. Open the settings view (`agent: open settings`) and go to the DeepSeek section
1723. Enter your DeepSeek API key
173
174The DeepSeek API key will be saved in your keychain.
175
176Zed will also use the `DEEPSEEK_API_KEY` environment variable if it's defined.
177
178#### Custom Models {#deepseek-custom-models}
179
180The Zed agent comes pre-configured to use the latest version for common models (DeepSeek Chat, DeepSeek Reasoner).
181If you wish to use alternate models or customize the API endpoint, you can do so by adding the following to your Zed `settings.json`:
182
183```json
184{
185  "language_models": {
186    "deepseek": {
187      "api_url": "https://api.deepseek.com",
188      "available_models": [
189        {
190          "name": "deepseek-chat",
191          "display_name": "DeepSeek Chat",
192          "max_tokens": 64000
193        },
194        {
195          "name": "deepseek-reasoner",
196          "display_name": "DeepSeek Reasoner",
197          "max_tokens": 64000,
198          "max_output_tokens": 4096
199        }
200      ]
201    }
202  }
203}
204```
205
206Custom models will be listed in the model dropdown in the Agent Panel.
207You can also modify the `api_url` to use a custom endpoint if needed.
208
209### GitHub Copilot Chat {#github-copilot-chat}
210
211> ✅ Supports tool use in some cases.
212> Visit [the Copilot Chat code](https://github.com/zed-industries/zed/blob/9e0330ba7d848755c9734bf456c716bddf0973f3/crates/language_models/src/provider/copilot_chat.rs#L189-L198) for the supported subset.
213
214You can use GitHub Copilot Chat with the Zed agent by choosing it via the model dropdown in the Agent Panel.
215
2161. Open the settings view (`agent: open settings`) and go to the GitHub Copilot Chat section
2172. Click on `Sign in to use GitHub Copilot`, follow the steps shown in the modal.
218
219Alternatively, you can provide an OAuth token via the `GH_COPILOT_TOKEN` environment variable.
220
221> **Note**: If you don't see specific models in the dropdown, you may need to enable them in your [GitHub Copilot settings](https://github.com/settings/copilot/features).
222
223To use Copilot Enterprise with Zed (for both agent and completions), you must configure your enterprise endpoint as described in [Configuring GitHub Copilot Enterprise](./edit-prediction.md#github-copilot-enterprise).
224
225### Google AI {#google-ai}
226
227> ✅ Supports tool use
228
229You can use Gemini models with the Zed agent by choosing it via the model dropdown in the Agent Panel.
230
2311. Go to the Google AI Studio site and [create an API key](https://aistudio.google.com/app/apikey).
2322. Open the settings view (`agent: open settings`) and go to the Google AI section
2333. Enter your Google AI API key and press enter.
234
235The Google AI API key will be saved in your keychain.
236
237Zed will also use the `GEMINI_API_KEY` environment variable if it's defined. See [Using Gemini API keys](https://ai.google.dev/gemini-api/docs/api-key) in the Gemini docs for more.
238
239#### Custom Models {#google-ai-custom-models}
240
241By default, Zed will use `stable` versions of models, but you can use specific versions of models, including [experimental models](https://ai.google.dev/gemini-api/docs/models/experimental-models). You can configure a model to use [thinking mode](https://ai.google.dev/gemini-api/docs/thinking) (if it supports it) by adding a `mode` configuration to your model. This is useful for controlling reasoning token usage and response speed. If not specified, Gemini will automatically choose the thinking budget.
242
243Here is an example of a custom Google AI model you could add to your Zed `settings.json`:
244
245```json
246{
247  "language_models": {
248    "google": {
249      "available_models": [
250        {
251          "name": "gemini-2.5-flash-preview-05-20",
252          "display_name": "Gemini 2.5 Flash (Thinking)",
253          "max_tokens": 1000000,
254          "mode": {
255            "type": "thinking",
256            "budget_tokens": 24000
257          }
258        }
259      ]
260    }
261  }
262}
263```
264
265Custom models will be listed in the model dropdown in the Agent Panel.
266
267### LM Studio {#lmstudio}
268
269> ✅ Supports tool use
270
2711. Download and install [the latest version of LM Studio](https://lmstudio.ai/download)
2722. In the app press `cmd/ctrl-shift-m` and download at least one model (e.g., qwen2.5-coder-7b). Alternatively, you can get models via the LM Studio CLI:
273
274   ```sh
275   lms get qwen2.5-coder-7b
276   ```
277
2783. Make sure the LM Studio API server is running by executing:
279
280   ```sh
281   lms server start
282   ```
283
284Tip: Set [LM Studio as a login item](https://lmstudio.ai/docs/advanced/headless#run-the-llm-service-on-machine-login) to automate running the LM Studio server.
285
286### Mistral {#mistral}
287
288> ✅ Supports tool use
289
2901. Visit the Mistral platform and [create an API key](https://console.mistral.ai/api-keys/)
2912. Open the configuration view (`agent: open settings`) and navigate to the Mistral section
2923. Enter your Mistral API key
293
294The Mistral API key will be saved in your keychain.
295
296Zed will also use the `MISTRAL_API_KEY` environment variable if it's defined.
297
298#### Custom Models {#mistral-custom-models}
299
300The Zed agent comes pre-configured with several Mistral models (codestral-latest, mistral-large-latest, mistral-medium-latest, mistral-small-latest, open-mistral-nemo, and open-codestral-mamba).
301All the default models support tool use.
302If you wish to use alternate models or customize their parameters, you can do so by adding the following to your Zed `settings.json`:
303
304```json
305{
306  "language_models": {
307    "mistral": {
308      "api_url": "https://api.mistral.ai/v1",
309      "available_models": [
310        {
311          "name": "mistral-tiny-latest",
312          "display_name": "Mistral Tiny",
313          "max_tokens": 32000,
314          "max_output_tokens": 4096,
315          "max_completion_tokens": 1024,
316          "supports_tools": true,
317          "supports_images": false
318        }
319      ]
320    }
321  }
322}
323```
324
325Custom models will be listed in the model dropdown in the Agent Panel.
326
327### Ollama {#ollama}
328
329> ✅ Supports tool use
330
331Download and install Ollama from [ollama.com/download](https://ollama.com/download) (Linux or macOS) and ensure it's running with `ollama --version`.
332
3331. Download one of the [available models](https://ollama.com/models), for example, for `mistral`:
334
335   ```sh
336   ollama pull mistral
337   ```
338
3392. Make sure that the Ollama server is running. You can start it either via running Ollama.app (macOS) or launching:
340
341   ```sh
342   ollama serve
343   ```
344
3453. In the Agent Panel, select one of the Ollama models using the model dropdown.
346
347#### Ollama Context Length {#ollama-context}
348
349Zed has pre-configured maximum context lengths (`max_tokens`) to match the capabilities of common models.
350Zed API requests to Ollama include this as the `num_ctx` parameter, but the default values do not exceed `16384` so users with ~16GB of RAM are able to use most models out of the box.
351
352See [get_max_tokens in ollama.rs](https://github.com/zed-industries/zed/blob/main/crates/ollama/src/ollama.rs) for a complete set of defaults.
353
354> **Note**: Token counts displayed in the Agent Panel are only estimates and will differ from the model's native tokenizer.
355
356Depending on your hardware or use-case you may wish to limit or increase the context length for a specific model via settings.json:
357
358```json
359{
360  "language_models": {
361    "ollama": {
362      "api_url": "http://localhost:11434",
363      "available_models": [
364        {
365          "name": "qwen2.5-coder",
366          "display_name": "qwen 2.5 coder 32K",
367          "max_tokens": 32768,
368          "supports_tools": true,
369          "supports_thinking": true,
370          "supports_images": true
371        }
372      ]
373    }
374  }
375}
376```
377
378If you specify a context length that is too large for your hardware, Ollama will log an error.
379You can watch these logs by running: `tail -f ~/.ollama/logs/ollama.log` (macOS) or `journalctl -u ollama -f` (Linux).
380Depending on the memory available on your machine, you may need to adjust the context length to a smaller value.
381
382You may also optionally specify a value for `keep_alive` for each available model.
383This can be an integer (seconds) or alternatively a string duration like "5m", "10m", "1h", "1d", etc.
384For example, `"keep_alive": "120s"` will allow the remote server to unload the model (freeing up GPU VRAM) after 120 seconds.
385
386The `supports_tools` option controls whether the model will use additional tools.
387If the model is tagged with `tools` in the Ollama catalog, this option should be supplied, and the built-in profiles `Ask` and `Write` can be used.
388If the model is not tagged with `tools` in the Ollama catalog, this option can still be supplied with the value `true`; however, be aware that only the `Minimal` built-in profile will work.
389
390The `supports_thinking` option controls whether the model will perform an explicit "thinking" (reasoning) pass before producing its final answer.
391If the model is tagged with `thinking` in the Ollama catalog, set this option and you can use it in Zed.
392
393The `supports_images` option enables the model's vision capabilities, allowing it to process images included in the conversation context.
394If the model is tagged with `vision` in the Ollama catalog, set this option and you can use it in Zed.
395
396### OpenAI {#openai}
397
398> ✅ Supports tool use
399
4001. Visit the OpenAI platform and [create an API key](https://platform.openai.com/account/api-keys)
4012. Make sure that your OpenAI account has credits
4023. Open the settings view (`agent: open settings`) and go to the OpenAI section
4034. Enter your OpenAI API key
404
405The OpenAI API key will be saved in your keychain.
406
407Zed will also use the `OPENAI_API_KEY` environment variable if it's defined.
408
409#### Custom Models {#openai-custom-models}
410
411The Zed agent comes pre-configured to use the latest version for common models (GPT-3.5 Turbo, GPT-4, GPT-4 Turbo, GPT-4o, GPT-4o mini).
412To use alternate models, perhaps a preview release or a dated model release, or if you wish to control the request parameters, you can do so by adding the following to your Zed `settings.json`:
413
414```json
415{
416  "language_models": {
417    "openai": {
418      "available_models": [
419        {
420          "name": "gpt-4o-2024-08-06",
421          "display_name": "GPT 4o Summer 2024",
422          "max_tokens": 128000
423        },
424        {
425          "name": "o1-mini",
426          "display_name": "o1-mini",
427          "max_tokens": 128000,
428          "max_completion_tokens": 20000
429        }
430      ],
431      "version": "1"
432    }
433  }
434}
435```
436
437You must provide the model's context window in the `max_tokens` parameter; this can be found in the [OpenAI model documentation](https://platform.openai.com/docs/models).
438
439OpenAI `o1` models should set `max_completion_tokens` as well to avoid incurring high reasoning token costs.
440Custom models will be listed in the model dropdown in the Agent Panel.
441
442### OpenAI API Compatible {#openai-api-compatible}
443
444Zed supports using [OpenAI compatible APIs](https://platform.openai.com/docs/api-reference/chat) by specifying a custom `api_url` and `available_models` for the OpenAI provider.
445This is useful for connecting to other hosted services (like Together AI, Anyscale, etc.) or local models.
446
447You can add a custom, OpenAI-compatible model via either via the UI or by editing your `settings.json`.
448
449To do it via the UI, go to the Agent Panel settings (`agent: open settings`) and look for the "Add Provider" button to the right of the "LLM Providers" section title.
450Then, fill up the input fields available in the modal.
451
452To do it via your `settings.json`, add the following snippet under `language_models`:
453
454```json
455{
456  "language_models": {
457    "openai": {
458      "api_url": "https://api.together.xyz/v1", // Using Together AI as an example
459      "available_models": [
460        {
461          "name": "mistralai/Mixtral-8x7B-Instruct-v0.1",
462          "display_name": "Together Mixtral 8x7B",
463          "max_tokens": 32768
464        }
465      ]
466    }
467  }
468}
469```
470
471Note that LLM API keys aren't stored in your settings file.
472So, ensure you have it set in your environment variables (`OPENAI_API_KEY=<your api key>`) so your settings can pick it up.
473
474### OpenRouter {#openrouter}
475
476> ✅ Supports tool use
477
478OpenRouter provides access to multiple AI models through a single API. It supports tool use for compatible models.
479
4801. Visit [OpenRouter](https://openrouter.ai) and create an account
4812. Generate an API key from your [OpenRouter keys page](https://openrouter.ai/keys)
4823. Open the settings view (`agent: open settings`) and go to the OpenRouter section
4834. Enter your OpenRouter API key
484
485The OpenRouter API key will be saved in your keychain.
486
487Zed will also use the `OPENROUTER_API_KEY` environment variable if it's defined.
488
489#### Custom Models {#openrouter-custom-models}
490
491You can add custom models to the OpenRouter provider by adding the following to your Zed `settings.json`:
492
493```json
494{
495  "language_models": {
496    "open_router": {
497      "api_url": "https://openrouter.ai/api/v1",
498      "available_models": [
499        {
500          "name": "google/gemini-2.0-flash-thinking-exp",
501          "display_name": "Gemini 2.0 Flash (Thinking)",
502          "max_tokens": 200000,
503          "max_output_tokens": 8192,
504          "supports_tools": true,
505          "supports_images": true,
506          "mode": {
507            "type": "thinking",
508            "budget_tokens": 8000
509          }
510        }
511      ]
512    }
513  }
514}
515```
516
517The available configuration options for each model are:
518
519- `name` (required): The model identifier used by OpenRouter
520- `display_name` (optional): A human-readable name shown in the UI
521- `max_tokens` (required): The model's context window size
522- `max_output_tokens` (optional): Maximum tokens the model can generate
523- `max_completion_tokens` (optional): Maximum completion tokens
524- `supports_tools` (optional): Whether the model supports tool/function calling
525- `supports_images` (optional): Whether the model supports image inputs
526- `mode` (optional): Special mode configuration for thinking models
527
528You can find available models and their specifications on the [OpenRouter models page](https://openrouter.ai/models).
529
530Custom models will be listed in the model dropdown in the Agent Panel.
531
532### Vercel v0 {#vercel-v0}
533
534> ✅ Supports tool use
535
536[Vercel v0](https://vercel.com/docs/v0/api) is an expert model for generating full-stack apps, with framework-aware completions optimized for modern stacks like Next.js and Vercel.
537It supports text and image inputs and provides fast streaming responses.
538
539The v0 models are [OpenAI-compatible models](/#openai-api-compatible), but Vercel is listed as first-class provider in the panel's settings view.
540
541To start using it with Zed, ensure you have first created a [v0 API key](https://v0.dev/chat/settings/keys).
542Once you have it, paste it directly into the Vercel provider section in the panel's settings view.
543
544You should then find it as `v0-1.5-md` in the model dropdown in the Agent Panel.
545
546### xAI {#xai}
547
548> ✅ Supports tool use
549
550Zed has first-class support for [xAI](https://x.ai/) models. You can use your own API key to access Grok models.
551
5521. [Create an API key in the xAI Console](https://console.x.ai/team/default/api-keys)
5532. Open the settings view (`agent: open settings`) and go to the **xAI** section
5543. Enter your xAI API key
555
556The xAI API key will be saved in your keychain. Zed will also use the `XAI_API_KEY` environment variable if it's defined.
557
558> **Note:** While the xAI API is OpenAI-compatible, Zed has first-class support for it as a dedicated provider. For the best experience, we recommend using the dedicated `x_ai` provider configuration instead of the [OpenAI API Compatible](#openai-api-compatible) method.
559
560#### Custom Models {#xai-custom-models}
561
562The Zed agent comes pre-configured with common Grok models. If you wish to use alternate models or customize their parameters, you can do so by adding the following to your Zed `settings.json`:
563
564```json
565{
566  "language_models": {
567    "x_ai": {
568      "api_url": "https://api.x.ai/v1",
569      "available_models": [
570        {
571          "name": "grok-1.5",
572          "display_name": "Grok 1.5",
573          "max_tokens": 131072,
574          "max_output_tokens": 8192
575        },
576        {
577          "name": "grok-1.5v",
578          "display_name": "Grok 1.5V (Vision)",
579          "max_tokens": 131072,
580          "max_output_tokens": 8192,
581          "supports_images": true
582        }
583      ]
584    }
585  }
586}
587```
588
589## Custom Provider Endpoints {#custom-provider-endpoint}
590
591You can use a custom API endpoint for different providers, as long as it's compatible with the provider's API structure.
592To do so, add the following to your `settings.json`:
593
594```json
595{
596  "language_models": {
597    "some-provider": {
598      "api_url": "http://localhost:11434"
599    }
600  }
601}
602```
603
604Currently, `some-provider` can be any of the following values: `anthropic`, `google`, `ollama`, `openai`.
605
606This is the same infrastructure that powers models that are, for example, [OpenAI-compatible](#openai-api-compatible).