llm-providers.md

  1# LLM Providers
  2
  3To use AI in Zed, you need to have at least one large language model provider set up.
  4
  5You can do that by either subscribing to [one of Zed's plans](./plans-and-usage.md), or by using API keys you already have for the supported providers.
  6
  7## Use Your Own Keys {#use-your-own-keys}
  8
  9If you already have an API key for an existing LLM provider—say Anthropic or OpenAI, for example—you can insert them into Zed and use the full power of the Agent Panel **_for free_**.
 10
 11To add an existing API key to a given provider, go to the Agent Panel settings (`agent: open settings`), look for the desired provider, paste the key into the input, and hit enter.
 12
 13> Note: API keys are _not_ stored as plain text in your `settings.json`, but rather in your OS's secure credential storage.
 14
 15## Supported Providers
 16
 17Here's all the supported LLM providers for which you can use your own API keys:
 18
 19- [Amazon Bedrock](#amazon-bedrock)
 20- [Anthropic](#anthropic)
 21- [DeepSeek](#deepseek)
 22- [GitHub Copilot Chat](#github-copilot-chat)
 23- [Google AI](#google-ai)
 24- [LM Studio](#lmstudio)
 25- [Mistral](#mistral)
 26- [Ollama](#ollama)
 27- [OpenAI](#openai)
 28- [OpenAI API Compatible](#openai-api-compatible)
 29- [OpenRouter](#openrouter)
 30- [Vercel](#vercel-v0)
 31- [xAI](#xai)
 32
 33### Amazon Bedrock {#amazon-bedrock}
 34
 35> Supports tool use with models that support streaming tool use.
 36> More details can be found in the [Amazon Bedrock's Tool Use documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/conversation-inference-supported-models-features.html).
 37
 38To use Amazon Bedrock's models, an AWS authentication is required.
 39Ensure your credentials have the following permissions set up:
 40
 41- `bedrock:InvokeModelWithResponseStream`
 42- `bedrock:InvokeModel`
 43- `bedrock:ConverseStream`
 44
 45Your IAM policy should look similar to:
 46
 47```json
 48{
 49  "Version": "2012-10-17",
 50  "Statement": [
 51    {
 52      "Effect": "Allow",
 53      "Action": [
 54        "bedrock:InvokeModel",
 55        "bedrock:InvokeModelWithResponseStream",
 56        "bedrock:ConverseStream"
 57      ],
 58      "Resource": "*"
 59    }
 60  ]
 61}
 62```
 63
 64With that done, choose one of the two authentication methods:
 65
 66#### Authentication via Named Profile (Recommended)
 67
 681. Ensure you have the AWS CLI installed and configured with a named profile
 692. Open your `settings.json` (`zed: open settings`) and include the `bedrock` key under `language_models` with the following settings:
 70   ```json
 71   {
 72     "language_models": {
 73       "bedrock": {
 74         "authentication_method": "named_profile",
 75         "region": "your-aws-region",
 76         "profile": "your-profile-name"
 77       }
 78     }
 79   }
 80   ```
 81
 82#### Authentication via Static Credentials
 83
 84While it's possible to configure through the Agent Panel settings UI by entering your AWS access key and secret directly, we recommend using named profiles instead for better security practices.
 85To do this:
 86
 871. Create an IAM User that you can assume in the [IAM Console](https://us-east-1.console.aws.amazon.com/iam/home?region=us-east-1#/users).
 882. Create security credentials for that User, save them and keep them secure.
 893. Open the Agent Configuration with (`agent: open settings`) and go to the Amazon Bedrock section
 904. Copy the credentials from Step 2 into the respective **Access Key ID**, **Secret Access Key**, and **Region** fields.
 91
 92#### Cross-Region Inference
 93
 94The Zed implementation of Amazon Bedrock uses [Cross-Region inference](https://docs.aws.amazon.com/bedrock/latest/userguide/cross-region-inference.html) for all the models and region combinations that support it.
 95With Cross-Region inference, you can distribute traffic across multiple AWS Regions, enabling higher throughput.
 96
 97For example, if you use `Claude Sonnet 3.7 Thinking` from `us-east-1`, it may be processed across the US regions, namely: `us-east-1`, `us-east-2`, or `us-west-2`.
 98Cross-Region inference requests are kept within the AWS Regions that are part of the geography where the data originally resides.
 99For example, a request made within the US is kept within the AWS Regions in the US.
100
101Although the data remains stored only in the source Region, your input prompts and output results might move outside of your source Region during cross-Region inference.
102All data will be transmitted encrypted across Amazon's secure network.
103
104We will support Cross-Region inference for each of the models on a best-effort basis, please refer to the [Cross-Region Inference method Code](https://github.com/zed-industries/zed/blob/main/crates/bedrock/src/models.rs#L297).
105
106For the most up-to-date supported regions and models, refer to the [Supported Models and Regions for Cross Region inference](https://docs.aws.amazon.com/bedrock/latest/userguide/inference-profiles-support.html).
107
108### Anthropic {#anthropic}
109
110You can use Anthropic models by choosing them via the model dropdown in the Agent Panel.
111
1121. Sign up for Anthropic and [create an API key](https://console.anthropic.com/settings/keys)
1132. Make sure that your Anthropic account has credits
1143. Open the settings view (`agent: open settings`) and go to the Anthropic section
1154. Enter your Anthropic API key
116
117Even if you pay for Claude Pro, you will still have to [pay for additional credits](https://console.anthropic.com/settings/plans) to use it via the API.
118
119Zed will also use the `ANTHROPIC_API_KEY` environment variable if it's defined.
120
121#### Custom Models {#anthropic-custom-models}
122
123You can add custom models to the Anthropic provider by adding the following to your Zed `settings.json`:
124
125```json
126{
127  "language_models": {
128    "anthropic": {
129      "available_models": [
130        {
131          "name": "claude-3-5-sonnet-20240620",
132          "display_name": "Sonnet 2024-June",
133          "max_tokens": 128000,
134          "max_output_tokens": 2560,
135          "cache_configuration": {
136            "max_cache_anchors": 10,
137            "min_total_token": 10000,
138            "should_speculate": false
139          },
140          "tool_override": "some-model-that-supports-toolcalling"
141        }
142      ]
143    }
144  }
145}
146```
147
148Custom models will be listed in the model dropdown in the Agent Panel.
149
150You can configure a model to use [extended thinking](https://docs.anthropic.com/en/docs/about-claude/models/extended-thinking-models) (if it supports it) by changing the mode in your model's configuration to `thinking`, for example:
151
152```json
153{
154  "name": "claude-sonnet-4-latest",
155  "display_name": "claude-sonnet-4-thinking",
156  "max_tokens": 200000,
157  "mode": {
158    "type": "thinking",
159    "budget_tokens": 4_096
160  }
161}
162```
163
164### DeepSeek {#deepseek}
165
1661. Visit the DeepSeek platform and [create an API key](https://platform.deepseek.com/api_keys)
1672. Open the settings view (`agent: open settings`) and go to the DeepSeek section
1683. Enter your DeepSeek API key
169
170The DeepSeek API key will be saved in your keychain.
171
172Zed will also use the `DEEPSEEK_API_KEY` environment variable if it's defined.
173
174#### Custom Models {#deepseek-custom-models}
175
176The Zed agent comes pre-configured to use the latest version for common models (DeepSeek Chat, DeepSeek Reasoner).
177If you wish to use alternate models or customize the API endpoint, you can do so by adding the following to your Zed `settings.json`:
178
179```json
180{
181  "language_models": {
182    "deepseek": {
183      "api_url": "https://api.deepseek.com",
184      "available_models": [
185        {
186          "name": "deepseek-chat",
187          "display_name": "DeepSeek Chat",
188          "max_tokens": 64000
189        },
190        {
191          "name": "deepseek-reasoner",
192          "display_name": "DeepSeek Reasoner",
193          "max_tokens": 64000,
194          "max_output_tokens": 4096
195        }
196      ]
197    }
198  }
199}
200```
201
202Custom models will be listed in the model dropdown in the Agent Panel.
203You can also modify the `api_url` to use a custom endpoint if needed.
204
205### GitHub Copilot Chat {#github-copilot-chat}
206
207You can use GitHub Copilot Chat with the Zed agent by choosing it via the model dropdown in the Agent Panel.
208
2091. Open the settings view (`agent: open settings`) and go to the GitHub Copilot Chat section
2102. Click on `Sign in to use GitHub Copilot`, follow the steps shown in the modal.
211
212Alternatively, you can provide an OAuth token via the `GH_COPILOT_TOKEN` environment variable.
213
214> **Note**: If you don't see specific models in the dropdown, you may need to enable them in your [GitHub Copilot settings](https://github.com/settings/copilot/features).
215
216To use Copilot Enterprise with Zed (for both agent and completions), you must configure your enterprise endpoint as described in [Configuring GitHub Copilot Enterprise](./edit-prediction.md#github-copilot-enterprise).
217
218### Google AI {#google-ai}
219
220You can use Gemini models with the Zed agent by choosing it via the model dropdown in the Agent Panel.
221
2221. Go to the Google AI Studio site and [create an API key](https://aistudio.google.com/app/apikey).
2232. Open the settings view (`agent: open settings`) and go to the Google AI section
2243. Enter your Google AI API key and press enter.
225
226The Google AI API key will be saved in your keychain.
227
228Zed will also use the `GEMINI_API_KEY` environment variable if it's defined. See [Using Gemini API keys](https://ai.google.dev/gemini-api/docs/api-key) in the Gemini docs for more.
229
230#### Custom Models {#google-ai-custom-models}
231
232By default, Zed will use `stable` versions of models, but you can use specific versions of models, including [experimental models](https://ai.google.dev/gemini-api/docs/models/experimental-models). You can configure a model to use [thinking mode](https://ai.google.dev/gemini-api/docs/thinking) (if it supports it) by adding a `mode` configuration to your model. This is useful for controlling reasoning token usage and response speed. If not specified, Gemini will automatically choose the thinking budget.
233
234Here is an example of a custom Google AI model you could add to your Zed `settings.json`:
235
236```json
237{
238  "language_models": {
239    "google": {
240      "available_models": [
241        {
242          "name": "gemini-2.5-flash-preview-05-20",
243          "display_name": "Gemini 2.5 Flash (Thinking)",
244          "max_tokens": 1000000,
245          "mode": {
246            "type": "thinking",
247            "budget_tokens": 24000
248          }
249        }
250      ]
251    }
252  }
253}
254```
255
256Custom models will be listed in the model dropdown in the Agent Panel.
257
258### LM Studio {#lmstudio}
259
2601. Download and install [the latest version of LM Studio](https://lmstudio.ai/download)
2612. In the app press `cmd/ctrl-shift-m` and download at least one model (e.g., qwen2.5-coder-7b). Alternatively, you can get models via the LM Studio CLI:
262
263   ```sh
264   lms get qwen2.5-coder-7b
265   ```
266
2673. Make sure the LM Studio API server is running by executing:
268
269   ```sh
270   lms server start
271   ```
272
273Tip: Set [LM Studio as a login item](https://lmstudio.ai/docs/advanced/headless#run-the-llm-service-on-machine-login) to automate running the LM Studio server.
274
275### Mistral {#mistral}
276
2771. Visit the Mistral platform and [create an API key](https://console.mistral.ai/api-keys/)
2782. Open the configuration view (`agent: open settings`) and navigate to the Mistral section
2793. Enter your Mistral API key
280
281The Mistral API key will be saved in your keychain.
282
283Zed will also use the `MISTRAL_API_KEY` environment variable if it's defined.
284
285#### Custom Models {#mistral-custom-models}
286
287The Zed agent comes pre-configured with several Mistral models (codestral-latest, mistral-large-latest, mistral-medium-latest, mistral-small-latest, open-mistral-nemo, and open-codestral-mamba).
288All the default models support tool use.
289If you wish to use alternate models or customize their parameters, you can do so by adding the following to your Zed `settings.json`:
290
291```json
292{
293  "language_models": {
294    "mistral": {
295      "api_url": "https://api.mistral.ai/v1",
296      "available_models": [
297        {
298          "name": "mistral-tiny-latest",
299          "display_name": "Mistral Tiny",
300          "max_tokens": 32000,
301          "max_output_tokens": 4096,
302          "max_completion_tokens": 1024,
303          "supports_tools": true,
304          "supports_images": false
305        }
306      ]
307    }
308  }
309}
310```
311
312Custom models will be listed in the model dropdown in the Agent Panel.
313
314### Ollama {#ollama}
315
316Download and install Ollama from [ollama.com/download](https://ollama.com/download) (Linux or macOS) and ensure it's running with `ollama --version`.
317
3181. Download one of the [available models](https://ollama.com/models), for example, for `mistral`:
319
320   ```sh
321   ollama pull mistral
322   ```
323
3242. Make sure that the Ollama server is running. You can start it either via running Ollama.app (macOS) or launching:
325
326   ```sh
327   ollama serve
328   ```
329
3303. In the Agent Panel, select one of the Ollama models using the model dropdown.
331
332#### Ollama Context Length {#ollama-context}
333
334Zed has pre-configured maximum context lengths (`max_tokens`) to match the capabilities of common models.
335Zed API requests to Ollama include this as the `num_ctx` parameter, but the default values do not exceed `16384` so users with ~16GB of RAM are able to use most models out of the box.
336
337See [get_max_tokens in ollama.rs](https://github.com/zed-industries/zed/blob/main/crates/ollama/src/ollama.rs) for a complete set of defaults.
338
339> **Note**: Token counts displayed in the Agent Panel are only estimates and will differ from the model's native tokenizer.
340
341Depending on your hardware or use-case you may wish to limit or increase the context length for a specific model via settings.json:
342
343```json
344{
345  "language_models": {
346    "ollama": {
347      "api_url": "http://localhost:11434",
348      "available_models": [
349        {
350          "name": "qwen2.5-coder",
351          "display_name": "qwen 2.5 coder 32K",
352          "max_tokens": 32768,
353          "supports_tools": true,
354          "supports_thinking": true,
355          "supports_images": true
356        }
357      ]
358    }
359  }
360}
361```
362
363If you specify a context length that is too large for your hardware, Ollama will log an error.
364You can watch these logs by running: `tail -f ~/.ollama/logs/ollama.log` (macOS) or `journalctl -u ollama -f` (Linux).
365Depending on the memory available on your machine, you may need to adjust the context length to a smaller value.
366
367You may also optionally specify a value for `keep_alive` for each available model.
368This can be an integer (seconds) or alternatively a string duration like "5m", "10m", "1h", "1d", etc.
369For example, `"keep_alive": "120s"` will allow the remote server to unload the model (freeing up GPU VRAM) after 120 seconds.
370
371The `supports_tools` option controls whether the model will use additional tools.
372If the model is tagged with `tools` in the Ollama catalog, this option should be supplied, and the built-in profiles `Ask` and `Write` can be used.
373If the model is not tagged with `tools` in the Ollama catalog, this option can still be supplied with the value `true`; however, be aware that only the `Minimal` built-in profile will work.
374
375The `supports_thinking` option controls whether the model will perform an explicit "thinking" (reasoning) pass before producing its final answer.
376If the model is tagged with `thinking` in the Ollama catalog, set this option and you can use it in Zed.
377
378The `supports_images` option enables the model's vision capabilities, allowing it to process images included in the conversation context.
379If the model is tagged with `vision` in the Ollama catalog, set this option and you can use it in Zed.
380
381### OpenAI {#openai}
382
3831. Visit the OpenAI platform and [create an API key](https://platform.openai.com/account/api-keys)
3842. Make sure that your OpenAI account has credits
3853. Open the settings view (`agent: open settings`) and go to the OpenAI section
3864. Enter your OpenAI API key
387
388The OpenAI API key will be saved in your keychain.
389
390Zed will also use the `OPENAI_API_KEY` environment variable if it's defined.
391
392#### Custom Models {#openai-custom-models}
393
394The Zed agent comes pre-configured to use the latest version for common models (GPT-5, GPT-5 mini, o4-mini, GPT-4.1, and others).
395To use alternate models, perhaps a preview release, or if you wish to control the request parameters, you can do so by adding the following to your Zed `settings.json`:
396
397```json
398{
399  "language_models": {
400    "openai": {
401      "available_models": [
402        {
403          "name": "gpt-5",
404          "display_name": "gpt-5 high",
405          "reasoning_effort": "high",
406          "max_tokens": 272000,
407          "max_completion_tokens": 20000
408        },
409        {
410          "name": "gpt-4o-2024-08-06",
411          "display_name": "GPT 4o Summer 2024",
412          "max_tokens": 128000
413        }
414      ]
415    }
416  }
417}
418```
419
420You must provide the model's context window in the `max_tokens` parameter; this can be found in the [OpenAI model documentation](https://platform.openai.com/docs/models).
421
422OpenAI `o1` models should set `max_completion_tokens` as well to avoid incurring high reasoning token costs.
423Custom models will be listed in the model dropdown in the Agent Panel.
424
425### OpenAI API Compatible {#openai-api-compatible}
426
427Zed supports using [OpenAI compatible APIs](https://platform.openai.com/docs/api-reference/chat) by specifying a custom `api_url` and `available_models` for the OpenAI provider.
428This is useful for connecting to other hosted services (like Together AI, Anyscale, etc.) or local models.
429
430You can add a custom, OpenAI-compatible model either via the UI or by editing your `settings.json`.
431
432To do it via the UI, go to the Agent Panel settings (`agent: open settings`) and look for the "Add Provider" button to the right of the "LLM Providers" section title.
433Then, fill up the input fields available in the modal.
434
435To do it via your `settings.json`, add the following snippet under `language_models`:
436
437```json
438{
439  "language_models": {
440    "openai": {
441      "api_url": "https://api.together.xyz/v1", // Using Together AI as an example
442      "available_models": [
443        {
444          "name": "mistralai/Mixtral-8x7B-Instruct-v0.1",
445          "display_name": "Together Mixtral 8x7B",
446          "max_tokens": 32768,
447          "capabilities": {
448            "tools": true,
449            "images": false,
450            "parallel_tool_calls": false,
451            "prompt_cache_key": false
452          }
453        }
454      ]
455    }
456  }
457}
458```
459
460By default, OpenAI-compatible models inherit the following capabilities:
461
462- `tools`: true (supports tool/function calling)
463- `images`: false (does not support image inputs)
464- `parallel_tool_calls`: false (does not support `parallel_tool_calls` parameter)
465- `prompt_cache_key`: false (does not support `prompt_cache_key` parameter)
466
467Note that LLM API keys aren't stored in your settings file.
468So, ensure you have it set in your environment variables (`OPENAI_API_KEY=<your api key>`) so your settings can pick it up.
469
470### OpenRouter {#openrouter}
471
472OpenRouter provides access to multiple AI models through a single API. It supports tool use for compatible models.
473
4741. Visit [OpenRouter](https://openrouter.ai) and create an account
4752. Generate an API key from your [OpenRouter keys page](https://openrouter.ai/keys)
4763. Open the settings view (`agent: open settings`) and go to the OpenRouter section
4774. Enter your OpenRouter API key
478
479The OpenRouter API key will be saved in your keychain.
480
481Zed will also use the `OPENROUTER_API_KEY` environment variable if it's defined.
482
483#### Custom Models {#openrouter-custom-models}
484
485You can add custom models to the OpenRouter provider by adding the following to your Zed `settings.json`:
486
487```json
488{
489  "language_models": {
490    "open_router": {
491      "api_url": "https://openrouter.ai/api/v1",
492      "available_models": [
493        {
494          "name": "google/gemini-2.0-flash-thinking-exp",
495          "display_name": "Gemini 2.0 Flash (Thinking)",
496          "max_tokens": 200000,
497          "max_output_tokens": 8192,
498          "supports_tools": true,
499          "supports_images": true,
500          "mode": {
501            "type": "thinking",
502            "budget_tokens": 8000
503          }
504        }
505      ]
506    }
507  }
508}
509```
510
511The available configuration options for each model are:
512
513- `name` (required): The model identifier used by OpenRouter
514- `display_name` (optional): A human-readable name shown in the UI
515- `max_tokens` (required): The model's context window size
516- `max_output_tokens` (optional): Maximum tokens the model can generate
517- `max_completion_tokens` (optional): Maximum completion tokens
518- `supports_tools` (optional): Whether the model supports tool/function calling
519- `supports_images` (optional): Whether the model supports image inputs
520- `mode` (optional): Special mode configuration for thinking models
521
522You can find available models and their specifications on the [OpenRouter models page](https://openrouter.ai/models).
523
524Custom models will be listed in the model dropdown in the Agent Panel.
525
526### Vercel v0 {#vercel-v0}
527
528[Vercel v0](https://vercel.com/docs/v0/api) is an expert model for generating full-stack apps, with framework-aware completions optimized for modern stacks like Next.js and Vercel.
529It supports text and image inputs and provides fast streaming responses.
530
531The v0 models are [OpenAI-compatible models](/#openai-api-compatible), but Vercel is listed as first-class provider in the panel's settings view.
532
533To start using it with Zed, ensure you have first created a [v0 API key](https://v0.dev/chat/settings/keys).
534Once you have it, paste it directly into the Vercel provider section in the panel's settings view.
535
536You should then find it as `v0-1.5-md` in the model dropdown in the Agent Panel.
537
538### xAI {#xai}
539
540Zed has first-class support for [xAI](https://x.ai/) models. You can use your own API key to access Grok models.
541
5421. [Create an API key in the xAI Console](https://console.x.ai/team/default/api-keys)
5432. Open the settings view (`agent: open settings`) and go to the **xAI** section
5443. Enter your xAI API key
545
546The xAI API key will be saved in your keychain. Zed will also use the `XAI_API_KEY` environment variable if it's defined.
547
548> **Note:** While the xAI API is OpenAI-compatible, Zed has first-class support for it as a dedicated provider. For the best experience, we recommend using the dedicated `x_ai` provider configuration instead of the [OpenAI API Compatible](#openai-api-compatible) method.
549
550#### Custom Models {#xai-custom-models}
551
552The Zed agent comes pre-configured with common Grok models. If you wish to use alternate models or customize their parameters, you can do so by adding the following to your Zed `settings.json`:
553
554```json
555{
556  "language_models": {
557    "x_ai": {
558      "api_url": "https://api.x.ai/v1",
559      "available_models": [
560        {
561          "name": "grok-1.5",
562          "display_name": "Grok 1.5",
563          "max_tokens": 131072,
564          "max_output_tokens": 8192
565        },
566        {
567          "name": "grok-1.5v",
568          "display_name": "Grok 1.5V (Vision)",
569          "max_tokens": 131072,
570          "max_output_tokens": 8192,
571          "supports_images": true
572        }
573      ]
574    }
575  }
576}
577```
578
579## Custom Provider Endpoints {#custom-provider-endpoint}
580
581You can use a custom API endpoint for different providers, as long as it's compatible with the provider's API structure.
582To do so, add the following to your `settings.json`:
583
584```json
585{
586  "language_models": {
587    "some-provider": {
588      "api_url": "http://localhost:11434"
589    }
590  }
591}
592```
593
594Currently, `some-provider` can be any of the following values: `anthropic`, `google`, `ollama`, `openai`.
595
596This is the same infrastructure that powers models that are, for example, [OpenAI-compatible](#openai-api-compatible).