This guide provides detailed information on how to configure each provider in Kilo Code CLI. Each provider has specific configuration requirements and optional settings that can be customized to suit your needs.
Kilo Code CLI supports multiple AI providers, each with their own configuration requirements. This document details the configuration fields for each provider, including required and optional parameters.
You can configure providers using:
kilocode config to configure providers interactivelyDescription: Parameters that are shared by all providers.
Optional Fields:
enableReasoningEffort (boolean): Enable or disable reasoning for supported models. Many models have no, dynamic or compulsive reasoning regardless of this setting. Must be set to true for reasoningEffort or modelMaxThinkingTokens to take effect.reasoningEffort (text): Specify reasoning effort for supported models. Can be "low", "medium", "high" or "xhigh". Requires enableReasoningEffort to be true, mutually exclusive with modelMaxThinkingTokens.modelMaxThinkingTokens (number): Specify reasoning token limit for supported models (mainly Claude models). Requires enableReasoningEffort to be true, mutually exclusive with reasoningEffort.verbosity (text): Controls the verbosity and length of the model response for supported models (mainly GPT-5.x and Claude Opus 4.5). Also known as output effort. Supported values are "low", "medium" and "high".The official Kilo Code provider for accessing Kilo Code's managed AI services.
Description: Access Kilo Code's managed AI infrastructure with support for multiple models and organizations.
Required Fields:
kilocodeToken (password): Your Kilo Code authentication tokenkilocodeModel (text): The model to use (default: anthropic/claude-sonnet-4.5)Optional Fields:
kilocodeOrganizationId (text): Organization ID for team accounts (leave empty for personal use)openRouterSpecificProvider (text): Specific OpenRouter provider to use when routing through OpenRouteropenRouterProviderDataCollection (text): OpenRouter provider data collection preference
allow: Allow data collection by the providerdeny: Deny data collection by the provideropenRouterProviderSort (text): OpenRouter provider sorting preference for model selection
price: Sort by price (lowest first)throughput: Sort by throughput (highest first)latency: Sort by latency (lowest first)openRouterZdr (boolean): Enable OpenRouter Zero Data Retention for enhanced privacyExample Configuration:
{
"id": "default",
"provider": "kilocode",
"kilocodeToken": "your-token-here",
"kilocodeModel": "anthropic/claude-sonnet-4",
"kilocodeOrganizationId": "org-123456"
}
Default Model: anthropic/claude-sonnet-4.5
Notes:
Direct integration with Anthropic's Claude API.
Description: Use Claude models directly from Anthropic with your own API key.
Required Fields:
apiKey (password): Your Anthropic API keyapiModelId (text): The Claude model to use (default: claude-sonnet-4.5)Optional Fields:
anthropicBaseUrl (text): Custom base URL for API requests (leave empty for default)anthropicUseAuthToken (boolean): Use authentication token instead of API key for requests. When enabled, the system will use token-based authentication instead of the standard API key authentication method.anthropicBeta1MContext (boolean): Enable beta 1M context window support. This allows access to extended context windows for supported models, enabling processing of larger documents and conversations.Example Configuration:
{
"id": "default",
"provider": "anthropic",
"apiKey": "sk-ant-...",
"apiModelId": "claude-sonnet-4.5",
"anthropicBaseUrl": "",
"anthropicUseAuthToken": false,
"anthropicBeta1MContext": false
}
Default Model: claude-sonnet-4.5
Notes:
anthropicUseAuthToken option is useful for enterprise deployments with custom authentication systemsanthropicBeta1MContext feature requires beta access and may incur additional costsNative OpenAI API integration.
Description: Use OpenAI's models with native API support.
Required Fields:
openAiNativeApiKey (password): Your OpenAI API keyapiModelId (text): The OpenAI model to use (default: gpt-5-chat-latest)Optional Fields:
openAiNativeBaseUrl (text): Custom base URL for API requests (leave empty for default)openAiNativeServiceTier (text): Service tier for request prioritization and latency optimization
auto (default): Let OpenAI automatically select the best tier based on current system load and model availabilitydefault: Use standard processing with balanced performance and costflex: Cost-optimized processing with variable latency, ideal for non-time-sensitive workloadspriority: Fastest processing with higher priority in the queue, recommended for latency-sensitive applicationsExample Configuration:
{
"id": "default",
"provider": "openai-native",
"openAiNativeApiKey": "sk-...",
"apiModelId": "gpt-5-chat-latest",
"openAiNativeBaseUrl": "",
"openAiNativeServiceTier": "auto"
}
Default Model: gpt-5-chat-latest
Notes:
auto tier is recommended for most users as it balances performance and cost automaticallyAccess multiple AI models through OpenRouter's unified API.
Description: Use OpenRouter to access various AI models from different providers through a single API.
Required Fields:
openRouterApiKey (password): Your OpenRouter API keyopenRouterModelId (text): The model identifier (default: anthropic/claude-3-5-sonnet)Optional Fields:
openRouterBaseUrl (text): Custom base URL (leave empty for default)openRouterSpecificProvider (text): Specific OpenRouter provider to use for routing requests. When specified, OpenRouter will route your request to this specific provider instead of automatically selecting one. Useful when you want to ensure requests go to a particular infrastructure provider.openRouterUseMiddleOutTransform (boolean): Enable middle-out transform for optimized request routing. This feature can improve routing efficiency and reduce latency by using advanced request transformation techniques.openRouterProviderDataCollection (text): OpenRouter provider data collection preference
allow: Allow data collection by the provider for model improvement and analyticsdeny: Deny data collection by the provider to maintain stricter privacy controlsopenRouterProviderSort (text): OpenRouter provider sorting preference for model selection
price: Sort by price (lowest first) - optimizes for cost efficiencythroughput: Sort by throughput (highest first) - optimizes for maximum tokens per secondlatency: Sort by latency (lowest first) - optimizes for fastest response timeopenRouterZdr (boolean): Enable OpenRouter Zero Data Retention (ZDR) for enhanced privacy. When enabled, OpenRouter will not store any request or response data, ensuring maximum privacy and compliance with data protection requirements.Example Configuration:
{
"id": "default",
"provider": "openrouter",
"openRouterApiKey": "sk-or-...",
"openRouterModelId": "anthropic/claude-3-5-sonnet",
"openRouterBaseUrl": "",
"openRouterSpecificProvider": "anthropic",
"openRouterUseMiddleOutTransform": false,
"openRouterProviderDataCollection": "deny",
"openRouterProviderSort": "latency",
"openRouterZdr": true
}
Default Model: anthropic/claude-3-5-sonnet
Notes:
AWS Bedrock for accessing foundation models on AWS infrastructure.
Description: Use AWS Bedrock to access various foundation models with AWS security and compliance.
Required Fields:
awsAccessKey (password): Your AWS access key IDawsSecretKey (password): Your AWS secret access keyawsRegion (text): AWS region (default: us-east-1)apiModelId (text): The model to use (default: anthropic.claude-sonnet-4.5-20250929-v1:0)Optional Fields:
awsSessionToken (password): AWS session token for temporary credentialsawsUseCrossRegionInference (boolean): Enable cross-region inference to access models in different AWS regionsawsUsePromptCache (boolean): Enable prompt caching to reduce costs and latency for repeated prompts. When enabled, Bedrock caches portions of your prompts that are reused across requests, significantly reducing both API costs and response times for subsequent requests with similar context.awsProfile (string): AWS profile name from your credentials file (typically ~/.aws/credentials). Use this to specify which AWS profile to use for authentication instead of providing access keys directly.awsUseProfile (boolean): Use AWS profile from credentials file instead of access keys. When enabled, the system will authenticate using the profile specified in awsProfile rather than awsAccessKey and awsSecretKey.awsApiKey (string): AWS API key for alternative authentication methods. This is used for specific authentication scenarios that require API key-based access.awsUseApiKey (boolean): Use API key authentication instead of access keys. Enable this when you want to authenticate using awsApiKey rather than the standard AWS access key/secret key pair.awsCustomArn (string): Custom Amazon Resource Name (ARN) for cross-account access or custom model access. Use this when you need to access models in a different AWS account or when using custom fine-tuned models.awsModelContextWindow (number): Override the model's default context window size. Specify a custom token limit for the context window. Must be a positive integer. Use this when you need to limit or extend the context size beyond the model's default.awsBedrockEndpointEnabled (boolean): Enable custom Bedrock endpoint. Set to true when you want to use a custom endpoint URL instead of the standard AWS Bedrock endpoint.awsBedrockEndpoint (string): Custom Bedrock endpoint URL. Specify a custom endpoint when using VPC endpoints, private endpoints, or region-specific endpoints. Only used when awsBedrockEndpointEnabled is true.awsBedrock1MContext (boolean): Enable 1M token context window support for compatible models. When enabled, allows access to extended context windows (up to 1 million tokens) for models that support this feature, enabling processing of extremely large documents and conversations.Example Configuration:
{
"id": "default",
"provider": "bedrock",
"awsAccessKey": "AKIA...",
"awsSecretKey": "...",
"awsRegion": "us-east-1",
"apiModelId": "anthropic.claude-sonnet-4.5-20250929-v1:0",
"awsSessionToken": "",
"awsUseCrossRegionInference": false,
"awsUsePromptCache": true,
"awsBedrock1MContext": false
}
Example Configuration with AWS Profile:
{
"id": "bedrock-profile",
"provider": "bedrock",
"awsProfile": "my-aws-profile",
"awsUseProfile": true,
"awsRegion": "us-east-1",
"apiModelId": "anthropic.claude-sonnet-4.5-20250929-v1:0",
"awsUsePromptCache": true
}
Example Configuration with Custom Endpoint:
{
"id": "bedrock-custom",
"provider": "bedrock",
"awsAccessKey": "AKIA...",
"awsSecretKey": "...",
"awsRegion": "us-east-1",
"apiModelId": "anthropic.claude-sonnet-4.5-20250929-v1:0",
"awsBedrockEndpointEnabled": true,
"awsBedrockEndpoint": "https://bedrock-runtime.us-east-1.amazonaws.com",
"awsModelContextWindow": 200000
}
Default Model: anthropic.claude-sonnet-4.5-20250929-v1:0
Notes:
awsAccessKey and awsSecretKey)awsProfile with awsUseProfile enabled)awsApiKey with awsUseApiKey enabled)Google's Gemini AI models via direct API access.
Description: Access Google's Gemini models directly with your API key.
Required Fields:
geminiApiKey (password): Your Google AI API keyapiModelId (text): The model to use (default: gemini-2.5-flash-preview-04-17)Optional Fields:
googleGeminiBaseUrl (text): Custom base URL (leave empty for default)enableUrlContext (boolean): Allows the model to access and process URLs in prompts. When enabled, the model can fetch and analyze content from URLs provided in the conversation, enabling web-based research and content analysis capabilities.enableGrounding (boolean): Enables Google Search grounding for factual responses. When enabled, the model can use Google Search to ground its responses in real-time information, improving accuracy for factual queries and reducing hallucinations.Example Configuration:
{
"id": "default",
"provider": "gemini",
"geminiApiKey": "AIza...",
"apiModelId": "gemini-2.5-flash-preview-04-17",
"googleGeminiBaseUrl": "",
"enableUrlContext": false,
"enableGrounding": false
}
Example Configuration with URL Context and Grounding:
{
"id": "gemini-enhanced",
"provider": "gemini",
"geminiApiKey": "AIza...",
"apiModelId": "gemini-2.5-flash-preview-04-17",
"enableUrlContext": true,
"enableGrounding": true
}
Default Model: gemini-2.5-flash-preview-04-17
Notes:
Google Cloud Vertex AI for enterprise-grade AI deployment.
Description: Use Google Cloud's Vertex AI platform for accessing AI models with enterprise features.
Required Fields:
vertexProjectId (text): Your Google Cloud project IDvertexRegion (text): Google Cloud region (default: us-central1)apiModelId (text): The model to use (default: claude-4.5-sonnet)Authentication (choose one):
vertexJsonCredentials (password): JSON service account credentialsvertexKeyFile (text): Path to service account key fileOptional Fields:
enableUrlContext (boolean): Allows the model to access and process URLs in prompts. When enabled, the model can fetch and analyze content from URLs provided in the conversation, enabling web-based research and content analysis capabilities.enableGrounding (boolean): Enables Google Search grounding for factual responses. When enabled, the model can use Google Search to ground its responses in real-time information, improving accuracy for factual queries and reducing hallucinations.Example Configuration:
{
"id": "default",
"provider": "vertex",
"vertexProjectId": "my-project-123",
"vertexRegion": "us-central1",
"apiModelId": "claude-4.5-sonnet",
"vertexJsonCredentials": "{...}",
"vertexKeyFile": "",
"enableUrlContext": false,
"enableGrounding": false
}
Example Configuration with URL Context and Grounding:
{
"id": "vertex-enhanced",
"provider": "vertex",
"vertexProjectId": "my-project-123",
"vertexRegion": "us-central1",
"apiModelId": "claude-4.5-sonnet",
"vertexJsonCredentials": "{...}",
"enableUrlContext": true,
"enableGrounding": true
}
Default Model: claude-4.5-sonnet
Notes:
Local Claude Code CLI integration.
Description: Use the Claude Code CLI tool for local AI interactions.
Required Fields:
claudeCodePath (text): Path to the Claude Code executableapiModelId (text): The model to use (default: claude-sonnet-4-5)claudeCodeMaxOutputTokens (text): Maximum output tokens (default: 8000)Example Configuration:
{
"id": "default",
"provider": "claude-code",
"claudeCodePath": "/usr/local/bin/claude-code",
"apiModelId": "claude-sonnet-4-5",
"claudeCodeMaxOutputTokens": "8000"
}
Default Model: claude-sonnet-4-5
Notes:
Mistral AI's language models.
Description: Access Mistral's powerful language models including Codestral for code generation.
Required Fields:
mistralApiKey (password): Your Mistral API keyapiModelId (text): The model to use (default: magistral-medium-latest)Optional Fields:
mistralCodestralUrl (text): Custom Codestral base URL (leave empty for default)Example Configuration:
{
"id": "default",
"provider": "mistral",
"mistralApiKey": "...",
"apiModelId": "magistral-medium-latest",
"mistralCodestralUrl": ""
}
Default Model: magistral-medium-latest
Notes:
Groq's ultra-fast LPU inference.
Description: Use Groq's Language Processing Unit (LPU) for extremely fast inference.
Required Fields:
groqApiKey (password): Your Groq API keyapiModelId (text): The model to use (default: llama-3.3-70b-versatile)Example Configuration:
{
"id": "default",
"provider": "groq",
"groqApiKey": "gsk_...",
"apiModelId": "llama-3.3-70b-versatile"
}
Default Model: llama-3.3-70b-versatile
Notes:
DeepSeek's AI models.
Description: Access DeepSeek's language models optimized for coding and reasoning.
Required Fields:
deepSeekApiKey (password): Your DeepSeek API keyapiModelId (text): The model to use (default: deepseek-chat)Example Configuration:
{
"id": "default",
"provider": "deepseek",
"deepSeekApiKey": "...",
"apiModelId": "deepseek-chat"
}
Default Model: deepseek-chat
Notes:
xAI's Grok models.
Description: Access xAI's Grok language models.
Required Fields:
xaiApiKey (password): Your xAI API keyapiModelId (text): The model to use (default: grok-code-fast-1)Example Configuration:
{
"id": "default",
"provider": "xai",
"xaiApiKey": "...",
"apiModelId": "grok-code-fast-1"
}
Default Model: grok-code-fast-1
Notes:
Cerebras AI inference platform.
Description: Use Cerebras' wafer-scale AI inference platform.
Required Fields:
cerebrasApiKey (password): Your Cerebras API keyapiModelId (text): The model to use (default: qwen-3-coder-480b-free)Example Configuration:
{
"id": "default",
"provider": "cerebras",
"cerebrasApiKey": "...",
"apiModelId": "qwen-3-coder-480b-free"
}
Default Model: qwen-3-coder-480b-free
Notes:
Local Ollama instance for running models locally.
Description: Run AI models locally using Ollama.
Required Fields:
ollamaBaseUrl (text): Ollama server URL (default: http://localhost:11434)ollamaModelId (text): Model identifier (default: llama3.2)Optional Fields:
ollamaApiKey (password): API key if authentication is enabledollamaNumCtx (number): Context window size for the model. Controls the maximum number of tokens the model can process at once. Common values include:
2048: Small context, lower memory usage4096: Standard context for most tasks8192: Extended context for longer conversations16384: Large context for complex tasks32768: Very large context (requires significant memory)
Higher values allow processing longer conversations and larger documents but require more memory. If not specified, Ollama uses the model's default context size.Example Configuration:
{
"id": "default",
"provider": "ollama",
"ollamaBaseUrl": "http://localhost:11434",
"ollamaModelId": "llama3.2",
"ollamaApiKey": "",
"ollamaNumCtx": 8192
}
Default Model: llama3.2
Notes:
ollamaNumCtx parameter directly affects memory usage - ensure your system has sufficient RAM for larger context windowsLM Studio for local model inference.
Description: Use LM Studio to run models locally with a user-friendly interface.
Required Fields:
lmStudioBaseUrl (text): LM Studio server URL (default: http://localhost:1234/v1)lmStudioModelId (text): Model identifier (default: local-model)Optional Fields:
lmStudioDraftModelId (string): Draft model ID for speculative decoding. Specifies a smaller, faster model that generates initial token predictions which are then verified by the main model. This can significantly improve inference speed while maintaining output quality. The draft model should be compatible with the main model's vocabulary and typically be a smaller version of the same model family.lmStudioSpeculativeDecodingEnabled (boolean): Enable speculative decoding for faster inference. When enabled along with a draft model, uses a two-stage generation process: the draft model proposes multiple tokens ahead, and the main model verifies them in parallel. This technique can reduce latency by 2-3x for compatible model pairs without sacrificing quality.Example Configuration:
{
"id": "default",
"provider": "lmstudio",
"lmStudioBaseUrl": "http://localhost:1234/v1",
"lmStudioModelId": "local-model",
"lmStudioDraftModelId": "local-model-draft",
"lmStudioSpeculativeDecodingEnabled": true
}
Example Configuration without Speculative Decoding:
{
"id": "default",
"provider": "lmstudio",
"lmStudioBaseUrl": "http://localhost:1234/v1",
"lmStudioModelId": "local-model",
"lmStudioSpeculativeDecodingEnabled": false
}
Default Model: local-model
Notes:
lmStudioSpeculativeDecodingEnabled to truelmStudioDraftModelIdVSCode's built-in language model API.
Description: Use VSCode's native language model capabilities (e.g., GitHub Copilot).
Required Fields:
vsCodeLmModelSelector (text): Model selector in format vendor/familyExample Configuration:
{
"id": "default",
"provider": "vscode-lm",
"vsCodeLmModelSelector": {
"vendor": "copilot",
"family": "gpt-4o"
}
}
Default Model: copilot-gpt-4o
Notes:
OpenAI API integration (alternative configuration).
Description: Alternative OpenAI integration with simplified configuration and support for Azure OpenAI Service.
Required Fields:
openAiApiKey (password): Your OpenAI API keyopenAiModelId (text): The model to use (default: gpt-4o)Optional Fields:
openAiBaseUrl (text): Custom base URL for OpenAI API requests (leave empty for default). When using Azure OpenAI, set this to your Azure endpoint URL (e.g., https://your-resource.openai.azure.com)openAiLegacyFormat (boolean): Use legacy API format for compatibility with older OpenAI API versions. Enable this if you're using an older API version or a proxy that expects the legacy format.openAiR1FormatEnabled (boolean): Enable R1 format for reasoning models that support extended thinking capabilities. This format is optimized for models like o1 and o1-mini that perform chain-of-thought reasoning.openAiUseAzure (boolean): Use Azure OpenAI Service instead of standard OpenAI API. When enabled, ensure you set openAiBaseUrl to your Azure endpoint and azureApiVersion to a valid API version.azureApiVersion (string): Azure OpenAI API version (e.g., 2024-02-15-preview, 2023-05-15). Required when openAiUseAzure is true. See Azure OpenAI API versions for available versions.openAiStreamingEnabled (boolean): Enable streaming responses for real-time token generation. When enabled, responses are streamed as they're generated rather than waiting for the complete response.openAiHeaders (object): Custom HTTP headers to include in OpenAI API requests. Useful for adding authentication headers, tracking headers, or other custom metadata. Example: {"X-Custom-Header": "value", "X-Request-ID": "123"}Example Configuration:
{
"id": "default",
"provider": "openai",
"openAiApiKey": "sk-...",
"openAiModelId": "gpt-4o",
"openAiBaseUrl": "",
"openAiLegacyFormat": false,
"openAiR1FormatEnabled": false,
"openAiUseAzure": false,
"azureApiVersion": "",
"openAiStreamingEnabled": true,
"openAiHeaders": {}
}
Example Azure OpenAI Configuration:
{
"id": "azure-openai",
"provider": "openai",
"openAiApiKey": "your-azure-api-key",
"openAiModelId": "gpt-4",
"openAiBaseUrl": "https://your-resource.openai.azure.com",
"openAiUseAzure": true,
"azureApiVersion": "2024-02-15-preview",
"openAiStreamingEnabled": true
}
Example Configuration with Custom Headers:
{
"id": "openai-with-headers",
"provider": "openai",
"openAiApiKey": "sk-...",
"openAiModelId": "gpt-4o",
"openAiHeaders": {
"X-Organization-ID": "org-123456",
"X-Request-Source": "kilocode-cli"
}
}
Default Model: gpt-4o
Notes:
Glama AI platform.
Description: Access AI models through the Glama platform.
Required Fields:
glamaApiKey (password): Your Glama API keyglamaModelId (text): Model identifier (default: llama-3.1-70b-versatile)Example Configuration:
{
"id": "default",
"provider": "glama",
"glamaApiKey": "...",
"glamaModelId": "llama-3.1-70b-versatile"
}
Default Model: llama-3.1-70b-versatile
HuggingFace Inference API.
Description: Access models hosted on HuggingFace's inference infrastructure.
Required Fields:
huggingFaceApiKey (password): Your HuggingFace API tokenhuggingFaceModelId (text): Model identifier (default: meta-llama/Llama-2-70b-chat-hf)huggingFaceInferenceProvider (text): Inference provider (default: auto)Example Configuration:
{
"id": "default",
"provider": "huggingface",
"huggingFaceApiKey": "hf_...",
"huggingFaceModelId": "meta-llama/Llama-2-70b-chat-hf",
"huggingFaceInferenceProvider": "auto"
}
Default Model: meta-llama/Llama-2-70b-chat-hf
Notes:
auto, hf-inference, or specific endpointsLiteLLM proxy for unified model access.
Description: Use LiteLLM as a proxy to access multiple AI providers through a unified interface.
Required Fields:
litellmBaseUrl (text): LiteLLM proxy URLlitellmApiKey (password): API key for the proxylitellmModelId (text): Model identifier (default: gpt-4o)Optional Fields:
litellmUsePromptCache (boolean): Enable prompt caching to reduce costs and improve performance for repeated prompts. When enabled, LiteLLM caches portions of your prompts that are reused across requests, significantly reducing both API costs and response times for subsequent requests with similar context. This is particularly beneficial for applications with repeated system prompts, documentation, or other static context.Example Configuration:
{
"id": "default",
"provider": "litellm",
"litellmBaseUrl": "http://localhost:8000",
"litellmApiKey": "...",
"litellmModelId": "gpt-4o",
"litellmUsePromptCache": true
}
Example Configuration without Prompt Caching:
{
"id": "litellm-no-cache",
"provider": "litellm",
"litellmBaseUrl": "http://localhost:8000",
"litellmApiKey": "...",
"litellmModelId": "gpt-4o",
"litellmUsePromptCache": false
}
Default Model: gpt-4o
Notes:
Moonshot AI platform.
Description: Access Moonshot AI's language models.
Required Fields:
moonshotBaseUrl (text): Moonshot API base URL (default: https://api.moonshot.ai/v1)moonshotApiKey (password): Your Moonshot API keyapiModelId (text): The model to use (default: kimi-k2-0711-preview)Example Configuration:
{
"id": "default",
"provider": "moonshot",
"moonshotBaseUrl": "https://api.moonshot.ai/v1",
"moonshotApiKey": "...",
"apiModelId": "kimi-k2-0711-preview"
}
Default Model: kimi-k2-0711-preview
Doubao AI platform.
Description: Access Doubao's AI models.
Required Fields:
doubaoApiKey (password): Your Doubao API keyapiModelId (text): The model to use (default: doubao-seed-1-6-250615)Example Configuration:
{
"id": "default",
"provider": "doubao",
"doubaoApiKey": "...",
"apiModelId": "doubao-seed-1-6-250615"
}
Default Model: doubao-seed-1-6-250615
Chutes AI platform.
Description: Access AI models through the Chutes platform.
Required Fields:
chutesApiKey (password): Your Chutes API keyapiModelId (text): The model to use (default: deepseek-ai/DeepSeek-R1-0528)Example Configuration:
{
"id": "default",
"provider": "chutes",
"chutesApiKey": "...",
"apiModelId": "deepseek-ai/DeepSeek-R1-0528"
}
Default Model: deepseek-ai/DeepSeek-R1-0528
SambaNova AI inference platform.
Description: Use SambaNova's AI inference platform for fast model execution.
Required Fields:
sambaNovaApiKey (password): Your SambaNova API keyapiModelId (text): The model to use (default: Meta-Llama-3.1-70B-Instruct)Example Configuration:
{
"id": "default",
"provider": "sambanova",
"sambaNovaApiKey": "...",
"apiModelId": "Meta-Llama-3.1-70B-Instruct"
}
Default Model: Meta-Llama-3.1-70B-Instruct
Fireworks AI platform.
Description: Access models through Fireworks AI's fast inference platform.
Required Fields:
fireworksApiKey (password): Your Fireworks API keyapiModelId (text): The model to use (default: accounts/fireworks/models/kimi-k2-instruct-0905)Example Configuration:
{
"id": "default",
"provider": "fireworks",
"fireworksApiKey": "...",
"apiModelId": "accounts/fireworks/models/kimi-k2-instruct-0905"
}
Default Model: accounts/fireworks/models/kimi-k2-instruct-0905
Notes:
Featherless AI platform.
Description: Access AI models through the Featherless platform.
Required Fields:
featherlessApiKey (password): Your Featherless API keyapiModelId (text): The model to use (default: deepseek-ai/DeepSeek-V3-0324)Example Configuration:
{
"id": "default",
"provider": "featherless",
"featherlessApiKey": "...",
"apiModelId": "deepseek-ai/DeepSeek-V3-0324"
}
Default Model: deepseek-ai/DeepSeek-V3-0324
DeepInfra's serverless AI inference.
Description: Use DeepInfra for serverless access to various AI models.
Required Fields:
deepInfraApiKey (password): Your DeepInfra API keydeepInfraModelId (text): Model identifier (default: meta-llama/Meta-Llama-3.1-70B-Instruct)Optional Fields:
deepInfraBaseUrl (text): Custom base URL for DeepInfra API requests. Use this when you need to connect to a different DeepInfra endpoint or a custom proxy. Leave empty to use the default DeepInfra API URL (https://api.deepinfra.com/v1/openai).Example Configuration:
{
"id": "default",
"provider": "deepinfra",
"deepInfraApiKey": "...",
"deepInfraModelId": "meta-llama/Meta-Llama-3.1-70B-Instruct"
}
Example Configuration with Custom Base URL:
{
"id": "deepinfra-custom",
"provider": "deepinfra",
"deepInfraApiKey": "...",
"deepInfraModelId": "meta-llama/Meta-Llama-3.1-70B-Instruct",
"deepInfraBaseUrl": "https://custom-endpoint.deepinfra.com/v1/openai"
}
Default Model: meta-llama/Meta-Llama-3.1-70B-Instruct
Notes:
IO Intelligence platform.
Description: Access AI models through the IO Intelligence platform.
Required Fields:
ioIntelligenceApiKey (password): Your IO Intelligence API keyioIntelligenceModelId (text): Model identifier (default: gpt-4o)Example Configuration:
{
"id": "default",
"provider": "io-intelligence",
"ioIntelligenceApiKey": "...",
"ioIntelligenceModelId": "gpt-4o"
}
Default Model: gpt-4o
Qwen Code AI models.
Description: Access Qwen's code-specialized models using OAuth authentication.
Required Fields:
qwenCodeOauthPath (text): Path to OAuth credentials file (default: ~/.qwen/oauth_creds.json)apiModelId (text): The model to use (default: qwen3-coder-plus)Example Configuration:
{
"id": "default",
"provider": "qwen-code",
"qwenCodeOauthPath": "~/.qwen/oauth_creds.json",
"apiModelId": "qwen3-coder-plus"
}
Default Model: qwen3-coder-plus
Notes:
ZAI AI platform.
Description: Access AI models through the ZAI platform with support for both international and China-based API endpoints.
Required Fields:
zaiApiKey (password): Your ZAI API keyzaiApiLine (text): API line identifier (default: international_coding)apiModelId (text): The model to use (default: glm-4.6)Available API Lines:
The zaiApiLine parameter determines which API endpoint and region to use:
international_coding (default): International Coding Plan
https://api.z.ai/api/coding/paas/v4international: International Standard
https://api.z.ai/api/paas/v4china_coding: China Coding Plan
https://open.bigmodel.cn/api/coding/paas/v4china: China Standard
https://open.bigmodel.cn/api/paas/v4Example Configuration:
{
"id": "default",
"provider": "zai",
"zaiApiKey": "...",
"zaiApiLine": "international_coding",
"apiModelId": "glm-4.6"
}
Default Model: glm-4.6
Notes:
Unbound AI platform.
Description: Access AI models through the Unbound platform.
Required Fields:
unboundApiKey (password): Your Unbound API keyunboundModelId (text): Model identifier (default: gpt-4o)Example Configuration:
{
"id": "default",
"provider": "unbound",
"unboundApiKey": "...",
"unboundModelId": "gpt-4o"
}
Default Model: gpt-4o
Requesty AI platform.
Description: Access AI models through the Requesty platform.
Required Fields:
requestyApiKey (password): Your Requesty API keyrequestyModelId (text): Model identifier (default: gpt-4o)Optional Fields:
requestyBaseUrl (text): Custom base URL (leave empty for default)Example Configuration:
{
"id": "default",
"provider": "requesty",
"requestyApiKey": "...",
"requestyBaseUrl": "",
"requestyModelId": "gpt-4o"
}
Default Model: gpt-4o
Roo AI platform.
Description: Access AI models through the Roo platform.
Required Fields:
apiModelId (text): Model identifier (default: deepseek-ai/DeepSeek-R1-0528)Example Configuration:
{
"id": "default",
"provider": "roo",
"apiModelId": "deepseek-ai/DeepSeek-R1-0528"
}
Default Model: deepseek-ai/DeepSeek-R1-0528
Notes:
Vercel AI Gateway for unified model access.
Description: Use Vercel's AI Gateway to access multiple AI providers.
Required Fields:
vercelAiGatewayApiKey (password): Your Vercel AI Gateway API keyvercelAiGatewayModelId (text): Model identifier (default: gpt-4o)Example Configuration:
{
"id": "default",
"provider": "vercel-ai-gateway",
"vercelAiGatewayApiKey": "...",
"vercelAiGatewayModelId": "gpt-4o"
}
Default Model: gpt-4o
Notes:
Virtual quota management with automatic fallback.
Description: Manage multiple provider profiles with automatic fallback when quotas are exceeded.
Required Fields:
profiles (text): Array of provider profiles with quota configurationsExample Configuration:
{
"id": "default",
"provider": "virtual-quota-fallback",
"profiles": [
{
"provider": "anthropic",
"quota": 1000000,
"config": {
"apiKey": "...",
"apiModelId": "claude-3-5-sonnet-20241022"
}
},
{
"provider": "openai",
"quota": 500000,
"config": {
"openAiApiKey": "...",
"apiModelId": "gpt-4o"
}
}
]
}
Default Model: gpt-4o
Notes:
Human-in-the-loop relay for manual responses.
Description: Route requests to a human operator for manual responses.
Required Fields:
apiModelId (text): Model identifier (fixed value: human)Example Configuration:
{
"id": "default",
"provider": "human-relay",
"apiModelId": "human"
}
Default Model: human
Notes:
Fake AI provider for testing and development.
Description: Mock AI provider for testing purposes without making actual API calls.
Required Fields:
apiModelId (text): Model identifier (fixed value: fake-model)Example Configuration:
{
"id": "default",
"provider": "fake-ai",
"apiModelId": "fake-model"
}
Default Model: fake-model
Notes:
OVHcloud AI Endpoints inference provider.
Description: Use OVHcloud leading cloud computing for accessing various open-source models, with GDPR compliance and data sovreignty.
Required Field:
ovhCloudAiEndpointsModelId (text): Model identifier (default: gpt-oss-120b)Optional Fields:
ovhCloudAiEndpointsApiKey (password): Your OVHcloud AI Endpoints API key
If you do not provide the API key, you can use our service for free with a rate limit.ovhCloudAiEndpointsBaseUrl (text): Custom base URL for OVHcloud AI Endpoints API requests. Use this when you need to connect to a different OVHcloud region or a custom endpoint. Leave empty to use the default OVHcloud AI Endpoints URL.Example Configuration:
{
"id": "default",
"provider": "ovhcloud",
"ovhCloudAiEndpointsApiKey": "your-api-key",
"ovhCloudAiEndpointsModelId": "gpt-oss-120b"
}
Example Configuration with Custom Base URL:
{
"id": "ovhcloud-custom",
"provider": "ovhcloud",
"ovhCloudAiEndpointsApiKey": "your-api-key",
"ovhCloudAiEndpointsModelId": "gpt-oss-120b",
"ovhCloudAiEndpointsBaseUrl": "https://custom-endpoint.ovhcloud.com/v1"
}
Default Model: gpt-oss-120b
Notes:
Public Cloud > AI & Machine Learning section, then in AI Endpoints.Inception Labs AI platform.
Description: Access AI models through the Inception Labs platform.
Required Fields:
inceptionLabsApiKey (password): Your Inception Labs API keyinceptionLabsModelId (text): Model identifier (default: gpt-4o)Optional Fields:
inceptionLabsBaseUrl (text): Custom base URL (leave empty for default)Example Configuration:
{
"id": "default",
"provider": "inception",
"inceptionLabsApiKey": "...",
"inceptionLabsModelId": "gpt-4o",
"inceptionLabsBaseUrl": ""
}
Default Model: gpt-4o
Notes:
Synthetic AI provider.
Description: Access AI models through the Synthetic platform.
Required Fields:
syntheticApiKey (password): Your Synthetic API keyapiModelId (text): Model identifier (default: synthetic-model)Example Configuration:
{
"id": "default",
"provider": "synthetic",
"syntheticApiKey": "...",
"apiModelId": "synthetic-model"
}
Default Model: synthetic-model
MiniMax AI platform.
Description: Access MiniMax's AI models.
Required Fields:
minimaxApiKey (password): Your MiniMax API keyminimaxBaseUrl (text): MiniMax API base URL (default: https://api.minimax.io/anthropic)apiModelId (text): The model to use (default: MiniMax-M2)Example Configuration:
{
"id": "default",
"provider": "minimax",
"minimaxBaseUrl": "https://api.minimax.io/anthropic",
"minimaxApiKey": "...",
"apiModelId": "MiniMax-M2"
}
Default Model: MiniMax-M2
Notes:
.io and .com domainsFor issues or questions about provider configuration: