14 KiB
Provider Setup Guide
Step-by-step setup instructions for every LLM provider SF supports. If you ran the onboarding wizard (sf config) and picked a provider, you may already be configured — check with /model inside a session.
Table of Contents
- Quick Reference
- Built-in Providers
- Local Providers
- Custom OpenAI-Compatible Endpoints
- Common Pitfalls
- Verifying Your Setup
Quick Reference
| Provider | Auth Method | Env Variable | Config File |
|---|---|---|---|
| Anthropic | API key | ANTHROPIC_API_KEY |
— |
| OpenAI | API key | OPENAI_API_KEY |
— |
| Google Gemini | API key | GEMINI_API_KEY |
— |
| OpenRouter | API key | OPENROUTER_API_KEY |
Optional models.json |
| Groq | API key | GROQ_API_KEY |
— |
| xAI | API key | XAI_API_KEY |
— |
| Mistral | API key | MISTRAL_API_KEY |
— |
| GitHub Copilot | OAuth | GH_TOKEN |
— |
| Amazon Bedrock | IAM credentials | AWS_PROFILE or AWS_ACCESS_KEY_ID |
— |
| Vertex AI | ADC | GOOGLE_APPLICATION_CREDENTIALS |
— |
| Azure OpenAI | API key | AZURE_OPENAI_API_KEY |
— |
| Ollama | None (local) | — | models.json required |
| LM Studio | None (local) | — | models.json required |
| vLLM / SGLang | None (local) | — | models.json required |
Built-in Providers
Built-in providers have models pre-registered in SF. You only need to supply credentials.
Anthropic (Claude)
Recommended. Anthropic models have the deepest integration: built-in web search, extended thinking, and prompt caching.
Option A — API key (recommended):
export ANTHROPIC_API_KEY="sk-ant-..."
Or run sf config and paste your key when prompted.
Get a key: console.anthropic.com/settings/keys
Note: SF does not support browser-based OAuth sign-in for Anthropic. Use an API key or a configured provider/runtime adapter.
Runtime boundary: SF may use Claude Code, Codex, or Gemini CLI core as
model/runtime adapters when explicitly configured. These adapters are not project
MCP dependencies, and SF does not expose its own workflow as an MCP server. Run
SF directly with sf or /sf autonomous; reserve MCP configuration for external
tools that SF may call.
OpenAI
export OPENAI_API_KEY="sk-..."
Or run sf config and choose "Paste an API key" then "OpenAI".
Get a key: platform.openai.com/api-keys
Google Gemini
export GEMINI_API_KEY="..."
Get a key: aistudio.google.com/app/apikey
OpenRouter
OpenRouter aggregates 200+ models from multiple providers behind a single API key.
Step 1 — Get your API key:
Go to openrouter.ai/keys and create a key.
Step 2 — Set the key:
export OPENROUTER_API_KEY="sk-or-..."
Or run sf config, choose "Paste an API key", then "OpenRouter".
Step 3 — Switch to an OpenRouter model:
Inside a SF session, type /model and select an OpenRouter model. Models are prefixed with openrouter/ (e.g., openrouter/anthropic/claude-sonnet-4).
Optional — Add custom OpenRouter models via models.json:
If you want models not in the built-in list, add them to ~/.sf/agent/models.json:
{
"providers": {
"openrouter": {
"baseUrl": "https://openrouter.ai/api/v1",
"apiKey": "OPENROUTER_API_KEY",
"api": "openai-completions",
"models": [
{
"id": "meta-llama/llama-3.3-70b",
"name": "Llama 3.3 70B (OpenRouter)",
"reasoning": false,
"input": ["text"],
"contextWindow": 131072,
"maxTokens": 32768,
"cost": { "input": 0.3, "output": 0.3, "cacheRead": 0, "cacheWrite": 0 }
}
]
}
}
}
Note: the apiKey field here is the name of the environment variable, not the literal key. SF resolves it automatically. You can also use a literal value or a shell command (see Value Resolution).
Optional — Route through specific providers:
Use modelOverrides to control which upstream provider OpenRouter uses:
{
"providers": {
"openrouter": {
"modelOverrides": {
"anthropic/claude-sonnet-4": {
"compat": {
"openRouterRouting": {
"only": ["amazon-bedrock"]
}
}
}
}
}
}
}
Groq
export GROQ_API_KEY="gsk_..."
Get a key: console.groq.com/keys
xAI (Grok)
export XAI_API_KEY="xai-..."
Get a key: console.x.ai
Mistral
export MISTRAL_API_KEY="..."
Get a key: console.mistral.ai/api-keys
GitHub Copilot
Uses OAuth — sign in through the browser:
sf config
# Choose "Sign in with your browser" → "GitHub Copilot"
Requires an active GitHub Copilot subscription.
Amazon Bedrock
Bedrock uses AWS IAM credentials, not API keys. Any of these work:
# Option 1: Named profile
export AWS_PROFILE="my-profile"
# Option 2: IAM keys
export AWS_ACCESS_KEY_ID="AKIA..."
export AWS_SECRET_ACCESS_KEY="..."
export AWS_REGION="us-east-1"
# Option 3: Bedrock API key (bearer token)
export AWS_BEARER_TOKEN_BEDROCK="..."
ECS task roles and IRSA (Kubernetes) are also detected automatically.
Anthropic on Vertex AI
Uses Google Cloud Application Default Credentials:
gcloud auth application-default login
export ANTHROPIC_VERTEX_PROJECT_ID="my-project-id"
Or set GOOGLE_CLOUD_PROJECT and ensure ADC credentials exist at ~/.config/gcloud/application_default_credentials.json.
Azure OpenAI
export AZURE_OPENAI_API_KEY="..."
Local Providers
Local providers run on your machine. They require a models.json configuration file because SF needs to know the endpoint URL and which models are available.
Config file location: ~/.sf/agent/models.json
The file reloads each time you open /model — no restart needed.
Ollama
Step 1 — Install and start Ollama:
# Linux
curl -fsSL https://ollama.com/install.sh | sh
ollama serve
Step 2 — Pull a model:
ollama pull llama3.1:8b
ollama pull qwen2.5-coder:7b
Step 3 — Create ~/.sf/agent/models.json:
{
"providers": {
"ollama": {
"baseUrl": "http://localhost:11434/v1",
"api": "openai-completions",
"apiKey": "ollama",
"compat": {
"supportsDeveloperRole": false,
"supportsReasoningEffort": false
},
"models": [
{ "id": "llama3.1:8b" },
{ "id": "qwen2.5-coder:7b" }
]
}
}
}
The apiKey is required by the config schema but Ollama ignores it — any value works.
Step 4 — Select the model:
Inside SF, type /model and pick your Ollama model.
Ollama tips:
- Ollama does not support the
developerrole orreasoning_effort— always setcompat.supportsDeveloperRole: falseandcompat.supportsReasoningEffort: false. - If you get empty responses, check that
ollama serveis running and the model is pulled. - Context window and max tokens default to 128K / 16K if not specified. Override these if your model has different limits.
LM Studio
Step 1 — Install LM Studio:
Download from lmstudio.ai.
Step 2 — Start the local server:
In LM Studio, go to the "Local Server" tab, load a model, and click "Start Server". The default port is 1234.
Step 3 — Create ~/.sf/agent/models.json:
{
"providers": {
"lm-studio": {
"baseUrl": "http://localhost:1234/v1",
"api": "openai-completions",
"apiKey": "lm-studio",
"compat": {
"supportsDeveloperRole": false,
"supportsReasoningEffort": false
},
"models": [
{
"id": "your-model-name",
"name": "My Local Model",
"contextWindow": 32768,
"maxTokens": 4096
}
]
}
}
}
Replace your-model-name with the model identifier shown in LM Studio's server tab.
LM Studio tips:
- The model ID in
models.jsonmust match what LM Studio reports in its server API. Check the server tab for the exact string. - LM Studio defaults to port 1234. If you changed it, update
baseUrlaccordingly. - Increase
contextWindowandmaxTokensif your model supports larger contexts.
vLLM
{
"providers": {
"vllm": {
"baseUrl": "http://localhost:8000/v1",
"api": "openai-completions",
"apiKey": "vllm",
"compat": {
"supportsDeveloperRole": false,
"supportsReasoningEffort": false,
"supportsUsageInStreaming": false
},
"models": [
{
"id": "meta-llama/Llama-3.1-8B-Instruct",
"contextWindow": 128000,
"maxTokens": 16384
}
]
}
}
}
The model id must match the --model flag you passed to vllm serve.
SGLang
{
"providers": {
"sglang": {
"baseUrl": "http://localhost:30000/v1",
"api": "openai-completions",
"apiKey": "sglang",
"compat": {
"supportsDeveloperRole": false,
"supportsReasoningEffort": false
},
"models": [
{
"id": "meta-llama/Llama-3.1-8B-Instruct"
}
]
}
}
}
Custom OpenAI-Compatible Endpoints
Any server that implements the OpenAI Chat Completions API can work with SF. This covers proxies (LiteLLM, Portkey, Helicone), self-hosted inference, and new providers.
Quickest path — use the onboarding wizard:
sf config
# Choose "Paste an API key" → "Custom (OpenAI-compatible)"
# Enter: base URL, API key, model ID
This writes ~/.sf/agent/models.json for you automatically.
Manual setup:
{
"providers": {
"my-provider": {
"baseUrl": "https://my-endpoint.example.com/v1",
"apiKey": "MY_PROVIDER_API_KEY",
"api": "openai-completions",
"models": [
{
"id": "model-id-here",
"name": "Friendly Model Name",
"reasoning": false,
"input": ["text"],
"contextWindow": 128000,
"maxTokens": 16384,
"cost": { "input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0 }
}
]
}
}
}
Adding custom headers (for proxies):
{
"providers": {
"litellm-proxy": {
"baseUrl": "https://litellm.example.com/v1",
"apiKey": "MY_API_KEY",
"api": "openai-completions",
"headers": {
"x-custom-header": "value"
},
"models": [...]
}
}
}
Qwen models with thinking mode:
For Qwen-compatible servers, use thinkingFormat to enable thinking mode:
{
"compat": {
"thinkingFormat": "qwen",
"supportsDeveloperRole": false
}
}
Use "qwen-chat-template" instead if the server requires chat_template_kwargs.enable_thinking.
For the full reference on compat fields, modelOverrides, value resolution, and advanced configuration, see Custom Models.
Common Pitfalls
"Authentication failed" with a valid key
Cause: The key is set in your shell but not visible to SF.
Fix: Make sure the environment variable is exported in the same terminal where you run sf. Or use sf config to save the key to ~/.sf/agent/auth.json so it persists across sessions.
OpenRouter models not appearing in /model
Cause: No OPENROUTER_API_KEY set, so SF hides OpenRouter models.
Fix: Set the key and restart SF:
export OPENROUTER_API_KEY="sk-or-..."
sf
Ollama returns empty responses
Cause: Ollama server isn't running, or the model isn't pulled.
Fix:
# Verify the server is running
curl http://localhost:11434/v1/models
# Pull the model if missing
ollama pull llama3.1:8b
LM Studio model ID mismatch
Cause: The id in models.json doesn't match what LM Studio exposes via its API.
Fix: Check the LM Studio server tab for the exact model identifier. It often includes the filename or quantization level (e.g., lmstudio-community/Meta-Llama-3.1-8B-Instruct-GGUF).
developer role error with local models
Cause: Most local inference servers don't support the OpenAI developer message role.
Fix: Add compat.supportsDeveloperRole: false to the provider config. This makes SF send system messages instead:
{
"compat": {
"supportsDeveloperRole": false,
"supportsReasoningEffort": false
}
}
stream_options error with local models
Cause: Some servers don't support stream_options: { include_usage: true }.
Fix: Add compat.supportsUsageInStreaming: false:
{
"compat": {
"supportsUsageInStreaming": false
}
}
"apiKey is required" validation error
Cause: models.json schema requires apiKey when models are defined.
Fix: For local servers that don't need auth, set a dummy value:
"apiKey": "not-needed"
Cost shows $0.00 for custom models
Expected behavior. SF defaults cost to zero for custom models. Override with the cost field if you want accurate cost tracking:
"cost": { "input": 0.15, "output": 0.60, "cacheRead": 0.015, "cacheWrite": 0.19 }
Values are per million tokens.
Verifying Your Setup
After configuring a provider:
-
Launch SF:
sf -
Check available models:
/modelYour provider's models should appear in the list.
-
Switch to the model: Select it from the
/modelpicker. -
Send a test message: Type anything to confirm the model responds.
If the model doesn't appear, check:
- The environment variable is set in the current shell
models.jsonis valid JSON (usecat ~/.sf/agent/models.json | python3 -m json.tool)- The server is running (for local providers)
For additional help, see Troubleshooting or run /sf doctor inside a session.