When all credentials for a provider are exhausted, the system now automatically falls back to the next available provider in a user-configured fallback chain. Higher-priority providers are restored automatically when their backoff expires. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
13 KiB
Issue #125: Provider Fallback When Multiple Providers Configured
Copyright (c) 2026 Jeremy McSpadden jeremy@fluxlabs.net
Overview
Add cross-provider fallback so that when a provider hits rate/quota limits, the system automatically switches to another provider that serves an equivalent model (or a user-configured fallback chain of different models).
Current State
The codebase already supports:
- Multi-credential per provider — round-robin or session-sticky selection
- Per-credential backoff tracking — rate_limit (30s), quota_exhausted (30min), server_error (20s)
- Credential rotation on error —
markUsageLimitReached()backs off one key and returns whether another key exists for the same provider - Retry with exponential backoff — 3 retries, 2s/4s/8s delays
- Error classification — quota_exhausted, rate_limit, server_error, unknown
The gap: fallback only works within a single provider (multiple API keys). There is no mechanism to fall back to a different provider serving the same or equivalent model.
Architecture
Phase 1: Fallback Chain Configuration & Storage
Goal: Let users define ordered fallback chains that map a primary model to backup model+provider combos.
1.1 — Settings Schema (settings-manager.ts)
Add a new top-level setting:
interface FallbackChainEntry {
provider: string; // e.g. "zai", "alibaba", "openai"
model: string; // e.g. "glm-5", "claude-opus-4-6"
priority: number; // lower = higher priority (1 = primary)
}
interface FallbackSettings {
enabled: boolean; // default: false
chains: Record<string, FallbackChainEntry[]>; // keyed by chain name
// Example:
// "coding": [
// { provider: "zai", model: "glm-5", priority: 1 },
// { provider: "alibaba", model: "glm-5", priority: 2 },
// { provider: "openai", model: "gpt-4.1", priority: 3 }
// ]
}
Files to modify:
packages/pi-coding-agent/src/core/settings-manager.ts— addgetFallbackSettings(),setFallbackChain(),removeFallbackChain(), getter/setter forfallback.enabled
1.2 — Settings File Location
Stored in the existing ~/.pi/agent/settings.json under a new fallback key.
1.3 — CLI Configuration Commands
Add subcommands to the existing settings CLI:
pi settings fallback enable/disablepi settings fallback add-chain <name> --provider <p> --model <m> --priority <n>pi settings fallback remove-chain <name>pi settings fallback list
Files to modify:
packages/pi-coding-agent/src/cli/commands/settings.ts(or equivalent CLI entry point)
Phase 2: Provider-Level Backoff Tracking
Goal: Track backoff state at the provider level (not just credential level) so the fallback system knows when an entire provider is unavailable.
2.1 — Extend AuthStorage (auth-storage.ts)
Add a provider-level backoff map alongside the existing credential-level one:
private providerBackoff: Map<string, number> = new Map();
// Map<provider, backoffExpiresAt>
New methods:
markProviderExhausted(provider: string, errorType: UsageLimitErrorType): void
isProviderAvailable(provider: string): boolean
getProviderBackoffRemaining(provider: string): number // ms until available, 0 if available
Logic: When markUsageLimitReached() returns false (all credentials for a provider
are backed off), also mark the provider itself as backed off with the longest remaining
credential backoff duration.
Files to modify:
packages/pi-coding-agent/src/core/auth-storage.ts
Phase 3: Fallback Resolution Engine
Goal: Given a current model+provider that just failed, find the next available fallback from the configured chain.
3.1 — FallbackResolver (fallback-resolver.ts — new file)
// packages/pi-coding-agent/src/core/fallback-resolver.ts
export interface FallbackResult {
model: Model<Api>;
reason: string; // "quota_exhausted on zai, falling back to alibaba"
}
export class FallbackResolver {
constructor(
private settings: SettingsManager,
private authStorage: AuthStorage,
private modelRegistry: ModelRegistry,
) {}
/**
* Find the next available fallback for the current model.
* Returns null if no fallback is configured or available.
*/
async findFallback(
currentModel: Model<Api>,
errorType: UsageLimitErrorType,
): Promise<FallbackResult | null> {
// 1. Check if fallback is enabled
// 2. Find chain(s) containing currentModel's provider+model
// 3. Sort by priority
// 4. Skip entries where provider is backed off
// 5. Skip entries without valid API keys
// 6. Return first available, or null
}
/**
* Find the chain a model belongs to.
*/
findChainForModel(provider: string, modelId: string): FallbackChainEntry[] | null
/**
* Get the highest-priority available model from a chain.
* Used on session start to pick the best available model.
*/
async getBestAvailable(chainName: string): Promise<FallbackResult | null>
}
3.2 — Model Equivalence
For same-model cross-provider fallback (Phase 1 of the feature), the chain entries explicitly name the provider+model pairs. No automatic equivalence detection needed — the user defines what's equivalent.
Phase 4: Integrate Fallback into Retry Flow
Goal: When credential rotation fails (all keys for a provider exhausted), try the fallback chain before giving up or doing exponential backoff.
4.1 — Modify _handleRetryableError() (agent-session.ts)
Current flow:
1. Classify error
2. Try credential rotation within provider → if success, retry immediately
3. If quota_exhausted and all backed off → give up
4. Exponential backoff retry
New flow:
1. Classify error
2. Try credential rotation within provider → if success, retry immediately
3. ** Try provider fallback via FallbackResolver **
a. If fallback found → swap model on agent, retry immediately
b. Emit event: "fallback_provider_switch" with old/new provider info
4. If quota_exhausted and no fallback → give up
5. Exponential backoff retry
Key changes in agent-session.ts (~lines 2317-2370):
// After credential rotation fails:
if (!hasAlternate) {
const fallbackResult = await this.fallbackResolver?.findFallback(
this.agent.model,
errorType,
);
if (fallbackResult) {
// Swap to fallback model
this.agent.setModel(fallbackResult.model);
this._removeLastError();
this._emitEvent("auto_retry_start", {
attempt: this._retryAttempt + 1,
delayMs: 0,
reason: fallbackResult.reason,
});
await this.agent.continue();
return true;
}
}
4.2 — Agent Model Swapping
The agent needs a method to swap its model mid-conversation:
// agent.ts or agent-loop.ts
setModel(model: Model<Api>): void {
this.config.model = model;
// Re-resolve API key for new provider
}
Important: The API key must also be re-resolved since we're switching providers.
The getApiKey callback in AgentOptions already takes a provider string, so this
should work naturally.
Files to modify:
packages/pi-coding-agent/src/core/agent-session.tspackages/pi-ai/src/agent.tsorpackages/pi-ai/src/agent-loop.ts
Phase 5: Provider Restoration (Auto-Upgrade)
Goal: When a higher-priority provider's backoff expires, switch back to it.
5.1 — Pre-Request Priority Check
Before each LLM request, check if a higher-priority provider in the chain has become available again:
// In agent-loop.ts streamAssistantResponse(), before calling streamFn:
if (this.fallbackResolver) {
const bestAvailable = await this.fallbackResolver.getBestAvailable(currentChain);
if (bestAvailable && bestAvailable.model.provider !== currentModel.provider) {
// Upgrade back to higher-priority provider
this.setModel(bestAvailable.model);
this._emitEvent("fallback_provider_restored", { ... });
}
}
5.2 — Quota Reset Awareness (Future Enhancement)
For now, rely on backoff expiry times. A future enhancement could:
- Parse rate limit headers for reset timestamps
- Store per-provider quota windows (5-hour, daily, weekly, monthly)
- Predict when quota will restore based on usage patterns
This is complex and should be a separate issue.
Phase 6: User-Facing Events & UI
Goal: Surface fallback activity to the user so they know what's happening.
6.1 — New Events
type FallbackEvent =
| { type: "fallback_provider_switch"; from: string; to: string; reason: string }
| { type: "fallback_provider_restored"; provider: string; reason: string }
| { type: "fallback_chain_exhausted"; chain: string; reason: string }
6.2 — TUI Integration
Display a brief notification in the TUI when fallback occurs:
⚡ Switched from zai/glm-5 → alibaba/glm-5 (rate limit)✓ Restored to zai/glm-5 (quota available)⚠ All providers in chain "coding" exhausted
Files to modify:
packages/pi-tui/src/— event handler for new fallback events- Status bar or notification area in the TUI
Implementation Order
| Step | Phase | Effort | Dependencies |
|---|---|---|---|
| 1 | Phase 1.1-1.2: Settings schema | Small | None |
| 2 | Phase 2: Provider-level backoff | Small | None |
| 3 | Phase 3: FallbackResolver | Medium | Steps 1, 2 |
| 4 | Phase 4: Retry integration | Medium | Step 3 |
| 5 | Phase 5.1: Auto-restoration | Small | Step 4 |
| 6 | Phase 1.3: CLI commands | Small | Step 1 |
| 7 | Phase 6: Events & UI | Small | Step 4 |
Steps 1 and 2 can be done in parallel. Steps 6 and 7 can be done in parallel.
Key Design Decisions
1. Explicit chains vs automatic model equivalence
Decision: Explicit user-configured chains. Why: Automatic equivalence is unreliable — models with the same name from different providers may have different capabilities, limits, or pricing. Users should explicitly opt in to which models they consider interchangeable.
2. Where fallback sits in the retry flow
Decision: After credential rotation, before exponential backoff. Why: Provider fallback is a better recovery than waiting and retrying the same exhausted provider. If the fallback also fails, exponential backoff still kicks in.
3. Model swap vs new agent
Decision: Swap model on existing agent mid-conversation.
Why: Creating a new agent would lose conversation context. The agent's streamFn
already accepts model as a parameter, and getApiKey resolves per-provider, so
swapping is straightforward.
4. Restoration strategy
Decision: Check before each request (lazy check on backoff expiry).
Why: No background timers needed. The cost of one isProviderAvailable() check
per request is negligible. More sophisticated quota tracking can be added later.
5. Scope of fallback
Decision: Per-session, not per-agent-type (initially).
Why: The issue mentions per-agent-type toggle, but the simpler initial implementation
is a global fallback chain that applies to any session using a model in the chain.
Per-agent-type scoping can be added by extending the chain config with an agentTypes
filter.
Risks & Mitigations
| Risk | Impact | Mitigation |
|---|---|---|
| Model swap mid-conversation changes behavior | Medium | Log the swap, let user disable fallback |
| Different providers have different tool/feature support | High | Validate fallback model supports same API features before swapping |
| Credential resolution race conditions | Low | Use existing file-lock mechanism in auth-storage |
| Chain misconfiguration (nonexistent model) | Low | Validate chain entries on save, warn on invalid |
| Backoff timing mismatch with actual quota reset | Medium | Conservative backoff defaults; Phase 5.2 for future improvement |
Testing Strategy
- Unit tests for FallbackResolver — mock auth-storage and model-registry, test chain resolution, priority ordering, backoff skipping
- Unit tests for extended auth-storage — provider-level backoff tracking
- Integration test for retry flow — simulate rate limit → credential fallback → provider fallback → restoration
- E2E test — configure a chain, hit rate limit on provider A, verify automatic switch to provider B
- Settings tests — validate chain CRUD operations, persistence, invalid input handling
Files Summary
| File | Action | Changes |
|---|---|---|
packages/pi-coding-agent/src/core/settings-manager.ts |
Modify | Add FallbackSettings types, getters/setters |
packages/pi-coding-agent/src/core/auth-storage.ts |
Modify | Add provider-level backoff tracking |
packages/pi-coding-agent/src/core/fallback-resolver.ts |
New | FallbackResolver class |
packages/pi-coding-agent/src/core/agent-session.ts |
Modify | Integrate fallback into retry flow |
packages/pi-ai/src/agent.ts |
Modify | Add setModel() method |
packages/pi-coding-agent/src/cli/commands/settings.ts |
Modify | Add fallback CLI subcommands |
packages/pi-tui/src/ |
Modify | Fallback event display |