singularity-forge/mintlify-docs/guides/cost-management.mdx
ace-pm 9d739dfa5d Rename GSD→SF: complete rebrand from fork origin
- All gsdDir/gsdRoot/gsdHome → sfDir/sfRootDir/sfHome
- GSDWorkspace* → SFWorkspace* interfaces
- bootstrapGsdProject → bootstrapProject
- runGSDDoctor → runSFDoctor
- GsdClient → SfClient, gsd-client.ts → sf-client.ts
- .gsd/ → .sf/ in all tests, docs, docker, native, vscode
- Auto-migration: headless detects .gsd/ → renames to .sf/
- Deleted gsd-phase-state.ts backward-compat re-export
- Renamed bin/gsd-from-source → bin/sf-from-source
- Updated mintlify docs, github workflows, docker configs
2026-04-15 18:33:47 +02:00

80 lines
2.5 KiB
Text
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: "Cost management"
description: "Budget ceilings, cost tracking, projections, and enforcement modes."
---
SF tracks token usage and cost for every unit of work dispatched during auto mode. This data powers the dashboard, budget enforcement, and cost projections.
## Cost tracking
Every unit's metrics are captured automatically:
- **Token counts** — input, output, cache read, cache write, total
- **Cost** — USD cost per unit
- **Duration** — wall-clock time
- **Tool calls** — number of tool invocations
- **Message counts** — assistant and user messages
Data is stored in `.sf/metrics.json` and survives across sessions.
### Viewing costs
`Ctrl+Alt+G` or `/sf status` shows real-time cost breakdown by:
- Phase (research, planning, execution, completion, reassessment)
- Slice (M001/S01, M001/S02, ...)
- Model (which models consumed the most budget)
- Project totals
## Budget ceiling
```yaml
budget_ceiling: 50.00
```
### Enforcement modes
| Mode | Behavior |
|------|----------|
| `warn` | Log a warning, continue |
| `pause` | Pause auto mode (default when ceiling is set) |
| `halt` | Stop auto mode entirely |
## Cost projections
After two or more slices complete, SF projects the remaining cost:
```
Projected remaining: $12.40 ($6.20/slice avg × 2 remaining)
```
## Budget pressure and model downgrading
When approaching the budget ceiling, the [complexity router](/guides/token-optimization) automatically downgrades model assignments:
| Budget used | Effect |
|------------|--------|
| < 50% | No adjustment |
| 50-75% | Standard tasks → Light |
| 75-90% | More aggressive |
| > 90% | Nearly everything downgrades |
## Token profiles and cost
| Profile | Typical savings | How |
|---------|----------------|-----|
| `budget` | 40-60% | Cheaper models, phase skipping, minimal context |
| `balanced` | 10-20% | Default models, skip slice research |
| `quality` | 0% (baseline) | Full models, all phases |
See [token optimization](/guides/token-optimization) for details.
## Tips
- Start with `balanced` and a generous `budget_ceiling` to establish baseline costs
- Check `/sf status` after a few slices to see per-slice averages
- Switch to `budget` for well-understood, repetitive work
- Use `quality` only for architectural decisions
- Per-phase model selection lets you use Opus for planning while keeping execution on Sonnet
- Enable [dynamic routing](/guides/dynamic-model-routing) for automatic downgrading on simple tasks
- Use `/sf visualize` → Metrics tab to see where your budget is going