2.7 KiB
2.7 KiB
Quality Score
Principles
- Make code legible to agents with semantic names and explicit boundaries.
- Prefer small, testable modules over files that require broad context to edit.
- Enforce style, architecture, and reliability rules mechanically where possible.
- Keep a cleanup loop for stale docs, generated artifacts, and accumulated implementation debt.
Fast Checks (run on every change)
just typecheck # tsc --project tsconfig.resources.json, no emit
just lint # eslint across src/
Both must pass before any commit. Typecheck catches type drift early. Lint enforces import rules that enforce the Pi clean seam (ADR-010).
Slow Checks (run before shipping)
just test # full unit suite — node --test runner, no coverage overhead
just test-smoke # sf --version, sf --help, sf --print — all three must pass
Coverage thresholds (enforced by npm run test:coverage):
- Statements: 40% minimum
- Lines: 40% minimum
- Branches: 20% minimum
- Functions: 20% minimum
- Autonomous path overrides:
src/resources/extensions/sf/auto/**: 60% statements/lines/functions, 40% branchessrc/resources/extensions/sf/uok/**: 60% statements/lines/functions, 40% branches
These are floors, not targets. The real quality bar is purposeful tests that assert behavior contracts (see docs/SPEC_FIRST_TDD.md).
Evals (ad-hoc, not yet automated)
No automated eval suite exists yet. ADR-018 Phase 3 defines the eval runner contract. Until then, quality for autonomous behavior is measured by:
- Smoke test pass rate across providers
- Manual milestone runs with trace inspection (
.sf/traces/) - Decision register review at milestone close
Known Blind Spots
| Area | Gap | Risk |
|---|---|---|
headless.ts |
RPC lifecycle (spawn → event stream → restart) is not covered by unit tests; only integration-tested manually | High: crash recovery correctness |
| Parallel milestone orchestration | No tests for concurrent STATE.md mutations | Medium: data loss under parallelism |
| Notification routing | Text-matching classification has no per-pattern unit tests | Low: wrong exit code on wording change |
| Stuck detection | Sliding-window logic tested, but real-loop replay is not | Medium: false positives under unusual patterns |
| Provider fallback | Model routing under simulated provider failure not covered | Medium: silent routing to wrong tier |
Doc Quality Signal
grep -r "TODO\|placeholder\|Describe the\|Document.*here\|Record.*here\|Use this as\|Capture.*here\|Track cleanup" \
docs/ --include="*.md"
This should return empty. Any match is a placeholder doc that needs real content.