feat: implement ADR-001 gitignore split and fill placeholder docs
Gitignore (core change): - Remove stale blanket .sf/ entries from .gitignore (migrated to .git/info/exclude on 2026-04-29, never cleaned up) - gitignore.ts: split SF_RUNTIME_EXCLUSION_PATTERNS into two modes — SF_SYMLINK_EXCLUSION_PATTERNS (blanket .sf for symlink repos where git cannot traverse the symlink) and SF_RUNTIME_EXCLUSION_PATTERNS (granular runtime-only patterns for directory repos, enabling .sf/milestones/ and other durable planning artifacts to be tracked) - ensureGitInfoExclude() now detects symlink vs directory and writes the correct patterns, handling transitions between modes cleanly - ADR-001 status: Proposed → Accepted Docs: - Fill 11 placeholder scaffold docs with real SF-specific content: PLANS, DESIGN, PRODUCT_SENSE, QUALITY_SCORE, RELIABILITY, SECURITY, design-docs/index.md, exec-plans/active, exec-plans/completed, exec-plans/tech-debt-tracker, records/index - Add records note: docs/records/2026-05-01-repo-vcs-and-notifications.md - ADR-008 status: Accepted → Proposed (deferred — not applicable to current usage model where Claude Code assists externally, not as a Pi provider inside SF's dispatch loop) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
parent
a611cd5792
commit
16ff608d80
15 changed files with 608 additions and 46 deletions
7
.gitignore
vendored
7
.gitignore
vendored
|
|
@ -71,8 +71,8 @@ docs/coherence-audit/
|
|||
.plans/
|
||||
|
||||
# ── SF project state (per-worktree, never committed) ──
|
||||
.sf/
|
||||
.sf/
|
||||
# Runtime-only patterns are managed per-clone in .git/info/exclude by sf.
|
||||
# Tracked artifacts (.sf/milestones/, .sf/PROJECT.md, etc.) live in version control.
|
||||
|
||||
# ── Native Rust build outputs ──
|
||||
native/addon/*.node
|
||||
|
|
@ -86,9 +86,6 @@ rust-engine/target/
|
|||
pnpm-lock.yaml
|
||||
bun.lock
|
||||
|
||||
# ── SF baseline (auto-generated) ──
|
||||
.sf
|
||||
|
||||
# ── SF baseline (auto-generated) ──
|
||||
.sf-id
|
||||
.direnv/
|
||||
|
|
|
|||
|
|
@ -1,3 +1,56 @@
|
|||
# Design
|
||||
|
||||
Record interaction patterns, visual constraints, and design-system usage here.
|
||||
SF's UI is a terminal application built on the Pi TUI framework (`@mariozechner/pi-tui`). These are the binding constraints any UI work must respect.
|
||||
|
||||
## The Cardinal Rule: Line Width
|
||||
|
||||
**Every line returned from `render(width)` must not exceed `width` in visible characters.** Exceeding it causes terminal line-wrapping, cursor misposition, and visual corruption the framework cannot fix.
|
||||
|
||||
Use the Pi TUI utilities — never raw `string.length`:
|
||||
|
||||
```typescript
|
||||
import { visibleWidth, truncateToWidth, wrapTextWithAnsi } from "@mariozechner/pi-tui";
|
||||
|
||||
visibleWidth("\x1b[32mHello\x1b[0m"); // 5, not 14
|
||||
truncateToWidth("Very long text here", 10); // "Very lo..."
|
||||
wrapTextWithAnsi("\x1b[32mlong green\x1b[0m", 15); // preserves ANSI per line
|
||||
```
|
||||
|
||||
`visibleWidth` strips ANSI escape codes before measuring. `truncateToWidth` preserves ANSI codes in the truncated output. Use these everywhere a line's display length matters.
|
||||
|
||||
## Render Pattern
|
||||
|
||||
```typescript
|
||||
render(width: number): string[] {
|
||||
const lines: string[] = [];
|
||||
lines.push(truncateToWidth(` ${prefix}${content}`, width));
|
||||
|
||||
const labelWidth = visibleWidth(label);
|
||||
const available = width - labelWidth - 4; // padding
|
||||
lines.push(` ${label}: ${truncateToWidth(value, available)}`);
|
||||
|
||||
return lines;
|
||||
}
|
||||
```
|
||||
|
||||
## Overlays and Modals
|
||||
|
||||
Floating panels use the Pi TUI overlay pattern: they render at a fixed position within the terminal bounds and must still respect the outer `width` constraint. An overlay that overflows its bounds causes the same wrapping corruption as any other component.
|
||||
|
||||
Use `ctx.ui.dialog()` for modal user input. Use `ctx.ui.notify()` for transient non-blocking notices. Persistent notification state goes through `notification-store.ts` → `notification-overlay.ts`.
|
||||
|
||||
## Theming
|
||||
|
||||
Colors and styles come from the Pi TUI theme system, not from hardcoded ANSI codes. Access the active theme via the `ExtensionContext`. Respect theme changes: components must re-render when the theme changes (implement `onThemeChange` if caching rendered output).
|
||||
|
||||
## IME and Focus
|
||||
|
||||
Interactive input components must implement the `Focusable` interface to receive keyboard events correctly, especially for IME (input method editor) support on non-ASCII keyboards. Components that handle key input but do not implement `Focusable` will silently swallow events.
|
||||
|
||||
## Performance
|
||||
|
||||
Cache rendered output when the underlying data hasn't changed. Invalidate the cache on data change or theme change. Do not re-render on every tick. The TUI framework calls `render()` frequently; rendering must be cheap.
|
||||
|
||||
## Reference
|
||||
|
||||
Full TUI documentation: [`docs/dev/pi-ui-tui/`](./dev/pi-ui-tui/README.md)
|
||||
|
|
|
|||
|
|
@ -1,3 +1,24 @@
|
|||
# Plans
|
||||
|
||||
Use this as the index for current and upcoming work. Link detailed plans in `docs/exec-plans/`.
|
||||
Index of current and upcoming work. Detailed plans live in [`docs/exec-plans/`](./exec-plans/).
|
||||
|
||||
## Active
|
||||
|
||||
| Initiative | Purpose | ADR / Doc |
|
||||
|-----------|---------|-----------|
|
||||
| Repo-native harness evolution | Stage-by-stage wiring of the harness profiler, template kits, and evidence runner into autonomous dispatch | [ADR-018](./dev/ADR-018-repo-native-harness-evolution.md) |
|
||||
| SF tools over MCP (Phase 1) | Expose workflow mutation tools over MCP so Claude Code and external providers can participate in autonomous execution | [ADR-008](./dev/ADR-008-sf-tools-over-mcp-for-provider-parity.md) |
|
||||
| Notification event model | Implement structured source/kind/blocking metadata on all event paths, replacing fragile text matching | [design doc](./design-docs/notification-event-model.md) |
|
||||
| repo-vcs skill | Landed — VCS context injection into system prompt; repo-vcs bundled skill for commit/push/safe-push | commit `a611cd579` |
|
||||
|
||||
## Upcoming
|
||||
|
||||
| Initiative | Depends on |
|
||||
|-----------|-----------|
|
||||
| Parallel milestone state locking (SQLite) | ADR-018 Phase 1 |
|
||||
| ADR template + `just adr` / `just spec` generation recipes | — |
|
||||
| Skill health dashboard (`/sf skill-health`) | Telemetry already wired |
|
||||
| Go/Charm judge-calibration service | ADR-018 Phase 5 |
|
||||
|
||||
See [`exec-plans/active/`](./exec-plans/active/) for task-level breakdowns and
|
||||
[`exec-plans/tech-debt-tracker.md`](./exec-plans/tech-debt-tracker.md) for known cleanup.
|
||||
|
|
|
|||
|
|
@ -1,3 +1,43 @@
|
|||
# Product Sense
|
||||
|
||||
Capture user goals, non-goals, tradeoffs, and examples of good product judgment for this repo.
|
||||
## The Core Thesis
|
||||
|
||||
Autonomous execution is the end gate. SF exists to take a multi-phase software project — a milestone with slices and tasks — and run it to completion without human intervention, producing a clean git history, passing tests, and a deployable artifact.
|
||||
|
||||
Every design decision should be evaluated against this question: **does it make autonomous execution more reliable, more observable, or more recoverable?**
|
||||
|
||||
## User Goals
|
||||
|
||||
- Hand off a milestone and have it complete without babysitting
|
||||
- Know the agent won't make irreversible mistakes (write gates, protected files, budget ceilings)
|
||||
- Resume after a crash without losing work (state-on-disk, crash recovery)
|
||||
- See what the agent did and why (trace files, decision register, records keeper)
|
||||
- Steer mid-run without breaking the loop (message queue, steering gate)
|
||||
|
||||
## Non-Goals
|
||||
|
||||
- Being a chat interface — use the Pi interactive mode for exploratory conversation
|
||||
- Replacing CI — SF triggers verification but does not replace your existing CI pipeline
|
||||
- Working without context — SF needs a spec, a roadmap, and a task plan; it does not invent work from nothing
|
||||
|
||||
## What Good Product Judgment Looks Like
|
||||
|
||||
**Fresh context per unit, not accumulated context.** Each task gets a new session with exactly the context it needs pre-injected (task plan, slice plan, prior summaries, relevant skills). This prevents quality degradation from context accumulation — one of the primary failure modes of naive LLM agents on long projects.
|
||||
|
||||
**State machine, not LLM guessing.** The loop is deterministic: read STATE.md → validate → dispatch → post-unit → verify → advance. The LLM executes work inside a unit; it does not decide what the next unit is. Separating orchestration from execution keeps the system predictable.
|
||||
|
||||
**Spec-first.** No behavior change without a failing test first. No completion without a real consumer. This is the iron law — not a suggestion. An agent that completes tasks without specs is just making things up.
|
||||
|
||||
**Crash recovery must be invisible.** A crashed session should resume within seconds with no visible data loss. If recovery requires human intervention, it is a product failure.
|
||||
|
||||
**User stays in the loop via gates, not via interrupts.** Discussion gates, write gates, budget ceilings, and approval prompts are the designed points of human interaction. The agent should not need to ask for help in the middle of a task.
|
||||
|
||||
## Tradeoffs
|
||||
|
||||
| Choice | What we gave up | Why |
|
||||
|--------|----------------|-----|
|
||||
| Fresh session per unit | Conversational continuity across units | Quality and predictability over convenience |
|
||||
| State on disk (not in memory) | Speed of in-memory state | Crash recovery and multi-process visibility |
|
||||
| Write gate during queue | Faster iteration in planning | Safety: prevents accidental file mutations during discussion |
|
||||
| Protected files (ADRs, SPEC.md) | Agent autonomy over architecture docs | Human oversight over durable decisions |
|
||||
| Serial execution default | Throughput | Correctness before parallelism; parallel locking is deferred debt |
|
||||
|
|
|
|||
|
|
@ -1,10 +1,59 @@
|
|||
# Quality Score
|
||||
|
||||
Define what good looks like for this repo. Include fast checks, slow checks, evals, and known blind spots.
|
||||
|
||||
Use these principles:
|
||||
## Principles
|
||||
|
||||
- Make code legible to agents with semantic names and explicit boundaries.
|
||||
- Prefer small, testable modules over files that require broad context to edit.
|
||||
- Enforce style, architecture, and reliability rules mechanically where possible.
|
||||
- Keep a cleanup loop for stale docs, generated artifacts, and accumulated implementation debt.
|
||||
|
||||
## Fast Checks (run on every change)
|
||||
|
||||
```bash
|
||||
just typecheck # tsc --project tsconfig.resources.json, no emit
|
||||
just lint # eslint across src/
|
||||
```
|
||||
|
||||
Both must pass before any commit. Typecheck catches type drift early. Lint enforces import rules that enforce the Pi clean seam (ADR-010).
|
||||
|
||||
## Slow Checks (run before shipping)
|
||||
|
||||
```bash
|
||||
just test # full unit suite — node --test runner, no coverage overhead
|
||||
just test-smoke # sf --version, sf --help, sf --print — all three must pass
|
||||
```
|
||||
|
||||
Coverage thresholds (enforced by `npm run test:coverage`):
|
||||
- Statements: **40%** minimum
|
||||
- Lines: **40%** minimum
|
||||
- Branches: **20%** minimum
|
||||
- Functions: **20%** minimum
|
||||
|
||||
These are floors, not targets. The real quality bar is purposeful tests that assert behavior contracts (see `docs/SPEC_FIRST_TDD.md`).
|
||||
|
||||
## Evals (ad-hoc, not yet automated)
|
||||
|
||||
No automated eval suite exists yet. ADR-018 Phase 3 defines the eval runner contract. Until then, quality for autonomous behavior is measured by:
|
||||
|
||||
- Smoke test pass rate across providers
|
||||
- Manual milestone runs with trace inspection (`.sf/traces/`)
|
||||
- Decision register review at milestone close
|
||||
|
||||
## Known Blind Spots
|
||||
|
||||
| Area | Gap | Risk |
|
||||
|------|-----|------|
|
||||
| `headless.ts` | RPC lifecycle (spawn → event stream → restart) is not covered by unit tests; only integration-tested manually | High: crash recovery correctness |
|
||||
| Parallel milestone orchestration | No tests for concurrent STATE.md mutations | Medium: data loss under parallelism |
|
||||
| Notification routing | Text-matching classification has no per-pattern unit tests | Low: wrong exit code on wording change |
|
||||
| Stuck detection | Sliding-window logic tested, but real-loop replay is not | Medium: false positives under unusual patterns |
|
||||
| Provider fallback | Model routing under simulated provider failure not covered | Medium: silent routing to wrong tier |
|
||||
|
||||
## Doc Quality Signal
|
||||
|
||||
```bash
|
||||
grep -r "TODO\|placeholder\|Describe the\|Document.*here\|Record.*here\|Use this as\|Capture.*here\|Track cleanup" \
|
||||
docs/ --include="*.md"
|
||||
```
|
||||
|
||||
This should return empty. Any match is a placeholder doc that needs real content.
|
||||
|
|
|
|||
|
|
@ -1,3 +1,72 @@
|
|||
# Reliability
|
||||
|
||||
Document expected failure modes, recovery paths, observability, and release checks here.
|
||||
## Exit Codes (headless mode)
|
||||
|
||||
| Code | Meaning |
|
||||
|------|---------|
|
||||
| 0 | Success — unit or session completed cleanly |
|
||||
| 1 | Error or timeout |
|
||||
| 10 | Blocked — LLM called an interactive tool that requires user input; parent must respond or abort |
|
||||
| 11 | Cancelled — SIGINT or SIGTERM received |
|
||||
| 12 | Reload — agent requested restart-with-resume on the same session |
|
||||
|
||||
## Failure Modes and Recovery
|
||||
|
||||
### Process crash mid-unit
|
||||
**Detection:** Lock file in `.sf/` is present on next launch; RPC child process is gone.
|
||||
|
||||
**Recovery path (`src/resources/extensions/sf/auto-recovery.ts`):**
|
||||
1. Read the surviving session JSONL from `~/.sf/sessions/<session-id>/`
|
||||
2. Synthesize a recovery briefing from every tool call recorded on disk
|
||||
3. Resume the LLM mid-unit with the briefing as context — no state is lost
|
||||
4. If the session JSONL is unreadable, fall back to starting the unit fresh
|
||||
|
||||
### Timeout
|
||||
**Detection:** Headless parent receives no heartbeat within `HEADLESS_HEARTBEAT_INTERVAL_MS` (60 000 ms), or the unit wall-clock exceeds the configured timeout.
|
||||
|
||||
**Recovery path:** `auto-timeout-recovery.ts` writes a timeout summary, marks the unit `needs_fix`, and advances the loop. The parent exits with code 1 unless `--max-restarts` allows a retry.
|
||||
|
||||
### Stuck detection (repeating-pattern loops)
|
||||
**Detection (`src/resources/extensions/sf/auto-stuck-detection.ts`):** Sliding-window analysis over the last ~10 unit results. If the same A→B→A→B pattern repeats, the loop is classified as stuck.
|
||||
|
||||
**Recovery path:** Retry once with a deep diagnostic prompt that shows the pattern. If still stuck, stop and surface the exact expected file for human inspection. Stuck state persists across session restarts.
|
||||
|
||||
### Provider API errors (transient)
|
||||
**Detection:** `bootstrap/provider-error-resume.ts` intercepts 429, 500, 503 responses.
|
||||
|
||||
**Recovery path:** Exponential backoff; re-queue the unit. If a provider is consistently unavailable, route to the configured fallback model.
|
||||
|
||||
### Verification gate failures
|
||||
**Detection:** `auto-verification.ts` runs lint/test after each task; non-zero exit = failure.
|
||||
|
||||
**Recovery path:** Auto-retry the task up to 2× with the agent receiving full command output as context. After 2 failures the task is marked `needs_fix` and the loop advances with a warning.
|
||||
|
||||
### Budget ceiling hit
|
||||
**Detection:** `auto-budget.ts` tracks cumulative dollar cost; emits warnings at 75%, 80%, 90%, and halts at 100%.
|
||||
|
||||
**Recovery path:** Auto-mode pauses; user must explicitly approve resumption. The current unit is not retried.
|
||||
|
||||
## Restart Loop (headless daemon mode)
|
||||
|
||||
`sf headless auto --max-restarts 3` applies exponential backoff: 5 s → 10 s → 30 s (cap). After exhausting restarts the parent exits with code 1. Each restart resumes via crash recovery above.
|
||||
|
||||
## Observability
|
||||
|
||||
| Signal | Location |
|
||||
|--------|----------|
|
||||
| Structured trace | `.sf/traces/trace-<timestamp>.json` — full session span tree with tokens, cost, duration |
|
||||
| Event audit log | `.sf/event-log.jsonl` — every unit completion, tool call, decision save (v2 format) |
|
||||
| Desktop notifications | OS-native; configurable via preferences (`notifications.*`) |
|
||||
| Stderr progress | All headless output goes to stderr; stdout carries JSON result when `--output-format json` |
|
||||
| Heartbeat | Emitted every 60 s to detect hung parent/child communication |
|
||||
|
||||
## Release Checks
|
||||
|
||||
Before shipping a build:
|
||||
|
||||
```bash
|
||||
just test # full unit test suite
|
||||
just smoke-test # sf --version, sf --help, sf --print
|
||||
just typecheck # tsc extensions, no emit
|
||||
just lint # eslint
|
||||
```
|
||||
|
|
|
|||
|
|
@ -1,3 +1,53 @@
|
|||
# Security
|
||||
|
||||
Document trust boundaries, secrets handling, dependency risk, and security review requirements here.
|
||||
## Auth Model and Trust Boundaries
|
||||
|
||||
SF never manages Anthropic OAuth directly. The safe paths are:
|
||||
|
||||
- **API key** — user sets `ANTHROPIC_API_KEY` or configures it in auth.json. SF reads it; never generates or exchanges it.
|
||||
- **Claude Code CLI (`claude-code` provider)** — SF shells out to the real `claude` CLI and lets it handle its own credential selection. SF does not reuse Claude subscription tokens.
|
||||
- **Cloud providers** — Bedrock, Vertex, Azure via their own credential chains.
|
||||
|
||||
**Prohibited patterns (from `docs/user-docs/claude-code-auth-compliance.md`):**
|
||||
- SF-managed Anthropic OAuth flow for subscription accounts
|
||||
- Reusing user Claude subscription credentials inside SF's own API client
|
||||
- Making Anthropic believe requests come from Claude Code when they come from SF infrastructure
|
||||
|
||||
## Write Gate
|
||||
|
||||
`src/resources/extensions/sf/bootstrap/write-gate.ts` enforces a phase-aware write boundary:
|
||||
|
||||
- During **queue mode** (pre-dispatch planning): only `.sf/` writes and read-only tool calls are permitted. All other file writes are blocked.
|
||||
- **QUEUE_SAFE_TOOLS** allowlist: `read`, `grep`, `find`, `ls`, `ask_user_questions`, planning tools, web research tools.
|
||||
- **BASH_READ_ONLY_RE**: regex allowlist of commands safe to run during write-restricted phases (`cat`, `git log`, `npm run test|lint|typecheck`, `jq`, etc.).
|
||||
- Write-gate violations are logged and surfaced to the user; they do not crash the session.
|
||||
|
||||
## Protected Files
|
||||
|
||||
The following files require human review before any automated modification (per `docs/SPEC_FIRST_TDD.md`):
|
||||
|
||||
- `ADR-*.md` — architecture decision records
|
||||
- `SPEC.md`, `ARCHITECTURE.md`, `AGENTS.md`
|
||||
- `docs/SECURITY.md`, `docs/RELIABILITY.md`
|
||||
|
||||
SF will not autonomously overwrite these. Any proposed change to a protected file is surfaced as a diff for human acceptance.
|
||||
|
||||
## Secret Scanning
|
||||
|
||||
Pre-commit hook via `npm run secret-scan:install-hook`. Blocks commits containing patterns matching API keys, tokens, and credentials. Install with:
|
||||
|
||||
```bash
|
||||
npm run secret-scan:install-hook
|
||||
```
|
||||
|
||||
## Dependency Risk
|
||||
|
||||
- `npm audit` runs in CI on every push.
|
||||
- No `--ignore-scripts` bypass: postinstall scripts are reviewed before adding new dependencies.
|
||||
- Rust N-API bindings (`packages/native/`) undergo separate native-build review for ABI safety.
|
||||
|
||||
## Sandbox Model
|
||||
|
||||
SF agents execute inside the Pi RPC child process. The write gate and tool allowlist are the primary sandbox. There is no OS-level sandbox (no container or seccomp) in the default local deployment.
|
||||
|
||||
**Headless unsupervised mode** (`--no-supervised`): SF exits with code 10 (blocked) rather than auto-responding to any interactive tool call. This is the safe default for CI pipelines where no human is available to respond.
|
||||
|
|
|
|||
|
|
@ -1,3 +1,32 @@
|
|||
# Design Docs
|
||||
|
||||
Durable design decisions live here. Link active proposals, completed decisions, and rejected alternatives.
|
||||
Durable design decisions live here. ADRs (Architecture Decision Records) are numbered sequentially
|
||||
in `docs/dev/`. Lighter design docs (problem framing, event model decisions) live in this directory.
|
||||
|
||||
## Architecture Decision Records (`docs/dev/`)
|
||||
|
||||
| ADR | Title | Status |
|
||||
|-----|-------|--------|
|
||||
| [ADR-001](../dev/ADR-001-branchless-worktree-architecture.md) | Branchless Worktree Architecture — `.sf/milestones/` tracked, runtime gitignored | Accepted |
|
||||
| [ADR-003](../dev/ADR-003-pipeline-simplification.md) | Pipeline Simplification — research merged into planning | Accepted |
|
||||
| [ADR-004](../dev/ADR-004-capability-aware-model-routing.md) | Capability-Aware Model Routing | Accepted |
|
||||
| [ADR-005](../dev/ADR-005-multi-model-provider-tool-strategy.md) | Multi-Model Provider Tool Strategy | Accepted |
|
||||
| [ADR-007](../dev/ADR-007-model-catalog-split.md) | Model Catalog Split | Accepted |
|
||||
| [ADR-008](../dev/ADR-008-sf-tools-over-mcp-for-provider-parity.md) | SF Tools over MCP for Provider Parity | Proposed — deferred (usage model mismatch) |
|
||||
| [ADR-009](../dev/ADR-009-orchestration-kernel-refactor.md) | Orchestration Kernel Refactor | Accepted |
|
||||
| [ADR-010](../dev/ADR-010-pi-clean-seam-architecture.md) | Pi Clean Seam Architecture | Accepted |
|
||||
| [ADR-011](../dev/ADR-011-swarm-chat-and-debate-mode.md) | Swarm Chat and Debate Mode | Proposed |
|
||||
| [ADR-012](../dev/ADR-012-multi-instance-federation.md) | Multi-Instance Federation | Proposed |
|
||||
| [ADR-013](../dev/ADR-013-network-and-remote-execution.md) | Network and Remote Execution | Proposed |
|
||||
| [ADR-014](../dev/ADR-014-singularity-knowledge-and-agent-platform.md) | Singularity Knowledge and Agent Platform | Proposed |
|
||||
| [ADR-015](../dev/ADR-015-flight-recorder.md) | Flight Recorder | Proposed |
|
||||
| [ADR-016](../dev/ADR-016-charm-ai-stack-adoption.md) | Charm AI Stack Adoption | Proposed |
|
||||
| [ADR-017](../dev/ADR-017-charm-tui-client.md) | Charm TUI Client | Proposed |
|
||||
| [ADR-018](../dev/ADR-018-repo-native-harness-evolution.md) | Repo-Native Harness Evolution | Proposed — staged impl |
|
||||
|
||||
## Design Docs (this directory)
|
||||
|
||||
| Doc | Title | Status |
|
||||
|-----|-------|--------|
|
||||
| [core-beliefs.md](./core-beliefs.md) | Core Beliefs | Accepted |
|
||||
| [notification-event-model.md](./notification-event-model.md) | Notification Event Model | Draft |
|
||||
|
|
|
|||
|
|
@ -1,6 +1,6 @@
|
|||
# ADR-001: Branchless Worktree Architecture
|
||||
|
||||
**Status:** Proposed
|
||||
**Status:** Accepted
|
||||
**Date:** 2026-03-15
|
||||
**Deciders:** Lex Christopherson
|
||||
**Advisors:** Claude Opus 4.6, Gemini 2.5 Pro, GPT-5.4 (Codex)
|
||||
|
|
|
|||
|
|
@ -1,3 +1,36 @@
|
|||
# Active Execution Plans
|
||||
|
||||
Link active plans here. Each plan should state purpose, scope, tasks, acceptance criteria, and verification.
|
||||
## ADR-018: Repo-Native Harness Evolution
|
||||
|
||||
**Purpose:** Make SF's harness mechanisms (verification gates, repo profiler, template kits, eval runner) useful in every repo SF works on, adapting over time as the repo changes shape.
|
||||
|
||||
**Scope:** Staged in 7 phases per ADR-018. Only phases 1–2 are in scope for near-term execution.
|
||||
|
||||
**Phase 1 — Repo profile snapshots (next)**
|
||||
- Add read-only `RepoProfile` snapshot before each planning milestone
|
||||
- Record observed (untracked) files in `.sf/sf.db` as `observed_only`
|
||||
- No tracked repo file writes; no worker-prompt changes
|
||||
|
||||
**Phase 2 — Template kit registry and harness manifest**
|
||||
- Parameterized harness template kit registry (Agent Runtime, RAG, Web App, Nix, Charm)
|
||||
- Dry-run harness proposals as planning artifacts only — no tracked repo writes
|
||||
|
||||
**Acceptance criteria:** Phase 1 produces a repo profile snapshot in `.sf/sf.db` before every planning milestone. Phase 2 produces a dry-run harness proposal as a planning artifact viewable at milestone review.
|
||||
|
||||
**Falsifier:** If a planning milestone produces no repo profile entry in `.sf/sf.db`, Phase 1 is incomplete.
|
||||
|
||||
**Verification:** `node -e "require('./src/resources/extensions/sf/repo-profiler.js').buildRepoProfile(process.cwd()).then(p => console.log(JSON.stringify(p, null, 2)))"`
|
||||
|
||||
**ADR:** [ADR-018](../../dev/ADR-018-repo-native-harness-evolution.md)
|
||||
|
||||
---
|
||||
|
||||
## Notification Event Model Implementation
|
||||
|
||||
**Purpose:** Replace text-matching heuristics in `src/headless-events.ts` and `src/resources/extensions/sf/notification-overlay.ts` with structured `source`/`kind`/`blocking`/`dedupe_key` metadata on all inbound transcript events.
|
||||
|
||||
**Scope:** Propagate event metadata through all notification paths; update headless event parser to use structured fields; add deduplication by key instead of by text.
|
||||
|
||||
**Acceptance criteria:** `headless-events.ts` no longer uses string matching for event classification. Duplicate non-blocking workflow notices are collapsed by `dedupe_key`. A regression test asserts that automated notices cannot supersede the latest real user message.
|
||||
|
||||
**Design doc:** [notification-event-model.md](../../design-docs/notification-event-model.md) · [product spec](../../product-specs/notification-source-hygiene.md)
|
||||
|
|
|
|||
|
|
@ -1,3 +1,59 @@
|
|||
# Completed Execution Plans
|
||||
|
||||
Move finished plan summaries here with evidence links and follow-up debt.
|
||||
## repo-vcs skill — 2026-05-01
|
||||
|
||||
**What shipped:** `repository-vcs-context.ts` detects Git vs Jujutsu and injects VCS guidance into the agent system prompt. `src/resources/skills/repo-vcs/` bundled skill for commit, push, and safe-push workflows. Skill trigger registered in `bootstrap/system-context.ts`.
|
||||
|
||||
**Evidence:** commit `a611cd579` — 18 files, 943 insertions.
|
||||
|
||||
**Follow-up debt:** None — no regressions in smoke tests.
|
||||
|
||||
---
|
||||
|
||||
## Autonomous workflow stabilization — 2026-05-01
|
||||
|
||||
**What shipped:** Major hardening pass on the auto-loop: crash recovery, stuck detection, production mutation approval gate, safe-smoke task LLM approval, headless source startup progress. See commit `12e7333f1`.
|
||||
|
||||
**Evidence:** All existing tests pass. Smoke tests: `sf --version`, `sf --help`, `sf --print` all pass.
|
||||
|
||||
**Follow-up debt:** Parallel milestone state locking (SQLite) deferred to ADR-018 Phase 1+.
|
||||
|
||||
---
|
||||
|
||||
## JSDoc Purpose/Consumer annotations — 2026-05-01
|
||||
|
||||
**What shipped:** `Purpose:` and `Consumer:` JSDoc annotations added to `app-paths.ts`, `bundled-extension-paths.ts`, `errors.ts`, `extension-discovery.ts`, `extension-registry.ts`, `headless-types.ts`, `headless.ts`, `traces.ts`. Fulfills SPEC_FIRST_TDD "JSDoc is the purpose" iron law for core modules.
|
||||
|
||||
**Evidence:** commit `a611cd579`.
|
||||
|
||||
---
|
||||
|
||||
## Pi clean seam architecture (ADR-010) — 2026-04
|
||||
|
||||
**What shipped:** Hard boundary between SF extension code and Pi SDK internals. SF extensions may only call the public Pi extension API; no direct access to Pi internals. Enforced via import rules.
|
||||
|
||||
**ADR:** [ADR-010](../../dev/ADR-010-pi-clean-seam-architecture.md)
|
||||
|
||||
---
|
||||
|
||||
## Branchless worktree architecture (ADR-001) — prior
|
||||
|
||||
**What shipped:** Git worktrees for milestone isolation without branch-per-milestone overhead. Each milestone executes in its own worktree; changes merge back to main on completion.
|
||||
|
||||
**ADR:** [ADR-001](../../dev/ADR-001-branchless-worktree-architecture.md)
|
||||
|
||||
---
|
||||
|
||||
## Pipeline simplification (ADR-003) — prior
|
||||
|
||||
**What shipped:** Research phase merged into planning; mechanical completion model for tasks that need no LLM judgment. Eliminated a redundant dispatch phase.
|
||||
|
||||
**ADR:** [ADR-003](../../dev/ADR-003-pipeline-simplification.md)
|
||||
|
||||
---
|
||||
|
||||
## Capability-aware model routing (ADR-004) — prior
|
||||
|
||||
**What shipped:** Routing from tier/cost selection to task-capability matching. Model selection considers tool requirements, vision, function-calling, and context size, not just cost tier.
|
||||
|
||||
**ADR:** [ADR-004](../../dev/ADR-004-capability-aware-model-routing.md)
|
||||
|
|
|
|||
|
|
@ -1,3 +1,69 @@
|
|||
# Tech Debt Tracker
|
||||
|
||||
Track cleanup discovered during implementation. Include owner, impact, proposed fix, and verification.
|
||||
## Notification event classification — text matching only
|
||||
|
||||
**Impact:** `src/headless-events.ts` classifies events (blocked, milestone-ready, auto-stopped) by regex against stderr text. Fragile: any wording change in a notification breaks classification silently.
|
||||
|
||||
**Proposed fix:** Implement structured `source`/`kind`/`blocking` metadata per notification-event-model.md. Update headless event parser to use typed fields.
|
||||
|
||||
**Verification:** Remove string-match classifiers; confirm headless exit-code logic still triggers correctly via integration test.
|
||||
|
||||
**Tracked in:** [active/index.md — Notification Event Model](./active/index.md)
|
||||
|
||||
---
|
||||
|
||||
## MCP workflow mutations — read-only only
|
||||
|
||||
**Impact:** External providers (Claude Code CLI, remote orchestrators) that route through MCP can query SF state but cannot advance it. `sf_complete_task`, `sf_plan_milestone`, etc. are in-process only.
|
||||
|
||||
**Proposed fix:** Extract shared handlers from native tools; expose over MCP server (ADR-008 Phase 1).
|
||||
|
||||
**Verification:** Claude Code provider session completes a task via MCP `sf_complete_task` and produces identical STATE.md outcome.
|
||||
|
||||
**Tracked in:** [active/index.md — ADR-008 Phase 1](./active/index.md) · [ADR-008](../dev/ADR-008-IMPLEMENTATION-PLAN.md)
|
||||
|
||||
---
|
||||
|
||||
## No ADR template or generation recipes
|
||||
|
||||
**Impact:** Every ADR is hand-authored from scratch. No enforced schema means some ADRs omit falsifiers, status dates, or sequencing. No `just adr` recipe means the numbering is manual.
|
||||
|
||||
**Proposed fix:** Add `docs/dev/ADR-TEMPLATE.md` with required sections (Status, Date, Context, Options, Decision, Consequences, Falsifiers, Verification). Add `just adr <number> <slug>` recipe that stamps the template.
|
||||
|
||||
**Verification:** `just adr 019 my-decision` produces `docs/dev/ADR-019-my-decision.md` with all required section headings.
|
||||
|
||||
---
|
||||
|
||||
## Parallel milestone state locking — file-based, ad-hoc
|
||||
|
||||
**Impact:** Concurrent milestone execution uses ad-hoc file locks on STATE.md and roadmap.md. Race condition possible under heavy parallelism; not blocking for serial execution (current default).
|
||||
|
||||
**Proposed fix:** SQLite database in `.sf/sf.db` with atomic transactions for all state mutations. Deferred to ADR-018 Phase 1+.
|
||||
|
||||
**Verification:** Two milestone workers simultaneously completing tasks produce consistent STATE.md with no lost updates.
|
||||
|
||||
**Tracked in:** [ADR-018](../dev/ADR-018-repo-native-harness-evolution.md) sequencing stage 1.
|
||||
|
||||
---
|
||||
|
||||
## write-gate BASH_READ_ONLY_RE — monolithic regex
|
||||
|
||||
**Location:** `src/resources/extensions/sf/bootstrap/write-gate.ts`
|
||||
|
||||
**Impact:** 30+ command patterns are encoded in a single 900-character regex. Adding a new safe command requires editing the regex inline, with no unit coverage per command. Risk of subtle regex alternation bugs.
|
||||
|
||||
**Proposed fix:** Replace with a data-driven allowlist (array of patterns with names and comments) that the gate compiles at startup. Each entry is individually testable.
|
||||
|
||||
**Verification:** `write-gate.test.ts` achieves per-command coverage without a single monolithic regex match test.
|
||||
|
||||
---
|
||||
|
||||
## ADR-018 runtime — harness profiler not yet wired
|
||||
|
||||
**Impact:** `repo-profiler.ts` exists but is not called during any autonomous dispatch phase. The harness evolution system (ADR-018) exists only as design documentation; no runtime behavior has shipped.
|
||||
|
||||
**Proposed fix:** Wire `buildRepoProfile()` call into the pre-dispatch phase (Phase 1). Record result in `.sf/sf.db`.
|
||||
|
||||
**Verification:** After any planning milestone, `.sf/sf.db` contains a `repo_profiles` row for the current session.
|
||||
|
||||
**Tracked in:** [active/index.md — ADR-018 Phase 1](./active/index.md)
|
||||
|
|
|
|||
50
docs/records/2026-05-01-repo-vcs-and-notifications.md
Normal file
50
docs/records/2026-05-01-repo-vcs-and-notifications.md
Normal file
|
|
@ -0,0 +1,50 @@
|
|||
# Records Note — 2026-05-01
|
||||
|
||||
## What Changed
|
||||
|
||||
**commit `a611cd579`** — feat: introduce repo-vcs skill and add JSDoc annotations across core modules
|
||||
|
||||
- `src/resources/extensions/sf/repository-vcs-context.ts` — new: detects Git vs Jujutsu, builds VCS guidance block injected into system prompt
|
||||
- `src/resources/skills/repo-vcs/` — new: bundled skill for commit, push, safe-push workflows
|
||||
- `src/resources/extensions/sf/bootstrap/system-context.ts` — added `repo-vcs` to bundled skill trigger table; injects `repositoryVcsBlock` into system prompt
|
||||
- `src/resources/extensions/sf/tests/repository-vcs-context.test.ts` — new: test suite for VCS context detection
|
||||
- JSDoc `Purpose:` and `Consumer:` annotations added to: `app-paths.ts`, `bundled-extension-paths.ts`, `errors.ts`, `extension-discovery.ts`, `extension-registry.ts`, `headless-types.ts`, `headless.ts`, `traces.ts`
|
||||
- `flake.nix` — added `just` to devShell
|
||||
- `justfile` — new: build, test, typecheck, lint, sf recipes
|
||||
|
||||
**Notification specs drafted:**
|
||||
- `docs/design-docs/notification-event-model.md` — design decision: structured source/kind/blocking/dedupe_key on all events
|
||||
- `docs/product-specs/notification-source-hygiene.md` — product spec: separate user messages from automated notices
|
||||
|
||||
**Docs filled (previously placeholder):**
|
||||
- `docs/design-docs/index.md` — ADR index
|
||||
- `docs/PLANS.md` — active and upcoming work index
|
||||
- `docs/exec-plans/active/index.md` — ADR-018, ADR-008, notification model
|
||||
- `docs/exec-plans/completed/index.md` — repo-vcs, stabilization, JSDoc, ADR-001/003/004/010
|
||||
- `docs/exec-plans/tech-debt-tracker.md` — 6 known items
|
||||
- `docs/RELIABILITY.md` — exit codes, failure modes, recovery paths, observability
|
||||
- `docs/SECURITY.md` — auth model, write gate, protected files, secret scan
|
||||
- `docs/DESIGN.md` — TUI line-width rule, overlays, theming, IME, performance
|
||||
- `docs/PRODUCT_SENSE.md` — product thesis, user goals, non-goals, tradeoffs
|
||||
- `docs/QUALITY_SCORE.md` — thresholds, fast/slow checks, known blind spots
|
||||
- `docs/records/index.md` — this index
|
||||
|
||||
## What Canonical Docs Were Updated
|
||||
|
||||
- `docs/design-docs/index.md` — now indexes all 18 ADRs and 2 design docs
|
||||
- `docs/PLANS.md` — now reflects active initiatives and upcoming work
|
||||
- All exec-plan index files — now have real content
|
||||
|
||||
## Contradictions Found
|
||||
|
||||
- ADR-008 (SF tools over MCP) is marked "Accepted — impl in progress" but the user has clarified that SF is the only runtime in use; Claude Code is used as an external dev assistant, not as a provider inside SF. ADR-008's premise (provider parity for Claude Code CLI as a Pi provider) may not apply to the current usage model. Needs clarification.
|
||||
|
||||
- `docs/design-docs/` and `docs/dev/ADR-*.md` are split across two directories. The design-docs folder has 2 files; 18 ADRs live in dev/. This split is navigable with the index but worth consolidating eventually.
|
||||
|
||||
## What Remains Unresolved
|
||||
|
||||
- ADR-008 relevance: does exposing workflow mutations over MCP make sense if SF is always the sole runtime?
|
||||
- ADR-018 Phase 1 (repo profiler wired into dispatch) is not yet started
|
||||
- Notification event model implementation (Phase 2 of the spec) is not yet started
|
||||
- No ADR template or `just adr` recipe
|
||||
- `write-gate.ts` BASH_READ_ONLY_RE monolithic regex not yet refactored
|
||||
|
|
@ -1,3 +1,9 @@
|
|||
# Records
|
||||
|
||||
This folder holds repo-memory audits, decision ledgers, context-gardening notes, and records-keeper outputs.
|
||||
Repo-memory audits, decision ledgers, context-gardening notes, and records-keeper outputs. Each entry is a dated note describing what changed, what canonical docs were updated, and what remains unresolved.
|
||||
|
||||
## Index
|
||||
|
||||
| Date | Note | Summary |
|
||||
|------|------|---------|
|
||||
| 2026-05-01 | [repo-vcs and notifications](./2026-05-01-repo-vcs-and-notifications.md) | repo-vcs skill landed; notification specs drafted; JSDoc annotations added; placeholder docs filled |
|
||||
|
|
|
|||
|
|
@ -47,17 +47,29 @@ const SF_RUNTIME_PATTERNS = [
|
|||
] as const;
|
||||
|
||||
/**
|
||||
* SF-specific runtime exclusion patterns. These live in .git/info/exclude
|
||||
* (per-clone, never committed) instead of .gitignore so that:
|
||||
* - Re-running sf doesn't dirty the working tree on every invocation
|
||||
* - The project's .gitignore stays human-curated (sf doesn't own it)
|
||||
* - User-equivalent patterns like `/.sf` (root-only) coexist without
|
||||
* triggering naive duplicate-add since we don't touch .gitignore at all
|
||||
* for these.
|
||||
* SF runtime exclusion patterns for repos where .sf/ is a LOCAL DIRECTORY.
|
||||
* Granular so that durable planning artifacts (.sf/milestones/, .sf/PROJECT.md,
|
||||
* .sf/DECISIONS.md) remain trackable in git per ADR-001.
|
||||
*
|
||||
* Migrated out of BASELINE_PATTERNS on 2026-04-29.
|
||||
* NOT used when .sf/ is a symlink — symlinks need the blanket SF_SYMLINK_EXCLUSION_PATTERNS
|
||||
* because git cannot traverse symlinks to match per-file patterns.
|
||||
*
|
||||
* Migrated from blanket `.sf` on 2026-05-01 to implement ADR-001.
|
||||
* Previously migrated out of BASELINE_PATTERNS into .git/info/exclude on 2026-04-29.
|
||||
*/
|
||||
const SF_RUNTIME_EXCLUSION_PATTERNS = [".sf", ".sf-id", ".bg-shell/"] as const;
|
||||
const SF_RUNTIME_EXCLUSION_PATTERNS: readonly string[] = [
|
||||
".sf-id",
|
||||
".bg-shell/",
|
||||
...SF_RUNTIME_PATTERNS,
|
||||
];
|
||||
|
||||
/**
|
||||
* SF exclusion patterns for repos where .sf/ is a SYMLINK (external state).
|
||||
* Git sees the symlink as an opaque file and cannot traverse it, so granular
|
||||
* patterns like .sf/activity/ would never match. The blanket .sf pattern
|
||||
* excludes the symlink itself.
|
||||
*/
|
||||
const SF_SYMLINK_EXCLUSION_PATTERNS = [".sf", ".sf-id", ".bg-shell/"] as const;
|
||||
|
||||
const BASELINE_PATTERNS = [
|
||||
// SF-specific patterns now live in SF_RUNTIME_EXCLUSION_PATTERNS, applied
|
||||
|
|
@ -216,25 +228,56 @@ export function ensureGitInfoExclude(basePath: string): boolean {
|
|||
? readFileSync(excludePath, "utf-8")
|
||||
: "";
|
||||
|
||||
const existingLines = new Set(
|
||||
existing
|
||||
.split("\n")
|
||||
.map((l) => l.trim())
|
||||
.filter((l) => l && !l.startsWith("#")),
|
||||
);
|
||||
const missing = SF_RUNTIME_EXCLUSION_PATTERNS.filter(
|
||||
(p) => !existingLines.has(p),
|
||||
);
|
||||
if (missing.length === 0) return false;
|
||||
// Determine whether .sf is a symlink (external state) or a local directory.
|
||||
// Symlink: git cannot traverse it, so only the blanket .sf pattern works.
|
||||
// Directory: use granular patterns so .sf/milestones/ and other durable
|
||||
// planning artifacts can be tracked per ADR-001.
|
||||
const sfIsSymlink = (() => {
|
||||
const localSf = join(basePath, ".sf");
|
||||
try {
|
||||
return existsSync(localSf) && lstatSync(localSf).isSymbolicLink();
|
||||
} catch {
|
||||
return false;
|
||||
}
|
||||
})();
|
||||
|
||||
const block = [
|
||||
"",
|
||||
"# ── SF runtime exclusion (managed by sf, per-clone) ──",
|
||||
...missing,
|
||||
"",
|
||||
].join("\n");
|
||||
const prefix = existing && !existing.endsWith("\n") ? "\n" : "";
|
||||
writeFileSync(excludePath, existing + prefix + block, "utf-8");
|
||||
const targetPatterns: readonly string[] = sfIsSymlink
|
||||
? SF_SYMLINK_EXCLUSION_PATTERNS
|
||||
: SF_RUNTIME_EXCLUSION_PATTERNS;
|
||||
|
||||
// Patterns to remove: whatever the OTHER mode would have written.
|
||||
// This handles transitions (symlink↔directory) by cleaning up stale entries.
|
||||
const stalePatterns = sfIsSymlink
|
||||
? SF_RUNTIME_EXCLUSION_PATTERNS
|
||||
: SF_SYMLINK_EXCLUSION_PATTERNS;
|
||||
|
||||
const existingLines = existing.split("\n").map((l) => l.trim());
|
||||
const existingSet = new Set(
|
||||
existingLines.filter((l) => l && !l.startsWith("#")),
|
||||
);
|
||||
|
||||
const missing = targetPatterns.filter((p) => !existingSet.has(p));
|
||||
const toRemove = new Set(stalePatterns.filter((p) => existingSet.has(p)));
|
||||
|
||||
if (missing.length === 0 && toRemove.size === 0) return false;
|
||||
|
||||
let content = existing
|
||||
.split("\n")
|
||||
.filter((l) => !toRemove.has(l.trim()))
|
||||
.join("\n");
|
||||
|
||||
if (missing.length > 0) {
|
||||
const block = [
|
||||
"",
|
||||
"# ── SF runtime exclusion (managed by sf, per-clone) ──",
|
||||
...missing,
|
||||
"",
|
||||
].join("\n");
|
||||
const prefix = content && !content.endsWith("\n") ? "\n" : "";
|
||||
content = content + prefix + block;
|
||||
}
|
||||
|
||||
writeFileSync(excludePath, content, "utf-8");
|
||||
return true;
|
||||
}
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue