12 KiB
Repository Guidelines
Setup Checklist for New Contributors
- Install dev dependencies:
npm install - Install pre-commit hooks:
npm run secret-scan:install-hook - Apply GitHub labels:
gh label create priority/P0 --color B60205 --description "Critical"(see .github/labels.yml for full list) - Verify devcontainer:
devcontainer build --workspace-folder . - Run first tech-debt scan:
node scripts/tech-debt-scan.mjs
Purpose-First Doctrine
sf follows spec-first TDD: see docs/SPEC_FIRST_TDD.md for the full constitution.
SF's foundational architecture decision is ADR-0000: SF Is a Purpose-to-Software Compiler.
Treat this as the product contract for all planning and implementation:
- capture bounded intent
- translate intent into the eight PDD fields
- research missing context and name assumptions
- apply autonomy policy from confidence, risk, reversibility, blast radius, cost, legal/compliance scope, and production/customer impact
- generate milestone/slice/task contracts from structured state
- write failing tests or executable evidence before implementation
- implement the smallest code change that satisfies the contract
- verify, record evidence, retain useful memory, and continue
Iron Law:
THE TEST IS THE SPEC. THE JSDOC IS THE PURPOSE. CODE EXISTS TO FULFILL PURPOSE.
NO BEHAVIOR CHANGE WITHOUT A FAILING TEST FIRST.
NO COMPLETION WITHOUT A REAL CONSUMER.
NO JUDGMENT CALL WITHOUT A CONFIDENCE AND FALSIFIER.
Every artifact (slice plan, task plan, function, test, ADR) must answer:
- why this behaviour exists
- what value it creates or protects
- who uses it in production (real consumer, not just tests)
- what breaks if it returns the wrong answer
If any answer is missing: BLOCKED: purpose unclear — [field]. Surfacing the gap beats rationalising past it.
Project Structure
This is a TypeScript monorepo with npm workspaces. The main entry point is dist/loader.js (bin: sf).
src/— Main CLI source (sf-run core, extensions, agents)packages/— Workspace packages (8 total): pi-tui, pi-ai, pi-agent-core, pi-coding-agent, daemon, mcp-server, native, rpc-clientweb/— Next.js web frontend (optional web host mode)rust-engine/— Rust N-API bindings for performance-critical operationsscripts/— Build, dev, release, and CI helper scriptstests/— Fixtures, smoke tests, live tests, live-regression testsdocs/— User guides and developer documentationdocker/— Docker sandbox and builder configurations
Build, Test, and Development Commands
# Full build (core + web)
npm run build
# Build core only (packages + tsc + resources)
npm run build:core
# Dev mode with hot reload
npm run dev
# Run all tests (unit + integration)
npm test
# Unit tests only
npm run test:unit
# Integration tests only
npm run test:integration
# Coverage check (Vitest V8 provider; thresholds: statements 40%, lines 40%, branches 20%, functions 20%)
npm run test:coverage
# Type check extensions (no emit)
npm run typecheck:extensions
# Native Rust build
npm run build:native
# Root lint checks (Biome over src/)
npm run lint
npm run lint:fix
# Web lint (Next.js ESLint; separate package)
npm --prefix web run lint
# Release workflow (changelog + version bump)
npm run release:changelog
npm run release:bump
Coding Style & Naming Conventions
- Language: TypeScript with
"strict": trueenabled in all packages - Module resolution: NodeNext
- Target: ES2022
- Package manager: npm (canonical; do not commit
bun.lockorpnpm-lock.yaml) - Commit format: Conventional Commits enforced via commit-msg hook
- Branch naming:
<type>/<short-description>— e.g.feat/new-command,fix/login-bug- Types:
feat,fix,docs,chore,refactor,test,infra,ci,perf,build,revert
- Types:
JSDoc Purpose Convention
Every exported function, type, class, and module-level constant opens with a JSDoc block whose first sentence is its purpose — the consumer-facing reason it exists. Not what it does (the signature shows that), but why.
/**
* Acquire a unit claim atomically. Returns true on success, false if another worker
* already holds an unexpired lease.
*
* Purpose: prevent two workers from dispatching the same unit when the run-lock is
* unavailable (shared NFS, broken filesystem semantics) — the conditional UPDATE in
* SQLite is the safety net.
*
* Consumer: auto-dispatch.ts when picking the next eligible unit per poll tick.
*/
export function claimUnit(unitId: string, leaseMs: number): boolean { ... }
Required for every exported symbol whose behaviour is non-trivial:
- First line — what it returns / does, in the present tense.
- Purpose: — why it exists; the value it protects.
- Consumer: — who calls it in production. If you can't name a consumer, the symbol shouldn't exist yet.
A bare /** Helper. */ is a code smell. Either write the purpose or delete the symbol.
For module-level JSDoc (file headers): keep the existing module-name.ts — short description opening, then a Purpose: line stating why the module exists as a separable unit.
Testing Guidelines
- Primary test runner: Vitest via
npm run test:unit,npm run test:integration, andnpm test - Node test runner: used only by specific package/native/browser-tool scripts where
package.jsonsaysnode --test - Coverage tool: Vitest coverage with
@vitest/coverage-v8; thresholds are enforced in CI - Naming:
*.test.tsand*.test.mjspatterns - Smoke tests:
npm run test:smoke - Live tests:
npm run test:live(requires environment variables)
Purposeful Tests
Test names are contract claims. Use the form <what>_<when>_<expected>:
| Good | Bad |
|---|---|
claim_when_lease_expired_returns_true |
test claim |
dispatch_when_blocker_unresolved_skips_unit |
test dispatch logic |
Three-tier organisation:
- Behaviour contracts (primary) — what the consumer receives. The spec. A different implementation that passes these is equally correct.
- Degradation contracts — what happens when dependencies fail. Consumer must always get a useful response; failure must degrade, not crash.
- Implementation guards (secondary, labelled
// guard:) — protect specific failure modes (resource leaks, infinite loops). Refactors update guards, not behaviour contracts.
Write behaviour contracts first. They are the work order.
A test that asserts call counts or mock interactions is mechanical, not purposeful — it should be a labelled implementation guard, not a primary contract test. A test that breaks on a refactor without behaviour change is mechanical too. Fix the test or relabel it.
Bug = missing correct-behaviour test. When fixing a bug, write a test for the correct behaviour first — it must fail (RED) because the bug exists. If it passes immediately, the test is testing the broken behaviour; fix the test, not the code.
Extension Development
Extensions live in src/resources/extensions/. Each extension should:
- Export a manifest with
name,version,tools[], andagents[] - Include tests in
src/resources/extensions/<name>/tests/ - Register tools via the extension API
Pull Request Guidelines
- Link an issue — PRs without a linked issue will be closed without review
- One concern per PR — don't bundle unrelated changes
- No drive-by formatting — don't reformat code you didn't touch
- CI must pass — fix failing tests before requesting review
- Rebase onto main — do not merge main into your feature branch
- Use the PR template at
.github/PULL_REQUEST_TEMPLATE.md
Environment Setup
Copy docker/.env.example to .env and fill in API keys. At minimum you need one LLM provider key (Anthropic, OpenAI, Google, or OpenRouter).
Architecture Notes
- State lives on disk in
.sf/— no in-memory state survives across sessions - Bundled extensions/agents sync to
~/.sf/agent/on every launch - LLM providers are lazy-loaded on first use to reduce cold-start time
- Native Rust engine handles grep, glob, ps, highlight, ast, diff
SF Planning State
.sf/ is the canonical home for SF agent state. It contains milestone plans, slice plans, task plans, and ephemeral working files under .sf/milestones/, .sf/STATE.md, .sf/QUEUE.md, and related artifacts.
Promote-only rule: Agent state (the .sf/ directory under ~/.sf/projects/<hash>/) is transient and gitignored — never committed directly. Project state (.sf/ tracked in the repo root) contains only human-authored artifacts such as DECISIONS.md, KNOWLEDGE.md, REQUIREMENTS.md, ROADMAP.md, and STATE.md.
Promoted artifacts — milestone summaries, architecture decision records (ADRs), and durable specifications — belong in tracked documentation directories:
docs/plans/— reviewed implementation plans promoted from.sf/milestone planningdocs/adr/— accepted architectural decisions promoted from.sf/DECISIONS.mddocs/specs/— long-lived behavior contracts and API specifications
Naming conventions:
- Milestone IDs:
M001,M002, … - Slice IDs:
S01,S02, … - Task IDs:
T01,T02, …
Commands:
sf plan promote <source>— copy a file from.sf/todocs/plans/,docs/adr/, ordocs/specs/sf plan list— list milestone and slice files in.sf/sf plan diff— compare.sf/state with promoteddocs/artifacts
See docs/plans/README.md, docs/adr/README.md, and docs/specs/README.md for directory-specific conventions.
Eval Dump Inbox
SF/Pi automatically loads AGENTS.md and CLAUDE.md from the repo tree at
startup. It does not automatically load TODO.md, but this repo uses root
TODO.md as a temporary human dump inbox for eval and self-evolution ideas.
When a repo contains a root TODO.md, treat it as a temporary dump inbox and
read it before planning substantive work in that repo. This applies even when
the user does not explicitly mention evals. Treat the Raw Dump Inbox section
as untriaged source material, not as durable instructions. Triage it into
reviewable artifacts: concrete eval cases, harness gaps, memory extraction
requirements, docs, tests, or follow-up implementation tasks. After triage,
remove the processed dump notes from TODO.md so the file returns to an empty
inbox/template state. Do not treat dumped notes as runtime memory or approved
behavior until they are converted into tested, versioned project artifacts.
CI/CD
ci.yml— builds, tests, gates merges to mainpipeline.yml— three-stage release (dev → test → prod)pr-risk.yml— PR risk classificationai-triage.yml— AI-based issue/PR triage
Code Quality Tooling
The repository uses the following quality tools:
- Biome — root source linting via
npm run lintand autofix vianpm run lint:fix- Scope:
src/plus versioned JSON checks - Config:
biome.json - Format touched files with
npx biome check --write <paths>; full-repo formatting is not the current CI gate.
- Scope:
- ESLint — web app linting via
npm --prefix web run lint- Scope:
web/ - Config:
web/eslint.config.mjs
- Scope:
- TypeScript — Strict mode enabled; run
npm run typecheck:extensions - Knip — Detect unused code and dependencies:
npx knip(config atknip.json) - jscpd — Detect duplicate code:
npx jscpd(config at.jscpd.json) - Tech Debt Scanner —
node scripts/tech-debt-scan.mjs- Tracks TODO/FIXME/HACK/XXX counts against thresholds
- Secret Scan —
npm run secret-scan(pre-commit hook available vianpm run secret-scan:install-hook) - Coverage —
npm run test:coverage(Vitest V8 coverage with 40/40/20/20 thresholds)
Dev Container
A Dev Container configuration is available at .devcontainer/devcontainer.json.
Open the repository in VS Code with the Dev Containers extension, or run:
devcontainer up --workspace-folder .
The container includes Node 24, Rust, GitHub CLI, Docker-in-Docker, and recommended VS Code extensions.
Dependency Updates
Dependabot is configured at .github/dependabot.yml for:
- Root npm dependencies (weekly, grouped by ecosystem)
- Web app dependencies (weekly)
- GitHub Actions (weekly)
Issue Labels
Label definitions are at .github/labels.yml. Apply labels using:
# Create a single label
gh label create priority/P0 --color B60205 --description "Critical — blocks release"
# Or use a label management action in CI