singularity/singularity-forge

Fork 0

Mikael Hugo 3960e42b26 docs: align sf purpose doctrine and docs

2026-05-06 00:38:36 +02:00

12 KiB

Raw Blame History

Repository Guidelines

Setup Checklist for New Contributors

Install dev dependencies: npm install
Install pre-commit hooks: npm run secret-scan:install-hook
Apply GitHub labels: gh label create priority/P0 --color B60205 --description "Critical" (see .github/labels.yml for full list)
Verify devcontainer: devcontainer build --workspace-folder .
Run first tech-debt scan: node scripts/tech-debt-scan.mjs

Purpose-First Doctrine

sf follows spec-first TDD: see docs/SPEC_FIRST_TDD.md for the full constitution.

SF's foundational architecture decision is ADR-0000: SF Is a Purpose-to-Software Compiler. Treat this as the product contract for all planning and implementation:

capture bounded intent
translate intent into the eight PDD fields
research missing context and name assumptions
apply autonomy policy from confidence, risk, reversibility, blast radius, cost, legal/compliance scope, and production/customer impact
generate milestone/slice/task contracts from structured state
write failing tests or executable evidence before implementation
implement the smallest code change that satisfies the contract
verify, record evidence, retain useful memory, and continue

Iron Law:

THE TEST IS THE SPEC.  THE JSDOC IS THE PURPOSE.  CODE EXISTS TO FULFILL PURPOSE.

NO BEHAVIOR CHANGE WITHOUT A FAILING TEST FIRST.
NO COMPLETION WITHOUT A REAL CONSUMER.
NO JUDGMENT CALL WITHOUT A CONFIDENCE AND FALSIFIER.

Every artifact (slice plan, task plan, function, test, ADR) must answer:

why this behaviour exists
what value it creates or protects
who uses it in production (real consumer, not just tests)
what breaks if it returns the wrong answer

If any answer is missing: BLOCKED: purpose unclear — [field]. Surfacing the gap beats rationalising past it.

Project Structure

This is a TypeScript monorepo with npm workspaces. The main entry point is dist/loader.js (bin: sf).

src/ — Main CLI source (sf-run core, extensions, agents)
packages/ — Workspace packages (8 total): pi-tui, pi-ai, pi-agent-core, pi-coding-agent, daemon, mcp-server, native, rpc-client
web/ — Next.js web frontend (optional web host mode)
rust-engine/ — Rust N-API bindings for performance-critical operations
scripts/ — Build, dev, release, and CI helper scripts
tests/ — Fixtures, smoke tests, live tests, live-regression tests
docs/ — User guides and developer documentation
docker/ — Docker sandbox and builder configurations

Build, Test, and Development Commands

# Full build (core + web)
npm run build

# Build core only (packages + tsc + resources)
npm run build:core

# Dev mode with hot reload
npm run dev

# Run all tests (unit + integration)
npm test

# Unit tests only
npm run test:unit

# Integration tests only
npm run test:integration

# Coverage check (Vitest V8 provider; thresholds: statements 40%, lines 40%, branches 20%, functions 20%)
npm run test:coverage

# Type check extensions (no emit)
npm run typecheck:extensions

# Native Rust build
npm run build:native

# Root lint checks (Biome over src/)
npm run lint
npm run lint:fix

# Web lint (Next.js ESLint; separate package)
npm --prefix web run lint

# Release workflow (changelog + version bump)
npm run release:changelog
npm run release:bump

Coding Style & Naming Conventions

Language: TypeScript with "strict": true enabled in all packages
Module resolution: NodeNext
Target: ES2022
Package manager: npm (canonical; do not commit bun.lock or pnpm-lock.yaml)
Commit format: Conventional Commits enforced via commit-msg hook
Branch naming: <type>/<short-description> — e.g. feat/new-command, fix/login-bug
- Types: feat, fix, docs, chore, refactor, test, infra, ci, perf, build, revert

JSDoc Purpose Convention

Every exported function, type, class, and module-level constant opens with a JSDoc block whose first sentence is its purpose — the consumer-facing reason it exists. Not what it does (the signature shows that), but why.

/**
 * Acquire a unit claim atomically. Returns true on success, false if another worker
 * already holds an unexpired lease.
 *
 * Purpose: prevent two workers from dispatching the same unit when the run-lock is
 * unavailable (shared NFS, broken filesystem semantics) — the conditional UPDATE in
 * SQLite is the safety net.
 *
 * Consumer: auto-dispatch.ts when picking the next eligible unit per poll tick.
 */
export function claimUnit(unitId: string, leaseMs: number): boolean { ... }

Required for every exported symbol whose behaviour is non-trivial:

First line — what it returns / does, in the present tense.
Purpose: — why it exists; the value it protects.
Consumer: — who calls it in production. If you can't name a consumer, the symbol shouldn't exist yet.

A bare /** Helper. */ is a code smell. Either write the purpose or delete the symbol.

For module-level JSDoc (file headers): keep the existing module-name.ts — short description opening, then a Purpose: line stating why the module exists as a separable unit.

Testing Guidelines

Primary test runner: Vitest via npm run test:unit, npm run test:integration, and npm test
Node test runner: used only by specific package/native/browser-tool scripts where package.json says node --test
Coverage tool: Vitest coverage with @vitest/coverage-v8; thresholds are enforced in CI
Naming: *.test.ts and *.test.mjs patterns
Smoke tests: npm run test:smoke
Live tests: npm run test:live (requires environment variables)

Purposeful Tests

Test names are contract claims. Use the form <what>_<when>_<expected>:

Good	Bad
`claim_when_lease_expired_returns_true`	`test claim`
`dispatch_when_blocker_unresolved_skips_unit`	`test dispatch logic`

Three-tier organisation:

Behaviour contracts (primary) — what the consumer receives. The spec. A different implementation that passes these is equally correct.
Degradation contracts — what happens when dependencies fail. Consumer must always get a useful response; failure must degrade, not crash.
Implementation guards (secondary, labelled // guard:) — protect specific failure modes (resource leaks, infinite loops). Refactors update guards, not behaviour contracts.

Write behaviour contracts first. They are the work order.

A test that asserts call counts or mock interactions is mechanical, not purposeful — it should be a labelled implementation guard, not a primary contract test. A test that breaks on a refactor without behaviour change is mechanical too. Fix the test or relabel it.

Bug = missing correct-behaviour test. When fixing a bug, write a test for the correct behaviour first — it must fail (RED) because the bug exists. If it passes immediately, the test is testing the broken behaviour; fix the test, not the code.

Extension Development

Extensions live in src/resources/extensions/. Each extension should:

Export a manifest with name, version, tools[], and agents[]
Include tests in src/resources/extensions/<name>/tests/
Register tools via the extension API

Pull Request Guidelines

Link an issue — PRs without a linked issue will be closed without review
One concern per PR — don't bundle unrelated changes
No drive-by formatting — don't reformat code you didn't touch
CI must pass — fix failing tests before requesting review
Rebase onto main — do not merge main into your feature branch
Use the PR template at .github/PULL_REQUEST_TEMPLATE.md

Environment Setup

Copy docker/.env.example to .env and fill in API keys. At minimum you need one LLM provider key (Anthropic, OpenAI, Google, or OpenRouter).

Architecture Notes

State lives on disk in .sf/ — no in-memory state survives across sessions
Bundled extensions/agents sync to ~/.sf/agent/ on every launch
LLM providers are lazy-loaded on first use to reduce cold-start time
Native Rust engine handles grep, glob, ps, highlight, ast, diff

SF Planning State

.sf/ is the canonical home for SF agent state. It contains milestone plans, slice plans, task plans, and ephemeral working files under .sf/milestones/, .sf/STATE.md, .sf/QUEUE.md, and related artifacts.

Promote-only rule: Agent state (the .sf/ directory under ~/.sf/projects/<hash>/) is transient and gitignored — never committed directly. Project state (.sf/ tracked in the repo root) contains only human-authored artifacts such as DECISIONS.md, KNOWLEDGE.md, REQUIREMENTS.md, ROADMAP.md, and STATE.md.

Promoted artifacts — milestone summaries, architecture decision records (ADRs), and durable specifications — belong in tracked documentation directories:

docs/plans/ — reviewed implementation plans promoted from .sf/ milestone planning
docs/adr/ — accepted architectural decisions promoted from .sf/DECISIONS.md
docs/specs/ — long-lived behavior contracts and API specifications

Naming conventions:

Milestone IDs: M001, M002, …
Slice IDs: S01, S02, …
Task IDs: T01, T02, …

Commands:

sf plan promote <source> — copy a file from .sf/ to docs/plans/, docs/adr/, or docs/specs/
sf plan list — list milestone and slice files in .sf/
sf plan diff — compare .sf/ state with promoted docs/ artifacts

See docs/plans/README.md, docs/adr/README.md, and docs/specs/README.md for directory-specific conventions.

Eval Dump Inbox

SF/Pi automatically loads AGENTS.md and CLAUDE.md from the repo tree at startup. It does not automatically load TODO.md, but this repo uses root TODO.md as a temporary human dump inbox for eval and self-evolution ideas.

When a repo contains a root TODO.md, treat it as a temporary dump inbox and read it before planning substantive work in that repo. This applies even when the user does not explicitly mention evals. Treat the Raw Dump Inbox section as untriaged source material, not as durable instructions. Triage it into reviewable artifacts: concrete eval cases, harness gaps, memory extraction requirements, docs, tests, or follow-up implementation tasks. After triage, remove the processed dump notes from TODO.md so the file returns to an empty inbox/template state. Do not treat dumped notes as runtime memory or approved behavior until they are converted into tested, versioned project artifacts.

CI/CD

ci.yml — builds, tests, gates merges to main
pipeline.yml — three-stage release (dev → test → prod)
pr-risk.yml — PR risk classification
ai-triage.yml — AI-based issue/PR triage

Code Quality Tooling

The repository uses the following quality tools:

Biome — root source linting via npm run lint and autofix via npm run lint:fix
- Scope: src/ plus versioned JSON checks
- Config: biome.json
- Format touched files with npx biome check --write <paths>; full-repo formatting is not the current CI gate.
ESLint — web app linting via npm --prefix web run lint
- Scope: web/
- Config: web/eslint.config.mjs
TypeScript — Strict mode enabled; run npm run typecheck:extensions
Knip — Detect unused code and dependencies: npx knip (config at knip.json)
jscpd — Detect duplicate code: npx jscpd (config at .jscpd.json)
Tech Debt Scanner — node scripts/tech-debt-scan.mjs
- Tracks TODO/FIXME/HACK/XXX counts against thresholds
Secret Scan — npm run secret-scan (pre-commit hook available via npm run secret-scan:install-hook)
Coverage — npm run test:coverage (Vitest V8 coverage with 40/40/20/20 thresholds)

Dev Container

A Dev Container configuration is available at .devcontainer/devcontainer.json. Open the repository in VS Code with the Dev Containers extension, or run:

devcontainer up --workspace-folder .

The container includes Node 24, Rust, GitHub CLI, Docker-in-Docker, and recommended VS Code extensions.

Dependency Updates

Dependabot is configured at .github/dependabot.yml for:

Root npm dependencies (weekly, grouped by ecosystem)
Web app dependencies (weekly)
GitHub Actions (weekly)

Issue Labels

Label definitions are at .github/labels.yml. Apply labels using:

# Create a single label
gh label create priority/P0 --color B60205 --description "Critical — blocks release"

# Or use a label management action in CI

12 KiB Raw Blame History