diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md index 7680d868e..7d79f8211 100644 --- a/ARCHITECTURE.md +++ b/ARCHITECTURE.md @@ -2,7 +2,7 @@ ## Purpose -Singularity Forge (SF) is an autonomous agent orchestration system. It runs long-horizon coding work through the Unified Operation Kernel (UOK): milestones → slices → tasks. Each dispatch unit runs a fresh AI context, writes its output to disk, then terminates. UOK owns lifecycle, recovery, and the DB-backed run ledger; runtime files under `.sf/runtime/` are projections for query, UI, and compatibility. A deterministic controller (not an LLM) reads canonical state and decides what to dispatch next. The user is the end-gate — autonomous mode delivers work to human review, it does not merge to production unattended. +Singularity Forge (SF) is the product. It runs long-horizon coding work through the Unified Operation Kernel (UOK): milestones → slices → tasks. Each dispatch unit runs a fresh AI context, writes its output to disk, then terminates. UOK owns lifecycle, recovery, and the DB-backed run ledger; runtime files under `.sf/runtime/` are projections for query, UI, and compatibility. A deterministic controller (not an LLM) reads canonical state and decides what to dispatch next. Core changes follow purpose-driven TDD: purpose and consumer first, then failing tests, then implementation. The user is the end-gate — autonomous mode delivers work to human review, it does not merge to production unattended. ## Codemap diff --git a/BUILD_PLAN.md b/BUILD_PLAN.md index 9cbed7644..62ca99d41 100644 --- a/BUILD_PLAN.md +++ b/BUILD_PLAN.md @@ -4,6 +4,29 @@ A practical cut of the 56 NEW items in `SPEC.md` into tiers. Not every spec item This document is the answer to: **what should we actually ship for v3?** +## Strategic frame — 2026-05 + +We are already on a strong base: Forge is the product, UOK is the kernel, and core work is gated by purpose-driven TDD plus the eight PDD fields. The goal of this build plan is not to turn SF into a generic CLI coder. The goal is to sharpen Forge's autonomous single-repo execution while borrowing the best ideas from adjacent systems. + +This file is a **planning document**, not a verified implementation ledger. An item can be mapped here and still be open, partial, or only folded into milestone planning. Close-out still requires code evidence, tests, and milestone artifacts that prove the behavior exists in the repo. + +Use external comparisons to sharpen, not to steer identity: + +- **Claude Code / Codex** — interaction and execution ergonomics +- **Aider / gsd-2** — direct execution and repo work loop +- **Plandex** — workflow decomposition and staged progress +- **ACE Coder** — future multi-repo and large-scale convergence patterns, not the near-term product path for Forge + +The end state is not "SF plus a pile of borrowed references." The end state is that proven workflow, execution, and reliability patterns are absorbed into Forge and UOK as first-party behavior. + +## High-level milestone sequence + +1. **Stabilize the core.** Keep UOK, purpose-driven TDD, the eight PDD fields, and repo-local state/evidence as the non-negotiable base. +2. **Sharpen single-repo execution.** Port the highest-value correctness and workflow ideas from pi-mono, gsd-2, and adjacent CLI systems where they improve Forge without changing its product identity. +3. **Deepen autonomous reliability.** Improve evidence capture, recovery, verification, and self-improvement loops inside the single-repo boundary. +4. **Polish product surfaces.** Make the autonomous workflow legible in TUI, CLI, and docs without introducing separate planning semantics. +5. **Absorb and converge deliberately.** Fold proven external patterns into Forge/UOK as native behavior, and keep interfaces/concepts compatible with ACE Coder where useful, while letting Forge and ACE grow from their different starting points. + --- ## Tier 0 — Pi-mono ports (sf: do these FIRST) diff --git a/BUILD_PLAN_MILESTONE_MAP.md b/BUILD_PLAN_MILESTONE_MAP.md index 389c7b336..14be3351d 100644 --- a/BUILD_PLAN_MILESTONE_MAP.md +++ b/BUILD_PLAN_MILESTONE_MAP.md @@ -2,6 +2,27 @@ Every BUILD_PLAN.md tier item mapped to a milestone. **Rule D015**: every new milestone must cite which BUILD_PLAN tier/item it implements. +This file answers **where work belongs**, not **whether code is done**. "Mapped" means a BUILD_PLAN item has a milestone/slice home. It does **not** mean the implementation is verified in the current repo. + +## Mapping vs. code truth + +- **Mapped** — the item has a milestone/slice destination. +- **Verified in code** — the behavior exists in the repo and has evidence/tests/artifacts. +- **Open** — still planned or partially folded in, but not yet verified as complete. +- **Deferred** — intentionally out of the active plan. + +--- + +## High-level milestone direction + +These are the strategy bands above the itemized mapping: + +1. **Core foundation** — UOK, purpose-driven TDD, eight-field PDD gate, repo-local state +2. **Single-repo sharpening** — adopt the best execution/workflow ideas from pi-mono, gsd-2, Claude Code, Codex, Aider, and Plandex where they strengthen Forge +3. **Autonomous reliability** — evidence, recovery, verification, and self-improvement loops +4. **Surface coherence** — CLI, TUI, docs, and workflow language all reflect the same UOK-driven model +5. **ACE convergence prep** — keep concepts compatible with ACE Coder without turning Forge into the multi-repo system + --- ## Tier 0 — Pi-mono ports → **M006** @@ -44,4 +65,6 @@ All mapped. See BUILD_PLAN.md for item-level status. | Tier 2 | 7 (M012, M009, M013, M016) | 0 | | Tier 3+ | 0 | deferred | -**Zero gaps.** Every BUILD_PLAN tier item is either mapped to a milestone or explicitly deferred. +**Zero mapping gaps.** Every BUILD_PLAN tier item is either mapped to a milestone or explicitly deferred. + +That does **not** mean zero implementation gaps. Open `TODO`, `NEW`, and `⬜` markers in `BUILD_PLAN.md`, this map, and milestone artifacts still represent real work until they are reconciled against code evidence. diff --git a/README.md b/README.md index b5f3ea386..f8ca0857d 100644 --- a/README.md +++ b/README.md @@ -2,7 +2,7 @@ # SF -**The evolution of [Singularity Forge](https://github.com/sf-build/get-shit-done) — now a real coding agent.** +**The evolution of [Singularity Forge](https://github.com/sf-build/get-shit-done) — now a standalone autonomous repo operator.** [![npm version](https://img.shields.io/npm/v/singularity-forge?style=for-the-badge&logo=npm&logoColor=white&color=CB3837)](https://www.npmjs.com/package/singularity-forge) [![npm downloads](https://img.shields.io/npm/dm/singularity-forge?style=for-the-badge&logo=npm&logoColor=white&color=CB3837)](https://www.npmjs.com/package/singularity-forge) @@ -15,6 +15,10 @@ The original SF went viral as a prompt framework for Claude Code. It worked, but This version is different. SF is now a standalone CLI built on the [Pi SDK](https://github.com/badlogic/pi-mono), which gives it direct TypeScript access to the agent harness itself. That means SF can actually _do_ what v1 could only _ask_ the LLM to do: clear context between tasks, inject exactly the right files at dispatch time, manage git branches, track cost and tokens, detect stuck loops, recover from crashes, and auto-advance through an entire milestone without human intervention. +Forge is the product. The Unified Operation Kernel (UOK) is the internal runtime kernel. Core behavior is governed by purpose-driven TDD and the eight PDD fields: purpose, consumer, contract, failure boundary, evidence, non-goals, invariants, and assumptions. + +We sharpen Forge against the best external ideas we can find — Claude Code and Codex for ergonomics, Aider and gsd-2 for execution, Plandex for workflow structure — but those are reference inputs, not the destination. Forge stays focused on autonomous single-repo execution. ACE Coder is the separate multi-repo and large-scale path. + One command. Walk away. Come back to a built project with clean git history.
npm install -g singularity-forge@latest
@@ -153,7 +157,7 @@ The original SF was a collection of markdown prompts installed into `~/.claude/c - **No crash recovery.** If the session died mid-task, you started over. - **No observability.** No cost tracking, no progress dashboard, no stuck detection. -SF v2 solves all of these because it's not a prompt framework anymore — it's a TypeScript application that _controls_ the agent session. +SF v2 solves all of these because it's not a prompt framework anymore — it's a TypeScript application that _controls_ the agent session. Forge is the product; UOK is the internal kernel that drives the run loop. | | v1 (Prompt Framework) | v2 (Agent Application) | | -------------------- | ---------------------------- | ------------------------------------------------------- | diff --git a/VISION.md b/VISION.md index db2b4da6c..b65eeaae3 100644 --- a/VISION.md +++ b/VISION.md @@ -1,6 +1,6 @@ # Vision -SF is the orchestration layer between you and AI coding agents. It handles planning, execution, verification, and shipping so you can focus on what to build, not how to wrangle the tools. +SF is an autonomous single-repo software operator. Forge is the product; UOK is the internal execution kernel. It handles planning, execution, verification, and shipping so you can focus on what to build, not how to wrangle the tools. ## Who it's for @@ -14,10 +14,21 @@ Anyone who codes with AI agents — solo developers shipping faster, open-source **Tests are the contract.** If you change behavior, the tests tell you what you broke. Write tests for new behavior. Trust the test suite. +**Purpose-driven TDD.** The eight PDD fields — purpose, consumer, contract, failure boundary, evidence, non-goals, invariants, and assumptions — are the core gate. Non-trivial work should not move to implementation before purpose is explicit and a falsifier exists. + **Ship fast, fix fast.** Get it out, iterate quickly, don't let perfect be the enemy of good. Every release should work, but we'd rather ship and patch than delay and accumulate. **Provider-agnostic.** SF works with any LLM provider. No architectural decisions should privilege one provider over another. +**Sharpen by comparison, not imitation.** Learn from Claude Code, Codex, Aider, gsd-2, and Plandex where they are strong, but do not collapse Forge into a generic coder CLI. Forge's differentiator is autonomous single-repo execution on top of UOK. When an external pattern proves itself, absorb it into SF/UOK as first-party behavior instead of leaving it as a permanent comparison layer. + +## Direction + +- **Forge** grows as the single-repo product. +- **UOK** leads the runtime model and execution semantics. +- **ACE Coder** grows the multi-repo and large-scale orchestration path. +- External CLIs are comparison inputs used to sharpen workflow and execution choices. + ## What we won't accept These save everyone time. Don't open PRs for: diff --git a/docs/records/2026-05-07-cli-agent-code-survey.md b/docs/records/2026-05-07-cli-agent-code-survey.md new file mode 100644 index 000000000..6ecd13d76 --- /dev/null +++ b/docs/records/2026-05-07-cli-agent-code-survey.md @@ -0,0 +1,29 @@ +# CLI Agent Code Survey — 2026-05-07 + +We compared Forge-relevant CLI agent implementations to pull workflow and autonomy patterns into SF planning. + +## What was checked + +- `claude-code` +- `codex` +- `gemini-cli` +- `opencode` +- `aider` +- `goose` +- `qwen-code` +- `crush` +- `plandex` +- agentless-style repos: `Agentless`, `open-codex`, `RA.Aid`, `letta-code`, `neovate-code`, `amazon-q-developer-cli` +- `ace-coder` for curator, memory, and autonomy patterns + +## Where the code lives + +All reference checkouts are local under `/home/mhugo/code/`. + +## Takeaways + +- `plandex` is the closest workflow match for SF planning. +- `claude-code`, `aider`, `codex`, and `gemini-cli` are the best ergonomics references. +- `ace-coder` is our own codebase and the long-term direction is convergence between Forge and ACE. +- The code survey is done; future planning can rely on the local checkouts instead of rescanning remote repos. + diff --git a/docs/records/2026-05-07-strategy-alignment.md b/docs/records/2026-05-07-strategy-alignment.md new file mode 100644 index 000000000..ca0610709 --- /dev/null +++ b/docs/records/2026-05-07-strategy-alignment.md @@ -0,0 +1,28 @@ +# Strategy Alignment — 2026-05-07 + +Aligned the top-level SF docs and roadmap framing around the current architecture and end goal. + +## Canonical direction + +- **Forge** is the product. +- **UOK** is the internal execution kernel and leads runtime semantics. +- **Purpose-driven TDD** and the **eight PDD fields** are the core gate. +- **ACE Coder** is the multi-repo and large-scale path. + +## How external systems are used + +External CLIs are comparison inputs used to sharpen SF, not the destination: + +- **Claude Code / Codex** for interaction and execution ergonomics +- **Aider / gsd-2** for direct execution patterns +- **Plandex** for workflow structure + +When a pattern proves itself in practice, it should be absorbed into Forge/UOK as first-party behavior rather than preserved as a permanent external dependency in the product story. + +## High-level milestone framing + +1. Stabilize the UOK + PDD/TDD core. +2. Sharpen single-repo execution from external references where the fit is real. +3. Deepen autonomous reliability, evidence, recovery, and self-improvement. +4. Keep product surfaces coherent across CLI, TUI, and docs. +5. Absorb proven patterns into Forge/UOK and prepare concept-level convergence with ACE without collapsing the Forge/ACE split. diff --git a/docs/records/index.md b/docs/records/index.md index dfa5562f5..82e5633bd 100644 --- a/docs/records/index.md +++ b/docs/records/index.md @@ -7,3 +7,5 @@ Repo-memory audits, decision ledgers, context-gardening notes, and records-keepe | Date | Note | Summary | |------|------|---------| | 2026-05-01 | [repo-vcs and notifications](./2026-05-01-repo-vcs-and-notifications.md) | repo-vcs skill landed; notification specs drafted; JSDoc annotations added; placeholder docs filled | +| 2026-05-07 | [cli agent code survey](./2026-05-07-cli-agent-code-survey.md) | compared local CLI agent checkouts; Plandex is the workflow analog; ACE is owned code and future convergence target | +| 2026-05-07 | [strategy alignment](./2026-05-07-strategy-alignment.md) | aligned top-level docs and roadmap framing around Forge as product, UOK as kernel, and external CLIs as sharpening inputs | diff --git a/src/resources/extensions/sf-tui/extension-manifest.json b/src/resources/extensions/sf-tui/extension-manifest.json index f25703c53..74da251d9 100644 --- a/src/resources/extensions/sf-tui/extension-manifest.json +++ b/src/resources/extensions/sf-tui/extension-manifest.json @@ -2,7 +2,7 @@ "id": "sf-tui", "name": "SF TUI", "version": "1.0.0", - "description": "Adds SF-specific header, footer, prompt stash, color, emoji, and marketplace UI controls", + "description": "Adds SF-specific header, footer, prompt history, color, emoji, and marketplace UI controls", "tier": "bundled", "requires": { "platform": ">=2.29.0" }, "provides": { diff --git a/src/resources/extensions/sf-tui/index.js b/src/resources/extensions/sf-tui/index.js index bfc2afcd9..b1d88b96c 100644 --- a/src/resources/extensions/sf-tui/index.js +++ b/src/resources/extensions/sf-tui/index.js @@ -4,17 +4,24 @@ * Features: * - Powerline footer: git branch, diff stats, last commit, model, cost, context * - Header: project name + branch + model - * - Prompt history stash: Ctrl+Alt+H overlay + * - Prompt history: Ctrl+Alt+H overlay */ +import { randomUUID } from "node:crypto"; import { Key } from "@singularity-forge/pi-tui"; import { isAutoActive } from "../sf/auto.js"; +import { projectRoot } from "../sf/commands/context.js"; import { registerSessionColor } from "./color-band.js"; import { registerSessionEmoji } from "./emoji.js"; import { renderFooter } from "./footer.js"; import { invalidateGitStatus } from "./git.js"; import { renderHeader } from "./header.js"; import { openMarketplaceOverlay } from "./marketplace.js"; -import { openStashOverlay, pushStash, readStash, writeStash } from "./stash.js"; +import { + appendPromptHistory, + openPromptHistoryOverlay, + pushPromptHistory, + readPromptHistory, +} from "./prompt-history.js"; function installHeader(ctx) { if (!ctx.hasUI) return; @@ -45,19 +52,30 @@ function installFooter(ctx) { export default function sfTui(pi) { registerSessionEmoji(pi); registerSessionColor(pi); - const stash = readStash(); + const promptHistory = readPromptHistory(); + const promptHistorySessionId = randomUUID(); + let projectBasePath = null; let wasAutoActive = false; pi.on("session_start", async (_event, ctx) => { if (!ctx.hasUI) return; + try { + projectBasePath = projectRoot(); + const projectPromptHistory = readPromptHistory(projectBasePath); + promptHistory.splice(0, promptHistory.length, ...projectPromptHistory); + } catch { + projectBasePath = null; + } installHeader(ctx); installFooter(ctx); + const openProjectPromptHistory = (overlayCtx) => + openPromptHistoryOverlay(overlayCtx, projectBasePath ?? undefined); pi.registerShortcut(Key.ctrlAlt("h"), { - description: "Open prompt history stash", - handler: openStashOverlay, + description: "Open prompt history", + handler: openProjectPromptHistory, }); pi.registerShortcut(Key.ctrlShift("h"), { - description: "Open prompt history stash (fallback)", - handler: openStashOverlay, + description: "Open prompt history (fallback)", + handler: openProjectPromptHistory, }); pi.registerShortcut(Key.ctrlAlt("m"), { description: "Open marketplace browser", @@ -68,8 +86,18 @@ export default function sfTui(pi) { pi.on("before_agent_start", async (event) => { const prompt = event.prompt?.trim(); if (prompt) { - pushStash(stash, prompt); - writeStash(stash); + pushPromptHistory(promptHistory, prompt); + appendPromptHistory( + prompt, + projectBasePath ?? undefined, + promptHistorySessionId, + ); + pi.appendEntry("sf-prompt-history", { + prompt, + projectRoot: projectBasePath, + sessionId: promptHistorySessionId, + timestamp: Date.now(), + }); } }); pi.on("tool_result", async (_event, ctx) => { diff --git a/src/resources/extensions/sf-tui/stash.js b/src/resources/extensions/sf-tui/prompt-history.js similarity index 61% rename from src/resources/extensions/sf-tui/stash.js rename to src/resources/extensions/sf-tui/prompt-history.js index 65be82358..03a5c83b9 100644 --- a/src/resources/extensions/sf-tui/stash.js +++ b/src/resources/extensions/sf-tui/prompt-history.js @@ -1,4 +1,4 @@ -import { existsSync, mkdirSync, readFileSync, writeFileSync } from "node:fs"; +import { appendFileSync, existsSync, mkdirSync, readFileSync } from "node:fs"; import { homedir } from "node:os"; import { dirname, join } from "node:path"; import { @@ -9,39 +9,87 @@ import { } from "@singularity-forge/pi-tui"; const LIMIT = 20; -function stashPath() { - return join(homedir(), ".sf", "agent", "prompt-history.json"); +const SCAN_LINE_LIMIT = 2000; +function promptHistoryPath() { + return join(homedir(), ".sf", "agent", "prompt-history.jsonl"); } -export function readStash() { +function readEntries() { try { - const path = stashPath(); + const path = promptHistoryPath(); if (!existsSync(path)) return []; - const d = JSON.parse(readFileSync(path, "utf-8")); - return d.history.filter( - (h) => typeof h === "string" && h.trim().length > 0, - ); + return readFileSync(path, "utf-8") + .split(/\r?\n/) + .reverse() + .slice(0, SCAN_LINE_LIMIT) + .flatMap((line) => { + const text = line.trim(); + if (!text) return []; + const entry = JSON.parse(text); + if ( + !entry || + typeof entry !== "object" || + entry.version !== 1 || + typeof entry.prompt !== "string" || + entry.prompt.trim().length === 0 || + typeof entry.projectRoot !== "string" || + entry.projectRoot.trim().length === 0 + ) { + return []; + } + return [entry]; + }); } catch { return []; } } -export function writeStash(history) { +function normalizeHistory(history) { + const seen = new Set(); + const merged = []; + for (const item of history) { + const text = String(item ?? "").trim(); + if (!text || seen.has(text)) continue; + seen.add(text); + merged.push(text); + if (merged.length >= LIMIT) break; + } + return merged; +} +function appendEntries(entries) { try { - const path = stashPath(); + const path = promptHistoryPath(); mkdirSync(dirname(path), { recursive: true }); - writeFileSync( + appendFileSync( path, - JSON.stringify( - { version: 1, history: history.slice(0, LIMIT) }, - null, - 2, - ) + "\n", - "utf-8", + entries.map((entry) => JSON.stringify(entry)).join("\n") + "\n", + { encoding: "utf-8", mode: 0o600 }, ); } catch { /* non-fatal */ } } -export function pushStash(history, text) { +export function readPromptHistory(basePath) { + if (!basePath) return []; + return normalizeHistory( + readEntries() + .filter((entry) => entry.projectRoot === basePath) + .map((entry) => entry.prompt), + ); +} +export function appendPromptHistory(prompt, basePath, sessionId) { + if (!basePath) return; + const normalized = normalizeHistory([prompt]); + if (!normalized.length) return; + const now = Date.now(); + const entries = normalized.toReversed().map((prompt, index) => ({ + version: 1, + prompt, + projectRoot: basePath, + sessionId: sessionId ?? null, + timestamp: now - (normalized.length - index - 1), + })); + appendEntries(entries); +} +export function pushPromptHistory(history, text) { const t = text.trim(); if (!t || history[0] === t) return; history.unshift(t); @@ -53,7 +101,7 @@ function preview(text, maxWidth) { const c = text.replace(/\s+/g, " ").trim(); return c ? truncateToWidth(c, maxWidth, "…") : "(empty)"; } -class StashOverlay { +class PromptHistoryOverlay { tui; theme; done; @@ -138,7 +186,7 @@ class StashOverlay { } lines.push(pad(box(""))); lines.push(pad(th.fg("dim", "├" + "─".repeat(bw) + "┤"))); - lines.push(pad(box(th.fg("dim", `${this.items.length} stashed prompts`)))); + lines.push(pad(box(th.fg("dim", `${this.items.length} prompts`)))); lines.push(pad(th.fg("dim", "╰" + "─".repeat(bw) + "╯"))); lines.push(""); this.cacheL = lines; @@ -146,22 +194,22 @@ class StashOverlay { return lines; } } -export async function openStashOverlay(ctx) { +export async function openPromptHistoryOverlay(ctx, basePath) { if (!ctx.hasUI) { ctx.ui.notify("Prompt history requires interactive mode", "error"); return; } - const items = readStash(); + const items = readPromptHistory(basePath ?? undefined); if (!items.length) { ctx.ui.notify( - "No stashed prompts yet. Send a message to build history.", + "No prompt history yet. Send a message to build history.", "info", ); return; } const selected = await ctx.ui.custom( (tui, theme, _kb, done) => { - const o = new StashOverlay(tui, theme, items, done); + const o = new PromptHistoryOverlay(tui, theme, items, done); return { render: (w) => o.render(w), invalidate: () => o.invalidate(), diff --git a/src/resources/extensions/sf/auto-dispatch.js b/src/resources/extensions/sf/auto-dispatch.js index 6019933ce..4cb05b253 100644 --- a/src/resources/extensions/sf/auto-dispatch.js +++ b/src/resources/extensions/sf/auto-dispatch.js @@ -33,8 +33,6 @@ import { buildRunUatPrompt, buildValidateMilestonePrompt, buildWorkflowPreferencesPrompt, - checkNeedsReassessment, - checkNeedsRunUat, } from "./auto-prompts.js"; import { hasImplementationArtifacts } from "./auto-recovery.js"; import { getCanonicalMilestonePlan } from "./canonical-milestone-plan.js"; @@ -51,6 +49,10 @@ import { parseDeferredRequirements, resolveAllOverrides, } from "./files.js"; +import { + checkNeedsReassessment, + checkNeedsRunUat, +} from "./workflow-helpers.js"; import { getRelevantMemoriesRanked, isDbAvailable as isMemoryDbAvailable, diff --git a/src/resources/extensions/sf/tests/prompt-history.test.mjs b/src/resources/extensions/sf/tests/prompt-history.test.mjs new file mode 100644 index 000000000..5f2f6b15c --- /dev/null +++ b/src/resources/extensions/sf/tests/prompt-history.test.mjs @@ -0,0 +1,143 @@ +import { + existsSync, + mkdirSync, + mkdtempSync, + readFileSync, + rmSync, + writeFileSync, +} from "node:fs"; +import { tmpdir } from "node:os"; +import { join } from "node:path"; +import { afterEach, beforeEach, describe, expect, it } from "vitest"; +import { + appendPromptHistory, + readPromptHistory, +} from "../../sf-tui/prompt-history.js"; + +describe("prompt history", () => { + let oldHome; + let homeDir; + let projectDir; + + beforeEach(() => { + oldHome = process.env.HOME; + homeDir = mkdtempSync(join(tmpdir(), "sf-home-")); + projectDir = mkdtempSync(join(tmpdir(), "sf-project-")); + process.env.HOME = homeDir; + }); + + afterEach(() => { + process.env.HOME = oldHome; + rmSync(homeDir, { recursive: true, force: true }); + rmSync(projectDir, { recursive: true, force: true }); + }); + + it("appendPromptHistory_when_projectPathProvided_persists_tagged_global_entry", () => { + appendPromptHistory("first prompt", projectDir, "session-1"); + + const globalPath = join(homeDir, ".sf", "agent", "prompt-history.jsonl"); + + expect(existsSync(globalPath)).toBe(true); + expect( + readFileSync(globalPath, "utf-8") + .trim() + .split(/\r?\n/) + .map((line) => JSON.parse(line)), + ).toMatchObject([ + { + version: 1, + prompt: "first prompt", + projectRoot: projectDir, + sessionId: "session-1", + }, + ]); + }); + + it("readPromptHistory_when_history_contains_multiple_projects_returns_current_project_only", () => { + const globalPath = join(homeDir, ".sf", "agent", "prompt-history.jsonl"); + mkdirSync(join(homeDir, ".sf", "agent"), { recursive: true }); + writeFileSync( + globalPath, + [ + JSON.stringify({ + version: 1, + prompt: "shared", + projectRoot: projectDir, + sessionId: "session-1", + timestamp: 1, + }), + JSON.stringify({ + version: 1, + prompt: "project", + projectRoot: projectDir, + sessionId: "session-1", + timestamp: 2, + }), + JSON.stringify({ + version: 1, + prompt: "other project", + projectRoot: join(projectDir, "other"), + sessionId: "session-2", + timestamp: 3, + }), + ].join("\n") + "\n", + "utf-8", + ); + + expect(readPromptHistory(projectDir)).toEqual(["project", "shared"]); + }); + + it("readPromptHistory_when_project_history_has_duplicates_returns_newest_unique_prompts", () => { + const globalPath = join(homeDir, ".sf", "agent", "prompt-history.jsonl"); + mkdirSync(join(homeDir, ".sf", "agent"), { recursive: true }); + writeFileSync( + globalPath, + [ + JSON.stringify({ + version: 1, + prompt: "repeat", + projectRoot: projectDir, + sessionId: "session-1", + timestamp: 1, + }), + JSON.stringify({ + version: 1, + prompt: "newest", + projectRoot: projectDir, + sessionId: "session-1", + timestamp: 2, + }), + JSON.stringify({ + version: 1, + prompt: "repeat", + projectRoot: projectDir, + sessionId: "session-2", + timestamp: 3, + }), + ].join("\n") + "\n", + "utf-8", + ); + + expect(readPromptHistory(projectDir)).toEqual(["repeat", "newest"]); + }); + + it("readPromptHistory_when_legacy_untagged_history_exists_ignores_it", () => { + const globalPath = join(homeDir, ".sf", "agent", "prompt-history.json"); + mkdirSync(join(homeDir, ".sf", "agent"), { recursive: true }); + writeFileSync( + globalPath, + JSON.stringify({ version: 1, history: ["global leak"] }, null, 2) + "\n", + "utf-8", + ); + + expect(readPromptHistory(projectDir)).toEqual([]); + }); + + it("appendPromptHistory_when_projectPathMissing_does_not_persist_history", () => { + appendPromptHistory("global leak"); + const globalPath = join(homeDir, ".sf", "agent", "prompt-history.jsonl"); + + expect(existsSync(globalPath)).toBe(false); + expect(readPromptHistory()).toEqual([]); + }); +});