Final rebrand: rename remaining Rust source file to complete the gsd → forge transition. All parser references already use forge_parser after earlier commits. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
18 KiB
PRD: Branchless Worktree Architecture
Author: Lex Christopherson Date: 2026-03-15 ADR: ADR-001-branchless-worktree-architecture.md Priority: Critical — blocks reliable auto-mode operation
Problem Statement
SF's auto-mode is unreliable. Users experience:
-
Infinite loop detection failures — the agent writes planning artifacts on slice branches that become invisible after branch switching, causing
verifyExpectedArtifact()to fail repeatedly. Auto-mode burns budget retrying the same unit 3-6 times before hard-stopping. This is the #1 user complaint. -
State corruption across branches —
.sf/planning artifacts (roadmaps, plans, decisions) are gitignored but branch-specific. Multiple branches sharing a single.sf/directory clobber each other's state. Users see wrong milestones marked complete, wrong roadmaps loaded, and auto-mode starting from the wrong phase. -
Excessive complexity — 770+ lines of merge, conflict resolution, branch switching, and self-healing code exist solely to manage slice branches inside worktrees. This code has required 15+ bug fixes across versions and remains the primary source of auto-mode failures.
These problems are architectural. They cannot be fixed by patching individual symptoms.
Vision
Auto-mode uses git worktrees for isolation and sequential commits for history. No branch switching. No merge conflicts within a worktree. Planning artifacts are tracked in git and travel with the branch. The git layer is so simple it can't break.
Success Criteria
| Criterion | Measurement |
|---|---|
| Zero loop detection failures from branch visibility | No verifyExpectedArtifact() failures caused by branch mismatch in 50 consecutive auto-mode runs |
Zero .sf/ state corruption |
Manual worktrees created via git worktree add have correct .sf/ state without any SF-specific initialization |
| Code deletion | Net removal of ≥500 lines of merge/conflict/branch-switching code |
| Test simplification | Removal or simplification of ≥6 merge-specific test files |
| Backwards compatibility | Existing projects with sf/M001/S01 slice branches continue to work (read-only; new work uses new model) |
| No new git primitives | The implementation uses only: worktrees, commits, squash-merge. No new branch types, merge strategies, or conflict resolution. |
Non-Goals
- Parallel slice execution within a single worktree (if needed later, use separate worktrees)
- Changing how milestones relate to
main(squash-merge stays) - Modifying the dispatch unit types or state machine (except removing
fix-merge) - Changing the worktree-manager.ts manual worktree API (
/worktreecommand)
Current Architecture
Branch Model (M003, v2.13.0)
main
└─ milestone/M001 (worktree at .sf/worktrees/M001/)
├─ sf/M001/S01 (slice branch — code + .sf/ artifacts)
│ └── merge --no-ff → milestone/M001
├─ sf/M001/S02
│ └── merge --no-ff → milestone/M001
└── squash merge → main
Data Flow
Agent writes file → on slice branch → handleAgentEnd → auto-commit on slice branch
→ switch to milestone branch → verifyExpectedArtifact → FILE NOT FOUND (it's on slice branch)
→ loop counter++ → retry → same result → HARD STOP
Code Involved
| File | Lines | Purpose |
|---|---|---|
auto-worktree.ts |
512 | Worktree lifecycle + slice→milestone merge |
git-service.ts |
915 | Branch creation, switching, merge with conflict resolution |
git-self-heal.ts |
198 | Merge failure recovery |
auto.ts |
~150 lines | Merge dispatch guards, fix-merge routing, branch-mode vs worktree-mode branching |
worktree.ts |
~40 lines | Slice branch delegates |
| 11 test files | ~2000 lines | Merge/branch/worktree test coverage |
.sf/ Tracking (Current — Contradictory)
.gitignoreline 52:.sf/— ignores everythingsmartStage()lines 338-349: force-addsSF_DURABLE_PATHS— tracks milestones/, DECISIONS.md, PROJECT.md, REQUIREMENTS.md, QUEUE.md- Result:
.sf/milestones/is partially tracked on some branches, fully ignored on others. The code fights the config.
Proposed Architecture
Branch Model
main
└─ milestone/M001 (worktree at .sf/worktrees/M001/)
│
commit: feat(M001): context + roadmap
commit: feat(M001/S01): research
commit: feat(M001/S01): plan
commit: feat(M001/S01/T01): implement auth service
commit: feat(M001/S01/T02): implement auth tests
commit: feat(M001/S01): summary + UAT
commit: docs(M001): reassess roadmap after S01
commit: feat(M001/S02): research
commit: feat(M001/S02): plan
commit: ...
commit: feat(M001): milestone complete
│
└── squash merge → main
One branch. Sequential commits. No merges within the worktree.
Data Flow
Agent writes file → on milestone branch → handleAgentEnd → auto-commit on milestone branch
→ verifyExpectedArtifact → FILE FOUND (same branch) → persist completion → next dispatch
.sf/ Tracking (Proposed — Coherent)
Tracked (travels with branch):
.sf/milestones/**/*.md (except CONTINUE markers)
.sf/milestones/**/*.json (META.json integration records)
.sf/PROJECT.md
.sf/DECISIONS.md
.sf/REQUIREMENTS.md
.sf/QUEUE.md
Gitignored (ephemeral):
.sf/auto.lock
.sf/completed-units.json
.sf/STATE.md
.sf/metrics.json
.sf/sf.db
.sf/activity/
.sf/runtime/
.sf/worktrees/
.sf/DISCUSSION-MANIFEST.json
.sf/milestones/**/*-CONTINUE.md
.sf/milestones/**/continue.md
Why This Works
| Problem | How It's Solved |
|---|---|
| Artifact invisibility after branch switch | No branch switching. Artifacts commit on the one branch. |
.sf/ state clobbering |
Artifacts tracked in git. Each branch carries its own .sf/. git worktree add and git checkout give correct state. |
| Merge conflict complexity | No merges within a worktree. Only merge is milestone→main (squash). |
| Manual worktree initialization | Tracked artifacts are checked out with the branch. No SF-specific bootstrap needed. |
| Dual isolation mode maintenance | Single mode: worktree. Branch-mode (git.isolation: "branch") deprecated. |
Implementation Plan
Phase 1: .gitignore + Tracking Fix
Goal: Planning artifacts are tracked in git. .gitignore reflects reality.
-
Update
.gitignore:- Remove blanket
.sf/ignore - Add explicit runtime-only ignores (see proposed list above)
- Remove blanket
-
Force-add existing planning artifacts on current branch:
git add --force .sf/milestones/ .sf/PROJECT.md .sf/DECISIONS.md .sf/REQUIREMENTS.md .sf/QUEUE.md -
Ensure runtime files are NOT tracked:
git rm --cached -r .sf/runtime/ .sf/activity/ .sf/STATE.md .sf/metrics.json .sf/completed-units.json .sf/auto.lock -
Update README suggested
.gitignoresection -
Remove
smartStage()force-add ofSF_DURABLE_PATHS— no longer needed since.gitignoredoesn't block them
Verification: git status shows planning artifacts tracked, runtime files untracked. git worktree add on a new worktree has correct .sf/milestones/ state.
Phase 2: Remove Slice Branch Creation + Switching
Goal: No code creates, switches to, or references slice branches for new work.
- Remove
ensureSliceBranch()fromgit-service.ts(lines 485-544) - Remove
switchToMain()fromgit-service.ts(lines 549-563) - Remove
getSliceBranchName()fromworktree.ts(lines 94-98) - Remove
isOnSliceBranch()andgetActiveSliceBranch()fromworktree.ts - Update
auto.tsdispatch paths — remove branch creation beforeexecute-task - Update
handleAgentEnd— remove branch-switching logic post-dispatch
Verification: Auto-mode runs a full slice (research → plan → execute → complete) without creating any branches. All commits land on milestone/<MID>.
Phase 3: Remove Slice Merge Code
Goal: All slice→milestone and slice→main merge code is deleted.
- Remove
mergeSliceToMilestone()fromauto-worktree.ts(lines 253-350) - Remove
mergeSliceToMain()fromgit-service.ts(lines 705-893) - Remove merge dispatch guards from
auto.ts(lines 1635-1679) - Remove
fix-mergedispatch unit type fromauto.ts - Remove
buildPromptForFixMerge()fromauto.ts - Remove
withMergeHeal()fromgit-self-heal.ts(lines 99-136) - Remove
abortAndReset()fromgit-self-heal.ts(lines 37-84) — or simplify to crash-recovery-only - Remove
shouldUseWorktreeIsolation()preference resolution — worktree is the only mode - Remove
getMergeToMainMode()— milestone merge is the only mode - Deprecate
git.isolation: "branch"andgit.merge_to_main: "slice"preferences
Verification: git grep mergeSliceToMilestone returns zero results. git grep mergeSliceToMain returns zero results. git grep fix-merge returns zero results (outside of changelog/docs).
Phase 4: Simplify mergeMilestoneToMain()
Goal: Milestone→main merge is clean and minimal.
The function becomes:
- Auto-commit any dirty state in worktree
process.chdir(originalBasePath)— back to main repogit checkout maingit merge --squash milestone/<MID>- Build commit message with milestone summary + slice manifest
git commit- Optional:
git push removeWorktree()+git branch -D milestone/<MID>
No conflict categorization. No runtime file stripping (runtime files are gitignored, not in the merge). No .sf/ special handling.
If squash-merge conflicts (parallel milestone edge case): stop auto-mode with clear error, user resolves manually or SF dispatches a one-time resolution session.
Verification: Complete a full milestone in auto-mode. main receives one squash commit with all code and planning artifacts.
Phase 5: Test Cleanup
Goal: Test suite reflects the simplified architecture.
-
Delete or rewrite:
auto-worktree-merge.test.ts— tests slice→milestone merge (deleted)auto-worktree-milestone-merge.test.ts— rewrite for simplified milestone→mainworktree-e2e.test.ts— rewrite for branchless flowworktree-integration.test.ts— rewrite for branchless flow- Merge-related test cases in
git-service.test.ts
-
Add new tests:
- Branchless worktree lifecycle: create → commit → commit → squash-merge → cleanup
.sf/tracking: planning artifacts tracked, runtime files ignored- Manual worktree:
git worktree addhas correct.sf/state - Crash recovery: dirty state on milestone branch, restart, auto-commit, continue
-
Remove merge-specific doctor checks or simplify:
corrupt_merge_state— keep (still relevant for milestone→main)orphaned_auto_worktree— keepstale_milestone_branch— keeptracked_runtime_files— keep
Verification: npm run test passes. No test references mergeSliceToMilestone, mergeSliceToMain, or ensureSliceBranch.
Phase 6: Migration + Backwards Compatibility
Goal: Existing projects with slice branches continue to work.
- State derivation (
deriveState()) continues to readsf/M001/S01branch naming for legacy detection - On first run after upgrade:
- Detect existing slice branches
- Notify user: "SF no longer creates slice branches. Existing branches are preserved but new work commits directly to the milestone branch."
- No forced migration — legacy branches are read-only context
- Doctor check:
legacy_slice_branches— informational, not auto-fix - Update
shouldUseWorktreeIsolation()preference handling:git.isolation: "worktree"→ default behavior (only option)git.isolation: "branch"→ warning, treated as worktree- Remove preference UI for isolation mode
Verification: Open a project with existing sf/M001/S01 branches. SF reads state correctly, new work commits on milestone branch without slice branches.
Stress Test Results
Validated by three independent models:
Gemini 2.5 Pro — 6 Attack Vectors
| Attack | Severity | Mitigation |
|---|---|---|
| Parallel milestone code conflict at squash-merge | Medium | git rebase main before squash. Rare in single-user. |
SQLite desync after git reset --hard |
Low | DB rebuilt from tracked markdown on startup (M001/S02 importers). |
| Ghost lock after SIGKILL | Low | Existing heartbeat lock detection handles this. |
| Squash merge loses bisect granularity | Low | Commit messages tag slices. Branch preservable if needed. |
| Disk space with multiple worktrees | Low | Single active milestone at a time. Immediate cleanup. |
| Plan-action atomicity gap (crash between write and commit) | Low | handleAgentEnd auto-commits. Sequential model simplifies recovery. |
GPT-5.4 (Codex) — Codebase-Informed Analysis
- Confirmed
smartStage()force-add already implements tracked-artifact intent - Confirmed
resolveMainWorktreeRoot(PR #487) contradicts this architecture - Confirmed
.sf/milestones/partially tracked onmaindespite.gitignore - Verdict: Model is sound. Removes only accidental complexity.
GPT-5.4 (Codex) — Dissenting Opinion
Codex agreed on tracked artifacts and worktree-per-milestone, but pushed back on removing slice branches, calling it "a redesign, not a simplification." Specific concerns:
| Concern | Rebuttal |
|---|---|
| Crash recovery for orphaned slice branches disappears | The failure mode (orphaned branch needing merge) is caused by slice branches. Removing branches removes the failure. Sequential commits on one branch need no orphan recovery. |
| Concurrent edits to shared root docs (DECISIONS.md) from two terminals | Standard content conflict at squash-merge time. Not caused by or solved by slice branches. |
| Continuous integration via slice→milestone merges | In sequential single-user work, there's nothing to integrate against within the worktree. Pre-flight rebase before squash-merge is more direct. |
| Need a replacement slice-boundary primitive | Accepted: conventional commit tags (feat(M001/S01):) + optional git tags (sf/M001/S01-complete) serve as boundaries. |
Codex's analysis confirms the tracked-artifact approach but recommends treating branchless as a deliberate redesign with explicit replacement primitives, not a casual deletion.
Edge Case: Two Milestones Touching Same Source Files
Scenario: M001 and M002 both modify src/auth.ts. M001 squash-merges first.
Resolution: Before M002 squash-merges, rebase onto updated main:
cd .sf/worktrees/M002
git fetch origin main
git rebase main
# Resolve any conflicts (code-only, never .sf/)
# Then squash-merge
This is standard git workflow. SF can automate the rebase step as a pre-merge check.
Edge Case: Agent Crash Mid-Commit
Scenario: Power loss during git commit on the milestone branch.
Resolution: Git's internal journaling protects the object store. On restart:
- If commit completed: state is consistent
- If commit didn't complete: working directory has uncommitted changes,
handleAgentEndauto-commits on next dispatch - No branch to be "stuck between" — single branch means no split-brain state
Edge Case: User Edits Main While Worktree Active
Scenario: User makes manual commits on main while M001 worktree is active.
Resolution: Worktree is on milestone/M001 branch, independent of main. Manual main commits don't affect the worktree. At squash-merge time, git merge --squash handles the divergence normally. If there's a conflict, it's resolved once.
Metrics
Before (Current)
| Metric | Value |
|---|---|
| Merge/conflict/branch code | 770+ lines across 4 files |
| Merge-related test files | 11 files |
| Branch types | 4 (main, milestone/, sf//, worktree/) |
| Merge strategies | 3 (--no-ff, --squash, conflict resolution) |
| Dispatch unit types with merge logic | 2 (complete-slice, fix-merge) |
| Isolation modes | 2 (branch, worktree) |
| Doctor git checks | 4 |
After (Proposed)
| Metric | Value |
|---|---|
| Merge/conflict/branch code | ~50 lines (simplified mergeMilestoneToMain only) |
| Merge-related test files | 3-4 files (rewritten) |
| Branch types | 2 (main, milestone/*) |
| Merge strategies | 1 (--squash) |
| Dispatch unit types with merge logic | 0 |
| Isolation modes | 1 (worktree) |
| Doctor git checks | 3-4 (simplified) |
Net Impact
- ~720 lines deleted (net, after simplified replacements)
- ~7 test files deleted or consolidated
- 2 branch types eliminated
- 2 merge strategies eliminated
- 1 dispatch unit type eliminated (fix-merge)
- 1 isolation mode eliminated (branch)
- 0 merge conflicts possible within a worktree
Dependencies
-
M001 (Memory Database): The SQLite database (
sf.db) must remain gitignored. The M001/S02 importer layer rebuilds it from tracked markdown. This PRD's.gitignoreupdate explicitly ignoressf.db. -
PR #487: Must be closed. The
resolveMainWorktreeRootapproach (sharing.sf/across worktrees) contradicts tracked-artifact architecture.
Open Questions
-
Squash vs
--no-fffor milestone→main merge? Squash gives clean history onmainbut loses bisect granularity.--no-ffpreserves granular commits but cluttersmain. Current proposal: squash (matching existing behavior), with option to preserve milestone branch for debugging. -
Should
worktrees/move outside.sf/? Having worktrees inside.sf/creates a nesting-doll pattern (worktree contains.sf/which is inside.sf/worktrees/). Relocating to.sf-worktrees/or~/.sf/worktrees/<repo-hash>/is cleaner but changes the filesystem layout. Recommendation: defer, address separately if it causes issues. -
Pre-flight rebase automation? Before milestone→main squash-merge, should SF automatically
git rebase main? Gemini recommends yes. Risk: rebase can fail with conflicts, adding a code path. Recommendation: implement as a doctor check ("milestone branch is behind main by N commits") with manual resolution, automate later if needed.