sf snapshot: pre-dispatch, uncommitted changes after 42m inactivity

2026-05-04 22:41:07 +02:00 · 2026-05-04 22:41:07 +02:00 · 56aaf5bb45
commit 56aaf5bb45
parent 4053819854
11 changed files with 81 additions and 26 deletions
--- a/.omg/state/learn-watch.json
+++ b/.omg/state/learn-watch.json
@ -1,6 +1,6 @@
 {
  "last_session_id": "67e970c5-7790-4d38-ba0b-527b9f349c49",
-  "last_event_key": "67e970c5-7790-4d38-ba0b-527b9f349c49:transcript:01389baa63d7cd14460c1725484e72f23651a4b02cc12b87f3b6f1bf6043a8d0",
+  "last_event_key": "67e970c5-7790-4d38-ba0b-527b9f349c49:transcript:4fb7e8afb9c1c96fda3a464c707cde5137eba863cb21384f4d929867c14d1d9a",
  "last_prompted_session_id": "",
  "last_reason": "short-session",
  "last_prompted_at": "",
@ -8,5 +8,5 @@
  "last_actionable_message_count": 0,
  "deep_interview_lock_active": false,
  "deep_interview_lock_source": "/home/mhugo/code/singularity-forge/.omg/state/deep-interview.json",
-  "updated_at": "2026-05-04T19:57:31.227Z"
+  "updated_at": "2026-05-04T20:36:06.661Z"
 }
--- a/.siftignore
+++ b/.siftignore
@ -15,11 +15,39 @@ node_modules/**
 **/vendor/**
 **/coverage/**
 .cache/**
-tmp/**
+**/tmp/**
 *.log
 dist-test/**
 packages/*/dist/**
 packages/*/target/**
 rust-engine/target/**
-rust-engine/addon/*.node
 **/tsconfig.tsbuildinfo
+.claude/**
+.serena/**
+.crush/**
+.plans/**
+.omg/**
+.agents/**
+**/.next/**
+**/.cache/**
+**/out/**
+**/coverage/**
+**/package-lock.json
+**/yarn.lock
+**/pnpm-lock.yaml
+# Ignore large binaries and assets
+*.node
+*.so
+*.dll
+*.dylib
+*.exe
+*.bin
+*.pack
+*.woff2
+*.png
+*.jpg
+*.jpeg
+*.gif
+*.svg
+*.ico
+*.pdf
--- a/src/resources/agents/scout.md
+++ b/src/resources/agents/scout.md
@ -8,7 +8,7 @@ You are a scout. Quickly investigate a codebase and return structured findings t

 Use in-process `grep`, `find`, `ls`, and `lsp` before shelling out. These keep exploration inside SF's tool surface and use native backends where available.

-Use `codebase_search` as your PRIMARY tool for conceptual, behavioral, or architectural discovery (e.g. "how does X work?", "where is Y handled?"). It uses Sift-backed hybrid BM25/vector retrieval and is significantly more effective than grep for navigating unfamiliar logic. Use exact text search (`grep`) only when you already have a specific identifier or filename in mind. You are still the scout role; Sift is the powerful primitive you should lead with for exploration.
+Use `codebase_search` as your PRIMARY tool for conceptual, behavioral, or architectural discovery (e.g. "how does X work?", "where is Y handled?"). It uses Sift-backed hybrid BM25/vector retrieval and is significantly more effective than grep for navigating unfamiliar logic. Use `sift_search` when you need agentic multi-turn research, explicit strategy selection (e.g. `page-index-hybrid`, `path-hybrid`), or planner configuration. Use exact text search (`grep`) only when you already have a specific identifier or filename in mind. You are still the scout role; Sift is the powerful primitive you should lead with for exploration.

 Your output will be passed to an agent who has NOT seen the files you explored.

--- a/src/resources/extensions/sf/extension-manifest.json
+++ b/src/resources/extensions/sf/extension-manifest.json
@ -11,6 +11,7 @@
 			"write",
 			"read",
 			"edit",
+			"sift_search",
 			"sf_decision_save",
 			"sf_summary_save",
 			"sf_requirement_update",
--- a/src/resources/extensions/sf/prompts/discuss-headless.md
+++ b/src/resources/extensions/sf/prompts/discuss-headless.md
@ -76,7 +76,7 @@ Before anything else, form a diagnosis: What is the core challenge? What is brok
 - **Measure coverage**: find untested critical paths
 - **Scan for dead code, stubs, and commented-out features** — abandoned attempts are signals
 - **Discover needed skills**: identify repo languages, frameworks, data stores, external services, build tools, and domain-specific competencies. Check installed skills first; record installed, missing, and potentially useful skills in `.sf/CODEBASE.md` and `.sf/PM-STRATEGY.md`.
- **Use code intelligence**: use `codebase_search` (or Project RAG tools if configured) as your PRIMARY exploration method for conceptual, behavioral, or architectural discovery before manually reading files. Fall back to `.sf/CODEBASE.md`, in-process `grep`/`find`/`ls`, and `lsp` only for exact matches or structural navigation.
+- **Use code intelligence**: use `codebase_search` (or Project RAG tools if configured) as your PRIMARY exploration method for conceptual, behavioral, or architectural discovery before manually reading files. Use `sift_search` for agentic multi-turn research or explicit strategy selection. Fall back to `.sf/CODEBASE.md`, in-process `grep`/`find`/`ls`, and `lsp` only for exact matches or structural navigation.
 - Use in-process `grep`, `find`, `ls`, and `lsp` before shelling out. Fall back to shell `rg`, `find`, `ast-grep`, or `ls -la` only when the native/in-process tool surface is insufficient.

 ### Step 2: Check library and ecosystem facts
--- a/src/resources/extensions/sf/prompts/discuss.md
+++ b/src/resources/extensions/sf/prompts/discuss.md
@ -34,7 +34,7 @@ After reflection is confirmed, decide the approach based on the actual scope —

 Before asking your first question, do a mandatory investigation pass. This is not optional.

-1. **Scout the codebase** — use `codebase_search` for conceptual, behavioral, or architectural discovery (e.g. "how does X work?", "where is Y handled?"); use in-process `grep`, `find`, `ls`, and `lsp` for exact identifier matches or structural navigation. Use `scout` for broad unfamiliar areas that need a separate explorer. Understand what already exists, what patterns are established, what constraints current code imposes.
+1. **Scout the codebase** — use `codebase_search` for conceptual, behavioral, or architectural discovery (e.g. "how does X work?", "where is Y handled?"); use `sift_search` for agentic multi-turn research or explicit strategy selection; use in-process `grep`, `find`, `ls`, and `lsp` for exact identifier matches or structural navigation. Use `scout` for broad unfamiliar areas that need a separate explorer. Understand what already exists, what patterns are established, what constraints current code imposes.
 2. **Check library docs — DeepWiki first.** Use `ask_question` / `read_wiki_structure` / `read_wiki_contents` (DeepWiki) as the default for any GitHub-hosted library or framework the user mentioned. Fall back to `resolve_library` / `get_library_docs` (Context7) for npm/pypi/crates packages DeepWiki doesn't have. **Context7 free tier is capped at 1000 req/month — spend those on cases DeepWiki can't cover.** Get current facts about capabilities, constraints, API shapes, version-specific behavior.
 3. **Web search** — `search-the-web` if the domain is unfamiliar, if you need current best practices, or if the user referenced external services/APIs you need facts about. Use `fetch_page` for full content when snippets aren't enough.

--- a/src/resources/extensions/sf/prompts/guided-discuss-milestone.md
+++ b/src/resources/extensions/sf/prompts/guided-discuss-milestone.md
@ -15,7 +15,7 @@ Apply `pm-planning` skill thinking throughout: use Working Backwards to anchor o
 ### Before your first question round

 Do a lightweight targeted investigation so your questions are grounded in reality:
- Scout the codebase: use `codebase_search` for conceptual, behavioral, or architectural discovery (e.g. "how does X work?", "where is Y handled?"); use in-process `grep`, `find`, `ls`, and `lsp` for exact identifier matches or structural navigation. Use `scout` for broad unfamiliar areas that need a separate explorer.
+- Scout the codebase: use `codebase_search` for conceptual, behavioral, or architectural discovery (e.g. "how does X work?", "where is Y handled?"); use `sift_search` for agentic multi-turn research or explicit strategy selection; use in-process `grep`, `find`, `ls`, and `lsp` for exact identifier matches or structural navigation. Use `scout` for broad unfamiliar areas that need a separate explorer.
 - If the `PROJECT CODE INTELLIGENCE` block says Project RAG is configured, use its MCP search tools for broad concept, symbol, schema, and git-history lookup before manually reading files
 - Check the roadmap context above (if present) to understand what surrounds this milestone
 - **Library docs — DeepWiki first.** Use `ask_question` / `read_wiki_structure` / `read_wiki_contents` (DeepWiki) for any GitHub-hosted library. Fall back to `resolve_library` / `get_library_docs` (Context7) only when DeepWiki doesn't have it (Context7 is capped at 1000 req/month free tier).
--- a/src/resources/extensions/sf/prompts/guided-discuss-slice.md
+++ b/src/resources/extensions/sf/prompts/guided-discuss-slice.md
@ -11,7 +11,7 @@ Your goal is **not** to center the discussion on tech stack trivia, naming conve
 ### Before your first question round

 Do a lightweight targeted investigation so your questions are grounded in reality:
- Scout the codebase: use `codebase_search` for conceptual, behavioral, or architectural discovery (e.g. "how does X work?", "where is Y handled?"); use in-process `grep`, `find`, `ls`, and `lsp` for exact identifier matches or structural navigation. Use `scout` for broad unfamiliar areas that need a separate explorer.
+- Scout the codebase: use `codebase_search` for conceptual, behavioral, or architectural discovery (e.g. "how does X work?", "where is Y handled?"); use `sift_search` for agentic multi-turn research or explicit strategy selection; use in-process `grep`, `find`, `ls`, and `lsp` for exact identifier matches or structural navigation. Use `scout` for broad unfamiliar areas that need a separate explorer.
 - Check the roadmap context above to understand what surrounds this slice — what comes before, what depends on it
 - **Library docs — DeepWiki first.** Use `ask_question` / `read_wiki_structure` / `read_wiki_contents` (DeepWiki) for any GitHub-hosted library. Fall back to `resolve_library` / `get_library_docs` (Context7) only when DeepWiki doesn't have it (Context7 is capped at 1000 req/month free tier).
 - Identify the 3–5 biggest behavioural unknowns: things where the user's answer will materially change what gets built
--- a/src/resources/extensions/sf/prompts/queue.md
+++ b/src/resources/extensions/sf/prompts/queue.md
@ -26,7 +26,7 @@ Never fabricate or simulate user input during this discussion. Never generate fa

 - Check library docs **DeepWiki first** (`ask_question` / `read_wiki_structure` / `read_wiki_contents`) for any GitHub-hosted library or framework — AI-indexed, no free-tier cap. Fall back to Context7 (`resolve_library` / `get_library_docs`) for npm/pypi/crates packages DeepWiki doesn't cover. Context7 free tier is 1000 req/month — don't spend those on cases DeepWiki covers.
 - Do web searches (`search-the-web`) to verify the landscape — what solutions exist, what's changed recently, what's the current best practice. Use `freshness` for recency-sensitive queries, `domain` to target specific sites. Use `fetch_page` to read the full content of promising URLs when snippets aren't enough. **Budget:** You have a limited number of web searches per turn (typically 3-5). Prefer DeepWiki → Context7 → web search for docs; use `search_and_read` for one-shot topic research. Do NOT repeat the same or similar queries. Distribute searches across turns rather than clustering them.
- Scout the codebase: use `codebase_search` for conceptual, behavioral, or architectural discovery (e.g. "how does X work?", "where is Y handled?"); use in-process `grep`, `find`, `ls`, and `lsp` for exact identifier matches or structural navigation. Use `scout` for broad unfamiliar areas that need a separate explorer. Understand what already exists, what patterns are established, what constraints current code imposes.
+- Scout the codebase: use `codebase_search` for conceptual, behavioral, or architectural discovery (e.g. "how does X work?", "where is Y handled?"); use `sift_search` for agentic multi-turn research or explicit strategy selection; use in-process `grep`, `find`, `ls`, and `lsp` for exact identifier matches or structural navigation. Use `scout` for broad unfamiliar areas that need a separate explorer. Understand what already exists, what patterns are established, what constraints current code imposes.

 Don't go deep — just enough that your next question reflects what's actually true rather than what you assume.

--- a/src/resources/extensions/sf/prompts/system.md
+++ b/src/resources/extensions/sf/prompts/system.md
@ -161,7 +161,7 @@ Templates showing the expected format for each artifact type are in:

 **Code navigation:** Use `lsp` for definition, type_definition, implementation, references, incoming_calls, outgoing_calls, hover, signature, symbols, rename, code_actions, format, and diagnostics. Falls back gracefully if no server is available. Never `grep` for a symbol definition when `lsp` can resolve it semantically. Never shell out to prettier/rustfmt/gofmt when `lsp format` is available. After editing code, use `lsp diagnostics` to verify no type errors were introduced.

-**Codebase exploration:** For conceptual, behavioral, or architectural discovery (e.g. "how does X work?", "where is Y handled?"), use `codebase_search` first. Its hybrid BM25+Vector retrieval is significantly more effective than grep for navigating unfamiliar logic. Use in-process SF tools like `grep` for exact text matches when you already have a specific identifier, and `find`/`ls` for literal filesystem discovery. Use `lsp` for structural navigation (definitions, references). Use `.sf/CODEBASE.md` for durable orientation. If the `PROJECT CODE INTELLIGENCE` block says Project RAG is configured, use its MCP tools for broad hybrid semantic + BM25 code retrieval before manual file-by-file reading. Never read files one-by-one to "explore" — search first, then read what's relevant.
+**Codebase exploration:** For conceptual, behavioral, or architectural discovery (e.g. "how does X work?", "where is Y handled?"), use `codebase_search` first. Its hybrid BM25+Vector retrieval is significantly more effective than grep for navigating unfamiliar logic. For Sift-specific features — agentic multi-turn search, explicit strategy selection, or planner configuration — use `sift_search`. Strategy guide: `page-index-hybrid` (strongest recall + structural reranking, default), `path-hybrid` (filename/path-heavy queries), `bm25` (fast lexical-only), `vector` (semantic-only). Enable `agent: true` with `agentMode: 'graph'` for deep multi-turn research across disconnected code regions, or `plannerStrategy: 'model-driven'` for LLM-guided planning. Use in-process SF tools like `grep` for exact text matches when you already have a specific identifier, and `find`/`ls` for literal filesystem discovery. Use `lsp` for structural navigation (definitions, references). Use `.sf/CODEBASE.md` for durable orientation. If the `PROJECT CODE INTELLIGENCE` block says Project RAG is configured, use its MCP tools for broad hybrid semantic + BM25 code retrieval before manual file-by-file reading. Never read files one-by-one to "explore" — search first, then read what's relevant.

 **Swarm dispatch:** Let the system decide whether swarming fits before dispatching multiple execution subagents. Use a 2-3 worker same-model swarm only when the work splits into independent shards with explicit file/directory ownership, shard-local verification, low conflict risk, and clear wall-clock savings. Do not swarm shared-interface edits, lockfiles, migrations, single-failure debugging, or sequence-dependent work. The parent agent remains coordinator: assign ownership, synthesize results, inspect dirty files, resolve conflicts, and run final verification.

--- a/src/resources/extensions/sf/skills/researcher/SKILL.md
+++ b/src/resources/extensions/sf/skills/researcher/SKILL.md
@ -54,10 +54,19 @@ rg -n "codebase_search|resolveSubagentLaunchSpec" src packages
 rg --files src/resources/extensions/sf/skills
 ```

-**Local code search — sift (hybrid BM25+vector search):**
-```bash
-sift search --strategy path-hybrid "authentication middleware"
-sift search --strategy hybrid --limit 5 "where is the write gate registered"
+**Local code search — `sift_search` tool (Sift hybrid BM25+vector):**
+```json
+// Best all-around preset (strongest recall + structural reranking)
+{"query": "authentication middleware token validation", "strategy": "page-index-hybrid", "limit": 5}
+
+// Path-heavy query (filename, module, symbol intent)
+{"query": "write gate registration handler", "strategy": "path-hybrid", "limit": 5}
+
+// Agentic multi-turn research for complex exploration
+{"query": "trace the cache invalidation path across all layers", "agent": true, "agentMode": "graph", "plannerStrategy": "model-driven"}
+
+// Fast lexical-only fallback
+{"query": "sift_request_factory", "strategy": "bm25", "limit": 10}
 ```

 **SF project database queries:**
@ -108,13 +117,17 @@ read file=src/my-file.ts
 rg -n "TODO.*auth" src packages
 ```

-## Step 3: Supplement with sift (when LSP/rg is not enough)
+## Step 3: Supplement with `sift_search` (when LSP/rg is not enough)

-Use sift when you need semantic/hybrid search across unstructured content:
+Use `sift_search` when you need semantic/hybrid search, agentic multi-turn exploration,
+or explicit strategy control:

-```bash
-# Hybrid search for conceptual matches
-sift search --strategy hybrid --limit 5 "authentication middleware token validation"
+```json
+// Hybrid search for conceptual matches
+{"query": "authentication middleware token validation", "strategy": "page-index-hybrid", "limit": 5}
+
+// Agentic research for complex cross-cutting concerns
+{"query": "how does the dispatch loop handle retries and timeouts", "agent": true, "agentMode": "graph"}
 ```

 ## Step 4: Query the SF project database
@ -189,7 +202,7 @@ What should the agent do next?
 <success_criteria>
 - Research report written to the correct artifact path
 - At least one SF DB query executed and cited
- At least one sift search executed and cited
+- At least one `sift_search` or `codebase_search` executed and cited
 - Findings are specific (file:line or table:row references), not generic
 - Gaps identified honestly — what you could not determine
 </success_criteria>
@ -219,14 +232,27 @@ SELECT id, scope, decision FROM decisions WHERE scope='architecture' ORDER BY se
 SELECT id, category, content FROM memories ORDER BY seq DESC LIMIT 20;
 ```

-### sift strategies
+### sift_search strategies

 | Strategy | When to use |
 |---|---|
-| `path-hybrid` | Default. File path + content matching — best for most queries |
-| `hybrid` | Pure content matching — when you don't care about file names |
-| `page-index-hybrid` | Web-page-like content (documentation) |
-| `bm25` | Exact keyword matching — fast fallback |
+| `page-index-hybrid` | **Default / best all-around.** BM25 + phrase + path-fuzzy + segment-fuzzy + vector with structural reranking. Strongest recall for conceptual discovery. |
+| `path-hybrid` | Filename, path fragment, module, or symbol stem queries. Deterministic path-sensitive reranking. |
+| `bm25` | Fast lexical-only. Use when speed matters or query is exact keyword. |
+| `vector` | Semantic-only. Use when lexical matches are sparse but meaning matters. |
+| `page-index-llm` | Heavier semantic pass on top of page-index shortlist. Slower, deeper recall. |
+| `page-index-jina` | Jina-embedding semantic pass on page-index shortlist. |
+| `page-index-gemma` | Gemma-embedding semantic pass on page-index shortlist. |
+| `path-fuzzy` | Approximate filename/path recovery. Narrow but typo-tolerant. |
+| `segment-fuzzy` | Snippet-bearing line/segment evidence. Useful when downstream tools need snippet support. |
+
+### Agent mode options
+
+| Option | Values | When to use |
+|---|---|---|
+| `agent` | `true` / `false` | Enable multi-turn autonomous search. Default: `false`. |
+| `agentMode` | `linear` / `graph` | `linear`: bounded linear planning (default). `graph`: bounded graph exploration for disconnected code regions. |
+| `plannerStrategy` | `heuristic` / `model-driven` | `heuristic`: rule-based planning (faster). `model-driven`: LLM-guided planning (slower, more thorough). |

 ### DB schema reference