ace-pm 35dc87ef53 chore: sync workspace state after rebrand

- Rebrand commits already in history (gsd → forge)
- Sync pre-existing doc, docker, and CI config updates
- All rebrand artifacts verified in place:
  * Native crates: forge-engine, forge-ast, forge-grep
  * Log prefixes: [forge] across 22+ files
  * Binary: ~/bin/sf-run
  * Workspace scopes: @sf-run/*, @singularity-forge/*
  * Nix flake: Rust toolchain ready

System ready for: nix develop && bun run build:native

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-04-15 14:54:20 +02:00

38 KiB

Raw Permalink Blame History

Frontier Techniques for SF

Research into cutting-edge AI agent techniques that map directly to SF's architecture, ranked by impact and feasibility.

Date: 2026-03-25 Status: Research / Pre-RFC

Executive Summary
1. Skill Library Evolution
2. DAG-Based Parallel Tool Execution
3. Speculative Tool Execution
4. Semantic Context Compression
5. Cross-Session Learning Graph
6. MCTS-Based Planning
Priority Matrix
Sources & References

Executive Summary

SF is a multi-layered, event-driven agent platform with strong extensibility primitives: a skill system, file-based memory, session branching, compaction, and 16+ extension lifecycle hooks. These existing primitives create natural integration points for six frontier techniques that could fundamentally change how SF operates.

The techniques fall into three categories:

Category	Techniques	Theme
Self-Improvement	Skill Library Evolution, Cross-Session Learning Graph	SF gets better the more you use it
Performance	DAG Tool Execution, Speculative Tool Execution	SF gets faster per turn
Intelligence	Semantic Context Compression, MCTS Planning	SF reasons better with the same context budget

1. Skill Library Evolution

Category: Self-Improvement Impact: Massive | Effort: Medium | Priority: #1

What It Is

Inspired by SkillRL (ICLR 2026), this technique transforms SF's skill system from static instruction files into a self-improving knowledge base. Instead of skills being written once and updated manually, they evolve based on execution outcomes.

SkillRL demonstrates that agents with learned skill libraries outperform baselines by 15.3%+ across task benchmarks, with 10-20% token compression compared to raw trajectory storage.

How It Works

┌─────────────────────────────────────────────────────────┐
│                    EXECUTION LOOP                       │
│                                                         │
│  1. Skill invoked → agent executes task                 │
│  2. Outcome captured (success/failure + trajectory)     │
│  3. Trajectory distilled:                               │
│     ├─ Success → strategic pattern extracted            │
│     └─ Failure → anti-pattern + lesson recorded         │
│  4. Skill file updated with versioned improvement       │
│  5. Next invocation benefits from accumulated learnings │
│                                                         │
└─────────────────────────────────────────────────────────┘

Two types of learned knowledge:

Type	Description	Example
General Skills	Universal strategic guidance applicable across tasks	"When editing TypeScript files, always check for type errors via LSP before committing"
Task-Specific Skills	Category-level heuristics for specific skill domains	"The `fix-issue` skill should check CI status before opening a PR, not after"

Why It Fits SF

SF already has every primitive needed:

Skill files (~/.claude/skills/, .claude/skills/) — the storage layer exists
Extension hooks (turn_end, agent_end) — outcome capture points exist
Memory system (MEMORY.md + individual files) — persistence exists
/improve-skill and /heal-skill commands — manual versions of this loop already exist

The gap is automation: connecting execution outcomes back to skill files without human intervention.

Integration Points

SF Component	Role in Integration
`agent-session.ts` → `turn_end` event	Captures execution outcome (success/failure signals)
Extension hook: `agent_end`	Triggers trajectory distillation
Skill file system	Receives versioned updates with learned patterns
`compaction.ts`	Provides trajectory data from the session for distillation

Architecture

User invokes skill
        │
        ▼
┌──────────────┐     ┌──────────────────┐
│ AgentSession  │────▶│  Skill Executor   │
│ (turn_end)    │     │  (tracks outcome) │
└──────────────┘     └────────┬─────────┘
                              │
                    ┌─────────▼──────────┐
                    │ Outcome Classifier  │
                    │ (success/failure/   │
                    │  partial)           │
                    └─────────┬──────────┘
                              │
              ┌───────────────┼───────────────┐
              ▼               ▼               ▼
     ┌────────────┐  ┌──────────────┐  ┌───────────┐
     │  Success   │  │   Failure    │  │  Partial   │
     │  Distiller │  │  Distiller   │  │  Analyzer  │
     └─────┬──────┘  └──────┬───────┘  └─────┬─────┘
           │                │                 │
           ▼                ▼                 ▼
     ┌─────────────────────────────────────────────┐
     │           Skill File Updater                 │
     │  • Appends learned pattern to skill          │
     │  • Versions the update                       │
     │  • Preserves original skill intent           │
     └─────────────────────────────────────────────┘

Open Questions

Drift prevention: How to prevent accumulated learnings from overwhelming the original skill intent?
Conflict resolution: What happens when a lesson from one session contradicts another?
Quality gate: Should updates require a validation pass before being written?

2. DAG-Based Parallel Tool Execution

Category: Performance Impact: High | Effort: Medium | Priority: #2

What It Is

The LLM Compiler pattern (ICML 2024) treats multi-tool workflows like a compiler optimization pass. When the model returns multiple tool calls in a single response, instead of executing them sequentially, the system:

Analyzes dependencies between tool calls
Constructs a Directed Acyclic Graph (DAG)
Executes independent tools in parallel
Blocks only on actual data dependencies

How It Works

Current SF behavior (sequential):

Read(auth.ts) ─── 150ms ───▶ result
                               │
Read(types.ts) ─── 120ms ──▶ result
                               │
Grep("login") ─── 80ms ────▶ result
                               │
Read(test.ts) ─── 130ms ───▶ result
                               │
Total: ~480ms sequential

With DAG execution (parallel):

Read(auth.ts)  ─── 150ms ──▶ result ─┐
Read(types.ts) ─── 120ms ──▶ result ─┤
Grep("login")  ─── 80ms ───▶ result ─┤── all complete at 150ms
Read(test.ts)  ─── 130ms ──▶ result ─┘
                                      │
Total: ~150ms (max of parallel set)

Dependency analysis rules:

Tool A	Tool B	Dependency?	Reason
Read(file)	Read(file)	No	Reads are idempotent
Read(file)	Grep(pattern)	No	Independent data sources
Read(file)	Edit(file)	Yes	Edit depends on Read content
Edit(file)	Edit(file)	Yes	Edits to same file must serialize
Bash(cmd)	Bash(cmd)	Maybe	Depends on side effects
Write(file)	Read(file)	Yes	Read after write needs write to complete

Why It Fits SF

The model already emits multiple tool_use blocks in a single response. SF processes them, but the execution path in agent-loop.ts handles them in sequence. The parallelism opportunity is sitting right there.

Measured impact estimate: A typical coding turn involves 3-5 tool calls. With 60% parallelizable (reads, greps, globs), per-turn latency drops by 40-60%. Over a 50-turn session, that's minutes saved.

Integration Points

SF Component	Role in Integration
`agent-loop.ts` tool execution path	Replace sequential execution with DAG scheduler
Tool definitions	Annotate tools with side-effect metadata (pure/impure)
Extension hooks (`tool_*`)	Must still fire in correct order per dependency chain

Architecture

Model response with N tool_use blocks
                │
                ▼
┌──────────────────────────────┐
│      Dependency Analyzer      │
│  • Parse tool calls           │
│  • Identify file overlaps     │
│  • Identify data dependencies │
│  • Classify: pure vs impure   │
└──────────────┬───────────────┘
               │
               ▼
┌──────────────────────────────┐
│        DAG Constructor        │
│  • Nodes = tool calls         │
│  • Edges = dependencies       │
│  • Topological sort           │
└──────────────┬───────────────┘
               │
               ▼
┌──────────────────────────────┐
│      Parallel Executor        │
│  • Execute roots immediately  │
│  • On completion, unlock      │
│    dependent nodes            │
│  • Collect all results        │
│  • Return in original order   │
└──────────────────────────────┘

Open Questions

Bash side effects: How to determine if two Bash commands conflict without executing them?
Extension hooks: Should tool_start/tool_end events fire in execution order or original order?
Error propagation: If a parallel tool fails, do dependent tools get cancelled or receive the error?

3. Speculative Tool Execution

Category: Performance Impact: High | Effort: Low-Medium | Priority: #3

What It Is

Based on Speculative Tool Calls research, this technique predicts which tools the model will request and pre-executes them before the model responds. Correct predictions eliminate the first tool-call round-trip entirely. Wrong predictions are discarded at zero cost beyond compute.

How It Works

┌─────────────────────────────────────────────────────────────┐
│ User: "fix the bug in auth.ts"                              │
│                                                             │
│ BEFORE model responds:                                      │
│   Speculator predicts:                                      │
│     ├─ Read("auth.ts")           → pre-executed ✓           │
│     ├─ Grep("error|bug", "auth") → pre-executed ✓           │
│     ├─ LSP diagnostics(auth.ts)  → pre-executed ✓           │
│     └─ Read("auth.test.ts")      → pre-executed ✓           │
│                                                             │
│ Model responds with tool calls:                             │
│     ├─ Read("auth.ts")           → CACHE HIT (0ms)         │
│     ├─ Read("auth.test.ts")      → CACHE HIT (0ms)         │
│     └─ Grep("login", "src/")     → cache miss (execute)    │
│                                                             │
│ Hit rate: 2/3 = 67%                                         │
│ Latency saved: ~300ms on this turn                          │
└─────────────────────────────────────────────────────────────┘

Prediction strategies (simplest to most sophisticated):

Strategy	Description	Expected Hit Rate
Keyword extraction	Parse user prompt for file paths, function names → Read those files	40-60%
Session history	Track which tools follow which user prompt patterns	50-70%
Learned patterns	Use the skill library evolution data to predict tool sequences	60-80%
Model pre-query	Ask a fast/cheap model to predict tool calls	70-85%

Why It Fits SF

The #1 latency bottleneck in SF is the round-trip: user prompt → model thinks → model requests tool → tool executes → result sent back → model thinks again. Speculative execution attacks the highest-latency step.

SF's architecture makes this easy to add:

AgentSession.prompt() already processes user input before sending to the model
Tool results are already cached in the message array
The extension system can intercept input and spawn pre-fetches

Integration Points

SF Component	Role in Integration
`AgentSession.prompt()`	Trigger speculation after user input, before model call
Tool result cache (new)	Store speculated results keyed by tool+args
`agent-loop.ts` tool execution	Check cache before executing; serve cached result on hit
Extension hook: `input`	Parse user intent for file paths, patterns

Architecture

User input arrives
        │
        ├──────────────────────────────────────┐
        │                                      │
        ▼                                      ▼
┌───────────────┐                    ┌──────────────────┐
│  Send to LLM  │                    │   Speculator      │
│  (normal path) │                    │  • Extract paths   │
│               │                    │  • Predict tools   │
│  ... waiting  │                    │  • Pre-execute     │
│  for response │                    │  • Cache results   │
│               │                    └──────────────────┘
│               │                              │
│               │◀─── model returns ──────────│
│               │     tool_use blocks         │
└───────┬───────┘                              │
        │                                      │
        ▼                                      │
┌───────────────┐                              │
│ Tool Executor  │◀──── check cache ───────────┘
│ • Cache hit?   │
│   → return     │
│ • Cache miss?  │
│   → execute    │
└───────────────┘

Cost Analysis

Scenario	Cost
Correct prediction	~0ms latency (result already available). Compute cost: the pre-execution itself (trivial for Read/Grep).
Wrong prediction	Wasted compute for the pre-executed tool. For Read/Grep/Glob, this is <10ms of I/O.
Partial hit	Net positive as long as hit rate > 20% (given how cheap misses are).

Open Questions

TTL for cached results: How long are speculated results valid? File contents can change between speculation and model request.
Side effects: Should only pure tools (Read, Grep, Glob, LSP) be speculatable?
Resource limits: Cap on number of speculative executions per turn to prevent I/O storms?

4. Semantic Context Compression

Category: Intelligence Impact: High | Effort: High | Priority: #4

What It Is

SF's compaction system uses a char/4 heuristic for token estimation and all-or-nothing LLM summarization for context reduction. Research from Zylos and context engineering literature shows that embedding-based compression achieves 80-90% token reduction while preserving the ability to selectively recall specific historical context.

Current SF Compaction (Weaknesses Highlighted)

Messages: [M1, M2, M3, M4, M5, M6, M7, M8, M9, M10]
                                                    ▲
Token budget exceeded                               │ recent
                                                    │
Current approach:
┌─────────────────────────┬─────────────────────────┐
│  M1-M6: LLM-summarized │  M7-M10: kept verbatim  │
│  into single blob       │  (last ~20k tokens)     │
│                         │                         │
│  ⚠ All detail lost      │  ✓ Full fidelity        │
│  ⚠ No selective recall  │                         │
│  ⚠ char/4 overestimates │                         │
└─────────────────────────┴─────────────────────────┘

Three specific weaknesses:

Weakness	Impact	Current Code Location
char/4 token estimation	~25% overestimate → compacts too early → wastes context	`compaction.ts:201-259`
All-or-nothing summarization	Loses specific details that may be relevant later	`compaction.ts:327-400`
No retrieval from compacted history	Once summarized, detail is gone forever	`compaction-orchestrator.ts`

Proposed: Tiered Memory Architecture

┌─────────────────────────────────────────────────────────┐
│                    HOT TIER                              │
│  Recent turns (last ~20k tokens)                        │
│  Full text, full fidelity                               │
│  Storage: in-context messages                           │
│  Access: always in prompt                               │
├─────────────────────────────────────────────────────────┤
│                    WARM TIER                             │
│  Older turns (beyond context window)                    │
│  Stored as embeddings + compressed text                 │
│  Storage: session-local vector index                    │
│  Access: retrieved when semantically relevant to        │
│          current turn                                   │
│  Token cost: only retrieved segments count              │
├─────────────────────────────────────────────────────────┤
│                    COLD TIER                             │
│  Ancient turns / previous sessions                      │
│  Stored as summaries + metadata                         │
│  Storage: disk (existing session files)                 │
│  Access: retrieved only on explicit recall              │
│  Token cost: minimal summary headers                    │
└─────────────────────────────────────────────────────────┘

How retrieval works per turn:

New user prompt arrives
        │
        ▼
┌───────────────────┐
│  Embed the prompt  │ (compute embedding of user's question)
└────────┬──────────┘
         │
         ├──── query warm tier ──▶ top-K relevant historical turns
         │                         (cosine similarity > threshold)
         │
         ├──── always include ──▶ hot tier (recent turns, full text)
         │
         ▼
┌───────────────────┐
│  Compose context   │
│  = hot + retrieved │
│  + system prompt   │
└───────────────────┘

Token Estimation Improvement

Replace char/4 with adaptive estimation:

Approach	Accuracy	Cost
char/4 (current)	~75% (overestimates)	Zero
Provider-reported usage	100% (for last turn)	Zero (already tracked)
tiktoken/provider tokenizer	~98%	~5ms per message
Hybrid: actual for recent, char/4 for old	~95%	Negligible

The hybrid approach — use actual token counts from provider responses for recent messages, fall back to char/4 for older messages — is a quick win that requires no new dependencies.

Integration Points

SF Component	Role in Integration
`compaction.ts`	Replace cut-point algorithm with tiered approach
`compaction-orchestrator.ts`	Add warm-tier retrieval before model call
`agent-session.ts` message building	Inject retrieved warm-tier segments
Session persistence layer	Store embeddings alongside session entries

Open Questions

Embedding model: Local (fast, private) or API (better quality, adds latency)?
Index format: Simple cosine similarity on flat arrays vs. HNSW index?
Retrieval budget: How many tokens to allocate to warm-tier retrievals per turn?
Coherence: How to prevent retrieved historical context from confusing the model about the current state?

5. Cross-Session Learning Graph

Category: Self-Improvement Impact: Transformative | Effort: High | Priority: #5

What It Is

SF's memory system (MEMORY.md + individual files) stores flat, file-based memories. A learning graph extends this into a structured knowledge base that captures relationships between codebases, files, errors, solutions, and patterns across all sessions.

This is informed by research on agent memory architectures and the emerging discipline of context engineering.

Current Memory vs Learning Graph

Aspect	Current (MEMORY.md)	Learning Graph
Structure	Flat file list	Nodes + edges (graph)
Relationships	None	"file X often breaks when Y changes"
Retrieval	All loaded into context	Query-driven, only relevant nodes
Learning	Manual (user says "remember X")	Automatic from execution outcomes
Scope	Per-project directory	Per-project with cross-project patterns
Staleness	Manual cleanup	Confidence decay over time

Graph Schema

┌──────────┐     touches      ┌──────────┐
│  Session  │────────────────▶│   File    │
│           │                 │           │
│ • date    │                 │ • path    │
│ • outcome │                 │ • type    │
│ • tokens  │                 │ • churn   │
└────┬──────┘                 └─────┬─────┘
     │                              │
     │ encountered                  │ involved_in
     │                              │
     ▼                              ▼
┌──────────┐    resolved_by   ┌──────────┐
│  Error    │────────────────▶│ Solution  │
│           │                 │           │
│ • type    │                 │ • pattern │
│ • message │                 │ • success │
│ • freq    │                 │   rate    │
└──────────┘                 └──────────┘
     │                              │
     │ prevented_by                 │ uses
     │                              │
     ▼                              ▼
┌──────────┐                 ┌──────────┐
│  Pattern  │                │   Tool   │
│           │                │          │
│ • type    │                │ • name   │
│ • desc    │                │ • avg    │
│ • conf    │                │   time   │
└──────────┘                 └──────────┘

Example Queries

Query	Result
"What errors have occurred in `auth.ts`?"	List of error nodes connected to that file node
"What's the typical fix for `TypeError` in this codebase?"	Solution nodes with highest success rate for that error type
"Which files tend to break together?"	File clusters with high co-occurrence in error sessions
"What tools are slowest in this project?"	Tool nodes sorted by avg execution time

Integration Points

SF Component	Role in Integration
`session-manager.ts`	Write graph nodes on session save
`agent-session.ts` prompt building	Query graph for relevant context before model call
Memory system (MEMORY.md)	Coexists — graph handles structured knowledge, memory handles preferences/feedback
Extension hook: `agent_end`	Trigger graph update with session outcome

Storage Options

Option	Pros	Cons
SQLite + json columns	Simple, no dependencies, fast queries	No native vector search
SQLite + sqlite-vss	Adds vector similarity to SQLite	Extra native dependency
Flat JSON files	Zero dependencies, git-friendly	Slow for large graphs
LanceDB	Embedded vector DB, no server	Additional dependency

Open Questions

Privacy: Graph contains detailed codebase interaction history — should it be encrypted at rest?
Portability: Should the graph travel with the project (.claude/ dir) or stay user-local?
Garbage collection: How to prune stale nodes (e.g., files that no longer exist)?

6. MCTS-Based Planning

Category: Intelligence Impact: Transformative | Effort: Very High | Priority: #6

What It Is

Inspired by ToolTree and Monte Carlo Tree Search, this technique replaces SF's linear action selection with a tree-based planner that explores multiple solution paths simultaneously.

Instead of the model deciding one action at a time and hoping it works, the system:

Generates N candidate next-actions
Scores each based on estimated probability of reaching the goal
Explores promising branches in parallel
Backtracks when a path fails, without wasting the user's context on dead ends

Current vs MCTS Approach

Current (linear):

User: "fix the auth bug"
  │
  ▼
Action 1: Read auth.ts ──▶ Action 2: Edit line 45 ──▶ Action 3: Run tests
                                                              │
                                                         Tests fail ✗
                                                              │
                                                         ▼
                                                    Action 4: Try different edit
                                                              │
                                                         Tests fail ✗
                                                              │
                                                         ▼
                                                    Action 5: Read error log...
                                                    (linear flailing)

With MCTS (tree search):

User: "fix the auth bug"
  │
  ▼
Read auth.ts
  │
  ├── Branch A: Edit line 45 (score: 0.6)
  │     └── Run tests → FAIL → prune
  │
  ├── Branch B: Check auth middleware (score: 0.7)  ◀── highest score
  │     └── Edit middleware.ts → Run tests → PASS ✓
  │
  └── Branch C: Check env config (score: 0.3)
        └── (not explored — lower score)

Result: Branch B succeeds after 2 actions, not 5+

Why It Fits SF

SF already has session branching primitives:

fork() creates a branch from any message
Branch summaries compress history at fork points
Tree navigation (/tree) lets users explore branches
Session tree is already a first-class concept

The gap: these primitives are user-triggered. MCTS would make the agent trigger them automatically during problem-solving.

Architecture

┌─────────────────────────────────────────────────────────┐
│                    MCTS Planning Layer                   │
│                                                         │
│  ┌─────────────┐    ┌──────────────┐    ┌────────────┐ │
│  │   Proposer   │───▶│   Scorer     │───▶│  Selector  │ │
│  │ Generate N   │    │ Estimate P   │    │ Pick best  │ │
│  │ candidates   │    │ of success   │    │ to explore │ │
│  └─────────────┘    └──────────────┘    └─────┬──────┘ │
│                                               │        │
│  ┌─────────────┐    ┌──────────────┐          │        │
│  │  Pruner     │◀───│   Executor   │◀─────────┘        │
│  │ Kill dead   │    │ Run action   │                   │
│  │ branches    │    │ in worktree  │                   │
│  └─────────────┘    └──────────────┘                   │
└─────────────────────────────────────────────────────────┘
         │
         ▼
┌─────────────────────┐
│  Agent Session       │
│  (receives winning   │
│   branch as result)  │
└─────────────────────┘

Scoring Approaches

Approach	Speed	Quality	Cost
Heuristic (file relevance, error proximity)	Fast	Low	Free
Fast model (haiku-class rates candidates)	Medium	Medium	Low
Self-evaluation (main model rates its own proposals)	Slow	High	High
Learned scorer (trained on past outcomes from learning graph)	Fast	High	Free at inference

Integration Points

SF Component	Role in Integration
`agent-loop.ts`	New planning phase between user prompt and action execution
Session branching (`fork()`)	Used to create exploration branches
Git worktrees	Each branch explored in an isolated worktree
`agent-session.ts`	Receives the winning branch and presents it as the result
Skill Library Evolution (#1)	Provides learned patterns to improve the scorer over time

Cost-Benefit Analysis

Factor	Value
LLM calls per turn	2-5x more (proposal generation + scoring)
Token usage	3-10x more per complex problem
Success rate on hard problems	Estimated 30-50% improvement
Time to solution	Fewer total turns despite more LLM calls per turn
User experience	Agent appears to "think harder" on hard problems

Open Questions

When to activate: MCTS is expensive. Should it only activate when the agent detects a hard problem (repeated failures, high uncertainty)?
Branch isolation: Git worktrees work for file changes, but how to isolate Bash side effects?
Budget control: How many branches to explore before falling back to linear execution?
Transparency: Should the user see the exploration tree or just the winning path?

Priority Matrix

#	Technique	Impact	Effort	Compounding	Dependencies
1	Skill Library Evolution	Massive	Medium	Yes — improves all other techniques	None
2	DAG Tool Execution	High	Medium	No — static speedup	None
3	Speculative Tool Execution	High	Low-Med	Yes — improves with learning	Benefits from #1
4	Semantic Context Compression	High	High	No — static improvement	None
5	Cross-Session Learning Graph	Transformative	High	Yes — feeds #1, #3, #6	Benefits from #1
6	MCTS Planning	Transformative	Very High	Yes — improves with #1, #5	Benefits from #1, #5

Recommended Implementation Order

Phase 1 (Foundation)          Phase 2 (Performance)       Phase 3 (Intelligence)
─────────────────────         ─────────────────────       ─────────────────────
┌─────────────────┐          ┌─────────────────┐         ┌─────────────────┐
│ Skill Library    │          │ DAG Tool Exec   │         │ Semantic Context│
│ Evolution        │──feeds──▶│                 │         │ Compression     │
│                  │          │ Speculative     │         │                 │
│                  │──feeds──▶│ Tool Exec       │         │ MCTS Planning   │
└─────────────────┘          └─────────────────┘         └─────────────────┘
                                      │                          ▲
┌─────────────────┐                   │                          │
│ Cross-Session   │───────────────────┴──────────────────────────┘
│ Learning Graph  │         (feeds intelligence layer)
└─────────────────┘

Phase 1 creates the feedback loop that makes everything else better over time. Phase 2 delivers immediate, measurable performance wins. Phase 3 requires the most architectural change but delivers the deepest capability gains.

Sources & References

38 KiB Raw Permalink Blame History

Frontier Techniques for SF

Table of Contents

Executive Summary

1. Skill Library Evolution

What It Is

How It Works

Why It Fits SF

Integration Points

Architecture

Open Questions

2. DAG-Based Parallel Tool Execution

What It Is

How It Works

Why It Fits SF

Integration Points

Architecture

Open Questions

3. Speculative Tool Execution

What It Is

How It Works

Why It Fits SF

Integration Points

Architecture

Cost Analysis

Open Questions

4. Semantic Context Compression

What It Is

Current SF Compaction (Weaknesses Highlighted)

Proposed: Tiered Memory Architecture

Token Estimation Improvement

Integration Points

Open Questions

5. Cross-Session Learning Graph

What It Is

Current Memory vs Learning Graph

Graph Schema

Example Queries

Integration Points

Storage Options

Open Questions

6. MCTS-Based Planning

What It Is

Current vs MCTS Approach

Why It Fits SF

Architecture

Scoring Approaches

Integration Points

Cost-Benefit Analysis

Open Questions

Priority Matrix

Recommended Implementation Order

Sources & References

Papers

Industry & Analysis

Workshops & Events

38 KiB

Raw Permalink Blame History