23 KiB
SF vs RA.Aid — Full Feature Comparison
Date: 2026-05-07 Scope: Complete feature-by-feature comparison across all subsystems
Executive Summary
| Dimension | SF | RA.Aid | Verdict |
|---|---|---|---|
| Architecture | TypeScript monorepo, extension-based, DB-first | Python, LangGraph agents, ORM-based | Both valid; SF more modular |
| State Model | SQLite + JSONL dual persistence | SQLite (Peewee ORM) single source | RA.Aid simpler; SF more durable |
| Agent Stages | UOK gates (implicit) | Explicit research → plan → implement | RA.Aid clearer stage boundaries |
| Memory | Key facts, snippets, notes, trajectory | Key facts, snippets, notes, trajectory | Parity |
| Cost Tracking | Per-unit SQLite + JSONL ledger | Per-trajectory DB records + CLI commands | RA.Aid more queryable |
| Shell Safety | Execution policy profiles + inheritance | cowboy_mode + interactive approval | SF more granular |
| Subagents | Full subagent system with inheritance | No subagent delegation | SF wins |
| Mode System | 5 work modes × 3 run controls × 4 permission profiles × 3 model modes | --research-only, --research-and-plan-only, --hil, --chat | SF far ahead |
| Web UI | Next.js TUI + headless + RPC | FastAPI server (optional) | SF more complete |
| Testing | Vitest, 144+ tests | pytest | SF more tested |
| Observability | Prometheus metrics + journal + audit | Trajectory DB + cost CLI | Different philosophies |
| Skills System | .agents/skills/ with YAML frontmatter |
No skill system | SF wins |
| Recovery | Crash recovery, verification retry, rethink | Fallback handler, retry with backoff | Parity |
| MCP | MCP client only | No MCP | SF wins |
1. Architecture & State Model
SF
singularity-forge/
├── src/resources/extensions/sf/ # Core extension
│ ├── uok/ # UOK kernel (safety)
│ ├── auto/ # Autonomous mode state
│ ├── commands/ # CLI command handlers
│ ├── skills/ # Skill system
│ └── metrics-central.js # Prometheus metrics
├── packages/ # npm workspaces
│ ├── pi-tui/ # Terminal UI
│ ├── pi-ai/ # AI provider abstraction
│ └── ...
├── web/ # Next.js web UI
└── .sf/ # Project-local state
├── sf.db # SQLite (schema v43)
├── runtime/ # Working files
└── sessions/ # Per-session state
State Philosophy: DB-first with JSONL durability. SQLite is the queryable source of truth; JSONL is the append-only audit log.
RA.Aid
ra_aid/
├── agents/ # LangGraph agents
│ ├── research_agent.py
│ ├── planning_agent.py
│ └── implementation_agent.py
├── database/ # Peewee ORM
│ ├── models.py # Trajectory, Session, KeyFact, ...
│ ├── connection.py # SQLite with WAL
│ └── repositories/ # Repository pattern
├── tools/ # Tool implementations
├── prompts/ # Prompt templates
└── .ra-aid/ # Project-local state
└── pk.db # SQLite database
State Philosophy: Single SQLite database with Peewee ORM. Everything is a model: sessions, human inputs, trajectories, key facts, snippets, research notes.
Comparison
| Aspect | SF | RA.Aid |
|---|---|---|
| ORM | Raw SQLite (better-sqlite3) | Peewee (higher-level) |
| Schema Evolution | Manual versioned migrations | Peewee migrate |
| Query Surface | Direct SQL + tool wrappers | Repository pattern + Pydantic models |
| Session Isolation | Per-session files in ~/.sf/sessions/ |
Single DB with session_id FK |
| Cross-Process | SQLite WAL + file-based locks | Peewee connection pooling |
| Backup/Export | JSONL ledger + DB file | DB file only |
Verdict: SF's dual persistence (DB + JSONL) is more durable for audit trails. RA.Aid's ORM is more ergonomic for queries.
2. Agent Stage Boundaries
SF: UOK Gate System
SF doesn't have explicit "research agent" / "planning agent" / "implementation agent". Instead, it has:
- UOK Kernel: Unified Orchestration Kernel that manages unit execution
- Gates: Pass/fail checkpoints between phases
- Work Modes:
chat→plan→build→review→repair→research - Run Control:
manual→assisted→autonomous
The stage boundary is implicit in the work mode + unit type combination.
RA.Aid: Explicit Agent Pipeline
# Main flow in __main__.py
if is_informational_query() or args.research_only:
run_research_agent(...) # Stage 1
else:
run_research_agent(...) # Stage 1
if not args.research_and_plan_only:
run_planning_agent(...) # Stage 2
run_task_implementation_agent(...) # Stage 3
Each agent is a separate LangGraph agent with its own:
- Prompt template
- Tool set
- Memory/checkpointer
- Optional expert reasoning assistance
Comparison
| Aspect | SF | RA.Aid |
|---|---|---|
| Stage Definition | Work mode + unit type | Explicit agent function |
| Prompt Separation | Single prompt with mode injection | Separate prompt per agent |
| Tool Separation | All tools available, gated by policy | Different tools per agent |
| Memory Separation | Shared session state | Separate MemorySaver per agent |
| Expert Consultation | Model mode routing | Explicit reasoning_assist prompt |
| Stage Skipping | /mode command |
--research-only, --research-and-plan-only |
Verdict: RA.Aid's explicit pipeline is clearer for users. SF's implicit gates are more flexible but harder to reason about.
3. Memory System
SF
| Memory Type | Storage | Access |
|---|---|---|
| Key Facts | SQLite (key_facts table) |
get_key_facts() / add_key_fact() |
| Code Snippets | SQLite (code_snippets table) |
get_code_snippets() |
| Research Notes | SQLite (research_notes table) |
get_research_notes() |
| Trajectory | JSONL (uok-audit.jsonl) + SQLite |
uok/audit.js |
| Prompt History | JSONL (~/.sf/agent/prompt-history.jsonl) |
prompt-history.js |
| Work Log | SQLite (work_log table) |
get_work_log() |
RA.Aid
| Memory Type | Storage | Access |
|---|---|---|
| Key Facts | SQLite (key_fact table) |
KeyFactRepository |
| Key Snippets | SQLite (key_snippet table) |
KeySnippetRepository |
| Research Notes | SQLite (research_note table) |
ResearchNoteRepository |
| Trajectory | SQLite (trajectory table) |
TrajectoryRepository |
| Human Input | SQLite (human_input table) |
HumanInputRepository |
| Work Log | SQLite (work_log table) |
WorkLogRepository |
| Related Files | SQLite (related_files table) |
RelatedFilesRepository |
Comparison
| Aspect | SF | RA.Aid |
|---|---|---|
| Storage | Mixed (SQLite + JSONL) | Unified (SQLite only) |
| Queryability | SQL + JSONL grep | SQL only |
| Repository Pattern | Ad hoc functions | Formal repository classes |
| Pydantic Models | No | Yes (TrajectoryModel, etc.) |
| Garbage Collection | Manual | Automatic (garbage_collect()) |
| Session Scoping | Per-session files | session_id foreign key |
Verdict: RA.Aid's unified repository pattern is cleaner. SF's dual persistence is more audit-friendly.
4. Cost Tracking
SF
// metrics.js — per-unit cost tracking
export function recordTokenUsage(unitId, modelId, inputTokens, outputTokens, cost) {
// Writes to SQLite + JSONL
}
// Usage:
recordTokenUsage("unit-123", "claude-sonnet-4", 1500, 800, 0.045);
- Per-unit cost in SQLite
- JSONL ledger for durability
- Dashboard integration via
sf costcommand - No session-level aggregation
RA.Aid
# Trajectory record with cost
trajectory_repo.create(
tool_name="llm_call",
current_cost=0.045,
input_tokens=1500,
output_tokens=800,
record_type="model_usage"
)
# Session-level aggregation
session_totals = trajectory_repo.get_session_usage_totals(session_id)
# Returns: {"total_cost": 1.23, "total_tokens": 45000, ...}
# CLI commands:
# ra-aid last-cost # Latest session
# ra-aid all-costs # All sessions
- Per-trajectory cost in DB
- SQL aggregation for session totals
- Built-in CLI commands for cost queries
Comparison
| Aspect | SF | RA.Aid |
|---|---|---|
| Granularity | Per-unit | Per-trajectory (finer) |
| Aggregation | Manual | SQL SUM |
| CLI Query | sf cost (basic) |
ra-aid last-cost, ra-aid all-costs |
| Budget Limits | Cost guard gate | --max-cost, --max-tokens |
| Show Cost | TUI overlay | --show-cost flag |
Verdict: RA.Aid's cost tracking is more mature with built-in aggregation and CLI queries.
5. Shell Safety & Execution Policy
SF
// execution-policy.js
const PROFILES = {
restricted: { // No destructive tools
allowDestructive: false,
allowBash: false,
allowWrite: false,
},
normal: { // Read-only + planning writes
allowDestructive: false,
allowBash: true, // But classified commands blocked
allowWrite: true, // But source mutations gated
},
trusted: { // Most tools allowed
allowDestructive: true,
allowBash: true,
allowWrite: true,
},
unrestricted: { // Everything
allowDestructive: true,
allowBash: true,
allowWrite: true,
},
};
// Subagent inheritance enforces parent policy
validateSubagentDispatch(envelope, proposal);
- 4 permission profiles
- Subagent inheritance (parent → child)
- Execution policy tool_call hook
- Destructive command classifier
RA.Aid
# tools/shell.py
cowboy_mode = get_config_repository().get("cowboy_mode", False)
if not cowboy_mode:
response = Prompt.ask(
"Execute this command? (y=yes, n=no, c=enable cowboy mode)",
choices=["y", "n", "c"],
default="y",
)
if response == "n":
return {"success": False, "output": "Cancelled"}
elif response == "c":
get_config_repository().set("cowboy_mode", True)
- Binary: cowboy_mode on/off
- Interactive approval per command
- No subagent delegation (no inheritance needed)
Comparison
| Aspect | SF | RA.Aid |
|---|---|---|
| Policy Granularity | 4 profiles + model mode + work mode | Binary (cowboy_mode) |
| Approval UX | Policy-driven automatic | Interactive per-command |
| Subagent Inheritance | Full envelope propagation | N/A (no subagents) |
| Destructive Classification | Static list + dynamic analysis | None |
| Audit Trail | Journal + metrics | Trajectory |
Verdict: SF's execution policy is far more sophisticated. RA.Aid's cowboy_mode is simpler but less safe.
6. Subagent System
SF
Full subagent system with:
- Modes: single, chain, parallel, debate, background
- Inheritance: Parent mode state propagates to children via env vars
- Validation: Subagent dispatch blocked if it violates parent policy
- Coordination: Parallel intent registry prevents conflicting work
// subagent-inheritance.js
export function validateSubagentDispatch(envelope, proposal) {
// Block if provider not allowed
// Block if heavy model in fast mode
// Block if destructive tools in restricted mode
}
RA.Aid
No subagent system. RA.Aid is a single-agent system. It does not dispatch child agents.
Comparison
| Aspect | SF | RA.Aid |
|---|---|---|
| Subagent Modes | 5 modes | None |
| Inheritance | Full mode envelope | N/A |
| Parallel Work | Parallel intent registry | N/A |
| Debate Mode | Advocate + challenger | N/A |
Verdict: SF has a significant advantage for complex multi-agent workflows.
7. Mode System
SF
Orthogonal axes:
- Work Mode:
chat|plan|build|review|repair|research - Run Control:
manual|assisted|autonomous - Permission Profile:
restricted|normal|trusted|unrestricted - Model Mode:
fast|smart|deep - Surface:
tui|web|headless|rpc
// Direct commands
/mode build
/control autonomous
/trust trusted
/model-mode deep
// TUI shortcuts
Ctrl+Shift+M // Cycle work mode
Ctrl+Shift+A // Autonomous
Ctrl+Shift+P // Cycle permission
RA.Aid
Flags:
--research-only: Research only, no implementation--research-and-plan-only: Research + plan, then exit--hil: Human-in-the-loop--chat: Chat mode (implies --hil)--cowboy-mode: Skip shell approval
ra-aid -m "task" --research-only
ra-aid -m "task" --research-and-plan-only
ra-aid -m "task" --hil --chat
Comparison
| Aspect | SF | RA.Aid |
|---|---|---|
| Work Mode | 6 modes with transitions | 2 flags (research-only, research-and-plan-only) |
| Run Control | 3 levels | Implicit (hil/chat vs default) |
| Permission | 4 profiles | 1 flag (cowboy-mode) |
| Model Routing | 3 modes (fast/smart/deep) | Per-task provider/model flags |
| Surface | 4 surfaces | 2 (CLI, server) |
| Keyboard Shortcuts | 8 shortcuts | None |
| Mode Persistence | SQLite + terminal title | In-memory only |
Verdict: SF's mode system is far more sophisticated and user-friendly.
8. Web UI
SF
- TUI: Terminal UI with color bands, emojis, mode badges, cost overlay
- Web: Next.js app with real-time updates
- Headless: JSON/JSONL output for automation
- RPC: gRPC/JSON-RPC for external control
sf tui # Terminal UI
sf web # Start web server
sf headless # JSON output
sf rpc # RPC server
RA.Aid
- CLI: Rich console output with panels
- Server: FastAPI server (optional)
ra-aid -m "task" # CLI
ra-aid --server # FastAPI on :1818
Comparison
| Aspect | SF | RA.Aid |
|---|---|---|
| Terminal UI | Full TUI with mode badges | Rich panels |
| Web Interface | Next.js | FastAPI |
| Headless/Machine | JSON/JSONL event stream | None |
| Real-time Updates | WebSocket | HTTP polling |
| Multi-session | Session manager | Single session |
Verdict: SF has a more complete multi-surface architecture.
9. Testing
SF
- Runner: Vitest
- Count: 144+ tests across 12 suites
- Coverage: V8 provider, 40/40/20/20 thresholds
- Types: Unit + integration + smoke + live
npm test # All tests
npm run test:unit # Unit only
npm run test:integration # Integration
npm run test:smoke # Smoke tests
npm run test:live # Live tests (need env)
RA.Aid
- Runner: pytest
- Count: Unknown (not inspected)
- Coverage: Unknown
- Types: Unit tests
pytest tests/
Comparison
| Aspect | SF | RA.Aid |
|---|---|---|
| Test Runner | Vitest | pytest |
| Test Count | 144+ | Unknown |
| Coverage | Enforced in CI | Unknown |
| Integration Tests | Yes | Unknown |
| Smoke Tests | Yes | Unknown |
| Live Tests | Yes | Unknown |
Verdict: SF appears to have more comprehensive testing infrastructure.
10. Observability
SF
| System | Purpose | Format |
|---|---|---|
| metrics-central.js | Aggregated metrics | Prometheus text |
| uok/audit.js | Per-unit audit trail | JSONL |
| journal.js | Mode transitions, decisions | SQLite |
| self-feedback.js | Inline self-correction | SQLite |
| TUI footer | Real-time cost/context | ANSI text |
RA.Aid
| System | Purpose | Format |
|---|---|---|
| Trajectory | Universal event log | SQLite (Peewee) |
| Cost CLI | Session cost queries | JSON |
| Work Log | Human-readable activity | SQLite |
| Console panels | Real-time status | Rich text |
Comparison
| Aspect | SF | RA.Aid |
|---|---|---|
| Metrics Format | Prometheus | None (DB queries) |
| Event Granularity | Per-unit + per-metric | Per-trajectory |
| Queryability | SQL + Prometheus | SQL only |
| Dashboard Ready | Yes (Grafana) | No |
| Real-time Display | TUI footer | Console panels |
Verdict: SF is better for external observability (Prometheus). RA.Aid is better for internal debugging (unified trajectory).
11. Skills System
SF
# .agents/skills/my-skill/SKILL.md
---
name: my-skill
user-invocable: true
model-invocable: true
side-effects: none
permission-profile: normal
---
# Skill documentation...
- YAML frontmatter
- Hierarchical discovery
- Permission filtering
- Work-mode relevance
- Eval harness
RA.Aid
No skill system. RA.Aid has custom tools (--custom-tools) but no structured skill framework.
Comparison
| Aspect | SF | RA.Aid |
|---|---|---|
| Skill Definition | YAML frontmatter | Python module |
| Discovery | Hierarchical .agents/skills/ |
--custom-tools flag |
| Permissions | Per-skill profile | None |
| Eval | Built-in harness | None |
| Auto-creation | Pattern detection | None |
Verdict: SF has a significant advantage for structured skill management.
12. Recovery & Resilience
SF
| Mechanism | Purpose |
|---|---|
| Crash recovery | Resume from checkpoint after failure |
| Verification retry | Re-run failed verification gates |
| Rethink | Inject rethink prompt on stuck detection |
| Circuit breaker | Exponential backoff on gate failures |
| Cost guard | Block expensive operations |
| Writer tokens | Prevent concurrent writes |
| Parity system | Detect and recover from drift |
RA.Aid
| Mechanism | Purpose |
|---|---|
| Fallback handler | Switch to alternative models on failure |
| Retry with backoff | Re-run failed agent invocations |
| Token limiter | Remove old messages to prevent overflow |
| Recursion limit | Prevent infinite loops |
Comparison
| Aspect | SF | RA.Aid |
|---|---|---|
| Checkpoint/Resume | Yes | No |
| Model Fallback | Yes (on 429/rate-limit) | Yes |
| Token Management | No | Yes (limiter) |
| Circuit Breaker | Yes | No |
| Cost Guard | Yes | No (budget only) |
| Concurrent Write Prevention | Yes (writer tokens) | No |
Verdict: Different strengths. SF better for operational resilience; RA.Aid better for model resilience.
13. MCP Integration
SF
- MCP Client: Full MCP client with tool discovery, resource listing, OAuth
- MCP Server Guard: Explicitly forbidden (test enforces this)
// No SF MCP server — client only
pi.registerMcpClient("filesystem", { ... });
RA.Aid
No MCP integration. RA.Aid uses LangChain tools directly.
Comparison
| Aspect | SF | RA.Aid |
|---|---|---|
| MCP Client | Yes | No |
| MCP Server | Explicitly forbidden | N/A |
| Tool Discovery | Dynamic from MCP servers | Static tool definitions |
Verdict: SF is ahead for MCP ecosystem integration.
14. Provider Abstraction
SF
// pi-ai package
const provider = await resolveProvider("anthropic", "claude-sonnet-4");
const response = await provider.complete(prompt, { thinking: true });
- Abstract provider interface
- Model mode routing (fast/smart/deep)
- Temperature/thinking level management
- Provider allowlists/blocklists
RA.Aid
# llm.py
model = initialize_llm(provider, model, temperature=temperature)
response = model.invoke(prompt)
- LiteLLM for provider abstraction
- Per-task provider/model override
- Temperature support
- Expert model consultation
Comparison
| Aspect | SF | RA.Aid |
|---|---|---|
| Abstraction Layer | Custom (pi-ai) | LiteLLM |
| Model Routing | Mode-based (fast/smart/deep) | Explicit flags |
| Expert Model | No | Yes (reasoning_assist) |
| Temperature | Yes | Yes |
| Thinking Level | Yes | No |
Verdict: RA.Aid's expert model consultation is a unique feature. SF's mode-based routing is more automatic.
15. Documentation & Prompt Engineering
SF
- AGENTS.md: Project-specific instructions
- CLAUDE.md: Claude-specific guidance
- PDD: Purpose-Driven Development fields
- Skills:
.agents/skills/with structured prompts - Prompt History: Per-project JSONL
RA.Aid
- Prompt Templates: Separate files per agent
- Expert Prompts: Optional expert consultation
- Human Prompts: HIL sections
- Custom Tools: Dynamic tool injection
Comparison
| Aspect | SF | RA.Aid |
|---|---|---|
| Prompt Organization | Skills + PDD | Agent-specific files |
| Expert Consultation | Model mode routing | Explicit reasoning_assist |
| Human-in-the-loop | Permission profiles | --hil flag |
| Custom Tools | Skill system | --custom-tools flag |
| Prompt Versioning | Git-tracked skills | Package-bundled |
Verdict: SF's skill system is more structured. RA.Aid's expert consultation is more dynamic.
Overall Assessment
SF Strengths
- Mode system: 5 axes of control vs RA.Aid's binary flags
- Subagent system: Full delegation with inheritance
- Skills system: Structured, evaluable, discoverable
- MCP integration: Client-only, ecosystem-ready
- Execution policy: Granular permission profiles
- Observability: Prometheus-compatible metrics
- Multi-surface: TUI + web + headless + RPC
RA.Aid Strengths
- Explicit pipeline: Clear research → plan → implement flow
- Expert consultation: Dynamic reasoning assistance
- Cost tracking: Built-in aggregation and CLI queries
- Repository pattern: Clean data access
Fallback handling: SF already has model switching on 429/rate-limit- Token limiting: Prevent context overflow
- Simplicity: Easier to understand and modify
Where SF Should Borrow from RA.Aid
- Explicit stage boundaries: Add
/research,/plan,/implementcommands that mirror RA.Aid's agent pipeline - Expert consultation: Add optional "expert model" for reasoning assistance before complex operations
- Cost CLI: Add
sf cost --session,sf cost --allcommands - Repository pattern: Formalize data access with repository classes
- Token limiting: Add context window management
Fallback handler: SF already has model fallback on 429/rate-limit errors
Where RA.Aid Should Borrow from SF
- Mode system: Add work modes, permission profiles, model modes
- Subagent system: Add delegation for parallel work
- Execution policy: Replace cowboy_mode with granular profiles
- Skills system: Add structured skill framework
- MCP integration: Add MCP client support
- UOK gates: Add safety checkpoints between stages
- Observability: Add Prometheus metrics
Conclusion
SF and RA.Aid are complementary rather than competitive:
- SF is a platform: modular, multi-surface, safety-first, designed for complex multi-agent workflows
- RA.Aid is a tool: focused, simple, explicit, designed for single-agent coding tasks
The ideal system would combine:
- SF's mode system + subagent system + skills system
- RA.Aid's explicit pipeline + expert consultation + cost tracking
- Both projects' DB-first state philosophy