singularity-forge/docs/dev/MEMORY-SYSTEM-ARCHITECTURE.md
2026-05-08 03:01:20 +02:00

19 KiB

SF Memory System Architecture

Overview

Singularity-forge includes a complete autonomous memory system built on SQLite (Node 26 native) with no external dependencies. The memory system enables SF to:

  • Learn from unit execution patterns and outcomes
  • Recall similar past situations for context-aware decisions
  • Adapt dispatch ranking based on historical patterns
  • Detect recurring issues and gotchas
  • Preserve architectural knowledge and conventions

Core Modules

1. memory-store.js (Core CRUD Layer)

Location: src/resources/extensions/sf/memory-store.js

Purpose: Foundational CRUD operations and ranking engine for all memory operations.

Key Functions:

  • createMemory(category, content, confidence = 0.8) — Create a new memory entry
  • getRelevantMemoriesRanked(embedding, category, limit = 5) — Query by similarity and category
  • updateMemoryConfidence(memoryId, confidence) — Adjust confidence scores
  • deleteMemory(memoryId) — Remove outdated memories
  • getMemoriesByRelation(fromId, relationName) — Follow relationship graphs
  • isDbAvailable() — Check database connectivity

Categories Supported:

  • gotcha — Known issues, workarounds, edge cases
  • convention — Coding standards, naming patterns, architectural rules
  • architecture — Design decisions, module responsibilities
  • pattern — Recurring execution patterns (unit types, dependencies)
  • environment — Configuration, setup, environment-specific behaviors
  • preference — User preferences, optimization decisions

Storage Schema:

memories (
  id TEXT PRIMARY KEY,
  category TEXT,
  content TEXT,
  confidence REAL,
  created_at TEXT,
  updated_at TEXT,
  hit_count INTEGER
)

2. memory-embeddings.js (Vector Operations)

Location: src/resources/extensions/sf/memory-embeddings.js

Purpose: Convert content to embeddings and perform similarity operations (cosine distance).

Key Functions:

  • computeEmbedding(content) — Generate deterministic embedding
  • storeEmbedding(memoryId, embedding, model = "default") — Persist embedding as BLOB
  • getEmbedding(memoryId) — Retrieve stored embedding
  • cosineSimilarity(embedding1, embedding2) — Compute similarity score (0-1)

Vector Format:

  • Embeddings stored as Float32Array → serialized BLOB in SQLite
  • Default: 128-dimensional vectors
  • Deterministic: same content always produces same embedding

Storage Schema:

memory_embeddings (
  memory_id TEXT PRIMARY KEY,
  model TEXT,
  dimensions INTEGER,
  vector BLOB,
  updated_at TEXT
)

3. memory-relations.js (Graph Layer)

Location: src/resources/extensions/sf/memory-relations.js

Purpose: Create and query relationship graphs between memories.

Key Functions:

  • createRelation(fromId, toId, relationName, confidence = 0.8) — Link two memories
  • getRelatedMemories(fromId, relationName) — Follow outgoing edges
  • getReverseRelations(toId, relationName) — Follow incoming edges
  • computePathWeight(fromId, toId, relationName) — Path strength

Relationship Types:

  • "caused_by" — Unit failure → root cause
  • "similar_to" — Pattern similarity
  • "workaround_for" — Known fix for issue
  • "depends_on" — Architectural dependency

Storage Schema:

memory_relations (
  from_id TEXT,
  to_id TEXT,
  relation_name TEXT,
  confidence REAL,
  created_at TEXT,
  PRIMARY KEY (from_id, to_id, relation_name)
)

4. memory-ingest.js (Input Layer)

Location: src/resources/extensions/sf/memory-ingest.js

Purpose: Ingest external knowledge (files, URLs, documentation) into memory.

Key Functions:

  • ingestFile(filePath, category, options) — Load from local file
  • ingestUrl(url, category, options) — Fetch and parse URL content
  • ingestMarkdown(content, category) — Parse markdown headers as memory entries
  • ingestCodeSnippet(code, language, category) — Extract and learn from code

Use Cases:

  • Load README.md as architectural conventions
  • Import docs/ as foundational knowledge
  • Parse error logs as gotchas
  • Extract code patterns from examples

5. memory-extractor.js (Auto-Learning)

Location: src/resources/extensions/sf/memory-extractor.js

Purpose: Automatically extract and learn patterns from unit execution.

Key Functions:

  • extractPatternFromUnit(unit, status, result) — Learn from unit completion
  • extractFailureGotcha(unit, error) — Record and categorize failures
  • extractConventionFromCode(filePath, codeContent) — Detect patterns
  • deduplicateMemory(memoryId, similarMemories) — Merge similar learnings

Learning Strategy:

  • Success: high confidence (0.9) — strong signal
  • Failure: medium confidence (0.5) — more variable
  • Conventions: learned from code reviews
  • Architectures: extracted from design docs

6. memory-embeddings-llm-gateway.js (Semantic Reranking)

Location: src/resources/extensions/sf/memory-embeddings-llm-gateway.js

Purpose: Optional LLM-powered semantic reranking of retrieved memories.

Key Functions:

  • rerankedByLLM(memories, query, topK = 3) — Use LLM to rerank results
  • isLLMAvailable() — Check if LLM provider configured
  • cacheRerankResult(query, topK, result) — Cache LLM rankings

Workflow:

  1. Vector similarity returns candidates (cosine-based)
  2. LLM gateway reranks semantically
  3. Top results returned with adjusted scores
  4. Cache results for subsequent identical queries

Fallback: If LLM unavailable, returns original vector-ranked results


7. memory-relations.js (Graph Operations)

Location: src/resources/extensions/sf/memory-relations.js

Purpose: Create and traverse memory relationship graphs.

Key Functions:

  • linkMemories(fromId, toId, relationName, confidence) — Create edges
  • findRelationPath(fromId, toId, maxDepth) — Path finding (similar to BFS)
  • computeGraphConfidence(fromId, toId) — Multi-hop confidence decay

Graph Traversal:

  • Relation strength decays with path depth
  • Can find indirect causes of failures
  • Enables multi-hop pattern matching

8. memory-sleeper.js (Decay & Supersession)

Location: src/resources/extensions/sf/memory-sleeper.js

Purpose: Age memories and mark superseded entries.

Key Functions:

  • markSuperseded(oldMemoryId, newMemoryId) — Chain updates
  • decayOldMemories(olderThanDays = 30) — Reduce confidence of old entries
  • archiveMemory(memoryId) — Mark as historical
  • reactivateMemory(memoryId) — Re-promote archived memory

Strategy:

  • Memories age over time (confidence decay)
  • New learnings override old ones via supersession
  • Archive doesn't delete; keeps full history
  • Old memories still searchable with lower weight

9. memory-backfill.js (Historical Data)

Location: src/resources/extensions/sf/memory-backfill.js

Purpose: Bulk-load historical data from past runs into memory.

Key Functions:

  • backfillFromRunLogs(logPath) — Import execution history
  • backfillFromGitHistory(repoPath) — Learn from git commits
  • backfillFromTestResults(testPath) — Ingest test data
  • computeBackfillConfidence(dataSource) — Adjust confidence by source quality

Use Cases:

  • Initial knowledge load from project history
  • Recover from database reset
  • Merge memories from multiple SF instances

10. memory-source-store.js (Source Tracking)

Location: src/resources/extensions/sf/memory-source-store.js

Purpose: Track origins of memories for traceability and debugging.

Key Functions:

  • trackMemorySource(memoryId, sourceUri, sourceType) — Record where memory came from
  • getMemorySources(memoryId) — Audit trail of memory
  • validateSourceFreshness(sourceUri) — Check if source updated
  • revalidateMemory(memoryId) — Re-fetch from source if changed

Source Types:

  • "unit-outcome" — Learned from unit execution
  • "documentation" — From docs/
  • "user-input" — Manually added
  • "llm-extracted" — From LLM analysis
  • "git-history" — From commits

Storage Schema:

memory_sources (
  memory_id TEXT,
  source_uri TEXT,
  source_type TEXT,
  created_at TEXT,
  last_validated_at TEXT
)

11. commands-memory.js (CLI Interface)

Location: src/resources/extensions/sf/commands-memory.js

Purpose: Command-line interface to memory system.

Commands:

  • /sf memory list [category] — List all memories (optionally filtered)
  • /sf memory search <query> — Find memories by content
  • /sf memory note <content> — Manually add memory
  • /sf memory status — Memory database statistics
  • /sf memory decay [--older-than-days N] — Age memories
  • /sf memory export <path.json> — Export all memories to JSON
  • /sf memory import <path.json> — Import memories from JSON

12. memory-tools.js (Tool Exports)

Location: src/resources/extensions/sf/tools/memory-tools.js

Purpose: Export memory functions as SF tools for agent use.

Exported Tools:

  • recall-memory — Query by context
  • create-memory — Store new learning
  • link-memories — Create relationships
  • search-memories — Full-text search
  • get-memory-stats — Analytics

Databases

sf-db.js (SQLite Backend)

Location: src/resources/extensions/sf/sf-db.js

Purpose: Core SQLite database abstraction (Node 26 native, no external deps).

Tables:

  • memories — Memory entries
  • memory_embeddings — Vector data
  • memory_relations — Relationship graph
  • memory_sources — Source tracking
  • Plus other SF tables (uok, env, etc.)

Key Advantage: Node 26.1+ has native SQLite support (node:sqlite)


Integration Points

1. UOK Kernel Integration (Unit Recording)

File: src/resources/extensions/sf/uok/unit-runtime.js

Function added: recordUnitOutcomeInMemory(unit, status, result)

recordUnitOutcomeInMemory(unit, "completed", { 
  success: true, 
  executionTimeMs: 2341 
})
// Stores pattern: "unit-type:code-review success confidence:0.9"

2. Dispatch Ranking (Decision Enhancement)

File: src/resources/extensions/sf/auto-dispatch.js

Function added: enhanceUnitRankingWithMemory(units, baseScores)

const enhanced = await enhanceUnitRankingWithMemory(candidates, {
  'unit-1': 0.75,
  'unit-2': 0.60
})
// Boosts scores based on learned patterns
// Boost = baseScore + (topMemoryConfidence * 0.15)

3. Gate Context (Failure Diagnostics)

File: src/resources/extensions/sf/uok/gate-runner.js

Function added: enrichGateResultWithMemory(gateResult, gateId)

const enriched = await enrichGateResultWithMemory(
  { outcome: 'fail', reason: 'timeout' },
  'deployment-gate'
)
// Adds memoryContext: { hasHistoricalPattern: true, ... }
// Pure diagnostic, never changes gate logic

Usage Examples

Example 1: Record Unit Completion

import { recordUnitOutcomeInMemory } from './uok/unit-runtime.js';

// After unit executes
recordUnitOutcomeInMemory(unit, 'completed', {
  success: true,
  duration: 2341
});
// Fire-and-forget: stores pattern in memory

Example 2: Get Dispatch Context

import { enhanceUnitRankingWithMemory } from './auto-dispatch.js';

const candidates = [
  { id: 'unit-a', type: 'research', readiness: 0.8 },
  { id: 'unit-b', type: 'research', readiness: 0.6 },
];

const enhanced = await enhanceUnitRankingWithMemory(candidates, {
  'unit-a': 0.8,
  'unit-b': 0.6
});

// Returns ranked with memory boost
// { id: 'unit-a', score: 0.92 } (boosted by 0.12)
// { id: 'unit-b', score: 0.60 } (no pattern match)

Example 3: Search for Gotchas

import { getRelevantMemoriesRanked } from './memory-store.js';

const gotchas = await getRelevantMemoriesRanked(
  unitEmbedding,
  'gotcha',
  3 // top 3
);

// Returns similar past issues
// [
//   { id: 'm1', confidence: 0.95, content: 'Network timeout during...' },
//   { id: 'm2', confidence: 0.87, content: 'Database lock contention...' },
//   ...
// ]

Architecture Diagram

┌─────────────────────────────────────────────────────────────┐
│                    SF Dispatch Loop                         │
│                                                             │
│  ┌────────────────────────────────────────────────────┐   │
│  │ For each unit candidate:                            │   │
│  │ 1. Base score (readiness, priority, etc.)          │   │
│  │ 2. enhanceUnitRankingWithMemory()                  │   │
│  │    ├─→ query memory for similar patterns           │   │
│  │    └─→ boost matching candidates                   │   │
│  │ 3. Apply dispatch rules                             │   │
│  │ 4. Return selected unit                             │   │
│  └────────────────────────────────────────────────────┘   │
│                         ↓                                   │
│                  Execute Selected Unit                      │
│                         ↓                                   │
│  ┌────────────────────────────────────────────────────┐   │
│  │ recordUnitOutcomeInMemory()                        │   │
│  │ ├─→ Extract pattern from result                   │   │
│  │ ├─→ Compute confidence (0.9 for success)          │   │
│  │ └─→ Store in memory (fire-and-forget)             │   │
│  └────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────┘
                             ↓
┌─────────────────────────────────────────────────────────────┐
│                   Memory System                             │
│                                                             │
│  ┌─────────────────┐  ┌──────────────────┐                │
│  │ memory-store.js │  │memory-embeddings.│  ← Cosine sim  │
│  │  (CRUD layer)   │  │   (vectors)      │                │
│  └─────────────────┘  └──────────────────┘                │
│           ↓                     ↓                          │
│  ┌─────────────────────────────────────┐                 │
│  │  memory-relations.js (Graph)        │                 │
│  │  memory-sleeper.js (Decay)          │                 │
│  │  memory-source-store.js (Tracking)  │                 │
│  └─────────────────────────────────────┘                 │
│                     ↓                                     │
│      ┌─────────────────────────────┐                     │
│      │  SQLite (sf-db.js)          │                     │
│      │  Node 26 native sqlite      │                     │
│      └─────────────────────────────┘                     │
└─────────────────────────────────────────────────────────────┘

Graceful Degradation

All memory operations follow the fire-and-forget pattern:

  1. Memory unavailable → dispatch continues without boost
  2. DB error → operation fails silently, decision unaffected
  3. LLM reranking fails → fall back to vector similarity
  4. Embedding computation fails → use default embedding

Result: Memory is always optional; never blocks dispatch or UOK execution.


Performance Characteristics

Operation Latency Notes
createMemory() <5ms Async write, fire-and-forget
getRelevantMemoriesRanked() 10-50ms Depends on DB size and vector dim
cosineSimilarity() <1ms 128D vectors, hardware-accelerated
computeEmbedding() 5-20ms Deterministic hash-based
Dispatch boost overhead <10ms Per dispatch cycle

Data Retention & Growth

Memory Lifecycle:

  1. Created with confidence score (0.0-1.0)
  2. Hit count incremented on each use
  3. Confidence may decay over time (sleeper)
  4. Marked superseded or archived
  5. Historical records preserved (never deleted)

Growth Management:

  • Embeddings indexed by memory_id (fast lookup)
  • Relations indexed by from_id, to_id (graph traversal)
  • Decay/supersession prevent stale data
  • Archive doesn't grow real table (historical only)

Security & Privacy

  • Memory is local — All data stored in SF's SQLite (no external services except optional LLM)
  • Source tracking — Full audit trail of where memories came from
  • No sensitive data — Memory system stores patterns and architecture, not credentials
  • Encapsulated — Memory functions exported only to SF extensions

Future Enhancements

  1. Distributed memory — Share learnings across SF instances
  2. Memory compression — Archive old embeddings to reduce DB size
  3. Active learning — Automatically query for improvements
  4. Temporal indexing — Query memories by creation date
  5. Semantic clustering — Group similar memories automatically
  6. Telemetry — Track which memories most influence dispatch

Architecture Decision: SF Tools, Not MCP

Memory is NOT exposed as MCP server.

  • SF is an MCP client only — SF consumes MCP tools from external services
  • Memory is internal SF infrastructure — uses SQLite (Node 26 native)
  • Memory exported as SF tools — LLM agents within SF call memory functions
  • No external exposure — Memory system is not a service; it's SF's autonomous learning mechanism

This keeps memory private to SF and prevents:

  • External memory pollution
  • Uncontrolled confidence scoring
  • Inconsistent learning patterns
  • Loss of autonomy (memory decisions stay internal)

See Also

  • ADR-0075: UOK Gate Architecture
  • ADR-0000: Purpose-to-Software Compiler
  • docs/dev/UOK-SELF-EVOLUTION.md — How SF learns
  • src/resources/extensions/sf/uok/unit-runtime.js — Unit recording
  • src/resources/extensions/sf/auto-dispatch.js — Dispatch ranking
  • src/resources/extensions/sf/tools/memory-tools.js — SF tool executors