feat: Architecture expansion - organisms, swarm evolution, memory gradient, infrastructure

New sections created: - organisms/ - Modular robot design (CAN bus + magnetic pogo connectors) - infrastructure/ - Kallax Grid World (40×40×40cm standardized cells) Core documents added: - Swarm-Evolution.md - Ternary clasp rules, escalation ladder (L0-L5), Mount Olympus council - Modular-Organism-Design.md - ESP32 modules, universal connector spec, Phase 0 BOM - Memory-Gradient.md - Metacognitive routing (renamed from RAG-as-Scaffold.md) - Kallax-Grid-World.md - Sim-to-real substrate, "schrotti cyberpunk" aesthetic Enhanced: - Nimmerswarm-Interface.md - Dual-spectrum architecture (IR position + visible state) - Attention-Slumber-Prediction-Cycle.md - Blend marker predictions extension Key insights: Decision markers (mark+continue+predict), Low-Cost-Mocap integration 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-29 11:09:50 +01:00
parent ed16d9722e
commit dff124f9b7
10 changed files with 2863 additions and 541 deletions
--- a/operations/Memory-Gradient.md
+++ b/operations/Memory-Gradient.md
@@ -0,0 +1,784 @@
+# Memory Gradient
+
+Knowledge metabolism — from external scaffold to internalized reflex.
+
+---
+
+## Overview
+
+Retrieval-Augmented Generation (RAG) gave us something valuable: a way to ground LLM responses in external knowledge. It solved real problems — hallucination, knowledge cutoffs, domain specificity. The work that built RAG deserves respect.
+
+But we wanted to go further.
+
+RAG treats retrieval as a permanent fixture — knowledge lives outside, gets fetched when needed, and the model never truly learns. What if retrieval could be **temporary**? What if the scaffold could teach, then step aside? What if the system could learn not just *what* to retrieve, but *when* to retrieve — and eventually, *when it no longer needs to*?
+
+**Memory Gradient** is our answer. It extends RAG into a complete knowledge lifecycle:
+
+```
+TRADITIONAL RAG                    MEMORY GRADIENT
+─────────────────                  ─────────────────
+External knowledge store     →     External knowledge as starting point
+Retrieve on every query      →     Retrieve until internalized
+Model never learns           →     Model metabolizes knowledge
+Static retrieval             →     Graduated confidence routing
+Binary: found / not found    →     Continuous gradient of knowing
+```
+
+The key insight: LLMs don't think in binary. They think in gradients — weighted paths, probability distributions, activation patterns. **Memory Gradient** aligns the knowledge system with how the model actually works.
+
+Three principles guide this approach:
+
+1. **Knowledge flows inward** — From hidden → discovered → familiar → internalized → reflex
+2. **Confidence is learned** — The routing decision itself is trainable
+3. **Scaffolds come off** — Temporary support that proves its own obsolescence
+
+The goal is not to build a better search engine. The goal is not even to make search unnecessary. The goal is to **know what you know** — and know what you don't.
+
+---
+
+## The Meta-Skill Hierarchy
+
+Not all knowledge lives in the same place. Not all retrieval costs the same. The skill is routing correctly.
+
+```
+┌─────────────────────────────────────────────────────────────┐
+│  LEVEL 3: METACOGNITION                                     │
+│           "Do I know this? Should I ask?"                   │
+│           The routing decision itself                       │
+│           → THIS IS THE MOST VALUABLE SKILL                 │
+├─────────────────────────────────────────────────────────────┤
+│  LEVEL 2: KNOWLEDGE (in weights, needs thought)             │
+│           Slow retrieval from trained memory                │
+│           "I learned this, let me recall..."                │
+├─────────────────────────────────────────────────────────────┤
+│  LEVEL 1: REFLEX (in weights, bypasses cognition)           │
+│           Instant response, no thinking required            │
+│           Like pulling hand from hot stove                  │
+├─────────────────────────────────────────────────────────────┤
+│  LEVEL 0: RAG LOOKUP (external, costs lifeforce)            │
+│           Scaffold, temporary, expensive but accurate       │
+│           Training wheels that should come off              │
+└─────────────────────────────────────────────────────────────┘
+```
+
+---
+
+## The Confidence Calibration Matrix
+
+The reward isn't just "did you get it right" — it's "did you KNOW you'd get it right?"
+
+```
+                        OUTCOME
+                    RIGHT    WRONG
+                  ┌────────┬────────┐
+           HIGH   │  +V    │  -V    │  ← Confident and wrong = BAD
+CONFIDENCE        │ trust  │ danger │    (overconfident, needs recalibration)
+                  ├────────┼────────┤
+           LOW    │  +v    │  +v    │  ← Uncertain = correctly routed to ASK
+   (asked RAG)    │ learn  │ learn  │    (didn't waste energy on wrong answer)
+                  └────────┴────────┘
+```
+
+**Reward Structure:**
+| Situation | Reward | Why |
+|-----------|--------|-----|
+| High confidence + Right | **+V** | Trust earned, reflex/knowledge worked |
+| High confidence + Wrong | **-V** | Dangerous! Overconfident, needs correction |
+| Low confidence + Asked + Right | **+v** | Correctly knew to ask, learned |
+| Low confidence + Asked + Wrong | **+v** | Correctly knew to ask, RAG failed (not her fault) |
+| Low confidence + Didn't ask + Wrong | **-v** | Should have asked, underconfident in asking |
+| Asked when didn't need to | **-v** | Wasted lifeforce, underconfident in self |
+
+**The sweet spot:** Know when you know, know when you don't.
+
+---
+
+## Token Path Rewards
+
+LLMs work token-based, not schema-based. The weights influence paths between tokens. This means:
+
+```
+TRADITIONAL VIEW          TOKEN PATH VIEW
+
+"Remember the answer"  →  "Strengthen the path that got it right"
+
+   Query                     Query
+     ↓                         ↓
+  Answer               ┌──────────────────┐
+                       │ Path A: cup→grip │ ← This path fired
+                       │ Path B: cup→drink│   and led to success
+                       │ Path C: cup→hot  │
+                       └──────────────────┘
+                               ↓
+                         SUCCESS
+                               ↓
+                       Path A gets +V
+                       (Hebbian: fired together → wire together)
+```
+
+**The Catalogue's Role:**
+
+When Young Nyx queries the catalogue, multiple token paths light up:
+
+```
+QUERY: "How do I grasp this cup?"
+
+PATHS ACTIVATED:
+├── cup → ceramic → fragile → careful_grip → success_rate_87%
+├── cup → handle → graspable → grip_type_A → success_rate_94%  ← WINNER
+├── cup → 8cm_diameter → fits_gripper_small → success_rate_91%
+└── cup → hot_liquid → thermal_warning → check_temp_first
+
+OUTCOME: Used grip_type_A, succeeded
+
+REWARD: Path "cup → handle → graspable → grip_type_A" strengthened
+        Next time: This path activates faster, stronger
+```
+
+**This is Hebbian learning for RAG:** Paths that fire together and succeed, wire together.
+
+---
+
+## The Metacognitive Router
+
+Before answering, before retrieving, the first question is always:
+
+```
+INPUT: Query/Task
+         │
+         ▼
+┌─────────────────────────────────────────┐
+│         METACOGNITIVE CHECK             │
+│                                         │
+│    "What is my confidence level?"       │
+│    "Is this reflex, knowledge, or RAG?" │
+│    "What's the cost of being wrong?"    │
+│                                         │
+└─────────────────────────────────────────┘
+         │
+         ▼
+┌─────────────────────────────────────────┐
+│         CONFIDENCE THRESHOLD            │
+│                                         │
+│    HIGH (>0.8): Use reflex/knowledge    │
+│    MEDIUM (0.4-0.8): Consider asking    │
+│    LOW (<0.4): Must ask catalogue/RAG   │
+│                                         │
+└─────────────────────────────────────────┘
+         │
+    ┌────┴────────────┬─────────────┐
+    │                 │             │
+   HIGH            MEDIUM          LOW
+    │                 │             │
+    ▼                 ▼             ▼
+┌────────┐    ┌────────────┐   ┌──────────┐
+│ REFLEX │    │ COST-CHECK │   │ ASK      │
+│   or   │    │ Wrong=bad? │   │ CATALOGUE│
+│ RECALL │    │ Time-sens? │   │ (RAG)    │
+└────────┘    └────────────┘   └──────────┘
+    │                │              │
+    │           ┌────┴────┐        │
+    │           │         │        │
+    │        PROCEED    ASK        │
+    │           │         │        │
+    └───────────┼─────────┼────────┘
+                │         │
+                ▼         ▼
+           ┌─────────────────┐
+           │     OUTPUT      │
+           └─────────────────┘
+                    │
+                    ▼
+           ┌─────────────────┐
+           │   VALIDATION    │
+           │  (was it right?)│
+           └─────────────────┘
+                    │
+              ┌─────┴─────┐
+              │           │
+           RIGHT        WRONG
+              │           │
+              ▼           ▼
+         Strengthen    Weaken path
+         that path     + recalibrate
+         + calibrate   confidence
+         confidence
+```
+
+---
+
+## The Problem with Standard RAG
+
+```
+Standard approach:
+─────────────────
+VECTOR DB (grows forever)
+    │
+    ▼
+MODEL looks up ──▶ answers ──▶ done
+    │
+    └── (never learns, always dependent)
+```
+
+**Issues:**
+- Model never internalizes knowledge
+- Pull the RAG, lose the capability
+- Vector DB bloats infinitely
+- No way to verify what model "knows" vs "looks up"
+- No metacognitive skill development
+- It's a crutch that never comes off
+
+---
+
+## The Nimmerverse Approach: RAG as Feeding System
+
+```
+VAULT (curriculum)
+    │
+    ▼
+CATALOGUE (indexed, searchable, token-path weighted)
+    │
+    ▼
+METACOGNITIVE ROUTER
+    │
+    ├── High confidence ──▶ REFLEX/KNOWLEDGE (bypass RAG)
+    │
+    └── Low confidence ──▶ RAG LOOKUP (scaffold)
+                               │
+                               ▼
+                        NYX processes, acts, decides
+                               │
+                               ▼
+                        VALIDATION: success?
+                               │
+                        ┌──────┴──────┐
+                        │             │
+                      FAIL         SUCCESS
+                        │             │
+                        ▼             ▼
+                  Stay in RAG    Was RAG used?
+                  (not ready)         │
+                               ┌──────┴──────┐
+                               │             │
+                             YES            NO
+                               │             │
+                               ▼             ▼
+                        FLAG for        Reflex/Knowledge
+                        training        confirmed ✓
+                        extraction           │
+                               │             │
+                               ▼             │
+                        TRAINING RUN        │
+                        (LoRA)              │
+                               │             │
+                               ▼             │
+                        CLEAR from RAG      │
+                        (scaffold removed)  │
+                               │             │
+                               ▼             │
+                        VALIDATION 2:       │
+                        success WITHOUT RAG?│
+                               │             │
+                        ┌──────┴──────┐     │
+                        │             │     │
+                      FAIL         SUCCESS  │
+                        │             │     │
+                        ▼             ▼     │
+                  Restore RAG    INTERNALIZED
+                  retry cycle    Knowledge is │
+                                 HERS now ✓  │
+                                      │      │
+                                      └──────┘
+                                          │
+                                          ▼
+                               CONFIDENCE CALIBRATION
+                               (update routing thresholds)
+```
+
+---
+
+## Two Kinds of Knowledge
+
+Not everything belongs in weights. Not everything belongs in retrieval.
+
+### IN THE WEIGHTS (Training Target)
+
+Knowledge she needs to **be herself**:
+
+- How to route (metacognition itself)
+- Vocabulary tokens and meanings
+- Nervous system contracts
+- Heartbeat mechanics
+- Confidence gradient logic
+- Core identity (who she is, who dafit is)
+- **How to think, not what to remember**
+- **When to ask, not all the answers**
+
+**Test:** If she needs it to function → weights
+
+### IN RETRIEVAL (Permanent RAG)
+
+Knowledge she needs to **remember specifics**:
+
+- Journal entries
+- Conversation history
+- Specific events and dates
+- Temporal details ("what happened Tuesday")
+- External references that change
+- Episodic memory
+- Object catalogue details
+
+**Test:** If she needs it to recall specifics → retrieval
+
+### IN REFLEX (Nervous System)
+
+Knowledge that bypasses cognition entirely:
+
+- Danger responses
+- Basic motor patterns
+- Protocol compliance
+- Heartbeat responses
+
+**Test:** If thinking would be too slow → reflex
+
+---
+
+## The Double Validation Loop
+
+### Gate 1: Can she do it WITH RAG?
+
+```
+Task presented
+    │
+    ▼
+Metacognitive check: Should I ask?
+    │
+    ├── HIGH confidence ──▶ Attempt from reflex/knowledge
+    │                            │
+    │                       ┌────┴────┐
+    │                    SUCCESS    FAIL
+    │                       │         │
+    │                       │    Confidence was
+    │                       │    miscalibrated!
+    │                       │    Recalibrate + retry with RAG
+    │                       │
+    └── LOW confidence ──▶ RAG provides context
+                               │
+                               ▼
+                          NYX attempts task
+                               │
+                        ┌──────┴──────┐
+                        │             │
+                      FAIL         SUCCESS
+                        │             │
+                        ▼             ▼
+                  Not ready,    Flag this RAG content
+                  needs more    for training extraction
+                  examples
+```
+
+### Gate 2: Can she do it WITHOUT RAG?
+
+```
+Same task presented
+    │
+    ▼
+RAG entry CLEARED (scaffold removed)
+    │
+    ▼
+NYX attempts task from weights alone
+    │
+    ├── FAIL  ──▶ Training didn't take, restore to RAG, retry cycle
+    │
+    └── PASS  ──▶ Knowledge is HERS now ✓
+                      │
+                      ▼
+                 Update confidence calibration
+                 (this type of task: now HIGH confidence)
+```
+
+---
+
+## The Catalogue as Oracle
+
+The catalogue isn't just storage — it's the **ground truth** for calibration.
+
+### What the Catalogue Provides
+
+```
+┌─────────────────────────────────────────────────────────────┐
+│                    CATALOGUE LAYERS                         │
+├─────────────────────────────────────────────────────────────┤
+│                                                             │
+│  LAYER 0: RAW DATA (Filesystem)                            │
+│  └── Images, point clouds, .blend files, audio, scans      │
+│                                                             │
+│  LAYER 1: STRUCTURED METADATA (PostgreSQL/Phoebe)          │
+│  └── Dimensions, timestamps, relationships, ownership       │
+│  └── Ground truth for validation                           │
+│                                                             │
+│  LAYER 2: VECTOR EMBEDDINGS (ChromaDB/pgvector)            │
+│  └── SigLIP vectors, text embeddings, multi-modal          │
+│  └── Semantic similarity, fuzzy matching                   │
+│                                                             │
+│  LAYER 3: TOKEN PATH WEIGHTS (The learning layer)          │
+│  └── Weighted connections between concepts                 │
+│  └── Strengthened by successful activations                │
+│  └── THIS IS WHERE +V FLOWS                                │
+│                                                             │
+│  LAYER 4: CONFIDENCE CALIBRATION (Meta-layer)              │
+│  └── "For queries like X, my accuracy is Y%"               │
+│  └── Updated after every validation                        │
+│  └── Drives the metacognitive router                       │
+│                                                             │
+└─────────────────────────────────────────────────────────────┘
+```
+
+### Catalogue as Checker/Reward System
+
+The catalogue validates — it doesn't just retrieve:
+
+```
+ACTION: Robot claims cup is 8cm diameter
+
+CATALOGUE CHECK:
+├── Query: cup_id_47 dimensions
+├── Ground Truth: diameter = 8.2cm
+├── Tolerance: ±0.5cm
+└── RESULT: VALID ✓
+
+REWARD FLOW:
+├── Path "visual_estimate → 8cm" gets +V
+├── Confidence for "size estimation" increases
+└── Next time: Can skip catalogue check for similar objects
+```
+
+---
+
+## Knowledge Acquisition Pipeline
+
+### The Extraction Flow
+
+```
+VAULT (raw knowledge)
+    │
+    │ extraction candidates
+    ▼
+┌─────────────────────────────────────────────────────────────┐
+│                  STAGING AREA                               │
+│                 (quarantine zone)                           │
+└─────────────────────────────────────────────────────────────┘
+    │
+    │ progressive policy validation
+    ▼
+┌─────────────────────────────────────────────────────────────┐
+│              POLICY VALIDATION                              │
+│         (increasing standards over time)                    │
+└─────────────────────────────────────────────────────────────┘
+    │
+    ├── FAIL ──▶ Reject or revise
+    │
+    └── PASS ──▶ PROMOTE to Catalogue/RAG
+                     │
+                     ▼
+           ┌──────────────────────┐
+           │   THREE-TIER RAG     │
+           ├──────────────────────┤
+           │  INTERNALIZED        │  ← In weights, no lookup needed
+           │  (reflex/knowledge)  │
+           ├──────────────────────┤
+           │  DISCOVERED          │  ← Young Nyx has used
+           │  (known_catalogue)   │
+           ├──────────────────────┤
+           │  HIDDEN              │  ← Available but not yet accessed
+           │  (available_catalogue)│
+           └──────────────────────┘
+```
+
+### Progressive Policy Validation
+
+Policies increase in sophistication as Young Nyx matures:
+
+| Week | Policy Tier | Validation |
+|------|-------------|------------|
+| **1-2** | **Basic Syntax** | Valid format, non-empty, has definition |
+| **3-4** | **Semantic Quality** | Embeds without collapse, unique signature |
+| **5-8** | **Topology Safety** | Doesn't corrupt anchor terms |
+| **9-12** | **Cross-Reference** | Links resolve, no circular dependencies |
+| **13+** | **Utility Validation** | Actually helped solve tasks |
+| **20+** | **Internalization Gate** | Ready to train into weights |
+
+### Three-Tier Knowledge State
+
+```
+┌──────────────────────────────────────────────┐
+│         INTERNALIZED KNOWLEDGE               │
+│  (in weights - reflex or slow recall)        │
+├──────────────────────────────────────────────┤
+│  • "heartbeat" - reflex, instant             │
+│  • "lifeforce" - knowledge, fast recall      │
+│  • "grip_type_A" - reflex, motor pattern     │
+│                                              │
+│  Status: NO LOOKUP, high confidence          │
+│  Metacognitive route: DIRECT                 │
+└──────────────────────────────────────────────┘
+
+┌──────────────────────────────────────────────┐
+│           DISCOVERED KNOWLEDGE               │
+│  (known_catalogue - has accessed before)     │
+├──────────────────────────────────────────────┤
+│  • "phoebe" - used 15 times, 80% success     │
+│  • "confidence_gradient" - used 8 times      │
+│                                              │
+│  Status: LOOKUP needed, medium confidence    │
+│  Metacognitive route: CHECK CATALOGUE        │
+└──────────────────────────────────────────────┘
+
+┌──────────────────────────────────────────────┐
+│            HIDDEN KNOWLEDGE                  │
+│  (available_catalogue - exists but unused)   │
+├──────────────────────────────────────────────┤
+│  • "drift_probe" - never accessed            │
+│  • "topology_gini" - never accessed          │
+│                                              │
+│  Status: Available for discovery             │
+│  Metacognitive route: UNKNOWN (will discover)│
+└──────────────────────────────────────────────┘
+```
+
+**State transitions:**
+```
+Hidden → retrieved → DISCOVERED (mark first access)
+Discovered → used 10+ times successfully → FLAG for training
+Flagged → trained + validated without RAG → INTERNALIZED
+Internalized → fails validation → DEMOTE back to Discovered
+```
+
+---
+
+## Measuring RAG Utility
+
+### Decision Trails
+
+Track every decision for learning:
+
+```sql
+CREATE TABLE decision_trails (
+    id SERIAL PRIMARY KEY,
+    task_id UUID,
+
+    -- Routing decision
+    initial_confidence FLOAT,        -- Before any lookup
+    route_chosen TEXT,               -- 'reflex', 'knowledge', 'rag', 'escalate'
+
+    -- RAG details (if used)
+    rag_terms_retrieved TEXT[],      -- What RAG returned
+    rag_terms_used TEXT[],           -- What appeared in solution
+
+    -- Outcome
+    outcome TEXT,                    -- 'success', 'fail', 'partial'
+    final_confidence FLOAT,          -- After action
+
+    -- Calibration
+    was_confidence_accurate BOOLEAN, -- Did confidence predict outcome?
+
+    -- Economics
+    lifeforce_cost FLOAT,
+    timestamp TIMESTAMPTZ DEFAULT NOW()
+);
+```
+
+### Compute Utility Score
+
+```python
+def compute_decision_quality(trail):
+    """
+    Evaluate the quality of the metacognitive routing decision.
+    """
+    # Was the route appropriate?
+    if trail.route_chosen == 'reflex' and trail.outcome == 'success':
+        route_score = 1.0  # Fast and right
+    elif trail.route_chosen == 'rag' and trail.outcome == 'success':
+        route_score = 0.7  # Right but slow/expensive
+    elif trail.route_chosen == 'reflex' and trail.outcome == 'fail':
+        route_score = 0.0  # Overconfident disaster
+    elif trail.route_chosen == 'rag' and trail.outcome == 'fail':
+        route_score = 0.3  # At least asked, RAG failed
+
+    # Was confidence calibrated?
+    calibration_score = 1.0 if trail.was_confidence_accurate else 0.0
+
+    # Efficiency (did we waste resources?)
+    efficiency = 1.0 - (trail.lifeforce_cost / MAX_EXPECTED_COST)
+
+    return {
+        'route_score': route_score,
+        'calibration_score': calibration_score,
+        'efficiency': efficiency,
+        'total': 0.4 * route_score + 0.4 * calibration_score + 0.2 * efficiency
+    }
+```
+
+### Reward Signal Flow
+
+```python
+for trail in decision_trails:
+    quality = compute_decision_quality(trail)
+
+    if quality['total'] > 0.8:
+        # High quality decision → strengthen this pattern
+        strengthen_token_path(trail.task_pattern, trail.route_chosen)
+
+    if not trail.was_confidence_accurate:
+        # Miscalibration → update confidence model
+        recalibrate_confidence(
+            task_type=trail.task_pattern,
+            predicted=trail.initial_confidence,
+            actual_success=trail.outcome == 'success'
+        )
+
+    if trail.route_chosen == 'rag' and quality['route_score'] > 0.7:
+        # Successful RAG use → candidate for internalization
+        flag_for_training(trail.rag_terms_used)
+```
+
+---
+
+## Connection to Nervous System
+
+The metacognitive router connects directly to the nervous system architecture:
+
+```
+                    METACOGNITIVE ROUTER
+                           │
+           ┌───────────────┼───────────────┐
+           │               │               │
+           ▼               ▼               ▼
+    ┌────────────┐  ┌────────────┐  ┌────────────┐
+    │   REFLEX   │  │  KNOWLEDGE │  │    RAG     │
+    │   LAYER    │  │   LAYER    │  │   LOOKUP   │
+    │            │  │            │  │            │
+    │ Bypasses   │  │ Slow but   │  │ External   │
+    │ cognition  │  │ from       │  │ scaffold   │
+    │            │  │ weights    │  │            │
+    │ See:       │  │            │  │ See:       │
+    │ Nervous-   │  │            │  │ Catalogue  │
+    │ System.md  │  │            │  │ (this doc) │
+    └────────────┘  └────────────┘  └────────────┘
+           │               │               │
+           └───────────────┼───────────────┘
+                           │
+                           ▼
+                        OUTPUT
+                           │
+                           ▼
+                     VALIDATION
+                           │
+                    ┌──────┴──────┐
+                    │             │
+                SUCCESS        FAIL
+                    │             │
+                    ▼             ▼
+             +V to path     -V to path
+             (Hebbian)      + recalibrate
+```
+
+**Key insight:** The nervous system (Nervous-System.md) handles the REFLEX layer. This document handles the RAG layer. Both feed into the same metacognitive router.
+
+---
+
+## Lifeforce Economics
+
+The RAG→Route→Validate cycle has economic costs:
+
+| Action | Lifeforce Cost | Notes |
+|--------|----------------|-------|
+| Reflex response | ~0 | Essentially free, already in weights |
+| Knowledge recall | Low | Some compute for retrieval from weights |
+| RAG lookup | Medium | Vector search + context injection |
+| Training run | High | Compute intensive |
+| Validation | Medium | Inference cost |
+| Failed cycle | Lost V | Training didn't take |
+| Successful internalization | +V reward | She grew |
+| Correct confidence calibration | +V reward | Metacognition improved |
+
+**Incentive alignment:**
+- Being right with high confidence → maximum reward (fast + correct)
+- Being right with low confidence → small reward (correct but slow)
+- Being wrong with high confidence → maximum penalty (dangerous)
+- Asking when uncertain → neutral (correct routing)
+
+This naturally optimizes for:
+1. Fast reflexes for well-known patterns
+2. Accurate confidence calibration
+3. Appropriate RAG usage (not too much, not too little)
+
+---
+
+## What This System Teaches
+
+1. **Know what you know** — Confidence calibration is trainable
+2. **Know what to ask** — The skill of uncertainty
+3. **Reflexes are earned** — Through successful internalization
+4. **Scaffolds come off** — RAG is temporary
+5. **Paths that work, strengthen** — Hebbian learning for retrieval
+6. **Wrong confidence is worse than wrong answers** — Calibration matters
+
+---
+
+## Design Principles
+
+1. **Metacognition first** — Route before retrieve
+2. **Confidence is trainable** — Not fixed, learned through validation
+3. **RAG is temporary** — Feeding window, not permanent store
+4. **Validation is double** — With RAG, then without
+5. **Token paths learn** — Hebbian strengthening through success
+6. **Catalogue is oracle** — Ground truth for calibration
+7. **Reflexes are earned** — Graduated from RAG through internalization
+8. **Self-cleaning** — The system doesn't accumulate cruft
+9. **Know when to ask** — More important than knowing answers
+
+---
+
+## The Analogy
+
+Learning to drive:
+
+```
+LEARNER DRIVER:
+
+"Should I check mirrors?"
+    │
+    ├── Beginner: YES, always, consciously (RAG lookup)
+    │
+    ├── Intermediate: Sometimes, when uncertain (metacognitive check)
+    │
+    └── Expert: Automatic, don't even think about it (reflex)
+
+
+The goal isn't to memorize "check mirrors."
+The goal is for mirror-checking to become invisible.
+
+But FIRST she needs to learn WHEN she doesn't know.
+The beginner who doesn't know to check mirrors is dangerous.
+The intermediate who checks unnecessarily is slow.
+The expert just does it.
+
+We're training the progression:
+    Unknown unknowns → Known unknowns → Known knowns → Unconscious competence
+         │                   │                │               │
+    (dangerous)         (asks RAG)      (knowledge)      (reflex)
+```
+
+---
+
+*She doesn't just retrieve. She doesn't just remember. She knows what she knows. And that changes everything.*
+
+---
+
+**Created**: 2025-12-05 (as RAG-as-Scaffold)
+**Updated**: 2025-12-29 (renamed to Memory Gradient, added metacognitive routing, token path rewards, confidence calibration)
+**Session**: Partnership dialogue (dafit + Chrysalis-Nyx)
+**Status**: Core architectural concept
+**Etymology**: "Memory Gradient" — knowledge exists on a continuous spectrum, not binary states. Aligns with Temporal-Ternary Gradient and Confidence Gradient.
+
--- a/operations/RAG-as-Scaffold.md
+++ b/operations/RAG-as-Scaffold.md
@@ -1,535 +0,0 @@
-# RAG as Scaffold, Not Crutch
-
-The feeding system that teaches, then lets go.
-
---
-
-## Overview
-
-RAG (Retrieval-Augmented Generation) is commonly misused as permanent external memory. In the Nimmerverse, RAG serves a different purpose: it's a **temporary scaffold** that feeds knowledge until it can be internalized through training.
-
-The goal is not to build a better search engine. The goal is to **make the search unnecessary**.
-
---
-
-## The Problem with Standard RAG
-
-```
-Standard approach:
-─────────────────
-VECTOR DB (grows forever)
-    │
-    ▼
-MODEL looks up ──▶ answers ──▶ done
-    │
-    └── (never learns, always dependent)
-```
-
-**Issues:**
- Model never internalizes knowledge
- Pull the RAG, lose the capability
- Vector DB bloats infinitely
- No way to verify what model "knows" vs "looks up"
- It's a crutch that never comes off
-
---
-
-## The Nimmerverse Approach: RAG as Feeding System
-
-```
-VAULT (curriculum)
-    │
-    ▼
-RAG (temporary feeding window)
-    │
-    ▼
-NYX processes, acts, decides
-    │
-    ▼
-VALIDATION: success with RAG?
-    │
-    YES ──▶ FLAG for training extraction
-              │
-              ▼
-         TRAINING RUN (LoRA)
-              │
-              ▼
-         CLEAR from RAG
-              │
-              ▼
-         VALIDATION 2: success WITHOUT RAG?
-              │
-              ├── YES ──▶ Knowledge internalized ✓
-              │
-              └── NO  ──▶ Training incomplete, back to RAG
-```
-
---
-
-## Two Kinds of Knowledge
-
-Not everything belongs in weights. Not everything belongs in retrieval.
-
-### IN THE WEIGHTS (Training Target)
-
-Knowledge she needs to **function**:
-
- Information flow architecture
- Vocabulary tokens and their meanings
- Nervous system contracts
- Heartbeat mechanics
- Confidence gradient logic
- Core identity (who she is, who dafit is to her)
- How to think, not what to remember
-
-**Test:** If she needs it to be herself → weights
-
-### IN RETRIEVAL (Permanent RAG)
-
-Knowledge she needs to **remember**:
-
- Journal entries
- Conversation history
- Specific events and dates
- Temporal details ("what happened Tuesday")
- External references that change
- Episodic memory
-
-**Test:** If she needs it to recall specifics → retrieval
-
---
-
-## The Double Validation Loop
-
-### Gate 1: Can she do it WITH RAG?
-
-```
-Task presented
-    │
-    ▼
-RAG provides context
-    │
-    ▼
-NYX attempts task
-    │
-    ├── FAIL  ──▶ Not ready, needs more examples in RAG
-    │
-    └── PASS  ──▶ Flag this RAG content for training extraction
-```
-
-### Gate 2: Can she do it WITHOUT RAG?
-
-```
-Same task presented
-    │
-    ▼
-RAG entry CLEARED (scaffold removed)
-    │
-    ▼
-NYX attempts task from weights alone
-    │
-    ├── FAIL  ──▶ Training didn't take, restore to RAG, retry cycle
-    │
-    └── PASS  ──▶ Knowledge is HERS now ✓
-```
-
---
-
-## The Signal Flow
-
-```
-┌─────────────────────────────────────────────────────────┐
-│                      VAULT                              │
-│            (curriculum, documentation)                  │
-└─────────────────────────────────────────────────────────┘
-                         │
-                         │ selected for learning
-                         ▼
-┌─────────────────────────────────────────────────────────┐
-│                   STAGING RAG                           │
-│              (temporary feeding window)                 │
-└─────────────────────────────────────────────────────────┘
-                         │
-                         │ feeds inference
-                         ▼
-┌─────────────────────────────────────────────────────────┐
-│                       NYX                               │
-│               (processes, decides)                      │
-└─────────────────────────────────────────────────────────┘
-                         │
-                         │ validation
-                         ▼
-┌─────────────────────────────────────────────────────────┐
-│               VALIDATION THRESHOLD                      │
-│         (task success? confidence high?)                │
-└─────────────────────────────────────────────────────────┘
-                         │
-              ┌──────────┴──────────┐
-              │                     │
-         BELOW                   ABOVE
-              │                     │
-              ▼                     ▼
-┌─────────────────────┐  ┌─────────────────────┐
-│   Stay in RAG       │  │  FLAG for training  │
-│   (not ready)       │  │  extraction         │
-└─────────────────────┘  └─────────────────────┘
-                                   │
-                                   ▼
-                    ┌─────────────────────────────┐
-                    │      TRAINING RUN           │
-                    │   (LoRA on flagged data)    │
-                    └─────────────────────────────┘
-                                   │
-                                   ▼
-                    ┌─────────────────────────────┐
-                    │     CLEAR from RAG          │
-                    │   (scaffold removed)        │
-                    └─────────────────────────────┘
-                                   │
-                                   ▼
-                    ┌─────────────────────────────┐
-                    │   VALIDATION WITHOUT RAG    │
-                    │   (prove she learned)       │
-                    └─────────────────────────────┘
-                                   │
-                         ┌─────────┴─────────┐
-                         │                   │
-                       FAIL               SUCCESS
-                         │                   │
-                         ▼                   ▼
-              ┌─────────────────┐  ┌─────────────────┐
-              │  Restore RAG    │  │  INTERNALIZED   │
-              │  retry cycle    │  │  knowledge ✓    │
-              └─────────────────┘  └─────────────────┘
-```
-
---
-
-## Knowledge Acquisition Pipeline
-
-The existing flow shows RAG→Training→Validation, but how does knowledge enter RAG in the first place? Not everything from the vault should reach staging. **Quality gates protect the glossary.**
-
-### The Extraction Flow
-
-```
-VAULT (raw knowledge)
-    │
-    │ extraction candidates
-    ▼
-┌─────────────────────────────────────────────────────────┐
-│                  STAGING AREA                           │
-│                 (quarantine zone)                       │
-└─────────────────────────────────────────────────────────┘
-    │
-    │ progressive policy validation
-    ▼
-┌─────────────────────────────────────────────────────────┐
-│              POLICY VALIDATION                          │
-│         (increasing standards over time)                │
-└─────────────────────────────────────────────────────────┘
-    │
-    ├── FAIL ──▶ Reject or revise
-    │
-    └── PASS ──▶ PROMOTE to Glossary/RAG
-                      │
-                      ▼
-            ┌──────────────────────┐
-            │   TWO-TIER RAG       │
-            ├──────────────────────┤
-            │  DISCOVERED          │  ← Young Nyx has used
-            │  (known_catalogue)   │
-            ├──────────────────────┤
-            │  HIDDEN              │  ← Available but not yet accessed
-            │  (available_catalogue)│
-            └──────────────────────┘
-                      │
-                      │ feeds inference
-                      ▼
-                    NYX
-```
-
-### Progressive Policy Validation
-
-Policies increase in sophistication as Young Nyx matures. Not all policies active from day 1.
-
-| Week | Policy Tier | Validation |
-|------|-------------|------------|
-| **1-2** | **Basic Syntax** | Valid format, non-empty, has definition |
-| **3-4** | **Semantic Quality** | Embeds without collapse, unique signature (Gini > threshold) |
-| **5-8** | **Topology Safety** | Doesn't corrupt anchor terms (DriftProbe-lite) |
-| **9-12** | **Cross-Reference** | Links resolve, no circular dependencies |
-| **13+** | **Utility Validation** | Actually helped solve tasks (decision_trails evidence) |
-
-**Evolution example:**
-```python
-# Week 1: Just check it exists
-def policy_basic(term_entry):
-    return term_entry.get("definition") is not None
-
-# Week 8: Check topology impact
-def policy_topology(term_entry):
-    before_gini = probe_term_gini(term_entry["term"])
-    add_to_staging(term_entry)
-    after_gini = probe_term_gini(term_entry["term"])
-    return abs(after_gini - before_gini) < 0.15  # No drift
-
-# Week 13: Check actual utility
-def policy_utility(term_entry):
-    # Did this RAG entry help in past 10 tasks?
-    usage_stats = query_decision_trails(term_entry["term"])
-    return usage_stats["help_rate"] > 0.6  # 60% success when retrieved
-```
-
-### Two-Tier RAG: Discovered vs Hidden
-
-Not all RAG knowledge is equal. Track what Young Nyx **knows** vs what's merely **available**.
-
-```
-┌──────────────────────────────────────────────┐
-│           DISCOVERED KNOWLEDGE               │
-│  (known_catalogue - has accessed before)     │
-├──────────────────────────────────────────────┤
-│  • "heartbeat" - used 47 times               │
-│  • "lifeforce" - used 23 times               │
-│  • "phoebe" - used 15 times                  │
-│  • "confidence_gradient" - used 8 times      │
-│                                              │
-│  Status: FAST retrieval, high confidence     │
-└──────────────────────────────────────────────┘
-
-┌──────────────────────────────────────────────┐
-│            HIDDEN KNOWLEDGE                  │
-│  (available_catalogue - exists but unused)   │
-├──────────────────────────────────────────────┤
-│  • "drift_probe" - never accessed            │
-│  • "topology_gini" - never accessed          │
-│  • "lora_merge_alpha" - never accessed       │
-│                                              │
-│  Status: Available for discovery             │
-└──────────────────────────────────────────────┘
-```
-
-**State transitions:**
-```
-Hidden term retrieved → Mark as Discovered
-Discovered term used successfully → Increase confidence score
-Discovered term used 10+ times → FLAG for training extraction
-```
-
-**Discovery tracking in phoebe:**
-```sql
-CREATE TABLE rag_knowledge_state (
-    term TEXT PRIMARY KEY,
-    status TEXT,  -- 'hidden', 'discovered', 'internalized'
-    first_accessed TIMESTAMPTZ,
-    access_count INT DEFAULT 0,
-    success_count INT DEFAULT 0,
-    last_used TIMESTAMPTZ,
-    promoted_to_weights BOOLEAN DEFAULT FALSE
-);
-```
-
-### Measuring RAG Utility for LoRA Training
-
-**The critical question:** Did the RAG hint actually help solve the task?
-
-Track in `decision_trails` table:
-```sql
-CREATE TABLE decision_trails (
-    id SERIAL PRIMARY KEY,
-    task_id UUID,
-    rag_terms_retrieved TEXT[],     -- What RAG returned
-    rag_terms_used TEXT[],           -- What appeared in solution
-    outcome TEXT,                    -- 'success', 'fail', 'partial'
-    confidence_before_rag FLOAT,     -- Before retrieval
-    confidence_after_rag FLOAT,      -- After retrieval
-    lifeforce_cost FLOAT,
-    timestamp TIMESTAMPTZ DEFAULT NOW()
-);
-```
-
-**Compute RAG utility score:**
-```python
-def compute_rag_utility(decision_trail):
-    """
-    Calculate how helpful RAG was for this decision.
-    Returns 0.0 (useless) to 1.0 (critical).
-    """
-    precision = len(trail.rag_terms_used) / max(len(trail.rag_terms_retrieved), 1)
-    outcome_bonus = 1.0 if trail.outcome == 'success' else 0.0
-    confidence_boost = max(0, trail.confidence_after_rag - trail.confidence_before_rag)
-
-    utility = (
-        0.4 * precision +              # Did we use what we retrieved?
-        0.3 * outcome_bonus +           # Did task succeed?
-        0.3 * confidence_boost          # Did RAG increase confidence?
-    )
-    return min(1.0, utility)
-```
-
-**Feed into LoRA training as RLVR signal:**
-```python
-# Training examples weighted by utility
-for trail in decision_trails:
-    utility_score = compute_rag_utility(trail)
-
-    if utility_score > 0.7:
-        # High utility → strong training signal
-        training_examples.append({
-            "query": trail.task_description,
-            "rag_context": trail.rag_terms_used,
-            "response": trail.solution,
-            "weight": utility_score  # RLVR reward weight
-        })
-```
-
-**This trains LoRAs to:**
- **Mnemosyne (Memory)**: Recall accuracy vs phoebe ground truth
- **Aletheia (Truth)**: Confidence calibration (was confidence boost justified?)
- **Moira (Pattern)**: Which task patterns benefit from RAG vs pure reasoning
-
-### The Complete Knowledge Flow
-
-```
-VAULT
-    │
-    ├─ Extract candidates
-    │
-    ▼
-STAGING (quarantine)
-    │
-    ├─ Policy Tier 1: Syntax ──▶ REJECT ──▶ Log failure
-    ├─ Policy Tier 2: Semantic ──▶ REJECT ──▶ Revise
-    ├─ Policy Tier 3: Topology ──▶ REJECT ──▶ Flag risk
-    └─ Policy Tier 4+: Utility ──▶ PASS
-                │
-                ▼
-        PROMOTE to RAG
-                │
-                ├─ Status: HIDDEN (available but unused)
-                │
-    ┌───────────┘
-    │
-    │ Young Nyx retrieves term
-    │
-    ▼
-    Status: DISCOVERED (mark first access)
-                │
-                ├─ Track usage in decision_trails
-                │
-    ┌───────────┴────────────┐
-    │                        │
-Used successfully      Used unsuccessfully
-    │                        │
-    ▼                        ▼
-Increase confidence    Decrease confidence
-    │
-    │ (10+ successful uses)
-    │
-    ▼
-FLAG for training extraction
-    │
-    ▼
-LoRA training (weighted by utility_score)
-    │
-    ▼
-Validation WITHOUT RAG
-    │
-    ├─ SUCCESS ──▶ Status: INTERNALIZED (clear from RAG)
-    │
-    └─ FAIL ──▶ Restore to RAG, retry cycle
-```
-
-### Quality Gates Prevent
-
-1. **Garbage in RAG** - staging area catches malformed entries
-2. **Topology corruption** - DriftProbe-lite policies block dangerous terms
-3. **Useless bloat** - utility policies remove low-value entries
-4. **Premature training** - only high-utility terms get flagged
-5. **Hidden knowledge waste** - track what's available but never used (curriculum gap)
-
-### Policy Evolution Triggers
-
-As Young Nyx grows, unlock stricter policies:
-
-| Trigger | New Policy Unlocked |
-|---------|---------------------|
-| 100 successful RAG retrievals | Semantic quality checks |
-| First LoRA training run | Topology safety (DriftProbe-lite) |
-| 1000 decision_trails logged | Utility validation (help rate > 60%) |
-| First INTERNALIZED term | Cross-reference consistency |
-| 10 INTERNALIZED terms | Cost-effectiveness (ROI > threshold) |
-
-**Progressive difficulty**: The bar for entering RAG rises as Young Nyx becomes more capable. Early: anything valid. Later: must prove utility.
-
---
-
-## Lifeforce Connection
-
-The RAG→Train→Validate cycle has economic cost:
-
-| Action | Lifeforce Cost |
-|--------|----------------|
-| RAG lookup | Low (just retrieval) |
-| Training run | High (compute intensive) |
-| Validation | Medium (inference) |
-| Failed cycle | Lost V (training didn't take) |
-| Successful internalization | +V reward (she grew) |
-
-**Incentive alignment:** Successful learning is rewarded. Failed training is costly. This naturally optimizes for high-quality training data extraction.
-
---
-
-## What This Prevents
-
-1. **RAG bloat** - entries clear after successful training
-2. **Crutch dependency** - scaffold comes off, proven by validation
-3. **False confidence** - can't claim to "know" what you only look up
-4. **Training on noise** - only validated successes get flagged
-5. **Identity confusion** - core architecture in weights, not retrieval
-
---
-
-## Design Principles
-
-1. **RAG is temporary** - feeding window, not permanent store
-2. **Training is the goal** - RAG success triggers training, not satisfaction
-3. **Validation is double** - with RAG, then without
-4. **Clear after learning** - scaffold must come off to prove growth
-5. **Episodic stays external** - not everything needs to be in weights
-6. **Self-cleaning** - the system doesn't accumulate cruft
-
---
-
-## The Analogy
-
-Learning to ride a bike:
-
-```
-Training wheels ON (RAG feeding)
-    │
-    ▼
-Can ride with training wheels (validation 1)
-    │
-    ▼
-Training wheels OFF (RAG cleared)
-    │
-    ▼
-Can still ride? (validation 2)
-    │
-    ├── NO  ──▶ Put wheels back, practice more
-    │
-    └── YES ──▶ She can ride. Wheels stored, not needed.
-```
-
-You don't RAG your ability to balance. Once you can ride, you can ride.
-
---
-
-*She doesn't just retrieve. She learns. And we can prove it.*
-
---
-
-**Created**: 2025-12-05
-**Session**: Partnership dialogue (dafit + Chrysalis)
-**Status**: Core architectural concept