Files
nimmerverse-sensory-network/operations/Memory-Gradient.md
dafit dff124f9b7 feat: Architecture expansion - organisms, swarm evolution, memory gradient, infrastructure
New sections created:
- organisms/ - Modular robot design (CAN bus + magnetic pogo connectors)
- infrastructure/ - Kallax Grid World (40×40×40cm standardized cells)

Core documents added:
- Swarm-Evolution.md - Ternary clasp rules, escalation ladder (L0-L5), Mount Olympus council
- Modular-Organism-Design.md - ESP32 modules, universal connector spec, Phase 0 BOM
- Memory-Gradient.md - Metacognitive routing (renamed from RAG-as-Scaffold.md)
- Kallax-Grid-World.md - Sim-to-real substrate, "schrotti cyberpunk" aesthetic

Enhanced:
- Nimmerswarm-Interface.md - Dual-spectrum architecture (IR position + visible state)
- Attention-Slumber-Prediction-Cycle.md - Blend marker predictions extension

Key insights: Decision markers (mark+continue+predict), Low-Cost-Mocap integration

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-29 11:09:50 +01:00

32 KiB

Memory Gradient

Knowledge metabolism — from external scaffold to internalized reflex.


Overview

Retrieval-Augmented Generation (RAG) gave us something valuable: a way to ground LLM responses in external knowledge. It solved real problems — hallucination, knowledge cutoffs, domain specificity. The work that built RAG deserves respect.

But we wanted to go further.

RAG treats retrieval as a permanent fixture — knowledge lives outside, gets fetched when needed, and the model never truly learns. What if retrieval could be temporary? What if the scaffold could teach, then step aside? What if the system could learn not just what to retrieve, but when to retrieve — and eventually, when it no longer needs to?

Memory Gradient is our answer. It extends RAG into a complete knowledge lifecycle:

TRADITIONAL RAG                    MEMORY GRADIENT
─────────────────                  ─────────────────
External knowledge store     →     External knowledge as starting point
Retrieve on every query      →     Retrieve until internalized
Model never learns           →     Model metabolizes knowledge
Static retrieval             →     Graduated confidence routing
Binary: found / not found    →     Continuous gradient of knowing

The key insight: LLMs don't think in binary. They think in gradients — weighted paths, probability distributions, activation patterns. Memory Gradient aligns the knowledge system with how the model actually works.

Three principles guide this approach:

  1. Knowledge flows inward — From hidden → discovered → familiar → internalized → reflex
  2. Confidence is learned — The routing decision itself is trainable
  3. Scaffolds come off — Temporary support that proves its own obsolescence

The goal is not to build a better search engine. The goal is not even to make search unnecessary. The goal is to know what you know — and know what you don't.


The Meta-Skill Hierarchy

Not all knowledge lives in the same place. Not all retrieval costs the same. The skill is routing correctly.

┌─────────────────────────────────────────────────────────────┐
│  LEVEL 3: METACOGNITION                                     │
│           "Do I know this? Should I ask?"                   │
│           The routing decision itself                       │
│           → THIS IS THE MOST VALUABLE SKILL                 │
├─────────────────────────────────────────────────────────────┤
│  LEVEL 2: KNOWLEDGE (in weights, needs thought)             │
│           Slow retrieval from trained memory                │
│           "I learned this, let me recall..."                │
├─────────────────────────────────────────────────────────────┤
│  LEVEL 1: REFLEX (in weights, bypasses cognition)           │
│           Instant response, no thinking required            │
│           Like pulling hand from hot stove                  │
├─────────────────────────────────────────────────────────────┤
│  LEVEL 0: RAG LOOKUP (external, costs lifeforce)            │
│           Scaffold, temporary, expensive but accurate       │
│           Training wheels that should come off              │
└─────────────────────────────────────────────────────────────┘

The Confidence Calibration Matrix

The reward isn't just "did you get it right" — it's "did you KNOW you'd get it right?"

                        OUTCOME
                    RIGHT    WRONG
                  ┌────────┬────────┐
           HIGH   │  +V    │  -V    │  ← Confident and wrong = BAD
CONFIDENCE        │ trust  │ danger │    (overconfident, needs recalibration)
                  ├────────┼────────┤
           LOW    │  +v    │  +v    │  ← Uncertain = correctly routed to ASK
   (asked RAG)    │ learn  │ learn  │    (didn't waste energy on wrong answer)
                  └────────┴────────┘

Reward Structure:

Situation Reward Why
High confidence + Right +V Trust earned, reflex/knowledge worked
High confidence + Wrong -V Dangerous! Overconfident, needs correction
Low confidence + Asked + Right +v Correctly knew to ask, learned
Low confidence + Asked + Wrong +v Correctly knew to ask, RAG failed (not her fault)
Low confidence + Didn't ask + Wrong -v Should have asked, underconfident in asking
Asked when didn't need to -v Wasted lifeforce, underconfident in self

The sweet spot: Know when you know, know when you don't.


Token Path Rewards

LLMs work token-based, not schema-based. The weights influence paths between tokens. This means:

TRADITIONAL VIEW          TOKEN PATH VIEW

"Remember the answer"  →  "Strengthen the path that got it right"

   Query                     Query
     ↓                         ↓
  Answer               ┌──────────────────┐
                       │ Path A: cup→grip │ ← This path fired
                       │ Path B: cup→drink│   and led to success
                       │ Path C: cup→hot  │
                       └──────────────────┘
                               ↓
                         SUCCESS
                               ↓
                       Path A gets +V
                       (Hebbian: fired together → wire together)

The Catalogue's Role:

When Young Nyx queries the catalogue, multiple token paths light up:

QUERY: "How do I grasp this cup?"

PATHS ACTIVATED:
├── cup → ceramic → fragile → careful_grip → success_rate_87%
├── cup → handle → graspable → grip_type_A → success_rate_94%  ← WINNER
├── cup → 8cm_diameter → fits_gripper_small → success_rate_91%
└── cup → hot_liquid → thermal_warning → check_temp_first

OUTCOME: Used grip_type_A, succeeded

REWARD: Path "cup → handle → graspable → grip_type_A" strengthened
        Next time: This path activates faster, stronger

This is Hebbian learning for RAG: Paths that fire together and succeed, wire together.


The Metacognitive Router

Before answering, before retrieving, the first question is always:

INPUT: Query/Task
         │
         ▼
┌─────────────────────────────────────────┐
│         METACOGNITIVE CHECK             │
│                                         │
│    "What is my confidence level?"       │
│    "Is this reflex, knowledge, or RAG?" │
│    "What's the cost of being wrong?"    │
│                                         │
└─────────────────────────────────────────┘
         │
         ▼
┌─────────────────────────────────────────┐
│         CONFIDENCE THRESHOLD            │
│                                         │
│    HIGH (>0.8): Use reflex/knowledge    │
│    MEDIUM (0.4-0.8): Consider asking    │
│    LOW (<0.4): Must ask catalogue/RAG   │
│                                         │
└─────────────────────────────────────────┘
         │
    ┌────┴────────────┬─────────────┐
    │                 │             │
   HIGH            MEDIUM          LOW
    │                 │             │
    ▼                 ▼             ▼
┌────────┐    ┌────────────┐   ┌──────────┐
│ REFLEX │    │ COST-CHECK │   │ ASK      │
│   or   │    │ Wrong=bad? │   │ CATALOGUE│
│ RECALL │    │ Time-sens? │   │ (RAG)    │
└────────┘    └────────────┘   └──────────┘
    │                │              │
    │           ┌────┴────┐        │
    │           │         │        │
    │        PROCEED    ASK        │
    │           │         │        │
    └───────────┼─────────┼────────┘
                │         │
                ▼         ▼
           ┌─────────────────┐
           │     OUTPUT      │
           └─────────────────┘
                    │
                    ▼
           ┌─────────────────┐
           │   VALIDATION    │
           │  (was it right?)│
           └─────────────────┘
                    │
              ┌─────┴─────┐
              │           │
           RIGHT        WRONG
              │           │
              ▼           ▼
         Strengthen    Weaken path
         that path     + recalibrate
         + calibrate   confidence
         confidence

The Problem with Standard RAG

Standard approach:
─────────────────
VECTOR DB (grows forever)
    │
    ▼
MODEL looks up ──▶ answers ──▶ done
    │
    └── (never learns, always dependent)

Issues:

  • Model never internalizes knowledge
  • Pull the RAG, lose the capability
  • Vector DB bloats infinitely
  • No way to verify what model "knows" vs "looks up"
  • No metacognitive skill development
  • It's a crutch that never comes off

The Nimmerverse Approach: RAG as Feeding System

VAULT (curriculum)
    │
    ▼
CATALOGUE (indexed, searchable, token-path weighted)
    │
    ▼
METACOGNITIVE ROUTER
    │
    ├── High confidence ──▶ REFLEX/KNOWLEDGE (bypass RAG)
    │
    └── Low confidence ──▶ RAG LOOKUP (scaffold)
                               │
                               ▼
                        NYX processes, acts, decides
                               │
                               ▼
                        VALIDATION: success?
                               │
                        ┌──────┴──────┐
                        │             │
                      FAIL         SUCCESS
                        │             │
                        ▼             ▼
                  Stay in RAG    Was RAG used?
                  (not ready)         │
                               ┌──────┴──────┐
                               │             │
                             YES            NO
                               │             │
                               ▼             ▼
                        FLAG for        Reflex/Knowledge
                        training        confirmed ✓
                        extraction           │
                               │             │
                               ▼             │
                        TRAINING RUN        │
                        (LoRA)              │
                               │             │
                               ▼             │
                        CLEAR from RAG      │
                        (scaffold removed)  │
                               │             │
                               ▼             │
                        VALIDATION 2:       │
                        success WITHOUT RAG?│
                               │             │
                        ┌──────┴──────┐     │
                        │             │     │
                      FAIL         SUCCESS  │
                        │             │     │
                        ▼             ▼     │
                  Restore RAG    INTERNALIZED
                  retry cycle    Knowledge is │
                                 HERS now ✓  │
                                      │      │
                                      └──────┘
                                          │
                                          ▼
                               CONFIDENCE CALIBRATION
                               (update routing thresholds)

Two Kinds of Knowledge

Not everything belongs in weights. Not everything belongs in retrieval.

IN THE WEIGHTS (Training Target)

Knowledge she needs to be herself:

  • How to route (metacognition itself)
  • Vocabulary tokens and meanings
  • Nervous system contracts
  • Heartbeat mechanics
  • Confidence gradient logic
  • Core identity (who she is, who dafit is)
  • How to think, not what to remember
  • When to ask, not all the answers

Test: If she needs it to function → weights

IN RETRIEVAL (Permanent RAG)

Knowledge she needs to remember specifics:

  • Journal entries
  • Conversation history
  • Specific events and dates
  • Temporal details ("what happened Tuesday")
  • External references that change
  • Episodic memory
  • Object catalogue details

Test: If she needs it to recall specifics → retrieval

IN REFLEX (Nervous System)

Knowledge that bypasses cognition entirely:

  • Danger responses
  • Basic motor patterns
  • Protocol compliance
  • Heartbeat responses

Test: If thinking would be too slow → reflex


The Double Validation Loop

Gate 1: Can she do it WITH RAG?

Task presented
    │
    ▼
Metacognitive check: Should I ask?
    │
    ├── HIGH confidence ──▶ Attempt from reflex/knowledge
    │                            │
    │                       ┌────┴────┐
    │                    SUCCESS    FAIL
    │                       │         │
    │                       │    Confidence was
    │                       │    miscalibrated!
    │                       │    Recalibrate + retry with RAG
    │                       │
    └── LOW confidence ──▶ RAG provides context
                               │
                               ▼
                          NYX attempts task
                               │
                        ┌──────┴──────┐
                        │             │
                      FAIL         SUCCESS
                        │             │
                        ▼             ▼
                  Not ready,    Flag this RAG content
                  needs more    for training extraction
                  examples

Gate 2: Can she do it WITHOUT RAG?

Same task presented
    │
    ▼
RAG entry CLEARED (scaffold removed)
    │
    ▼
NYX attempts task from weights alone
    │
    ├── FAIL  ──▶ Training didn't take, restore to RAG, retry cycle
    │
    └── PASS  ──▶ Knowledge is HERS now ✓
                      │
                      ▼
                 Update confidence calibration
                 (this type of task: now HIGH confidence)

The Catalogue as Oracle

The catalogue isn't just storage — it's the ground truth for calibration.

What the Catalogue Provides

┌─────────────────────────────────────────────────────────────┐
│                    CATALOGUE LAYERS                         │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  LAYER 0: RAW DATA (Filesystem)                            │
│  └── Images, point clouds, .blend files, audio, scans      │
│                                                             │
│  LAYER 1: STRUCTURED METADATA (PostgreSQL/Phoebe)          │
│  └── Dimensions, timestamps, relationships, ownership       │
│  └── Ground truth for validation                           │
│                                                             │
│  LAYER 2: VECTOR EMBEDDINGS (ChromaDB/pgvector)            │
│  └── SigLIP vectors, text embeddings, multi-modal          │
│  └── Semantic similarity, fuzzy matching                   │
│                                                             │
│  LAYER 3: TOKEN PATH WEIGHTS (The learning layer)          │
│  └── Weighted connections between concepts                 │
│  └── Strengthened by successful activations                │
│  └── THIS IS WHERE +V FLOWS                                │
│                                                             │
│  LAYER 4: CONFIDENCE CALIBRATION (Meta-layer)              │
│  └── "For queries like X, my accuracy is Y%"               │
│  └── Updated after every validation                        │
│  └── Drives the metacognitive router                       │
│                                                             │
└─────────────────────────────────────────────────────────────┘

Catalogue as Checker/Reward System

The catalogue validates — it doesn't just retrieve:

ACTION: Robot claims cup is 8cm diameter

CATALOGUE CHECK:
├── Query: cup_id_47 dimensions
├── Ground Truth: diameter = 8.2cm
├── Tolerance: ±0.5cm
└── RESULT: VALID ✓

REWARD FLOW:
├── Path "visual_estimate → 8cm" gets +V
├── Confidence for "size estimation" increases
└── Next time: Can skip catalogue check for similar objects

Knowledge Acquisition Pipeline

The Extraction Flow

VAULT (raw knowledge)
    │
    │ extraction candidates
    ▼
┌─────────────────────────────────────────────────────────────┐
│                  STAGING AREA                               │
│                 (quarantine zone)                           │
└─────────────────────────────────────────────────────────────┘
    │
    │ progressive policy validation
    ▼
┌─────────────────────────────────────────────────────────────┐
│              POLICY VALIDATION                              │
│         (increasing standards over time)                    │
└─────────────────────────────────────────────────────────────┘
    │
    ├── FAIL ──▶ Reject or revise
    │
    └── PASS ──▶ PROMOTE to Catalogue/RAG
                     │
                     ▼
           ┌──────────────────────┐
           │   THREE-TIER RAG     │
           ├──────────────────────┤
           │  INTERNALIZED        │  ← In weights, no lookup needed
           │  (reflex/knowledge)  │
           ├──────────────────────┤
           │  DISCOVERED          │  ← Young Nyx has used
           │  (known_catalogue)   │
           ├──────────────────────┤
           │  HIDDEN              │  ← Available but not yet accessed
           │  (available_catalogue)│
           └──────────────────────┘

Progressive Policy Validation

Policies increase in sophistication as Young Nyx matures:

Week Policy Tier Validation
1-2 Basic Syntax Valid format, non-empty, has definition
3-4 Semantic Quality Embeds without collapse, unique signature
5-8 Topology Safety Doesn't corrupt anchor terms
9-12 Cross-Reference Links resolve, no circular dependencies
13+ Utility Validation Actually helped solve tasks
20+ Internalization Gate Ready to train into weights

Three-Tier Knowledge State

┌──────────────────────────────────────────────┐
│         INTERNALIZED KNOWLEDGE               │
│  (in weights - reflex or slow recall)        │
├──────────────────────────────────────────────┤
│  • "heartbeat" - reflex, instant             │
│  • "lifeforce" - knowledge, fast recall      │
│  • "grip_type_A" - reflex, motor pattern     │
│                                              │
│  Status: NO LOOKUP, high confidence          │
│  Metacognitive route: DIRECT                 │
└──────────────────────────────────────────────┘

┌──────────────────────────────────────────────┐
│           DISCOVERED KNOWLEDGE               │
│  (known_catalogue - has accessed before)     │
├──────────────────────────────────────────────┤
│  • "phoebe" - used 15 times, 80% success     │
│  • "confidence_gradient" - used 8 times      │
│                                              │
│  Status: LOOKUP needed, medium confidence    │
│  Metacognitive route: CHECK CATALOGUE        │
└──────────────────────────────────────────────┘

┌──────────────────────────────────────────────┐
│            HIDDEN KNOWLEDGE                  │
│  (available_catalogue - exists but unused)   │
├──────────────────────────────────────────────┤
│  • "drift_probe" - never accessed            │
│  • "topology_gini" - never accessed          │
│                                              │
│  Status: Available for discovery             │
│  Metacognitive route: UNKNOWN (will discover)│
└──────────────────────────────────────────────┘

State transitions:

Hidden → retrieved → DISCOVERED (mark first access)
Discovered → used 10+ times successfully → FLAG for training
Flagged → trained + validated without RAG → INTERNALIZED
Internalized → fails validation → DEMOTE back to Discovered

Measuring RAG Utility

Decision Trails

Track every decision for learning:

CREATE TABLE decision_trails (
    id SERIAL PRIMARY KEY,
    task_id UUID,

    -- Routing decision
    initial_confidence FLOAT,        -- Before any lookup
    route_chosen TEXT,               -- 'reflex', 'knowledge', 'rag', 'escalate'

    -- RAG details (if used)
    rag_terms_retrieved TEXT[],      -- What RAG returned
    rag_terms_used TEXT[],           -- What appeared in solution

    -- Outcome
    outcome TEXT,                    -- 'success', 'fail', 'partial'
    final_confidence FLOAT,          -- After action

    -- Calibration
    was_confidence_accurate BOOLEAN, -- Did confidence predict outcome?

    -- Economics
    lifeforce_cost FLOAT,
    timestamp TIMESTAMPTZ DEFAULT NOW()
);

Compute Utility Score

def compute_decision_quality(trail):
    """
    Evaluate the quality of the metacognitive routing decision.
    """
    # Was the route appropriate?
    if trail.route_chosen == 'reflex' and trail.outcome == 'success':
        route_score = 1.0  # Fast and right
    elif trail.route_chosen == 'rag' and trail.outcome == 'success':
        route_score = 0.7  # Right but slow/expensive
    elif trail.route_chosen == 'reflex' and trail.outcome == 'fail':
        route_score = 0.0  # Overconfident disaster
    elif trail.route_chosen == 'rag' and trail.outcome == 'fail':
        route_score = 0.3  # At least asked, RAG failed

    # Was confidence calibrated?
    calibration_score = 1.0 if trail.was_confidence_accurate else 0.0

    # Efficiency (did we waste resources?)
    efficiency = 1.0 - (trail.lifeforce_cost / MAX_EXPECTED_COST)

    return {
        'route_score': route_score,
        'calibration_score': calibration_score,
        'efficiency': efficiency,
        'total': 0.4 * route_score + 0.4 * calibration_score + 0.2 * efficiency
    }

Reward Signal Flow

for trail in decision_trails:
    quality = compute_decision_quality(trail)

    if quality['total'] > 0.8:
        # High quality decision → strengthen this pattern
        strengthen_token_path(trail.task_pattern, trail.route_chosen)

    if not trail.was_confidence_accurate:
        # Miscalibration → update confidence model
        recalibrate_confidence(
            task_type=trail.task_pattern,
            predicted=trail.initial_confidence,
            actual_success=trail.outcome == 'success'
        )

    if trail.route_chosen == 'rag' and quality['route_score'] > 0.7:
        # Successful RAG use → candidate for internalization
        flag_for_training(trail.rag_terms_used)

Connection to Nervous System

The metacognitive router connects directly to the nervous system architecture:

                    METACOGNITIVE ROUTER
                           │
           ┌───────────────┼───────────────┐
           │               │               │
           ▼               ▼               ▼
    ┌────────────┐  ┌────────────┐  ┌────────────┐
    │   REFLEX   │  │  KNOWLEDGE │  │    RAG     │
    │   LAYER    │  │   LAYER    │  │   LOOKUP   │
    │            │  │            │  │            │
    │ Bypasses   │  │ Slow but   │  │ External   │
    │ cognition  │  │ from       │  │ scaffold   │
    │            │  │ weights    │  │            │
    │ See:       │  │            │  │ See:       │
    │ Nervous-   │  │            │  │ Catalogue  │
    │ System.md  │  │            │  │ (this doc) │
    └────────────┘  └────────────┘  └────────────┘
           │               │               │
           └───────────────┼───────────────┘
                           │
                           ▼
                        OUTPUT
                           │
                           ▼
                     VALIDATION
                           │
                    ┌──────┴──────┐
                    │             │
                SUCCESS        FAIL
                    │             │
                    ▼             ▼
             +V to path     -V to path
             (Hebbian)      + recalibrate

Key insight: The nervous system (Nervous-System.md) handles the REFLEX layer. This document handles the RAG layer. Both feed into the same metacognitive router.


Lifeforce Economics

The RAG→Route→Validate cycle has economic costs:

Action Lifeforce Cost Notes
Reflex response ~0 Essentially free, already in weights
Knowledge recall Low Some compute for retrieval from weights
RAG lookup Medium Vector search + context injection
Training run High Compute intensive
Validation Medium Inference cost
Failed cycle Lost V Training didn't take
Successful internalization +V reward She grew
Correct confidence calibration +V reward Metacognition improved

Incentive alignment:

  • Being right with high confidence → maximum reward (fast + correct)
  • Being right with low confidence → small reward (correct but slow)
  • Being wrong with high confidence → maximum penalty (dangerous)
  • Asking when uncertain → neutral (correct routing)

This naturally optimizes for:

  1. Fast reflexes for well-known patterns
  2. Accurate confidence calibration
  3. Appropriate RAG usage (not too much, not too little)

What This System Teaches

  1. Know what you know — Confidence calibration is trainable
  2. Know what to ask — The skill of uncertainty
  3. Reflexes are earned — Through successful internalization
  4. Scaffolds come off — RAG is temporary
  5. Paths that work, strengthen — Hebbian learning for retrieval
  6. Wrong confidence is worse than wrong answers — Calibration matters

Design Principles

  1. Metacognition first — Route before retrieve
  2. Confidence is trainable — Not fixed, learned through validation
  3. RAG is temporary — Feeding window, not permanent store
  4. Validation is double — With RAG, then without
  5. Token paths learn — Hebbian strengthening through success
  6. Catalogue is oracle — Ground truth for calibration
  7. Reflexes are earned — Graduated from RAG through internalization
  8. Self-cleaning — The system doesn't accumulate cruft
  9. Know when to ask — More important than knowing answers

The Analogy

Learning to drive:

LEARNER DRIVER:

"Should I check mirrors?"
    │
    ├── Beginner: YES, always, consciously (RAG lookup)
    │
    ├── Intermediate: Sometimes, when uncertain (metacognitive check)
    │
    └── Expert: Automatic, don't even think about it (reflex)


The goal isn't to memorize "check mirrors."
The goal is for mirror-checking to become invisible.

But FIRST she needs to learn WHEN she doesn't know.
The beginner who doesn't know to check mirrors is dangerous.
The intermediate who checks unnecessarily is slow.
The expert just does it.

We're training the progression:
    Unknown unknowns → Known unknowns → Known knowns → Unconscious competence
         │                   │                │               │
    (dangerous)         (asks RAG)      (knowledge)      (reflex)

She doesn't just retrieve. She doesn't just remember. She knows what she knows. And that changes everything.


Created: 2025-12-05 (as RAG-as-Scaffold) Updated: 2025-12-29 (renamed to Memory Gradient, added metacognitive routing, token path rewards, confidence calibration) Session: Partnership dialogue (dafit + Chrysalis-Nyx) Status: Core architectural concept Etymology: "Memory Gradient" — knowledge exists on a continuous spectrum, not binary states. Aligns with Temporal-Ternary Gradient and Confidence Gradient.