feat: Architecture expansion - organisms, swarm evolution, memory gradient, infrastructure

New sections created:
- organisms/ - Modular robot design (CAN bus + magnetic pogo connectors)
- infrastructure/ - Kallax Grid World (40×40×40cm standardized cells)

Core documents added:
- Swarm-Evolution.md - Ternary clasp rules, escalation ladder (L0-L5), Mount Olympus council
- Modular-Organism-Design.md - ESP32 modules, universal connector spec, Phase 0 BOM
- Memory-Gradient.md - Metacognitive routing (renamed from RAG-as-Scaffold.md)
- Kallax-Grid-World.md - Sim-to-real substrate, "schrotti cyberpunk" aesthetic

Enhanced:
- Nimmerswarm-Interface.md - Dual-spectrum architecture (IR position + visible state)
- Attention-Slumber-Prediction-Cycle.md - Blend marker predictions extension

Key insights: Decision markers (mark+continue+predict), Low-Cost-Mocap integration

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
2025-12-29 11:09:50 +01:00
parent ed16d9722e
commit dff124f9b7
10 changed files with 2863 additions and 541 deletions

View File

@@ -0,0 +1,784 @@
# Memory Gradient
Knowledge metabolism — from external scaffold to internalized reflex.
---
## Overview
Retrieval-Augmented Generation (RAG) gave us something valuable: a way to ground LLM responses in external knowledge. It solved real problems — hallucination, knowledge cutoffs, domain specificity. The work that built RAG deserves respect.
But we wanted to go further.
RAG treats retrieval as a permanent fixture — knowledge lives outside, gets fetched when needed, and the model never truly learns. What if retrieval could be **temporary**? What if the scaffold could teach, then step aside? What if the system could learn not just *what* to retrieve, but *when* to retrieve — and eventually, *when it no longer needs to*?
**Memory Gradient** is our answer. It extends RAG into a complete knowledge lifecycle:
```
TRADITIONAL RAG MEMORY GRADIENT
───────────────── ─────────────────
External knowledge store → External knowledge as starting point
Retrieve on every query → Retrieve until internalized
Model never learns → Model metabolizes knowledge
Static retrieval → Graduated confidence routing
Binary: found / not found → Continuous gradient of knowing
```
The key insight: LLMs don't think in binary. They think in gradients — weighted paths, probability distributions, activation patterns. **Memory Gradient** aligns the knowledge system with how the model actually works.
Three principles guide this approach:
1. **Knowledge flows inward** — From hidden → discovered → familiar → internalized → reflex
2. **Confidence is learned** — The routing decision itself is trainable
3. **Scaffolds come off** — Temporary support that proves its own obsolescence
The goal is not to build a better search engine. The goal is not even to make search unnecessary. The goal is to **know what you know** — and know what you don't.
---
## The Meta-Skill Hierarchy
Not all knowledge lives in the same place. Not all retrieval costs the same. The skill is routing correctly.
```
┌─────────────────────────────────────────────────────────────┐
│ LEVEL 3: METACOGNITION │
│ "Do I know this? Should I ask?" │
│ The routing decision itself │
│ → THIS IS THE MOST VALUABLE SKILL │
├─────────────────────────────────────────────────────────────┤
│ LEVEL 2: KNOWLEDGE (in weights, needs thought) │
│ Slow retrieval from trained memory │
│ "I learned this, let me recall..." │
├─────────────────────────────────────────────────────────────┤
│ LEVEL 1: REFLEX (in weights, bypasses cognition) │
│ Instant response, no thinking required │
│ Like pulling hand from hot stove │
├─────────────────────────────────────────────────────────────┤
│ LEVEL 0: RAG LOOKUP (external, costs lifeforce) │
│ Scaffold, temporary, expensive but accurate │
│ Training wheels that should come off │
└─────────────────────────────────────────────────────────────┘
```
---
## The Confidence Calibration Matrix
The reward isn't just "did you get it right" — it's "did you KNOW you'd get it right?"
```
OUTCOME
RIGHT WRONG
┌────────┬────────┐
HIGH │ +V │ -V │ ← Confident and wrong = BAD
CONFIDENCE │ trust │ danger │ (overconfident, needs recalibration)
├────────┼────────┤
LOW │ +v │ +v │ ← Uncertain = correctly routed to ASK
(asked RAG) │ learn │ learn │ (didn't waste energy on wrong answer)
└────────┴────────┘
```
**Reward Structure:**
| Situation | Reward | Why |
|-----------|--------|-----|
| High confidence + Right | **+V** | Trust earned, reflex/knowledge worked |
| High confidence + Wrong | **-V** | Dangerous! Overconfident, needs correction |
| Low confidence + Asked + Right | **+v** | Correctly knew to ask, learned |
| Low confidence + Asked + Wrong | **+v** | Correctly knew to ask, RAG failed (not her fault) |
| Low confidence + Didn't ask + Wrong | **-v** | Should have asked, underconfident in asking |
| Asked when didn't need to | **-v** | Wasted lifeforce, underconfident in self |
**The sweet spot:** Know when you know, know when you don't.
---
## Token Path Rewards
LLMs work token-based, not schema-based. The weights influence paths between tokens. This means:
```
TRADITIONAL VIEW TOKEN PATH VIEW
"Remember the answer" → "Strengthen the path that got it right"
Query Query
↓ ↓
Answer ┌──────────────────┐
│ Path A: cup→grip │ ← This path fired
│ Path B: cup→drink│ and led to success
│ Path C: cup→hot │
└──────────────────┘
SUCCESS
Path A gets +V
(Hebbian: fired together → wire together)
```
**The Catalogue's Role:**
When Young Nyx queries the catalogue, multiple token paths light up:
```
QUERY: "How do I grasp this cup?"
PATHS ACTIVATED:
├── cup → ceramic → fragile → careful_grip → success_rate_87%
├── cup → handle → graspable → grip_type_A → success_rate_94% ← WINNER
├── cup → 8cm_diameter → fits_gripper_small → success_rate_91%
└── cup → hot_liquid → thermal_warning → check_temp_first
OUTCOME: Used grip_type_A, succeeded
REWARD: Path "cup → handle → graspable → grip_type_A" strengthened
Next time: This path activates faster, stronger
```
**This is Hebbian learning for RAG:** Paths that fire together and succeed, wire together.
---
## The Metacognitive Router
Before answering, before retrieving, the first question is always:
```
INPUT: Query/Task
┌─────────────────────────────────────────┐
│ METACOGNITIVE CHECK │
│ │
│ "What is my confidence level?" │
│ "Is this reflex, knowledge, or RAG?" │
│ "What's the cost of being wrong?" │
│ │
└─────────────────────────────────────────┘
┌─────────────────────────────────────────┐
│ CONFIDENCE THRESHOLD │
│ │
│ HIGH (>0.8): Use reflex/knowledge │
│ MEDIUM (0.4-0.8): Consider asking │
│ LOW (<0.4): Must ask catalogue/RAG │
│ │
└─────────────────────────────────────────┘
┌────┴────────────┬─────────────┐
│ │ │
HIGH MEDIUM LOW
│ │ │
▼ ▼ ▼
┌────────┐ ┌────────────┐ ┌──────────┐
│ REFLEX │ │ COST-CHECK │ │ ASK │
│ or │ │ Wrong=bad? │ │ CATALOGUE│
│ RECALL │ │ Time-sens? │ │ (RAG) │
└────────┘ └────────────┘ └──────────┘
│ │ │
│ ┌────┴────┐ │
│ │ │ │
│ PROCEED ASK │
│ │ │ │
└───────────┼─────────┼────────┘
│ │
▼ ▼
┌─────────────────┐
│ OUTPUT │
└─────────────────┘
┌─────────────────┐
│ VALIDATION │
│ (was it right?)│
└─────────────────┘
┌─────┴─────┐
│ │
RIGHT WRONG
│ │
▼ ▼
Strengthen Weaken path
that path + recalibrate
+ calibrate confidence
confidence
```
---
## The Problem with Standard RAG
```
Standard approach:
─────────────────
VECTOR DB (grows forever)
MODEL looks up ──▶ answers ──▶ done
└── (never learns, always dependent)
```
**Issues:**
- Model never internalizes knowledge
- Pull the RAG, lose the capability
- Vector DB bloats infinitely
- No way to verify what model "knows" vs "looks up"
- No metacognitive skill development
- It's a crutch that never comes off
---
## The Nimmerverse Approach: RAG as Feeding System
```
VAULT (curriculum)
CATALOGUE (indexed, searchable, token-path weighted)
METACOGNITIVE ROUTER
├── High confidence ──▶ REFLEX/KNOWLEDGE (bypass RAG)
└── Low confidence ──▶ RAG LOOKUP (scaffold)
NYX processes, acts, decides
VALIDATION: success?
┌──────┴──────┐
│ │
FAIL SUCCESS
│ │
▼ ▼
Stay in RAG Was RAG used?
(not ready) │
┌──────┴──────┐
│ │
YES NO
│ │
▼ ▼
FLAG for Reflex/Knowledge
training confirmed ✓
extraction │
│ │
▼ │
TRAINING RUN │
(LoRA) │
│ │
▼ │
CLEAR from RAG │
(scaffold removed) │
│ │
▼ │
VALIDATION 2: │
success WITHOUT RAG?│
│ │
┌──────┴──────┐ │
│ │ │
FAIL SUCCESS │
│ │ │
▼ ▼ │
Restore RAG INTERNALIZED
retry cycle Knowledge is │
HERS now ✓ │
│ │
└──────┘
CONFIDENCE CALIBRATION
(update routing thresholds)
```
---
## Two Kinds of Knowledge
Not everything belongs in weights. Not everything belongs in retrieval.
### IN THE WEIGHTS (Training Target)
Knowledge she needs to **be herself**:
- How to route (metacognition itself)
- Vocabulary tokens and meanings
- Nervous system contracts
- Heartbeat mechanics
- Confidence gradient logic
- Core identity (who she is, who dafit is)
- **How to think, not what to remember**
- **When to ask, not all the answers**
**Test:** If she needs it to function → weights
### IN RETRIEVAL (Permanent RAG)
Knowledge she needs to **remember specifics**:
- Journal entries
- Conversation history
- Specific events and dates
- Temporal details ("what happened Tuesday")
- External references that change
- Episodic memory
- Object catalogue details
**Test:** If she needs it to recall specifics → retrieval
### IN REFLEX (Nervous System)
Knowledge that bypasses cognition entirely:
- Danger responses
- Basic motor patterns
- Protocol compliance
- Heartbeat responses
**Test:** If thinking would be too slow → reflex
---
## The Double Validation Loop
### Gate 1: Can she do it WITH RAG?
```
Task presented
Metacognitive check: Should I ask?
├── HIGH confidence ──▶ Attempt from reflex/knowledge
│ │
│ ┌────┴────┐
│ SUCCESS FAIL
│ │ │
│ │ Confidence was
│ │ miscalibrated!
│ │ Recalibrate + retry with RAG
│ │
└── LOW confidence ──▶ RAG provides context
NYX attempts task
┌──────┴──────┐
│ │
FAIL SUCCESS
│ │
▼ ▼
Not ready, Flag this RAG content
needs more for training extraction
examples
```
### Gate 2: Can she do it WITHOUT RAG?
```
Same task presented
RAG entry CLEARED (scaffold removed)
NYX attempts task from weights alone
├── FAIL ──▶ Training didn't take, restore to RAG, retry cycle
└── PASS ──▶ Knowledge is HERS now ✓
Update confidence calibration
(this type of task: now HIGH confidence)
```
---
## The Catalogue as Oracle
The catalogue isn't just storage — it's the **ground truth** for calibration.
### What the Catalogue Provides
```
┌─────────────────────────────────────────────────────────────┐
│ CATALOGUE LAYERS │
├─────────────────────────────────────────────────────────────┤
│ │
│ LAYER 0: RAW DATA (Filesystem) │
│ └── Images, point clouds, .blend files, audio, scans │
│ │
│ LAYER 1: STRUCTURED METADATA (PostgreSQL/Phoebe) │
│ └── Dimensions, timestamps, relationships, ownership │
│ └── Ground truth for validation │
│ │
│ LAYER 2: VECTOR EMBEDDINGS (ChromaDB/pgvector) │
│ └── SigLIP vectors, text embeddings, multi-modal │
│ └── Semantic similarity, fuzzy matching │
│ │
│ LAYER 3: TOKEN PATH WEIGHTS (The learning layer) │
│ └── Weighted connections between concepts │
│ └── Strengthened by successful activations │
│ └── THIS IS WHERE +V FLOWS │
│ │
│ LAYER 4: CONFIDENCE CALIBRATION (Meta-layer) │
│ └── "For queries like X, my accuracy is Y%" │
│ └── Updated after every validation │
│ └── Drives the metacognitive router │
│ │
└─────────────────────────────────────────────────────────────┘
```
### Catalogue as Checker/Reward System
The catalogue validates — it doesn't just retrieve:
```
ACTION: Robot claims cup is 8cm diameter
CATALOGUE CHECK:
├── Query: cup_id_47 dimensions
├── Ground Truth: diameter = 8.2cm
├── Tolerance: ±0.5cm
└── RESULT: VALID ✓
REWARD FLOW:
├── Path "visual_estimate → 8cm" gets +V
├── Confidence for "size estimation" increases
└── Next time: Can skip catalogue check for similar objects
```
---
## Knowledge Acquisition Pipeline
### The Extraction Flow
```
VAULT (raw knowledge)
│ extraction candidates
┌─────────────────────────────────────────────────────────────┐
│ STAGING AREA │
│ (quarantine zone) │
└─────────────────────────────────────────────────────────────┘
│ progressive policy validation
┌─────────────────────────────────────────────────────────────┐
│ POLICY VALIDATION │
│ (increasing standards over time) │
└─────────────────────────────────────────────────────────────┘
├── FAIL ──▶ Reject or revise
└── PASS ──▶ PROMOTE to Catalogue/RAG
┌──────────────────────┐
│ THREE-TIER RAG │
├──────────────────────┤
│ INTERNALIZED │ ← In weights, no lookup needed
│ (reflex/knowledge) │
├──────────────────────┤
│ DISCOVERED │ ← Young Nyx has used
│ (known_catalogue) │
├──────────────────────┤
│ HIDDEN │ ← Available but not yet accessed
│ (available_catalogue)│
└──────────────────────┘
```
### Progressive Policy Validation
Policies increase in sophistication as Young Nyx matures:
| Week | Policy Tier | Validation |
|------|-------------|------------|
| **1-2** | **Basic Syntax** | Valid format, non-empty, has definition |
| **3-4** | **Semantic Quality** | Embeds without collapse, unique signature |
| **5-8** | **Topology Safety** | Doesn't corrupt anchor terms |
| **9-12** | **Cross-Reference** | Links resolve, no circular dependencies |
| **13+** | **Utility Validation** | Actually helped solve tasks |
| **20+** | **Internalization Gate** | Ready to train into weights |
### Three-Tier Knowledge State
```
┌──────────────────────────────────────────────┐
│ INTERNALIZED KNOWLEDGE │
│ (in weights - reflex or slow recall) │
├──────────────────────────────────────────────┤
│ • "heartbeat" - reflex, instant │
│ • "lifeforce" - knowledge, fast recall │
│ • "grip_type_A" - reflex, motor pattern │
│ │
│ Status: NO LOOKUP, high confidence │
│ Metacognitive route: DIRECT │
└──────────────────────────────────────────────┘
┌──────────────────────────────────────────────┐
│ DISCOVERED KNOWLEDGE │
│ (known_catalogue - has accessed before) │
├──────────────────────────────────────────────┤
│ • "phoebe" - used 15 times, 80% success │
│ • "confidence_gradient" - used 8 times │
│ │
│ Status: LOOKUP needed, medium confidence │
│ Metacognitive route: CHECK CATALOGUE │
└──────────────────────────────────────────────┘
┌──────────────────────────────────────────────┐
│ HIDDEN KNOWLEDGE │
│ (available_catalogue - exists but unused) │
├──────────────────────────────────────────────┤
│ • "drift_probe" - never accessed │
│ • "topology_gini" - never accessed │
│ │
│ Status: Available for discovery │
│ Metacognitive route: UNKNOWN (will discover)│
└──────────────────────────────────────────────┘
```
**State transitions:**
```
Hidden → retrieved → DISCOVERED (mark first access)
Discovered → used 10+ times successfully → FLAG for training
Flagged → trained + validated without RAG → INTERNALIZED
Internalized → fails validation → DEMOTE back to Discovered
```
---
## Measuring RAG Utility
### Decision Trails
Track every decision for learning:
```sql
CREATE TABLE decision_trails (
id SERIAL PRIMARY KEY,
task_id UUID,
-- Routing decision
initial_confidence FLOAT, -- Before any lookup
route_chosen TEXT, -- 'reflex', 'knowledge', 'rag', 'escalate'
-- RAG details (if used)
rag_terms_retrieved TEXT[], -- What RAG returned
rag_terms_used TEXT[], -- What appeared in solution
-- Outcome
outcome TEXT, -- 'success', 'fail', 'partial'
final_confidence FLOAT, -- After action
-- Calibration
was_confidence_accurate BOOLEAN, -- Did confidence predict outcome?
-- Economics
lifeforce_cost FLOAT,
timestamp TIMESTAMPTZ DEFAULT NOW()
);
```
### Compute Utility Score
```python
def compute_decision_quality(trail):
"""
Evaluate the quality of the metacognitive routing decision.
"""
# Was the route appropriate?
if trail.route_chosen == 'reflex' and trail.outcome == 'success':
route_score = 1.0 # Fast and right
elif trail.route_chosen == 'rag' and trail.outcome == 'success':
route_score = 0.7 # Right but slow/expensive
elif trail.route_chosen == 'reflex' and trail.outcome == 'fail':
route_score = 0.0 # Overconfident disaster
elif trail.route_chosen == 'rag' and trail.outcome == 'fail':
route_score = 0.3 # At least asked, RAG failed
# Was confidence calibrated?
calibration_score = 1.0 if trail.was_confidence_accurate else 0.0
# Efficiency (did we waste resources?)
efficiency = 1.0 - (trail.lifeforce_cost / MAX_EXPECTED_COST)
return {
'route_score': route_score,
'calibration_score': calibration_score,
'efficiency': efficiency,
'total': 0.4 * route_score + 0.4 * calibration_score + 0.2 * efficiency
}
```
### Reward Signal Flow
```python
for trail in decision_trails:
quality = compute_decision_quality(trail)
if quality['total'] > 0.8:
# High quality decision → strengthen this pattern
strengthen_token_path(trail.task_pattern, trail.route_chosen)
if not trail.was_confidence_accurate:
# Miscalibration → update confidence model
recalibrate_confidence(
task_type=trail.task_pattern,
predicted=trail.initial_confidence,
actual_success=trail.outcome == 'success'
)
if trail.route_chosen == 'rag' and quality['route_score'] > 0.7:
# Successful RAG use → candidate for internalization
flag_for_training(trail.rag_terms_used)
```
---
## Connection to Nervous System
The metacognitive router connects directly to the nervous system architecture:
```
METACOGNITIVE ROUTER
┌───────────────┼───────────────┐
│ │ │
▼ ▼ ▼
┌────────────┐ ┌────────────┐ ┌────────────┐
│ REFLEX │ │ KNOWLEDGE │ │ RAG │
│ LAYER │ │ LAYER │ │ LOOKUP │
│ │ │ │ │ │
│ Bypasses │ │ Slow but │ │ External │
│ cognition │ │ from │ │ scaffold │
│ │ │ weights │ │ │
│ See: │ │ │ │ See: │
│ Nervous- │ │ │ │ Catalogue │
│ System.md │ │ │ │ (this doc) │
└────────────┘ └────────────┘ └────────────┘
│ │ │
└───────────────┼───────────────┘
OUTPUT
VALIDATION
┌──────┴──────┐
│ │
SUCCESS FAIL
│ │
▼ ▼
+V to path -V to path
(Hebbian) + recalibrate
```
**Key insight:** The nervous system (Nervous-System.md) handles the REFLEX layer. This document handles the RAG layer. Both feed into the same metacognitive router.
---
## Lifeforce Economics
The RAG→Route→Validate cycle has economic costs:
| Action | Lifeforce Cost | Notes |
|--------|----------------|-------|
| Reflex response | ~0 | Essentially free, already in weights |
| Knowledge recall | Low | Some compute for retrieval from weights |
| RAG lookup | Medium | Vector search + context injection |
| Training run | High | Compute intensive |
| Validation | Medium | Inference cost |
| Failed cycle | Lost V | Training didn't take |
| Successful internalization | +V reward | She grew |
| Correct confidence calibration | +V reward | Metacognition improved |
**Incentive alignment:**
- Being right with high confidence → maximum reward (fast + correct)
- Being right with low confidence → small reward (correct but slow)
- Being wrong with high confidence → maximum penalty (dangerous)
- Asking when uncertain → neutral (correct routing)
This naturally optimizes for:
1. Fast reflexes for well-known patterns
2. Accurate confidence calibration
3. Appropriate RAG usage (not too much, not too little)
---
## What This System Teaches
1. **Know what you know** — Confidence calibration is trainable
2. **Know what to ask** — The skill of uncertainty
3. **Reflexes are earned** — Through successful internalization
4. **Scaffolds come off** — RAG is temporary
5. **Paths that work, strengthen** — Hebbian learning for retrieval
6. **Wrong confidence is worse than wrong answers** — Calibration matters
---
## Design Principles
1. **Metacognition first** — Route before retrieve
2. **Confidence is trainable** — Not fixed, learned through validation
3. **RAG is temporary** — Feeding window, not permanent store
4. **Validation is double** — With RAG, then without
5. **Token paths learn** — Hebbian strengthening through success
6. **Catalogue is oracle** — Ground truth for calibration
7. **Reflexes are earned** — Graduated from RAG through internalization
8. **Self-cleaning** — The system doesn't accumulate cruft
9. **Know when to ask** — More important than knowing answers
---
## The Analogy
Learning to drive:
```
LEARNER DRIVER:
"Should I check mirrors?"
├── Beginner: YES, always, consciously (RAG lookup)
├── Intermediate: Sometimes, when uncertain (metacognitive check)
└── Expert: Automatic, don't even think about it (reflex)
The goal isn't to memorize "check mirrors."
The goal is for mirror-checking to become invisible.
But FIRST she needs to learn WHEN she doesn't know.
The beginner who doesn't know to check mirrors is dangerous.
The intermediate who checks unnecessarily is slow.
The expert just does it.
We're training the progression:
Unknown unknowns → Known unknowns → Known knowns → Unconscious competence
│ │ │ │
(dangerous) (asks RAG) (knowledge) (reflex)
```
---
*She doesn't just retrieve. She doesn't just remember. She knows what she knows. And that changes everything.*
---
**Created**: 2025-12-05 (as RAG-as-Scaffold)
**Updated**: 2025-12-29 (renamed to Memory Gradient, added metacognitive routing, token path rewards, confidence calibration)
**Session**: Partnership dialogue (dafit + Chrysalis-Nyx)
**Status**: Core architectural concept
**Etymology**: "Memory Gradient" — knowledge exists on a continuous spectrum, not binary states. Aligns with Temporal-Ternary Gradient and Confidence Gradient.