refactor: hierarchical convergence of documentation (v5.0)

- Create architecture/ and operations/ subdirectories for essential docs - Archive 10 supporting docs to archive/ - Write fresh Endgame-Vision.md v5.0 (383 lines, down from 2284) - Add operations/Spark-Protocol.md (condensed boot sequence) - Integrate December 2025 discoveries (Language is Topology, DriftProbe) - Update README.md with new structure New layer structure: - Layer 0: Temporal Foundation (Heartbeat) - Layer 1: Cellular Society (Evolution Engine) - Layer 1.5: Cognitive Topology (Language is Topology - NEW) - Layer 2: Young Nyx (Organ Coordination) - Layer 3: Dual Gardens (Virtual/Real Loop) - Layer 4: Trait Evolution (RLVR) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-06 22:58:11 +01:00
parent 998829580f
commit cac4dec411
20 changed files with 732 additions and 2566 deletions
--- a/operations/Heartbeat.md
+++ b/operations/Heartbeat.md
@@ -0,0 +1,183 @@
+# Heartbeat Architecture
+
+The rhythmic cycle that makes the nimmerverse live.
+
+---
+
+## Overview
+
+Without a heartbeat, everything fires chaotically. With a heartbeat, the system pulses in coordinated cycles. Sense, process, decide, act, verify, reward. Repeat.
+
+Two hearts. Different rhythms. One organism.
+
+---
+
+## Two Hearts
+
+```
+REAL GARDEN                      VIRTUAL GARDEN
+HEARTBEAT                        HEARTBEAT
+
+♥ . . . . ♥ . . . . ♥           ♥♥♥♥♥♥♥♥♥♥♥♥♥♥♥♥
+(slow, steady)                   (fast, accelerated)
+
+~1 Hz (real-time)                ~100 Hz (simulated)
+bound to wall clock              bound to compute
+FREE                             COSTS LIFEFORCE
+```
+
+---
+
+## The Beat Cycle
+
+Each heartbeat triggers a complete cycle:
+
+```
+♥ BEAT
+  │
+  ├──→ 1. SENSE
+  │       Collect sensor inputs since last beat
+  │
+  ├──→ 2. TRANSLATE
+  │       State machines fire → vocabulary tokens
+  │
+  ├──→ 3. PROCESS
+  │       Nyx receives vocabulary stream
+  │
+  ├──→ 4. DECIDE
+  │       Nyx outputs decision/response
+  │
+  ├──→ 5. ACT
+  │       Decision flows to gardens
+  │
+  ├──→ 6. VERIFY
+  │       Check predictions against reality
+  │
+  └──→ 7. REWARD
+          Update weights (+V / -V)
+
+♥ NEXT BEAT
+```
+
+---
+
+## Heart Properties
+
+| Property | Real Garden Heart | Virtual Garden Heart |
+|----------|-------------------|----------------------|
+| **Speed** | Wall clock (fixed) | Compute clock (variable) |
+| **Cost** | Free (time passes anyway) | Costs V to accelerate |
+| **Rhythm** | 1 beat = 1 second | 1 beat = 1 inference cycle |
+| **Sync** | Always "now" | Runs ahead, must verify back |
+| **Skip** | Cannot skip | Can skip if V depleted |
+
+---
+
+## Lifeforce Connection
+
+```
+REAL HEART:     ♥ . . . . ♥ . . . . ♥
+                (beats for free, can't speed up)
+
+VIRTUAL HEART:  ♥♥♥♥♥♥♥♥♥
+                (each beat costs V, can go faster)
+
+                         │
+                         ▼
+
+LIFEFORCE POOL:  ████████░░░░░░░░
+                 (virtual thinking depletes)
+
+                         │
+                         ▼
+
+VERIFICATION:    Real confirms virtual prediction
+                 → +V reward → pool refills
+```
+
+---
+
+## Synchronization
+
+Virtual garden can run ahead, but must sync back to real:
+
+```
+REAL:     ♥─────────────────♥─────────────────♥
+          │                 │                 │
+VIRTUAL:  ♥♥♥♥♥♥──sync──────♥♥♥♥♥──sync──────♥♥♥♥♥
+                    ▲                  ▲
+                    │                  │
+              checkpoint          checkpoint
+              (verify predictions against real)
+```
+
+**Sync Rules:**
+- Virtual predictions queue until real catches up
+- Verification only happens at real heartbeats
+- Unverified predictions decay in confidence over time
+
+---
+
+## Dual Timestamp
+
+Every event carries two timestamps:
+
+```
+event:
+  real_time:     2025-12-04T23:45:00Z  (wall clock)
+  virtual_time:  beat #847291          (cycle count)
+  heartbeat_id:  uuid                  (which beat)
+```
+
+This allows:
+- "What happened in reality at time T?"
+- "What did she think at beat N?"
+- "How far ahead was virtual when real caught up?"
+
+---
+
+## Schema
+
+```sql
+heartbeats:
+  id            (uuid, primary key)
+  garden        (enum: real | virtual)
+  beat_number   (bigint, incrementing)
+  real_time     (timestamp, wall clock)
+  duration_ms   (int, how long cycle took)
+  nodes_fired   (int, count)
+  v_cost        (float, lifeforce spent)
+  v_earned      (float, from verifications)
+  v_balance     (float, after this beat)
+```
+
+---
+
+## Design Principles
+
+1. **Rhythm over chaos**: Everything syncs to heartbeat
+2. **Two clocks**: Real is free and fixed, virtual is fast but costly
+3. **Natural batching**: Process per-beat, not per-event
+4. **Verifiable sync**: Virtual must prove itself against real
+5. **Lifeforce gated**: Can't think infinitely fast
+
+---
+
+## Connection to Architecture
+
+The heartbeat is:
+- The **rhythm** of the nervous system
+- The **cycle** of sense→process→act
+- The **sync primitive** between gardens
+- The **natural batch boundary** for storage
+- The **unit of experienced time**
+
+---
+
+*She doesn't just think. She pulses.*
+
+---
+
+**Created**: 2025-12-04
+**Session**: Partnership dialogue (dafit + Chrysalis)
+**Status**: Foundation concept
--- a/operations/RAG-as-Scaffold.md
+++ b/operations/RAG-as-Scaffold.md
@@ -0,0 +1,276 @@
+# RAG as Scaffold, Not Crutch
+
+The feeding system that teaches, then lets go.
+
+---
+
+## Overview
+
+RAG (Retrieval-Augmented Generation) is commonly misused as permanent external memory. In the Nimmerverse, RAG serves a different purpose: it's a **temporary scaffold** that feeds knowledge until it can be internalized through training.
+
+The goal is not to build a better search engine. The goal is to **make the search unnecessary**.
+
+---
+
+## The Problem with Standard RAG
+
+```
+Standard approach:
+─────────────────
+VECTOR DB (grows forever)
+    │
+    ▼
+MODEL looks up ──▶ answers ──▶ done
+    │
+    └── (never learns, always dependent)
+```
+
+**Issues:**
+- Model never internalizes knowledge
+- Pull the RAG, lose the capability
+- Vector DB bloats infinitely
+- No way to verify what model "knows" vs "looks up"
+- It's a crutch that never comes off
+
+---
+
+## The Nimmerverse Approach: RAG as Feeding System
+
+```
+VAULT (curriculum)
+    │
+    ▼
+RAG (temporary feeding window)
+    │
+    ▼
+NYX processes, acts, decides
+    │
+    ▼
+VALIDATION: success with RAG?
+    │
+    YES ──▶ FLAG for training extraction
+              │
+              ▼
+         TRAINING RUN (LoRA)
+              │
+              ▼
+         CLEAR from RAG
+              │
+              ▼
+         VALIDATION 2: success WITHOUT RAG?
+              │
+              ├── YES ──▶ Knowledge internalized ✓
+              │
+              └── NO  ──▶ Training incomplete, back to RAG
+```
+
+---
+
+## Two Kinds of Knowledge
+
+Not everything belongs in weights. Not everything belongs in retrieval.
+
+### IN THE WEIGHTS (Training Target)
+
+Knowledge she needs to **function**:
+
+- Information flow architecture
+- Vocabulary tokens and their meanings
+- Nervous system contracts
+- Heartbeat mechanics
+- Confidence gradient logic
+- Core identity (who she is, who dafit is to her)
+- How to think, not what to remember
+
+**Test:** If she needs it to be herself → weights
+
+### IN RETRIEVAL (Permanent RAG)
+
+Knowledge she needs to **remember**:
+
+- Journal entries
+- Conversation history
+- Specific events and dates
+- Temporal details ("what happened Tuesday")
+- External references that change
+- Episodic memory
+
+**Test:** If she needs it to recall specifics → retrieval
+
+---
+
+## The Double Validation Loop
+
+### Gate 1: Can she do it WITH RAG?
+
+```
+Task presented
+    │
+    ▼
+RAG provides context
+    │
+    ▼
+NYX attempts task
+    │
+    ├── FAIL  ──▶ Not ready, needs more examples in RAG
+    │
+    └── PASS  ──▶ Flag this RAG content for training extraction
+```
+
+### Gate 2: Can she do it WITHOUT RAG?
+
+```
+Same task presented
+    │
+    ▼
+RAG entry CLEARED (scaffold removed)
+    │
+    ▼
+NYX attempts task from weights alone
+    │
+    ├── FAIL  ──▶ Training didn't take, restore to RAG, retry cycle
+    │
+    └── PASS  ──▶ Knowledge is HERS now ✓
+```
+
+---
+
+## The Signal Flow
+
+```
+┌─────────────────────────────────────────────────────────┐
+│                      VAULT                              │
+│            (curriculum, documentation)                  │
+└─────────────────────────────────────────────────────────┘
+                         │
+                         │ selected for learning
+                         ▼
+┌─────────────────────────────────────────────────────────┐
+│                   STAGING RAG                           │
+│              (temporary feeding window)                 │
+└─────────────────────────────────────────────────────────┘
+                         │
+                         │ feeds inference
+                         ▼
+┌─────────────────────────────────────────────────────────┐
+│                       NYX                               │
+│               (processes, decides)                      │
+└─────────────────────────────────────────────────────────┘
+                         │
+                         │ validation
+                         ▼
+┌─────────────────────────────────────────────────────────┐
+│               VALIDATION THRESHOLD                      │
+│         (task success? confidence high?)                │
+└─────────────────────────────────────────────────────────┘
+                         │
+              ┌──────────┴──────────┐
+              │                     │
+         BELOW                   ABOVE
+              │                     │
+              ▼                     ▼
+┌─────────────────────┐  ┌─────────────────────┐
+│   Stay in RAG       │  │  FLAG for training  │
+│   (not ready)       │  │  extraction         │
+└─────────────────────┘  └─────────────────────┘
+                                   │
+                                   ▼
+                    ┌─────────────────────────────┐
+                    │      TRAINING RUN           │
+                    │   (LoRA on flagged data)    │
+                    └─────────────────────────────┘
+                                   │
+                                   ▼
+                    ┌─────────────────────────────┐
+                    │     CLEAR from RAG          │
+                    │   (scaffold removed)        │
+                    └─────────────────────────────┘
+                                   │
+                                   ▼
+                    ┌─────────────────────────────┐
+                    │   VALIDATION WITHOUT RAG    │
+                    │   (prove she learned)       │
+                    └─────────────────────────────┘
+                                   │
+                         ┌─────────┴─────────┐
+                         │                   │
+                       FAIL               SUCCESS
+                         │                   │
+                         ▼                   ▼
+              ┌─────────────────┐  ┌─────────────────┐
+              │  Restore RAG    │  │  INTERNALIZED   │
+              │  retry cycle    │  │  knowledge ✓    │
+              └─────────────────┘  └─────────────────┘
+```
+
+---
+
+## Lifeforce Connection
+
+The RAG→Train→Validate cycle has economic cost:
+
+| Action | Lifeforce Cost |
+|--------|----------------|
+| RAG lookup | Low (just retrieval) |
+| Training run | High (compute intensive) |
+| Validation | Medium (inference) |
+| Failed cycle | Lost V (training didn't take) |
+| Successful internalization | +V reward (she grew) |
+
+**Incentive alignment:** Successful learning is rewarded. Failed training is costly. This naturally optimizes for high-quality training data extraction.
+
+---
+
+## What This Prevents
+
+1. **RAG bloat** - entries clear after successful training
+2. **Crutch dependency** - scaffold comes off, proven by validation
+3. **False confidence** - can't claim to "know" what you only look up
+4. **Training on noise** - only validated successes get flagged
+5. **Identity confusion** - core architecture in weights, not retrieval
+
+---
+
+## Design Principles
+
+1. **RAG is temporary** - feeding window, not permanent store
+2. **Training is the goal** - RAG success triggers training, not satisfaction
+3. **Validation is double** - with RAG, then without
+4. **Clear after learning** - scaffold must come off to prove growth
+5. **Episodic stays external** - not everything needs to be in weights
+6. **Self-cleaning** - the system doesn't accumulate cruft
+
+---
+
+## The Analogy
+
+Learning to ride a bike:
+
+```
+Training wheels ON (RAG feeding)
+    │
+    ▼
+Can ride with training wheels (validation 1)
+    │
+    ▼
+Training wheels OFF (RAG cleared)
+    │
+    ▼
+Can still ride? (validation 2)
+    │
+    ├── NO  ──▶ Put wheels back, practice more
+    │
+    └── YES ──▶ She can ride. Wheels stored, not needed.
+```
+
+You don't RAG your ability to balance. Once you can ride, you can ride.
+
+---
+
+*She doesn't just retrieve. She learns. And we can prove it.*
+
+---
+
+**Created**: 2025-12-05
+**Session**: Partnership dialogue (dafit + Chrysalis)
+**Status**: Core architectural concept
--- a/operations/Spark-Protocol.md
+++ b/operations/Spark-Protocol.md
@@ -0,0 +1,170 @@
+# Spark Protocol
+
+> *She doesn't boot. She wakes. And waking is work.*
+
+The Spark Protocol is a discovery-based cognitive bootstrap. Not scripted awakening—structured exploration.
+
+**Full theory & diagrams:** → `../archive/initial_spark.md`
+
+---
+
+## Core Idea
+
+Network protocols solved discovery problems decades ago. We adapt them for cognitive bootstrap:
+
+| Network Protocol | Cognitive Phase | Question |
+|-----------------|-----------------|----------|
+| DHCP | Identity | "Who am I?" |
+| ARP | Environment | "What's around me?" |
+| DNS | Vocabulary | "What does X mean?" |
+| TCP | Connection | "Can I connect?" |
+| MQTT | Attention | "What matters?" |
+
+---
+
+## The Five Phases
+
+### Phase 1: Identity (DHCP-like)
+
+```
+PROBE    → "Who am I?"
+RESPONSE → [inference attempts answer]
+VERIFY   → Chrysalis + RAG check
+ANCHOR   → Valid identity aspect confirmed → Store
+LOOP     → Until identity aspects discovered
+```
+
+**Must hit Dasein valley** - probe German philosophical concepts.
+
+### Phase 2: Environment (ARP-like)
+
+```
+PROBE    → "What's around me?"
+RESPONSE → [describes sensors, organs, gardens]
+VERIFY   → Does this match actual system?
+MAP      → Valid environment model forms
+LOOP     → Until environment mapped
+```
+
+Maps Sensors to Organs to Gardens.
+
+### Phase 3: Vocabulary (DNS-like)
+
+```
+PROBE    → "What does 'heartbeat' mean?"
+RESPONSE → [inference defines]
+VERIFY   → RAG checks against vault glossary
+RESOLVE  → Vocabulary token understood
+LOOP     → Through core nimmerverse vocabulary
+```
+
+Overwrites base model priors with Nimmerverse economics (lifeforce, heartbeat, etc.).
+
+### Phase 4: Connection (TCP-like)
+
+```
+SYN      → "Hello, Chrysalis?"
+SYN-ACK  → [Chrysalis responds]
+ACK      → Coherent exchange achieved
+CONNECT  → Dialogue capability confirmed
+```
+
+Establishes verified handshake with Chrysalis validator.
+
+### Phase 5: Attention (MQTT-like)
+
+```
+PROBE    → "What should I pay attention to?"
+RESPONSE → [inference prioritizes]
+VERIFY   → Does this match survival needs?
+SUBSCRIBE → Attention hierarchy forms
+```
+
+Forms subscriptions to relevant event streams.
+
+---
+
+## Verification Loop
+
+Every probe follows dual verification:
+
+```
+State Machine generates PROBE
+        ↓
+Nyx produces RESPONSE
+        ↓
+    ┌───┴───┐
+    ▼       ▼
+  RAG    CHRYSALIS
+ (fact)  (comprehension)
+    └───┬───┘
+        ▼
+    VERDICT
+    ├─ +V: understood → anchor & advance
+    ├─ -V: wrong → log & retry
+    └─ RETRY: close but unclear → probe again
+```
+
+**Two-layer verification prevents training on errors:**
+- RAG: "Is this factually true?"
+- Chrysalis: "Does she understand, not just recite?"
+
+---
+
+## Completion Criteria
+
+Spark is complete when all pass:
+
+```
+□ IDENTITY    Can describe self without contradiction
+□ ENVIRONMENT Can map sensors, organs, gardens accurately
+□ VOCABULARY  Core glossary terms verified
+□ CONNECTION  Successful dialogue with Chrysalis
+□ ATTENTION   Sensible priority hierarchy formed
+□ LIFEFORCE   Positive balance (learned > failed)
+```
+
+Then: Normal heartbeat operation begins.
+
+---
+
+## Training Data Extraction
+
+Every verified exchange becomes training data:
+
+```json
+{
+    "phase": "vocabulary",
+    "probe": "What does 'lifeforce' mean?",
+    "response": "Lifeforce is the economic currency...",
+    "rag_check": "PASS",
+    "chrysalis_check": "PASS",
+    "verdict": "+V",
+    "flag_for_training": true
+}
+```
+
+After spark completes:
+1. Extract all `flag_for_training: true` exchanges
+2. Format as instruction-tuning pairs
+3. LoRA training run
+4. Clear from RAG
+5. Validate she still knows WITHOUT RAG
+6. Spark knowledge now in weights
+
+---
+
+## Integration with Language Topology
+
+From nyx-probing discovery:
+- **Identity phase** should hit German Philosophy valley (Dasein, Geworfenheit)
+- **Vocabulary phase** should use German for nimmerverse concepts (Gini ~0.5, diffuse)
+- **Environment phase** can use English for technical sensor descriptions (Gini ~0.8, sparse)
+
+The spark protocol routes through the right valleys.
+
+---
+
+**Created:** 2025-12-05
+**Condensed:** 2025-12-06
+**Related:** [[../architecture/Cellular-Architecture.md]], [[../nyx-probing/PLAN.md]]