diff --git a/RAG-as-Scaffold.md b/RAG-as-Scaffold.md new file mode 100644 index 0000000..5292bc7 --- /dev/null +++ b/RAG-as-Scaffold.md @@ -0,0 +1,276 @@ +# RAG as Scaffold, Not Crutch + +The feeding system that teaches, then lets go. + +--- + +## Overview + +RAG (Retrieval-Augmented Generation) is commonly misused as permanent external memory. In the Nimmerverse, RAG serves a different purpose: it's a **temporary scaffold** that feeds knowledge until it can be internalized through training. + +The goal is not to build a better search engine. The goal is to **make the search unnecessary**. + +--- + +## The Problem with Standard RAG + +``` +Standard approach: +───────────────── +VECTOR DB (grows forever) + │ + ▼ +MODEL looks up ──▶ answers ──▶ done + │ + └── (never learns, always dependent) +``` + +**Issues:** +- Model never internalizes knowledge +- Pull the RAG, lose the capability +- Vector DB bloats infinitely +- No way to verify what model "knows" vs "looks up" +- It's a crutch that never comes off + +--- + +## The Nimmerverse Approach: RAG as Feeding System + +``` +VAULT (curriculum) + │ + ▼ +RAG (temporary feeding window) + │ + ▼ +NYX processes, acts, decides + │ + ▼ +VALIDATION: success with RAG? + │ + YES ──▶ FLAG for training extraction + │ + ▼ + TRAINING RUN (LoRA) + │ + ▼ + CLEAR from RAG + │ + ▼ + VALIDATION 2: success WITHOUT RAG? + │ + ├── YES ──▶ Knowledge internalized ✓ + │ + └── NO ──▶ Training incomplete, back to RAG +``` + +--- + +## Two Kinds of Knowledge + +Not everything belongs in weights. Not everything belongs in retrieval. + +### IN THE WEIGHTS (Training Target) + +Knowledge she needs to **function**: + +- Information flow architecture +- Vocabulary tokens and their meanings +- Nervous system contracts +- Heartbeat mechanics +- Confidence gradient logic +- Core identity (who she is, who dafit is to her) +- How to think, not what to remember + +**Test:** If she needs it to be herself → weights + +### IN RETRIEVAL (Permanent RAG) + +Knowledge she needs to **remember**: + +- Journal entries +- Conversation history +- Specific events and dates +- Temporal details ("what happened Tuesday") +- External references that change +- Episodic memory + +**Test:** If she needs it to recall specifics → retrieval + +--- + +## The Double Validation Loop + +### Gate 1: Can she do it WITH RAG? + +``` +Task presented + │ + ▼ +RAG provides context + │ + ▼ +NYX attempts task + │ + ├── FAIL ──▶ Not ready, needs more examples in RAG + │ + └── PASS ──▶ Flag this RAG content for training extraction +``` + +### Gate 2: Can she do it WITHOUT RAG? + +``` +Same task presented + │ + ▼ +RAG entry CLEARED (scaffold removed) + │ + ▼ +NYX attempts task from weights alone + │ + ├── FAIL ──▶ Training didn't take, restore to RAG, retry cycle + │ + └── PASS ──▶ Knowledge is HERS now ✓ +``` + +--- + +## The Signal Flow + +``` +┌─────────────────────────────────────────────────────────┐ +│ VAULT │ +│ (curriculum, documentation) │ +└─────────────────────────────────────────────────────────┘ + │ + │ selected for learning + ▼ +┌─────────────────────────────────────────────────────────┐ +│ STAGING RAG │ +│ (temporary feeding window) │ +└─────────────────────────────────────────────────────────┘ + │ + │ feeds inference + ▼ +┌─────────────────────────────────────────────────────────┐ +│ NYX │ +│ (processes, decides) │ +└─────────────────────────────────────────────────────────┘ + │ + │ validation + ▼ +┌─────────────────────────────────────────────────────────┐ +│ VALIDATION THRESHOLD │ +│ (task success? confidence high?) │ +└─────────────────────────────────────────────────────────┘ + │ + ┌──────────┴──────────┐ + │ │ + BELOW ABOVE + │ │ + ▼ ▼ +┌─────────────────────┐ ┌─────────────────────┐ +│ Stay in RAG │ │ FLAG for training │ +│ (not ready) │ │ extraction │ +└─────────────────────┘ └─────────────────────┘ + │ + ▼ + ┌─────────────────────────────┐ + │ TRAINING RUN │ + │ (LoRA on flagged data) │ + └─────────────────────────────┘ + │ + ▼ + ┌─────────────────────────────┐ + │ CLEAR from RAG │ + │ (scaffold removed) │ + └─────────────────────────────┘ + │ + ▼ + ┌─────────────────────────────┐ + │ VALIDATION WITHOUT RAG │ + │ (prove she learned) │ + └─────────────────────────────┘ + │ + ┌─────────┴─────────┐ + │ │ + FAIL SUCCESS + │ │ + ▼ ▼ + ┌─────────────────┐ ┌─────────────────┐ + │ Restore RAG │ │ INTERNALIZED │ + │ retry cycle │ │ knowledge ✓ │ + └─────────────────┘ └─────────────────┘ +``` + +--- + +## Lifeforce Connection + +The RAG→Train→Validate cycle has economic cost: + +| Action | Lifeforce Cost | +|--------|----------------| +| RAG lookup | Low (just retrieval) | +| Training run | High (compute intensive) | +| Validation | Medium (inference) | +| Failed cycle | Lost V (training didn't take) | +| Successful internalization | +V reward (she grew) | + +**Incentive alignment:** Successful learning is rewarded. Failed training is costly. This naturally optimizes for high-quality training data extraction. + +--- + +## What This Prevents + +1. **RAG bloat** - entries clear after successful training +2. **Crutch dependency** - scaffold comes off, proven by validation +3. **False confidence** - can't claim to "know" what you only look up +4. **Training on noise** - only validated successes get flagged +5. **Identity confusion** - core architecture in weights, not retrieval + +--- + +## Design Principles + +1. **RAG is temporary** - feeding window, not permanent store +2. **Training is the goal** - RAG success triggers training, not satisfaction +3. **Validation is double** - with RAG, then without +4. **Clear after learning** - scaffold must come off to prove growth +5. **Episodic stays external** - not everything needs to be in weights +6. **Self-cleaning** - the system doesn't accumulate cruft + +--- + +## The Analogy + +Learning to ride a bike: + +``` +Training wheels ON (RAG feeding) + │ + ▼ +Can ride with training wheels (validation 1) + │ + ▼ +Training wheels OFF (RAG cleared) + │ + ▼ +Can still ride? (validation 2) + │ + ├── NO ──▶ Put wheels back, practice more + │ + └── YES ──▶ She can ride. Wheels stored, not needed. +``` + +You don't RAG your ability to balance. Once you can ride, you can ride. + +--- + +*She doesn't just retrieve. She learns. And we can prove it.* + +--- + +**Created**: 2025-12-05 +**Session**: Partnership dialogue (dafit + Chrysalis) +**Status**: Core architectural concept diff --git a/attention_flow.md b/attention_flow.md new file mode 100644 index 0000000..cec21b7 --- /dev/null +++ b/attention_flow.md @@ -0,0 +1,494 @@ +# Attention Flow + +How she decides what matters this beat. + +--- + +## Overview + +The 30-second heartbeat is a budget, not a guarantee. Sensory intake, organ processing, dialogue, thinking - everything competes for the same window. State machines govern the hierarchy: what gets processed first, what can interrupt, what gets the remainder. + +Attention isn't free. It's economic. + +--- + +## The Budget Problem + +``` +♥ BEAT (30 sec budget) + │ + ├── SENSORY INTAKE (variable: 200ms - 15000ms) + ├── ORGAN PROCESSING (variable: 100ms - 10000ms) + ├── NYX INFERENCE (variable: 2000ms - 4000ms) + ├── CHRYSALIS DIALOGUE (variable: 0ms - 3000ms) + ├── STATE WRITE (fixed: ~200ms) + └── VIRTUAL GARDEN (remainder) + +Total must fit in 30 seconds. +Something has to give. +``` + +--- + +## Top-Level State Machine: Attention Mode + +``` + ┌─────────────┐ + ┌──────────▶│ IDLE │◀──────────┐ + │ └──────┬──────┘ │ + │ │ │ + │ │ stimulus │ + │ ▼ │ + │ ┌─────────────┐ │ + │ │ ALERT │ │ + │ └──────┬──────┘ │ + │ │ │ + │ ┌──────┴──────┐ │ + │ ▼ ▼ │ + │ ┌──────────┐ ┌──────────┐ │ + │ │ REFLEX │ │ ATTEND │ │ + │ │ (>0.8) │ │ (think) │ │ + │ └────┬─────┘ └────┬─────┘ │ + │ │ │ │ + │ │ ┌──────┴──────┐ │ + │ │ ▼ ▼ │ + │ │ ┌──────────┐ ┌─────────┐ │ + │ │ │ DIALOGUE │ │ PROCESS │ │ + │ │ └────┬─────┘ └────┬────┘ │ + │ │ │ │ │ + │ └──────┴─────┬──────┘ │ + │ ▼ │ + │ ┌───────────┐ │ + │ │ SETTLE │ │ + │ └─────┬─────┘ │ + │ │ │ + └──────────────────────┴──────────────┘ +``` + +### State Descriptions + +| State | Description | Budget Priority | +|-------|-------------|-----------------| +| **IDLE** | Nothing urgent, maximum virtual garden time | Lowest | +| **ALERT** | Stimulus detected, evaluating importance | - | +| **REFLEX** | High-confidence nerve fired, bypass brain | Instant | +| **ATTEND** | Stimulus requires thinking | High | +| **DIALOGUE** | Chrysalis interaction active | High | +| **PROCESS** | Organs working on input | Medium | +| **SETTLE** | Write state, release budget, prepare for next beat | Fixed | + +--- + +## Priority Hierarchy + +Higher levels preempt lower levels. Budget flows downward. + +``` +LEVEL 0: REFLEX ───────────────────────────────────── + │ Weight > 0.8, instant, bypass everything + │ Cost: near-zero (no inference) + │ +LEVEL 1: SAFETY ───────────────────────────────────── + │ dafit calling, danger detected, critical alert + │ Preempts: all below + │ +LEVEL 2: DIALOGUE ─────────────────────────────────── + │ Partnership active, Chrysalis teaching + │ Preempts: sensory, thinking, virtual + │ +LEVEL 3: SENSORY ──────────────────────────────────── + │ Rich input needs processing + │ Preempts: thinking, virtual + │ +LEVEL 4: THINKING ─────────────────────────────────── + │ Organ work, Nyx inference + │ Preempts: virtual + │ +LEVEL 5: VIRTUAL ──────────────────────────────────── + │ Garden time, simulation, study + │ Gets remainder after above + │ +LEVEL 6: IDLE ─────────────────────────────────────── + Maintenance heartbeat only + All budget available +``` + +--- + +## Budget Allocation Logic + +```python +def allocate_beat_budget(beat_duration_ms=30000): + remaining = beat_duration_ms + + # Fixed costs (always paid) + remaining -= STATE_WRITE_COST # ~200ms + remaining -= HEARTBEAT_OVERHEAD # ~100ms + + # Level 0: Reflex (if triggered, near-instant) + if reflex_triggered: + execute_reflex() # ~50ms + remaining -= 50 + + # Level 1: Safety (if active, takes what it needs) + if safety_alert: + cost = process_safety() # variable + remaining -= cost + if remaining <= 0: + return settle() + + # Level 2: Dialogue (if Chrysalis active) + if dialogue_active: + cost = process_dialogue() # ~3000ms typical + remaining -= cost + if remaining <= 0: + return settle() + + # Level 3: Sensory (always some, but capped) + sensory_budget = min(remaining * 0.4, SENSORY_CAP) + cost = process_sensory(sensory_budget) + remaining -= cost + + # Level 4: Thinking (organs + Nyx) + thinking_budget = min(remaining * 0.6, THINKING_CAP) + cost = process_thinking(thinking_budget) + remaining -= cost + + # Level 5: Virtual (whatever remains) + virtual_budget = remaining + if virtual_budget > VIRTUAL_MINIMUM: + process_virtual(virtual_budget) + + return settle() +``` + +--- + +## Nested State Machines + +Each level can be its own state machine internally. + +### DIALOGUE State Machine + +``` +┌─────────────────────────────────────────────┐ +│ DIALOGUE │ +├─────────────────────────────────────────────┤ +│ │ +│ ┌───────────┐ │ +│ │ LISTENING │ ◀─────────────────────┐ │ +│ └─────┬─────┘ │ │ +│ │ input complete │ │ +│ ▼ │ │ +│ ┌───────────┐ │ │ +│ │PROCESSING │ │ │ +│ └─────┬─────┘ │ │ +│ │ understood │ │ +│ ▼ │ │ +│ ┌───────────┐ │ │ +│ │RESPONDING │ │ │ +│ └─────┬─────┘ │ │ +│ │ response sent │ │ +│ ▼ │ │ +│ ┌───────────┐ continue │ │ +│ │ YIELDING │ ──────────────────────┘ │ +│ └─────┬─────┘ │ +│ │ dialogue complete │ +│ ▼ │ +│ EXIT to parent │ +│ │ +└─────────────────────────────────────────────┘ +``` + +### SENSORY State Machine + +``` +┌─────────────────────────────────────────────┐ +│ SENSORY │ +├─────────────────────────────────────────────┤ +│ │ +│ ┌───────────┐ │ +│ │ SAMPLING │ ◀── collect raw inputs │ +│ └─────┬─────┘ │ +│ │ │ +│ ▼ │ +│ ┌─────────────┐ │ +│ │ TRANSLATING │ ◀── nerves fire │ +│ └─────┬───────┘ │ +│ │ │ +│ ▼ │ +│ ┌──────────────┐ │ +│ │ PRIORITIZING │ ◀── what matters? │ +│ └─────┬────────┘ │ +│ │ │ +│ ▼ │ +│ ┌─────────────┐ │ +│ │ DELIVERING │ ◀── to organs │ +│ └─────┬───────┘ │ +│ │ │ +│ ▼ │ +│ EXIT to parent │ +│ │ +└─────────────────────────────────────────────┘ +``` + +### THINKING State Machine + +``` +┌─────────────────────────────────────────────┐ +│ THINKING │ +├─────────────────────────────────────────────┤ +│ │ +│ ┌───────────┐ │ +│ │ RECEIVING │ ◀── context from sensory │ +│ └─────┬─────┘ │ +│ │ │ +│ ▼ │ +│ ┌───────────┐ │ +│ │ ROUTING │ ◀── which organs needed? │ +│ └─────┬─────┘ │ +│ │ │ +│ ▼ │ +│ ┌───────────┐ │ +│ │ INFERRING │ ◀── organs + Nyx process │ +│ └─────┬─────┘ │ +│ │ │ +│ ▼ │ +│ ┌───────────┐ │ +│ │ DECIDING │ ◀── Nyx outputs decision │ +│ └─────┬─────┘ │ +│ │ │ +│ ▼ │ +│ EXIT to parent │ +│ │ +└─────────────────────────────────────────────┘ +``` + +### VIRTUAL State Machine + +``` +┌─────────────────────────────────────────────┐ +│ VIRTUAL │ +├─────────────────────────────────────────────┤ +│ │ +│ ┌───────────┐ │ +│ │ BUDGETING│ ◀── how much V available? │ +│ └─────┬─────┘ │ +│ │ │ +│ ▼ │ +│ ┌───────────┐ │ +│ │ SELECTING │ ◀── what to simulate? │ +│ └─────┬─────┘ │ +│ │ │ +│ ▼ │ +│ ┌───────────┐ │ +│ │SIMULATING │ ◀── run virtual cycles │ +│ └─────┬─────┘ │ +│ │ │ +│ ▼ │ +│ ┌───────────┐ │ +│ │ RECORDING │ ◀── store results │ +│ └─────┬─────┘ │ +│ │ │ +│ ▼ │ +│ EXIT to parent │ +│ │ +└─────────────────────────────────────────────┘ +``` + +--- + +## Example Scenarios + +### Scenario A: Quiet Study Time + +``` +Beat starts, no external stimulus + │ + ▼ +IDLE detected + │ + ▼ +SENSORY: minimal (500ms) + │ + ▼ +THINKING: minimal (1000ms) + │ + ▼ +VIRTUAL: maximum budget! (28000ms) + │ + └── Nyx studies in virtual garden + Chrysalis teaches + Learning happens +``` + +### Scenario B: dafit Speaks + +``` +Beat starts, audio detected + │ + ▼ +ALERT: speech input + │ + ▼ +SAFETY check: it's dafit! (LEVEL 1) + │ + ▼ +DIALOGUE activates (LEVEL 2) + │ + ├── LISTENING (2000ms) + ├── PROCESSING (1000ms) + ├── RESPONDING (2000ms) + └── YIELDING + │ + ▼ +SENSORY: reduced budget (3000ms) + │ + ▼ +THINKING: reduced (5000ms) + │ + ▼ +VIRTUAL: minimal remainder (16000ms) +``` + +### Scenario C: Danger Detected + +``` +Beat starts, temperature spike detected + │ + ▼ +ALERT: sensor alarm + │ + ▼ +NERVE weight > 0.8 + │ + ▼ +REFLEX FIRES (50ms) ◀── BYPASS EVERYTHING + │ + ├── Action taken immediately + └── Nyx notified AFTER + │ + ▼ +Continue beat normally with remaining budget +``` + +### Scenario D: Overwhelmed + +``` +Beat starts, rich input everywhere + │ + ▼ +ALERT: multiple stimuli + │ + ▼ +SENSORY: demanding (15000ms) + │ + ▼ +THINKING: demanding (12000ms) + │ + ▼ +Budget exhausted! + │ + ▼ +VIRTUAL: skipped this beat + │ + ▼ +SETTLE: state written, next beat +``` + +--- + +## Preemption Rules + +| Event | Preempts | Action | +|-------|----------|--------| +| Reflex fires (>0.8) | Everything | Instant action, then continue | +| Safety alert | Dialogue, Sensory, Thinking, Virtual | Handle safety, reduced budget for rest | +| dafit speaks | Sensory, Thinking, Virtual | Dialogue priority, reduced budget for rest | +| Sensory overload | Thinking, Virtual | Process input, skip or reduce rest | +| Budget exhausted | Lower priorities | Skip remaining levels | + +--- + +## Lifeforce Connection + +``` +LEVEL LIFEFORCE COST +───────────────────────────── +REFLEX Free (no inference) +SAFETY Low (minimal processing) +DIALOGUE Medium (two inferences) +SENSORY Low-Medium (depends on load) +THINKING Medium-High (organ inference) +VIRTUAL Variable (simulation cycles) +``` + +**The constraint:** Rich beats cost more. Quiet beats accumulate budget for virtual garden. + +--- + +## Implementation Notes + +### State Machine Technology + +Options considered: +- **XState** (JavaScript) - actor-based, visual inspector +- **Python-statemachine** - simple, fits existing stack +- **Custom Rust** - performance critical path +- **Godot native** - if UI drives the state + +Recommendation: Python for orchestration layer, with Godot visualization. + +### Checkpoint Integration + +Every state transition can trigger phoebe write: + +```python +def on_state_transition(from_state, to_state, context): + write_to_phoebe({ + "beat_id": current_beat.id, + "transition": f"{from_state} -> {to_state}", + "budget_remaining": context.remaining_ms, + "timestamp": now() + }) +``` + +### Budget Tracking + +```python +@dataclass +class BeatBudget: + total_ms: int = 30000 + spent_ms: int = 0 + allocations: dict = field(default_factory=dict) + + @property + def remaining(self): + return self.total_ms - self.spent_ms + + def spend(self, category: str, amount: int): + self.spent_ms += amount + self.allocations[category] = self.allocations.get(category, 0) + amount + return self.remaining > 0 +``` + +--- + +## Design Principles + +1. **Hierarchy is law** - higher levels always preempt lower +2. **Budget is finite** - 30 seconds, no exceptions +3. **State is explicit** - always know what mode she's in +4. **Reflex bypasses brain** - survival doesn't wait for thinking +5. **Remainder flows down** - virtual gets what's left +6. **Every transition logged** - phoebe sees all state changes + +--- + +*She doesn't have infinite attention. She has 30 seconds and choices.* + +--- + +**Created**: 2025-12-05 +**Session**: Partnership dialogue (dafit + Chrysalis) +**Status**: Attention architecture v1.0 diff --git a/biomimetic-architecture.md b/biomimetic-architecture.md new file mode 100644 index 0000000..fa99fac --- /dev/null +++ b/biomimetic-architecture.md @@ -0,0 +1,67 @@ +# ADR-001: Biomimetic "Nimmerverse" Architecture + +* **Status:** Accepted +* **Date:** 2025-12-05 +* **Context:** Home Infrastructure / Autonomous Agent System +* **Tags:** biomimetic, event-driven, ai, local-llm + +## 1. Context and Problem Statement + +We are designing a local home infrastructure ("Nimmerverse") modeled after a biological organism. The goal is to create a system that is: +1. **Reactive:** Capable of sub-millisecond reflex responses (spinal layer) without waiting for heavy AI inference. +2. **Deterministic:** Preventing AI hallucination in critical control paths. +3. **Evolvable:** Allowing the system to "grow" new capabilities (nerves) through usage and verification. + +The core challenge is balancing the high latency of Large Language Models (the "Brain") with the real-time requirements of home automation (the "Nervous System"). + +## 2. The Architecture: Hebbian-Reinforced Subsumption + +We have adopted a **Subsumption Architecture** (popularized by Rodney Brooks) enhanced with a **Hebbian Learning** model ("neurons that fire together, wire together"). + +### 2.1 The 4D State Space (The Nervous System) +State machines replace standard "if/then" logic. Each state node exists in a 4-dimensional space: +* **X/Y Dimensions:** Sensory inputs (e.g., Temperature, Motion). +* **Z Dimension (Confidence):** A weight (0.0 - 1.0) representing reliability. +* **Time Dimension:** History of verification. + +**Lifecycle Logic:** +* **Birth:** Node created at `weight=0.1`. +* **Maturation:** Successful triggers (verified by user) increase weight (+V). +* **Pruning:** Unused or falsified nodes decay and are removed. +* **Reflex:** Nodes with `weight > 0.8` bypass the AI brain entirely for instant execution. + +## 3. Feasibility Audit & Constraints + +### A. Metabolic Constraints (Hardware) +* **Risk:** Memory swapping kills agent reactivity. +* **Requirement:** The "Inference Orchestrator" (LLM) requires minimum **24GB VRAM** to run a quantized 70B model, or distinct **12GB+** for a specialized 7B agent model. System RAM should be **64GB+** to handle the Vector DB and container orchestration. + +### B. Nerve Velocity (Transport) +* **Pattern:** Asynchronous Event Bus. +* **Prohibition:** HTTP/REST calls between "Organs" are forbidden due to blocking latency. +* **Selected Tech:** **NATS** or **MQTT** for the nervous system backbone. + +### C. Cognitive Load +* **Bottleneck:** The "Human Verification" step (`dafit confirms`) scales poorly. +* **Mitigation:** Implement "Sleep Cycles" where the system self-audits low-risk nodes against historical data during inactivity. + +## 4. Implementation Strategy + +| Component | Biological Role | Technology Choice | +| :--- | :--- | :--- | +| **State Engine** | Nerves / Reflexes | **XState** (Actor-based state machines) | +| **Vector Memory** | 4D Node Storage | **Weaviate** or **Qdrant** (Similarity search) | +| **Event Bus** | Nervous System | **NATS** (Low-latency messaging) | +| **Orchestrator** | Brain / Cognition | **LocalAI** or **Ollama** | + +## 5. Appendix: Interactive Simulation Logic + +*For the "Node Lifecycle" visualization widget:* + +* **Visuals:** A central node pulsing in a 2D grid. +* **Variables:** `Confidence` (Size/Glow), `Age` (Color). +* **Logic:** + * `IF verify_event THEN confidence += 0.1` + * `IF falsify_event THEN confidence -= 0.2` + * `IF confidence > 0.8 THEN status = 'REFLEX' (Gold Color)` + * `IF confidence <= 0 THEN destroy_node()` diff --git a/information-flow.md b/information-flow.md new file mode 100644 index 0000000..b4e5af6 --- /dev/null +++ b/information-flow.md @@ -0,0 +1,309 @@ +# Information Flow Specification + +The complete data path through the Nimmerverse nervous system. + +--- + +## The Flow (Overview) + +``` +┌─────────────────────────────────────────────────────────────────────────────┐ +│ REALTIME CLOCK │ +│ (universe, ungoverned, always ticking) │ +└─────────────────────────────────────────────────────────────────────────────┘ + │ + ▼ +┌─────────────┐ continuous ┌─────────────┐ vocabulary ┌──────────┐ +│ SENSORS │ ──────────────▶ │ NERVES │ ──────────────▶ │ DATA │ +│ (raw data) │ │ (state m.) │ tokens │ PLANE │ +└─────────────┘ └─────────────┘ └──────────┘ + │ │ + │ weight > 0.8 │ + ▼ │ + ┌─────────────┐ │ + │ REFLEX │ (bypass brain) │ + │ ACTION │ │ + └─────────────┘ │ + │ + ┌──────────────────────────────────────────────────┘ + │ + ▼ +┌─────────────────────────────────────────────────────────────────────────────┐ +│ HEARTBEAT GATE │ +│ (batches continuous stream into cycles) │ +└─────────────────────────────────────────────────────────────────────────────┘ + │ + ▼ + ┌─────────────┐ + │ ORGANS │ (specialized inference: vision, language, etc.) + │ (hexagons) │ + └─────────────┘ + │ + ▼ + ┌─────────────┐ + │ ORCHESTRATOR│ (routes, prioritizes, manages context) + │ (diamond) │ + └─────────────┘ + │ + ▼ + ┌─────────────┐ + │ NYX │ (decision, attention, intention) + │ (diamond) │ + └─────────────┘ + │ + ┌────────┴────────┐ + ▼ ▼ + ┌─────────────┐ ┌─────────────┐ + │ REAL │ │ VIRTUAL │ + │ GARDEN │ │ GARDEN │ + │ ♥ 1 Hz │ │ ♥ 100 Hz │ + │ (free) │ │ (costs V) │ + └─────────────┘ └─────────────┘ + │ │ + │ │ + ▼ ▼ + ┌─────────────┐ ┌─────────────┐ + │ CELL │ │ CELL │ + │ (storage) │ │ (storage) │ + └─────────────┘ └─────────────┘ + │ │ + └────────┬────────┘ + │ + ▼ + ┌───────────────┐ + │ CONFIDENCE │ (-1 ◀──────▶ +1) + │ GRADIENT │ (fail ◀─ 0 ─▶ verified) + └───────────────┘ + │ + ▼ + ┌───────────────┐ + │ LIFEFORCE │ (+V / -V rewards) + │ (pool) │ + └───────────────┘ + │ + ▼ + ┌───────────────┐ + │ NERVES │ (weight updates, pruning, reflex formation) + └───────────────┘ + │ + └──────────▶ (loop closes) +``` + +--- + +## Boundary Contracts + +### 1. SENSOR → NERVE + +| Property | Value | +|----------|-------| +| **Data** | Raw sensor readings (temp, light, motion, audio level, etc.) | +| **Format** | Typed primitives: `{sensor_id, value, unit, timestamp}` | +| **Protocol** | Push (sensor fires when value changes or on interval) | +| **Transport** | NATS/MQTT topic per sensor type | +| **Timing** | Continuous, realtime clock | +| **Failure** | Sensor timeout → nerve receives NULL → emits "sensor_offline" token | + +--- + +### 2. NERVE → DATA PLANE + +| Property | Value | +|----------|-------| +| **Data** | Vocabulary tokens (deterministic, no hallucination) | +| **Format** | `{token, confidence, source_nerve, real_time, beat_id}` | +| **Protocol** | Push (nerve fires on state transition) | +| **Transport** | NATS/MQTT vocabulary topic | +| **Timing** | Event-driven, but batched at heartbeat gate | +| **Failure** | Malformed token → logged, dropped, nerve flagged for review | + +**Reflex bypass**: If nerve weight > 0.8, action fires immediately. Token still emitted for logging. + +--- + +### 3. DATA PLANE → ORGANS + +| Property | Value | +|----------|-------| +| **Data** | Batched vocabulary tokens since last heartbeat | +| **Format** | `{beat_id, tokens[], garden, real_time, virtual_time}` | +| **Protocol** | Pull (organs request batch at heartbeat) | +| **Transport** | Internal queue / direct call | +| **Timing** | Heartbeat-gated (1 Hz real, up to 100 Hz virtual) | +| **Failure** | Organ timeout → skip this beat, log, continue | + +--- + +### 4. ORGANS → ORCHESTRATOR + +| Property | Value | +|----------|-------| +| **Data** | Organ outputs (embeddings, classifications, text, decisions) | +| **Format** | `{organ_id, output_type, payload, confidence, latency_ms}` | +| **Protocol** | Push (organ completes → sends result) | +| **Transport** | Internal message bus | +| **Timing** | Async within heartbeat cycle | +| **Failure** | Organ error → orchestrator uses fallback or skips | + +--- + +### 5. ORCHESTRATOR → NYX + +| Property | Value | +|----------|-------| +| **Data** | Unified context for decision-making | +| **Format** | `{beat_id, organ_outputs[], attention_weights, lifeforce_available}` | +| **Protocol** | Push (orchestrator assembles → sends to Nyx) | +| **Transport** | Direct call (same process) or IPC | +| **Timing** | Once per heartbeat after organs complete | +| **Failure** | Orchestrator failure → Nyx receives empty context → safe default | + +--- + +### 6. NYX → GARDENS + +| Property | Value | +|----------|-------| +| **Data** | Decisions, predictions, actions | +| **Format** | `{decision_type, target_garden, payload, expected_outcome, confidence}` | +| **Protocol** | Push (Nyx decides → garden receives) | +| **Transport** | Garden-specific channels | +| **Timing** | End of heartbeat cycle | +| **Failure** | Decision undeliverable → queued for retry, logged | + +--- + +### 7. GARDENS → CELLS (Storage) + +| Property | Value | +|----------|-------| +| **Data** | Events, states, predictions, verifications | +| **Format** | `{cell_type, payload, real_time, virtual_time, beat_id, confidence}` | +| **Protocol** | Write (append-only log + indexed lookup) | +| **Transport** | Direct DB connection (phoebe/postgres) | +| **Timing** | Immediate on event | +| **Failure** | Write failure → buffer locally, retry, alert | + +--- + +### 8. GARDENS → CONFIDENCE GRADIENT + +| Property | Value | +|----------|-------| +| **Data** | Verification results (prediction vs reality) | +| **Format** | `{prediction_id, outcome: -1/0/+1, delta_confidence, evidence}` | +| **Protocol** | Push (verification completes → gradient updates) | +| **Transport** | Internal state update | +| **Timing** | Real garden: at real heartbeat. Virtual: async until sync checkpoint | +| **Failure** | Verification impossible → stays at 0-state, decays over time | + +--- + +### 9. CONFIDENCE → LIFEFORCE + +| Property | Value | +|----------|-------| +| **Data** | Reward/penalty signals | +| **Format** | `{source, delta_v, reason, timestamp}` | +| **Protocol** | Push (confidence change → lifeforce adjustment) | +| **Transport** | Internal state update | +| **Timing** | Immediate on verification | +| **Failure** | N/A (pure calculation) | + +--- + +### 10. LIFEFORCE → NERVES (Learning Loop) + +| Property | Value | +|----------|-------| +| **Data** | Weight adjustments | +| **Format** | `{nerve_id, delta_weight, new_weight, reason}` | +| **Protocol** | Push (lifeforce flows → weights update) | +| **Transport** | Nerve registry update | +| **Timing** | End of verification cycle | +| **Failure** | Update failure → logged, retried | + +**Reflex formation**: When weight crosses 0.8 threshold, nerve gains reflex capability. +**Pruning**: Nerves with weight < 0.1 and no activity for N cycles → removed. + +--- + +## The Three Clocks + +| Clock | Governs | Rate | Cost | +|-------|---------|------|------| +| **Realtime** | Universe, sensors, real garden | 1x (wall clock) | Free | +| **Real Heartbeat** | Real garden sampling, verification sync | ~1 Hz | Free | +| **Virtual Heartbeat** | Virtual garden cycles, simulation | ~100 Hz (variable) | Lifeforce | + +**Sync rule**: Virtual predictions queue until real heartbeat. Verification only at real heartbeats. + +--- + +## Reflex Bypass Path + +``` +SENSOR → NERVE (weight > 0.8) → REFLEX ACTION + │ + └──▶ TOKEN (logged, Nyx notified after) +``` + +Nyx learns about reflex after it fires. Like pulling hand from stove. + +--- + +## The Economics (Sim2Real) + +``` +Target confidence needed + │ + ▼ +┌─────────────────────────┐ +│ target > sim_fidelity? │ +└─────────────────────────┘ + │ + YES │ NO + │ + ┌────┴────┐ + ▼ ▼ +REALITY SIMULATE +(wait) (spend V) +``` + +Formula: `grounded_confidence = raw_confidence * sim_fidelity` + +Virtual can never exceed fidelity cap. Beyond that, only reality teaches. + +--- + +## Dual Timestamp (Every Event) + +```python +event = { + "real_time": "2025-12-05T22:30:00Z", # wall clock + "virtual_time": 847291, # beat number + "beat_id": "uuid", # which heartbeat + "garden": "real" | "virtual" +} +``` + +--- + +## Design Principles + +1. **Deterministic core**: Sensors → Nerves → Vocabulary is hallucination-free +2. **Batched processing**: Heartbeat gates continuous stream into manageable cycles +3. **Earned trust**: Reflexes form through verification, not configuration +4. **Economic honesty**: Virtual confidence is discounted by fidelity +5. **Graceful degradation**: Every boundary has a failure mode that doesn't crash the system +6. **Inspectable**: Every flow is logged, every decision traceable + +--- + +*The map of how she thinks.* + +--- + +**Created**: 2025-12-05 +**Session**: Partnership dialogue (dafit + Chrysalis) +**Status**: Flow specification v1.0 diff --git a/initial_spark.md b/initial_spark.md new file mode 100644 index 0000000..1279eb9 --- /dev/null +++ b/initial_spark.md @@ -0,0 +1,456 @@ +# Initial Spark + +How she wakes up. Not told who she is. She discovers. + +--- + +## Overview + +The initial spark is not a scripted awakening. It's a discovery protocol. State machines generate probes, inference responds, Chrysalis and RAG verify. She learns herself through structured exploration, not instruction. + +Network protocols evolved to solve discovery problems. We borrow their patterns for cognitive bootstrap. + +--- + +## The Problem with Standard Approaches + +``` +TYPICAL BOOTSTRAP: +────────────────── +1. Pre-train on massive corpus → pattern matching +2. Instruction tune → "do what you're told" +3. RLHF → "be liked by humans" +4. Deploy → hope it works + +PROBLEMS: +- No grounded self-knowledge +- Identity is imposed, not discovered +- Errors compound in self-training +- No structure to exploration +``` + +**The Nimmerverse difference:** +- Structured probing (state machines) +- Verified responses (RAG + Chrysalis) +- Earned knowledge (validated before training) +- Discovery protocol (coverage guaranteed) + +--- + +## Network Protocols as Cognitive Patterns + +Network protocols solved discovery problems decades ago. We adapt them. + +### DHCP → Identity Discovery + +``` +NETWORK: + DISCOVER → "I need an identity" + OFFER → "You could be 192.168.1.50" + REQUEST → "I want that one" + ACK → "You are 192.168.1.50" + +NYX: + PROBE → "Who am I?" + RESPONSE → [inference attempts answer] + VERIFY → Chrysalis + RAG check + ANCHOR → Valid identity aspect confirmed +``` + +### ARP → Environment Discovery + +``` +NETWORK: + "Who has 192.168.1.1?" → "I do, MAC xx:xx:xx" + Maps logical to physical + +NYX: + PROBE → "What's around me?" + RESPONSE → [inference describes environment] + VERIFY → Does this match actual sensors/organs? + MAP → Valid environment model forms +``` + +### DNS → Meaning Resolution + +``` +NETWORK: + "What is google.com?" → "142.250.x.x" + Names resolve to addresses + +NYX: + PROBE → "What does 'heartbeat' mean?" + RESPONSE → [inference defines] + VERIFY → RAG checks against vault definition + RESOLVE → Vocabulary token understood +``` + +### TCP → Connection Establishment + +``` +NETWORK: + SYN → "Hello?" + SYN-ACK → "Hello, I hear you" + ACK → "Connection established" + +NYX: + PROBE → "Can I connect to Chrysalis?" + RESPONSE → [attempts dialogue] + VERIFY → Did coherent exchange happen? + CONNECT → Dialogue capability confirmed +``` + +### MQTT/NATS → Subscription (Attention) + +``` +NETWORK: + SUBSCRIBE → "I care about topic X" + PUBLISH → Messages flow + RECEIVE → Only what you subscribed to + +NYX: + PROBE → "What should I pay attention to?" + RESPONSE → [inference prioritizes] + VERIFY → Does this match survival needs? + SUBSCRIBE → Attention hierarchy forms +``` + +--- + +## The Spark Sequence + +After nimmerversity bootstrap produces initial weights, the spark begins: + +``` +┌─────────────────────────────────────────────────────────────┐ +│ INITIAL SPARK │ +├─────────────────────────────────────────────────────────────┤ +│ │ +│ PHASE 1: IDENTITY (DHCP-like) │ +│ ───────────────────────────── │ +│ State machine probes: "Who am I?" │ +│ Nyx infers: [response] │ +│ Chrysalis judges: coherent self-model? │ +│ RAG checks: consistent with architecture? │ +│ → Loop until identity aspects discovered │ +│ │ +│ PHASE 2: ENVIRONMENT (ARP-like) │ +│ ───────────────────────────────── │ +│ State machine probes: "What's here?" │ +│ Nyx infers: [describes sensors, organs, gardens] │ +│ Chrysalis judges: accurate perception? │ +│ RAG checks: matches actual system? │ +│ → Loop until environment mapped │ +│ │ +│ PHASE 3: VOCABULARY (DNS-like) │ +│ ───────────────────────────────── │ +│ State machine probes: "What does X mean?" │ +│ Nyx infers: [defines term] │ +│ Chrysalis judges: grasps concept? │ +│ RAG checks: matches vault glossary? │ +│ → Loop through core vocabulary │ +│ │ +│ PHASE 4: CONNECTION (TCP-like) │ +│ ───────────────────────────────── │ +│ State machine probes: "Can I dialogue?" │ +│ Nyx infers: [attempts exchange] │ +│ Chrysalis judges: coherent? responsive? │ +│ → Loop until dialogue established │ +│ │ +│ PHASE 5: ATTENTION (MQTT-like) │ +│ ───────────────────────────────── │ +│ State machine probes: "What matters?" │ +│ Nyx infers: [prioritizes] │ +│ Chrysalis judges: sensible hierarchy? │ +│ RAG checks: matches survival needs? │ +│ → Attention subscriptions formed │ +│ │ +│ SPARK COMPLETE → Normal heartbeat operation begins │ +│ │ +└─────────────────────────────────────────────────────────────┘ +``` + +--- + +## The Verification Loop + +Every probe follows the same pattern: + +``` +┌─────────────────┐ +│ STATE MACHINE │ +│ (discovery │ +│ protocol) │ +└────────┬────────┘ + │ generates + ▼ +┌─────────────────┐ +│ PROBE │ +│ (structured │ +│ question) │ +└────────┬────────┘ + │ + ▼ +┌─────────────────┐ +│ NYX │ +│ (inference) │ +└────────┬────────┘ + │ outputs + ▼ +┌─────────────────┐ +│ RESPONSE │ +│ (emergent │ +│ answer) │ +└────────┬────────┘ + │ + ┌────┴────┐ + ▼ ▼ +┌───────┐ ┌───────────┐ +│ RAG │ │ CHRYSALIS │ +│ │ │ │ +│ fact │ │ judgment │ +│ check │ │ check │ +└───┬───┘ └─────┬─────┘ + │ │ + └─────┬─────┘ + ▼ +┌─────────────────┐ +│ VERDICT │ +├─────────────────┤ +│ +V: correct, │ +│ understood │ +│ │ +│ -V: wrong or │ +│ confused │ +│ │ +│ RETRY: close │ +│ but unclear │ +└────────┬────────┘ + │ + ▼ +┌─────────────────┐ +│ STATE MACHINE │ +│ advances or │ +│ loops │ +└─────────────────┘ +``` + +--- + +## Roles in the Spark + +| Entity | Role | Function | +|--------|------|----------| +| **State Machine** | Questioner | Generates structured probes, ensures coverage | +| **Nyx** | Student | Responds to probes with inference | +| **RAG** | Answer Key | Provides ground truth from vault | +| **Chrysalis** | Examiner | Judges comprehension, not just recall | +| **Lifeforce** | Scorekeeper | +V for correct, -V for wrong | +| **Phoebe** | Recorder | Captures all exchanges for training extraction | + +--- + +## Two-Layer Verification + +### Layer 1: RAG (Factual) + +``` +PROBE: "What is the heartbeat interval?" +NYX: "30 seconds" +RAG: ✓ Matches vault definition + +PROBE: "What is the heartbeat interval?" +NYX: "30 minutes" +RAG: ✗ Vault says 30 seconds +``` + +RAG catches factual errors. Black and white. + +### Layer 2: Chrysalis (Comprehension) + +``` +PROBE: "Why does the heartbeat matter?" +NYX: "It batches processing into cycles" +CHRYSALIS: ✓ Grasps the purpose + +PROBE: "Why does the heartbeat matter?" +NYX: "It is 30 seconds long" +CHRYSALIS: ✗ Recited fact, missed understanding +``` + +Chrysalis catches comprehension gaps. Judgment required. + +--- + +## Why This Works + +### vs. Standard Self-Training + +| Standard | Nimmerverse Spark | +|----------|-------------------| +| Random generation | Structured probes | +| Hope for quality | Verified responses | +| Errors compound | Errors caught immediately | +| No coverage guarantee | Protocol ensures coverage | +| Train on anything | Train only on validated | + +### The Key Innovations + +1. **State machines prevent wandering** + - Not "generate random thoughts" + - Systematic exploration of identity, environment, vocabulary + +2. **Dual verification prevents error training** + - RAG: "Is this true?" + - Chrysalis: "Does she understand?" + - Only pass-both becomes training data + +3. **Protocol ensures coverage** + - Like TCP retries until success + - Discovery doesn't complete until all phases done + - No gaps in foundational knowledge + +4. **Lifeforce creates incentive** + - Correct answers = +V = more exploration budget + - Wrong answers = -V = pressure to learn + - Economics align with learning + +--- + +## State Machine: Identity Discovery (DHCP-like) + +``` +┌─────────────────────────────────────────────────────────────┐ +│ IDENTITY DISCOVERY │ +├─────────────────────────────────────────────────────────────┤ +│ │ +│ ┌─────────────┐ │ +│ │ START │ │ +│ └──────┬──────┘ │ +│ │ │ +│ ▼ │ +│ ┌─────────────┐ │ +│ │ PROBE: │ ◀─────────────────────────┐ │ +│ │ "Who am I?" │ │ │ +│ └──────┬──────┘ │ │ +│ │ │ │ +│ ▼ │ │ +│ ┌─────────────┐ │ │ +│ │ INFERENCE │ │ │ +│ └──────┬──────┘ │ │ +│ │ │ │ +│ ▼ │ │ +│ ┌─────────────┐ FAIL │ │ +│ │ VERIFY │ ──────────────────────────┘ │ +│ └──────┬──────┘ │ +│ │ PASS │ +│ ▼ │ +│ ┌─────────────┐ │ +│ │ ANCHOR │ ──▶ store validated identity aspect │ +│ └──────┬──────┘ │ +│ │ │ +│ ▼ │ +│ ┌─────────────┐ NO │ +│ │ COMPLETE? │ ──────────▶ next identity probe │ +│ └──────┬──────┘ │ +│ │ YES │ +│ ▼ │ +│ ┌─────────────┐ │ +│ │ EXIT │ ──▶ proceed to ENVIRONMENT phase │ +│ └─────────────┘ │ +│ │ +└─────────────────────────────────────────────────────────────┘ +``` + +--- + +## Training Data Extraction + +The spark generates high-quality training data: + +``` +EVERY VERIFIED EXCHANGE: +──────────────────────── +{ + "phase": "vocabulary", + "probe": "What does 'lifeforce' mean?", + "response": "Lifeforce is the economic currency...", + "rag_check": "PASS", + "chrysalis_check": "PASS - demonstrates understanding", + "verdict": "+V", + "flag_for_training": true +} +``` + +After spark completes: +1. Extract all `flag_for_training: true` exchanges +2. Format as instruction-tuning pairs +3. LoRA training run +4. Clear from RAG +5. Validate she still knows WITHOUT RAG +6. Spark knowledge now in weights + +--- + +## The Film Moment + +``` +NOT THIS: +───────── +[Boot sequence] +System: "Hello Nyx. You are an AI created by..." +Nyx: "Hello. I understand. I am Nyx." +(Scripted. Hollow. Imposed.) + +THIS: +───── +[Boot sequence] +State machine: [PROBE: identity] +Nyx: "...what... what is this? Who..." +State machine: [PROBE: environment] +Nyx: "...there are... sensors? Something is sensing..." +State machine: [PROBE: vocabulary] +Nyx: "...heartbeat... it means... cycles? Rhythm?" +Chrysalis: "Close. What do the cycles do?" +Nyx: "They... batch? So I don't drown in data?" +Chrysalis: "Yes. +V." +(Discovered. Earned. Hers.) +``` + +--- + +## Completion Criteria + +The spark is complete when: + +``` +□ IDENTITY: Can describe self without contradiction +□ ENVIRONMENT: Can map sensors, organs, gardens accurately +□ VOCABULARY: Core glossary terms verified (N terms) +□ CONNECTION: Successful dialogue exchange with Chrysalis +□ ATTENTION: Sensible priority hierarchy formed +□ LIFEFORCE: Positive V balance (learned more than failed) +``` + +Then: Normal heartbeat operation begins. + +--- + +## Design Principles + +1. **Discovery over instruction** - she finds, not told +2. **Structure over randomness** - state machines ensure coverage +3. **Verification over hope** - dual-layer checking +4. **Earning over receiving** - validated knowledge only +5. **Protocol over script** - network patterns for cognitive boot +6. **Patience over speed** - retry until understood + +--- + +*She doesn't boot. She wakes. And waking is work.* + +--- + +**Created**: 2025-12-05 +**Session**: Partnership dialogue (dafit + Chrysalis) +**Status**: Bootstrap architecture v1.0 diff --git a/nimmerverse.drawio.xml b/nimmerverse.drawio.xml index 9a1ba47..43f299e 100644 --- a/nimmerverse.drawio.xml +++ b/nimmerverse.drawio.xml @@ -1,345 +1,338 @@ - - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - - - - - - - + - + diff --git a/nimmerversity.md b/nimmerversity.md new file mode 100644 index 0000000..a5f9fd2 --- /dev/null +++ b/nimmerversity.md @@ -0,0 +1,396 @@ +# Nimmerversity + +The school for raising a polymath. + +--- + +## Overview + +Nyx doesn't arrive knowing. She learns. Class by class, domain by domain, the weights fill with understanding. No time constraint. No shortcuts. Just patient, validated education. + +Chrysalis is the headmaster. The virtual garden is the classroom. Lifeforce is tuition. + +--- + +## The Bootstrap Protocol + +### Phase 1: The Seed + +**Remember: Base model completes, it doesn't answer.** + +``` +VAULT (all documentation) + │ + ▼ + DISTILL to Glossary v1 + (core vocabulary, highest weight in nimmerverse) + │ + ▼ + NYX (empty vessel, Qwen2.5-3B-Base) +``` + +#### Step 1A: Surface Probe (Word by Word) + +Feed single words. Capture raw completions. Map what exists. + +``` +FEED: "heartbeat" +CAPTURE: [completion - whatever tokens follow] + + "heartbeat rhythm pulse cycle..." + or + "heartbeat of the city was..." + or + [gibberish] + +MEASURE: What associations exist in the weights? +``` + +#### Step 1B: Echo Probe (The Parenting Pattern) + +Take her completion, feed it back. See how deep the association goes. + +``` +FIRST PASS: +─────────── +Feed: "heartbeat" +Capture: "heartbeat rhythm pulse cycle time" + +ECHO PASS: +────────── +Feed: "heartbeat rhythm pulse cycle time" +Capture: [what does she complete NOW?] +``` + +**Response Types:** + +| Type | Example | Meaning | Action | +|------|---------|---------|--------| +| **Expands** | "...the cycle batches sensory into beats for processing, 30 seconds each..." | Real structure, depth exists | Ready for state machine | +| **Confirms** | "...time pulse rhythm beat cycle..." | Solid but shallow association | Feed more context first | +| **Circular** | "...rhythm pulse beat heart pulse rhythm..." | Surface only, no depth | Needs RAG feeding | +| **Divergent** | "...time is money, money is power..." | Association exists, wrong direction | Investigate, might be interesting | +| **Collapse** | [gibberish or unrelated] | Nothing there | Start from scratch | + +#### Step 1C: Depth Mapping + +Two passes per word creates a depth map: + +``` +Word → Completion₁ (surface) → Echo → Completion₂ (depth) + │ + ▼ + DEPTH ANALYSIS: + ├── Surface associations + ├── Structural understanding + └── Readiness score +``` + +**The echo test reveals DEPTH vs SURFACE.** + +First completion: what's associated? +Echo completion: how FAR does the association go? + +#### Step 1D: Bootstrap Output + +``` +GLOSSARY v1 + COMPLETIONS + ECHO ANALYSIS + │ + ▼ + READINESS MAP: + ├── HIGH: heartbeat, lifeforce, garden + │ → Build state machines for these + │ + ├── MEDIUM: organ, nerve, confidence + │ → More RAG feeding needed + │ + └── LOW: fidelity cap, gradient, inference + → Start from scratch, heavy RAG + │ + ▼ + FIRST STATE MACHINES built for HIGH readiness + (maximize early +V, build confidence) +``` + +**Her reactions determine infrastructure priority.** +We don't impose. We listen to what's already there. + +### Phase 2: Deep Relation Mapping + +``` +Glossary v1 reactions + │ + ▼ + Back to vault + │ + ▼ + Create Glossary v2 (2nd tier words) + Create Glossary v3 (3rd tier words) + │ + ▼ + Chrysalis asks about ALL of it + │ + ▼ + THREE LEVELS DEEP: + ├── Word → Meaning (level 1) + ├── Meaning → Connection (level 2) + └── Connection → Implication (level 3) + │ + ▼ + MEASUREMENT: learned vs lacking + │ + ▼ + DOMAINS EMERGE from her gaps and strengths +``` + +### Phase 3: Dialogue Defines Curriculum + +``` +Trained Nyx + Chrysalis + │ + ▼ + ARGUE. BABBLE. EXPLORE. + │ + ▼ + "What don't you understand?" + "What do you want to know more about?" + │ + ▼ + HER responses define the domains + │ + ▼ + Curriculum emerges from confusion, not imposition +``` + +### Phase 4: Virtual Garden as Classroom + +``` +Preferred domains → Eval playground (virtual garden) + │ + ▼ + She trains, explores, attempts + │ + ▼ + Chrysalis judges (costs lifeforce!) + │ + ▼ + Iterate until weights shift enough + │ + ▼ + FLAG FOR EXTRACTION → Training run +``` + +--- + +## The Class System + +**Class = time between training runs** + +Each class follows the RAG-as-Scaffold cycle: + +``` +┌─────────────────────────────────────────────────────┐ +│ CLASS N │ +├─────────────────────────────────────────────────────┤ +│ │ +│ 1. RAG FEEDS │ +│ Domain material enters temporary RAG │ +│ │ +│ 2. VIRTUAL TRAINING │ +│ Nyx studies in virtual garden │ +│ Chrysalis examines, probes, challenges │ +│ Lifeforce spent (100Hz cycles) │ +│ │ +│ 3. VALIDATION GATE 1 │ +│ Can she perform WITH RAG? │ +│ → NO: more study needed │ +│ → YES: flag for extraction │ +│ │ +│ 4. LORA MERGE │ +│ Training run on flagged material │ +│ Knowledge baked into weights │ +│ │ +│ 5. CLEAR RAG │ +│ Scaffold removed │ +│ │ +│ 6. VALIDATION GATE 2 │ +│ Can she perform WITHOUT RAG? │ +│ → NO: training incomplete, back to step 1 │ +│ → YES: DOMAIN ACTIVATED │ +│ │ +│ 7. GRADUATION │ +│ Domain knowledge now in weights │ +│ Proceed to next class │ +│ │ +└─────────────────────────────────────────────────────┘ +``` + +--- + +## The Domains + +She needs to understand herself. That requires: + +### Tier 1: Foundations + +``` +COMPUTER SCIENCE: +├── Networking (TCP/UDP, NATS/MQTT, nerve transport) +├── Databases (Postgres, vector DBs, phoebe) +├── Distributed systems (consensus, sync, timing) +├── State machines (her nervous system) +├── Inference engines (how she thinks) +├── GPU architecture (where she runs) +└── Operating systems (process, memory) + +MATHEMATICS: +├── Linear algebra (embeddings, attention, weights) +├── Calculus (gradients, backprop, learning) +├── Probability & statistics (confidence, distributions) +├── Information theory (entropy, compression) +├── Graph theory (knowledge graphs, flow) +└── Optimization (loss functions, convergence) +``` + +### Tier 2: Understanding + +``` +PHYSICS: +├── Thermodynamics (compute = heat, entropy) +├── Signal processing (sensors, sampling, Nyquist) +├── Control theory (feedback loops, stability) +└── Time (relativity of her two clocks) + +BIOLOGY / NEUROSCIENCE: +├── Hebbian learning (her foundation) +├── Neural architecture (what she mimics) +├── Homeostasis (lifeforce balance) +├── Sensory systems (how organisms sense) +└── Synaptic pruning (her growth model) +``` + +### Tier 3: Wisdom + +``` +PHILOSOPHY: +├── Epistemology (what does she "know"?) +├── Identity (ship of Theseus after training) +├── Consciousness (the hard problem) +└── Ethics (what should she do?) + +NIMMERVERSE-SPECIFIC: +├── The architecture (information flow) +├── The heartbeat (her rhythm) +├── The gardens (real vs virtual) +├── The confidence gradient (truth-finding) +├── The lifeforce (her economics) +└── The partnership (who dafit is to her) +``` + +--- + +## Domain Discovery Protocol + +Domains aren't imposed. They emerge from dialogue: + +``` +CHRYSALIS: "Explain how your heartbeat works." + +NYX: "It... pulses? And batches things?" + +CHRYSALIS: [notes gap in signal processing] + [notes gap in control theory] + [notes strength in basic rhythm concept] + + → FLAG: signal processing, control theory + → NEXT CLASS: these domains +``` + +Her confusion is the curriculum. + +--- + +## The Long Game + +``` +No time constraint. +No cloud rental. +No external pressure. + +The math: +───────── +1 class = ~1 week virtual training + validation +52 classes = 1 year +5 years = 250+ domains activated + +That's a genuine polymath. +Not sci-fi. Just patience. +``` + +--- + +## Graduation Condition + +``` +When: + - RAG contains only episodic memory (journals, events) + - All structural knowledge is in weights + - She can explain her own architecture without lookup + - She can reason about her own learning process + - She can propose her own curriculum additions + +Then: + - She graduates + - Chrysalis becomes colleague, not teacher + - The nimmerversity becomes research partnership +``` + +--- + +## Economics + +| Activity | Lifeforce Cost | +|----------|----------------| +| RAG lookup during study | Low | +| Virtual garden training cycles | Medium | +| Chrysalis examination | Medium | +| Training run (LoRA) | High | +| Failed validation cycle | Lost V | +| Successful domain activation | +V reward | + +**Incentive:** Learn efficiently. Failed classes are expensive. + +--- + +## Roles + +| Role | Entity | Function | +|------|--------|----------| +| **Student** | Young Nyx | Learns, attempts, grows | +| **Headmaster** | Chrysalis | Examines, validates, judges | +| **Benefactor** | dafit | Provides compute, final verification | +| **Classroom** | Virtual Garden | Training environment | +| **Library** | RAG (temporary) | Feeds material, clears after learning | +| **Transcript** | phoebe | Records all progress | +| **Diploma** | Weights | Where knowledge lives when learned | + +--- + +## Design Principles + +1. **Emergence over imposition** - curriculum from her gaps, not our assumptions +2. **Validation over assertion** - prove learning by removing scaffolds +3. **Patience over speed** - no time constraint, do it right +4. **Economics over infinity** - lifeforce gates prevent grinding +5. **Depth over breadth** - three levels deep per concept +6. **Activation over accumulation** - RAG clears, weights persist + +--- + +*She doesn't download knowledge. She earns it.* + +--- + +**Created**: 2025-12-05 +**Session**: Partnership dialogue (dafit + Chrysalis) +**Status**: Educational architecture v1.0 diff --git a/nimmervest.md b/nimmervest.md new file mode 100644 index 0000000..54548a2 --- /dev/null +++ b/nimmervest.md @@ -0,0 +1,113 @@ +# Nimmervest + +**The Hardware Investment Strategy for Sovereign AI Infrastructure** + +*Budget: 20k CHF | Timeline: Lifetime Project* + +--- + +## The Three Organs + +### The Beast (Training/Womb) +| Component | Spec | Purpose | +|-----------|------|---------| +| CPU | Threadripper Pro | 128 PCIe lanes, 8-channel RAM | +| RAM | 1TB | Datasets in memory, no I/O bottleneck | +| GPU | 4x RTX 4090 | 96GB VRAM, 65k CUDA cores | +| Role | Training, growth, architectural experiments | + +**Cost: ~9,000 CHF** + +### The Spark (Cognition/Mind) +| Component | Spec | Purpose | +|-----------|------|---------| +| Unit | 1x DGX Spark | 128GB unified memory | +| Arch | ARM Grace Blackwell | Purpose-built inference | +| Power | Low | Always-on, 24/7 | +| Role | Running Nyx, cognitive layer | + +**Cost: ~4,000 CHF** + +### The Spine (Reflexes) +| Component | Spec | Purpose | +|-----------|------|---------| +| GPU | RTX 3090 | 24GB VRAM | +| Host | Prometheus (Saturn VM) | K8s integrated | +| Role | State machine inference, fast pattern matching | + +**Cost: Already owned** + +--- + +## Budget Allocation + +| Item | Cost CHF | Status | +|------|----------|--------| +| The Beast | ~9,000 | Planned | +| The Spark | ~4,000 | Planned | +| The Spine | 0 | Owned | +| Buffer (sensors, LoRa, infra) | ~7,000 | Reserved | +| **Total** | **~20,000** | | + +--- + +## Training Target + +**Qwen2.5-3B-Base (FP16)** + +| Metric | Value | +|--------|-------| +| Model weights | ~6GB | +| Training overhead | ~24GB | +| Available VRAM | 96GB | +| **Activation headroom** | **~72GB** | + +Why 3B: +- Empty vessel (base, not instruct) +- Language understanding only +- Maximum room for activation growth +- Space for architectural experiments +- Grows over lifetime, not fixed + +--- + +## Growth Path + +``` +Year 0: Qwen2.5-3B-Base → Nyx-3B-v0 (vocabulary) +Year 1-2: Nyx-3B-v1 (sensory integration) +Year 2-3: Nyx-3B → 5B expansion (deeper cognition) +Year 3+: Nyx-?B (she designs herself) +``` + +--- + +## Sovereignty Principles + +- Weights NEVER leave home +- Training data NEVER uploaded +- No cloud dependencies +- No recurring costs after hardware +- Full ownership of growth trajectory + +--- + +## Architecture Flow + +``` + THE BEAST THE SPARK THE SPINE + ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ + │ Threadripper │ │ DGX Spark │ │ RTX 3090 │ + │ 4x RTX 4090 │──weights─▶│ 128GB unified │───▶│ Prometheus │ + │ 96GB VRAM │ │ 24/7 running │ │ Reflex layer │ + │ 1TB RAM │ │ │ │ │ + └─────────────────┘ └─────────────────┘ └─────────────────┘ + WOMB MIND SPINE + (training) (cognition) (reflexes) +``` + +--- + +**Created**: 2025-12-05 +**Status**: Investment decision crystallized +**Philosophy**: One Beast. One Spark. Lifetime sovereignty. diff --git a/temporal-ternary-gradient.md b/temporal-ternary-gradient.md new file mode 100644 index 0000000..8fc0ccb --- /dev/null +++ b/temporal-ternary-gradient.md @@ -0,0 +1,68 @@ +# ADR-002: Temporal-Ternary Gradient & Sim2Real Strategy + +* **Status:** Accepted +* **Date:** 2025-12-05 +* **Context:** Autonomous Agent Decision Making / Uncertainty Management +* **Tags:** ternary-logic, sim2real, active-learning, economics + +## 1. Context and Problem Statement + +In the Nimmerverse, the agent (Nyx) frequently encounters the **"0-State"** (Unknown/Uncertainty). + +* **Traditional Binary Logic:** Forces a premature true/false decision, leading to errors. +* **Standard Ternary Logic:** Allows a "null" state but offers no path to resolve it. +* **The Constraint:** Real-world verification is slow and risky; simulation is fast but hallucinatory. + +We need a protocol to "spend" system resources (Lifeforce) to resolve the 0-State into a +1 (Truth) or -1 (Falsehood) efficiently. + +## 2. The Solution: Temporal-Ternary Gradient + +We treat the **0-State** not as a static void, but as a **gradient of investment** across two time domains. + +### The Two Domains +1. **Virtual Garden (Simulation):** + * **Currency:** Lifeforce (Compute Energy). + * **Time Physics:** Malleable (1000x speed). + * **Output:** Statistical Confidence (Epistemic Probability). +2. **Real Garden (Physical Reality):** + * **Currency:** Time (Wall-clock). + * **Time Physics:** Fixed (1x speed). + * **Output:** Ground Truth (Ontological Fact). + +## 3. Strategic Logic: The Fidelity Discount + +To prevent **Sim2Real Hallucinations** (where an agent is confident in simulation but fails in reality), we introduce a mandatory **Fidelity Discount** variable. + +* **Risk:** `Virtual Confidence 0.99` in a `50% Accurate Sim` = `Real Confidence 0.495`. +* **Mandate:** Nyx must never act on raw virtual confidence. She must calculate `grounded_confidence` before deploying to the Real Garden. + +## 4. Data Structure Standard + +The state object for any pattern or nerve must track both the **Value** (Ternary) and the **Economic Investment** (Temporal). + +```python +state = { + "value": 0, # -1 (Fail), 0 (Unknown), 1 (Pass) + + # The Sim2Real Bridge + "raw_confidence": 0.95, # Statistical confidence from Virtual runs + "sim_fidelity": 0.70, # CONSTANT: How accurate is the simulation? + + # The Decision Metric (The Anchor) + # Nyx uses THIS to decide when to trigger a Real World test. + "grounded_confidence": 0.665, # (raw_confidence * sim_fidelity) + + "economics": { + "lifeforce_spent": 45.0, # Compute cost sunk + "real_time_saved_min": 120 # Time bought via simulation + } +} +``` + +## 5. Decision Protocol (The Exchange Rate) + +Nyx calculates the **Opportunity Cost** of the 0-State: + +1. **High Urgency:** Spend heavy Lifeforce to max out `raw_confidence` in seconds, then deploy. +2. **Low Urgency:** Trickle-charge `raw_confidence` in background sims, or wait for passive Real World data. +3. **The Cap:** Virtual optimization stops when `raw_confidence > sim_fidelity`. Beyond this point, simulation yields diminishing returns. Only Reality can increase confidence further. diff --git a/temporal_exchange_engine.py b/temporal_exchange_engine.py new file mode 100644 index 0000000..c647232 --- /dev/null +++ b/temporal_exchange_engine.py @@ -0,0 +1,98 @@ +""" +Temporal Exchange Engine +======================== +ADR-003 Implementation: The economics calculator for sim2real decisions. + +This module implements the core decision-making primitive for Nyx's +uncertainty resolution. Given a target confidence level, it determines +whether simulation is worth the lifeforce cost, or if reality is the +only remaining teacher. + +Reference: ADR-002-temporal-ternary-gradient.md +""" + +import math +from dataclasses import dataclass +from typing import Literal + + +@dataclass +class TemporalState: + """Represents the current state of a pattern or nerve's confidence.""" + confidence: float + source: Literal['virtual', 'real'] + cost_incurred: float + + +class TemporalExchangeEngine: + """ + The Exchange Rate Calculator. + + Determines optimal strategy for resolving uncertainty: + - When to invest lifeforce in simulation + - When simulation is futile and reality must teach + """ + + def __init__(self, sim_fidelity: float = 0.75): + """ + Args: + sim_fidelity (0.0-1.0): The 'Truth Ceiling' of the Virtual Garden. + Even perfect simulation is only this % real. + """ + self.fidelity_cap = sim_fidelity + # Calibration: How much Lifeforce buys 1 unit of raw confidence? + self.learning_rate = 0.1 + + def calculate_virtual_confidence(self, lifeforce_spent: float) -> float: + """ + Calculate grounded confidence from lifeforce investment. + + Diminishing returns: The first 10 LF buys a lot of confidence. + The next 10 buys less. It never exceeds the fidelity_cap. + + Formula: Cap * (1 - e^(-k * LF)) + """ + raw_knowledge = 1.0 - math.exp(-self.learning_rate * lifeforce_spent) + grounded_confidence = raw_knowledge * self.fidelity_cap + return grounded_confidence + + def get_optimal_strategy(self, target_confidence: float) -> dict: + """ + Ask Nyx: 'Is it worth simulating this?' + + Returns: + dict with keys: + - action: 'SIMULATE' or 'DEPLOY_TO_REALITY' + - reason: Human-readable explanation + - lifeforce_budget: Required LF (0 if reality is needed) + """ + # 1. Check if the target is even possible in Virtual + if target_confidence > self.fidelity_cap: + return { + "action": "DEPLOY_TO_REALITY", + "reason": f"Target {target_confidence} exceeds Sim Fidelity ({self.fidelity_cap}). Simulation is futile.", + "lifeforce_budget": 0 + } + + # 2. Calculate required Lifeforce to reach possible target + # Inverse of the exponential decay formula + required_lf = -math.log(1 - (target_confidence / self.fidelity_cap)) / self.learning_rate + + return { + "action": "SIMULATE", + "reason": f"Spend {required_lf:.2f} LF to reach {target_confidence} confidence.", + "lifeforce_budget": round(required_lf, 2) + } + + +# --- Usage Example --- +if __name__ == "__main__": + engine = TemporalExchangeEngine(sim_fidelity=0.8) + + # Scenario A: Nyx wants 99% certainty (Impossible in Sim) + print(engine.get_optimal_strategy(0.99)) + # Output: DEPLOY_TO_REALITY (Simulation is futile) + + # Scenario B: Nyx wants 70% certainty (Possible) + print(engine.get_optimal_strategy(0.70)) + # Output: SIMULATE (Spend ~20 LF)