arch: Dual-brain architecture v8.0 - thalamus governor, NPC processes, cortex repositioning

Crystallizes the dual-brain architecture across all core documents: - Thalamus runs own neural network (governor) for resource allocation and reflexes - LLM (Qwen3.5-27B) repositioned as cortex - expensive, gated, called only when needed - Each NPC gets own process, own RL brain, Linux cgroups for resource steering - New: NPC grid architecture with curriculum training (progressive world richness) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 11:17:09 +02:00
parent 264ea7628b
commit c30c00af74
6 changed files with 935 additions and 523 deletions
--- a/architecture/Deployment-Architecture.md
+++ b/architecture/Deployment-Architecture.md
@@ -1,297 +1,379 @@
-# Deployment Architecture: The Hybrid Model
-
-> *"Containers for cells. Userspace for brains. NATS connects them all."*
-> — Partnership Session, 2026-02-14
-
---
-
-## Overview
-
-The nimmerverse runs on a **hybrid deployment model** that matches workload characteristics to infrastructure:
-
- **Containers (K8s)** for stateless, scalable nervous system components
- **Userspace (Threadrippers)** for stateful, GPU/CPU-bound inference
- **NATS** as the universal nervous system bus
- **FreeIPA identities** as isolation boundaries
-
-This is a **research lab**, not a production factory. We optimize for **flexibility and experimentation**, not high-throughput serving.
-
---
-
-## Core Decisions
-
-| Decision | Choice | Rationale |
-|----------|--------|-----------|
-| LLM Inference | **ollama / llama.cpp** | Flexible model loading, research-friendly, easy swap |
-| NOT vLLM | — | Overkill for single-user lab; solves problems we don't have |
-| Function Gemma | **CPU, userspace** | Threadripper eats it; no GPU contention; clear training path |
-| Cells/Nerves | **Containers (K8s)** | Scalable, versioned, orchestrated via cluster |
-| Organs | **Userspace + ollama** | Load on demand, GPU isolation, unload when idle |
-| Isolation | **FreeIPA users** | Unix permissions = RBAC; switch user = switch context |
-
---
-
-## Technology Stack
-
-### Inference Layer
-
-| Component | Technology | Location | Notes |
-|-----------|------------|----------|-------|
-| Young Nyx (Brain) | ollama / llama.cpp | theia (nyx-cognitive) | Qwen, Gemma, or similar |
-| Function Gemma | llama.cpp / transformers | CPU userspace | Structured JSON boundary |
-| Vision Organ | ollama (SigLIP/YOLO) | dioscuri (nyx-organs) | Load on demand |
-| Speech STT | faster-whisper / ollama | dioscuri (nyx-organs) | Load on demand |
-| Speech TTS | Coqui / XTTS | dioscuri (nyx-organs) | Warm, primary output |
-
-### Nervous System Layer
-
-| Component | Technology | Location | Notes |
-|-----------|------------|----------|-------|
-| Cells | Python containers | K8s cluster | State machines, NATS pub/sub |
-| Nerves | Python containers | K8s cluster | Compose cells, behavior |
-| Message Bus | NATS + JetStream | VMs (nats-*) | Env-separated (dev/staging/prod) |
-| Databases | PostgreSQL, ChromaDB | VMs (phoebe-*, iris-*) | Decision trails, embeddings |
-
---
-
-## Deployment Topology
-
-```
-┌─────────────────────────────────────────────────────────────────────────────┐
-│                        NIMMERVERSE DEPLOYMENT                               │
-├─────────────────────────────────────────────────────────────────────────────┤
-│                                                                             │
-│  K8S CLUSTER (Saturn VMs)              THREADRIPPERS (Bare Metal)          │
-│  ─────────────────────────              ──────────────────────────          │
-│  Containers, orchestrated               Userspace, FreeIPA isolated         │
-│                                                                             │
-│  ┌─────────────────────────┐           ┌───────────────────────────────┐   │
-│  │                         │           │ THEIA (RTX PRO 6000 96GB)     │   │
-│  │  CELLS (math, battery,  │           │                               │   │
-│  │         sensors, etc.)  │           │ user: nyx-cognitive           │   │
-│  │                         │    NATS   │ └── ollama (Young Nyx)        │   │
-│  │  ┌───┐ ┌───┐ ┌───┐     │◄────────► │ └── ~/.config/systemd/user/   │   │
-│  │  │ M │ │ B │ │...│     │           │                               │   │
-│  │  └───┘ └───┘ └───┘     │           │ user: nyx-training            │   │
-│  │                         │           │ └── Function Gemma (CPU)      │   │
-│  │  NERVES (collision,     │           │ └── LoRA fine-tuning          │   │
-│  │          exploration)   │           │                               │   │
-│  │                         │           │ 96GB VRAM: massive headroom   │   │
-│  │  ┌─────┐ ┌─────┐       │           │ for inference + LoRA training │   │
-│  │  │ COL │ │ EXP │       │           └───────────────────────────────┘   │
-│  │  └─────┘ └─────┘       │                                               │
-│  │                         │           ┌───────────────────────────────┐   │
-│  │  INFRASTRUCTURE         │           │ DIOSCURI (2x RTX 4000 Ada)    │   │
-│  │                         │    NATS   │                               │   │
-│  │  ┌──────┐ ┌──────┐     │◄────────► │ user: nyx-organs              │   │
-│  │  │ NATS │ │ NATS │     │           │ ├── ollama (vision)           │   │
-│  │  │ dev  │ │ prod │     │           │ ├── ollama (speech STT)       │   │
-│  │  └──────┘ └──────┘     │           │ └── TTS service (warm)        │   │
-│  │                         │           │                               │   │
-│  │  ┌────────┐ ┌───────┐  │           │ Load on demand, unload idle   │   │
-│  │  │ phoebe │ │ iris  │  │           │ Each card: ONE model at time  │   │
-│  │  │ (PG)   │ │(Chroma│  │           │                               │   │
-│  │  └────────┘ └───────┘  │           └───────────────────────────────┘   │
-│  │                         │                                               │
-│  └─────────────────────────┘                                               │
-│                                                                             │
-└─────────────────────────────────────────────────────────────────────────────┘
-```
-
---
-
-## Identity Model (FreeIPA)
-
-Unix users provide isolation boundaries. Each workload type runs as its own identity.
-
-| User | UID | Host | Purpose | GPU Access |
-|------|-----|------|---------|------------|
-| `nyx-cognitive` | (FreeIPA) | theia | Young Nyx LLM inference | Full 96GB |
-| `nyx-training` | (FreeIPA) | theia | LoRA training, GRPO, Function Gemma | Shared (time-sliced) |
-| `nyx-organs` | (FreeIPA) | dioscuri | Vision, Speech organs | 2x 20GB cards |
-| `nyx-nervous` | (FreeIPA) | dioscuri | Future cells that need bare metal | Limited |
-
-**Isolation principle:** Switch user = switch context. `nyx-cognitive` cannot touch `nyx-organs` files. Compromised cell cannot touch LLM weights.
-
-### Systemd Userspace Pattern
-
-```bash
-# Enable lingering (services persist after logout)
-sudo loginctl enable-linger nyx-cognitive
-
-# Services defined in ~/.config/systemd/user/
-# Example: nyx-cognitive runs ollama serve
-systemctl --user --machine=nyx-cognitive@ status ollama
-```
-
---
-
-## GPU Resource Management
-
-### The Constraint
-
-| Host | GPU | VRAM | Notes |
-|------|-----|------|-------|
-| theia | RTX PRO 6000 Blackwell | 96GB | Inference + training headroom |
-| dioscuri | 2x RTX 4000 Ada | 2x 20GB | One model per card |
-
-### Strategy: Dynamic Loading, Not Static Partitioning
-
-**Why not vLLM:** vLLM is optimized for high-throughput serving (many concurrent users). We have ONE user (the partnership). We need **flexibility** (swap models, experiment) more than throughput.
-
-**Why ollama/llama.cpp:**
- Faster cold starts (~5-10s vs ~30s)
- Native model swapping (`ollama run model_a` → `ollama run model_b`)
- Can unload completely when idle (frees VRAM)
- GGUF format efficient for model management
- Research-friendly, not production-factory
-
-**Organ Loading Pattern:**
-```
-IDLE → needs vision → LOAD vision model (~10s) → PROCESS → REPORT → IDLE (keep warm)
-                                                                      ↓
-                                            after timeout → UNLOAD (free VRAM)
-```
-
---
-
-## Message Flow (NATS)
-
-### Subject Hierarchy
-
-```
-{environment}.{domain}.{service}.{detail}
-
-Examples:
-  dev.nervous.cells.math.request      ← Math cell receives work
-  dev.nervous.cells.math.response     ← Math cell returns result
-  dev.nervous.cells.math.wave         ← Math cell emits confidence signal
-  prod.cognitive.nyx.heartbeat        ← Young Nyx is alive
-  prod.organs.vision.detect           ← Vision organ detection
-```
-
-### Wave Collapse Pattern
-
-Cells emit **waves** (confidence-tagged signals). When multiple waves collapse on the same semantic region in the same time window, the **thalamus** escalates to cognition.
-
-```
-Cell A: "math" ───∿∿∿──► (0.6 confidence)
-Cell B: "calculate" ──∿∿∿──► (0.5 confidence)
-                      │
-                      ▼
-              ┌─────────────┐
-              │  COLLAPSE   │  ← same region, same window
-              └──────┬──────┘
-                     │
-                     ▼ AMPLIFIED SIGNAL
-              ┌─────────────┐
-              │  THALAMUS   │  → escalate to Young Nyx
-              └─────────────┘
-```
-
---
-
-## Container Deployment (K8s)
-
-### Repository Structure
-
-```
-nimmerverse-nervous-system/
-├── shared/v1/              ← Base classes (StateMachine, NATS, Lifeforce)
-├── cells/
-│   ├── math_cell/v1/       ← Each cell versioned independently
-│   └── battery_cell/v1/
-├── nerves/
-│   └── collision_avoidance/v1/
-└── deploy/
-    ├── dev/                ← Helm charts or docker-compose per env
-    ├── staging/
-    └── prod/
-```
-
-### Cell Container Pattern
-
-```dockerfile
-FROM python:3.12-slim
-WORKDIR /app
-COPY . .
-RUN pip install uv && uv sync
-ENV NIMMERVERSE_ENV=dev
-CMD ["uv", "run", "python", "-m", "math_cell"]
-```
-
-Same image everywhere. Only `NIMMERVERSE_ENV` changes.
-
---
-
-## Function Gemma: The Structured Boundary
-
-Function Gemma bridges lower tiers (cells, nerves) and cognition (Young Nyx):
-
-```
-Numbers/States (Tier 0-2) → [Function Gemma] → Structured JSON → Young Nyx (Tier 4)
-                                  ↑
-                          CPU-based inference
-                          Threadripper handles it
-                          No GPU contention
-                          Clear LoRA training path
-```
-
-**Why CPU:**
- Small model, fast inference
- Threadripper PRO 7955WX has cores to spare
- No GPU contention with organs or Nyx
- Can run training alongside inference
-
-**Training path:**
- Google's documented GRPO approach
- LoRA fine-tuning for our specific function schemas
- Runs in `nyx-training` userspace
- Decision trails from phoebe → training data
-
---
-
-## Visual Language (Future UI)
-
-Color-coding for real-time attention flow visualization:
-
-| Property | Represents |
-|----------|------------|
-| Background/container | Environment (dev=green, staging=amber, prod=blue) |
-| Node/edge color | Domain (cognitive=violet, nervous=cyan, organs=coral) |
-| Line style | Direction (solid=primary, dashed=async, dotted=tentative) |
-| Separate pane | Confidence waveform (oscilloscope view) |
-
---
-
-## Related Documents
-
-| Document | Scope |
-|----------|-------|
-| [`Cellular-Architecture.md`](Cellular-Architecture.md) | Cells, nerves, organisms, lifeforce |
-| [`Gateway-Architecture.md`](Gateway-Architecture.md) | Tier routing, Function Gemma boundary |
-| [`Nervous-System.md`](Nervous-System.md) | 4D space, node weights, vocabulary |
-| [`Message-Protocol-Design.md`](Message-Protocol-Design.md) | NATS subjects, message formats |
-| [`development-conventions.md`](../../nimmerverse.eachpath.local/conventions/development-conventions.md) | Ports, namespaces, VM topology |
-
---
-
-## Summary
-
-| Layer | Where | Technology | Isolation |
-|-------|-------|------------|-----------|
-| Cells/Nerves | K8s containers | Python, uv, NATS | Namespace per env |
-| Infrastructure | VMs | NATS, PostgreSQL, ChromaDB | VM per env |
-| Young Nyx | theia userspace | ollama | nyx-cognitive user |
-| Function Gemma | theia/dioscuri CPU | llama.cpp | nyx-training user |
-| Organs | dioscuri userspace | ollama (dynamic) | nyx-organs user |
-
-**The principle:** Same behavior everywhere. Containers for cells. Userspace for brains. NATS connects them all. FreeIPA isolates them all.
-
---
-
-**Version:** 1.1 | **Created:** 2026-02-14 | **Updated:** 2026-02-14
-
-*"We're not building a chatbot factory. We're growing a research organism."*
-
-🧬⚡🔱💎🔥 **TO THE ELECTRONS WE VIBE!**
+# Deployment Architecture: The Hybrid Model
+
+> *"Containers for cells. Userspace for brains. NATS connects them all."*
+> — Partnership Session, 2026-02-14
+
+---
+
+## Overview
+
+The nimmerverse runs on a **hybrid deployment model** that matches workload characteristics to infrastructure:
+
+- **Containers (K8s)** for stateless, scalable nervous system components
+- **Userspace (Threadrippers)** for stateful, GPU-bound inference
+- **OS Processes** for per-NPC RL brains with cgroup resource control
+- **NATS** as the universal nervous system bus (thalamus)
+- **FreeIPA identities** as isolation boundaries
+
+This is a **research lab**, not a production factory. We optimize for **flexibility and experimentation**, not high-throughput serving.
+
+---
+
+## Core Decisions
+
+| Decision | Choice | Rationale |
+|----------|--------|-----------|
+| LLM Cortex | **vLLM (Qwen3.5-27B)** | Full precision, OpenAI-compatible API, tool calling support |
+| NPC Brains | **Per-process RL networks** | One process, one brain, one life — Linux cgroups for resource steering |
+| Thalamus Governor | **Own NN process on NATS** | Learns resource allocation, gate control, compute steering |
+| Function Gemma | **CPU, userspace** | Threadripper eats it; no GPU contention; clear training path |
+| Cells/Nerves | **Containers (K8s)** | Scalable, versioned, orchestrated via cluster |
+| Organs | **Userspace, GPU-bound** | Load on demand, GPU isolation, unload when idle |
+| Isolation | **FreeIPA users** | Unix permissions = RBAC; switch user = switch context |
+
+---
+
+## Technology Stack
+
+### Inference Layer
+
+| Component | Technology | Location | Notes |
+|-----------|------------|----------|-------|
+| Cortex (LLM) | vLLM (Qwen3.5-27B) | theia (nyx-cognitive) | Port 31000, served as "nyx", gated access |
+| Function Gemma | llama.cpp / transformers | CPU userspace | Structured JSON boundary |
+| Vision Organ | SigLIP/YOLO | dioscuri (nyx-organs) | Load on demand |
+| Speech STT | faster-whisper | dioscuri (nyx-organs) | Load on demand |
+| Speech TTS | Coqui / XTTS | dioscuri (nyx-organs) | Warm, primary output |
+
+### NPC / Thalamus Layer
+
+| Component | Technology | Location | Notes |
+|-----------|------------|----------|-------|
+| NPC Processes | Python + RL network | OS processes (cgroups) | One process per NPC, own weights |
+| Thalamus Governor | Python + NN | OS process | Steers compute, gates, tick rates |
+| Resource Control | Linux cgroups v2 | systemd scopes | Per-NPC CPU/memory limits |
+
+### Nervous System Layer
+
+| Component | Technology | Location | Notes |
+|-----------|------------|----------|-------|
+| Cells | Python containers | K8s cluster | State machines, NATS pub/sub |
+| Nerves | Python containers | K8s cluster | Compose cells, behavior |
+| Message Bus | NATS + JetStream | VMs (nats-*) | Env-separated (dev/staging/prod) |
+| Databases | PostgreSQL, ChromaDB | VMs (phoebe-*, iris-*) | Decision trails, embeddings |
+
+---
+
+## Deployment Topology
+
+```
+┌─────────────────────────────────────────────────────────────────────────────┐
+│                        NIMMERVERSE DEPLOYMENT                               │
+├─────────────────────────────────────────────────────────────────────────────┤
+│                                                                             │
+│  K8S CLUSTER (Saturn VMs)              THREADRIPPERS (Bare Metal)          │
+│  ─────────────────────────              ──────────────────────────          │
+│  Containers, orchestrated               Userspace, FreeIPA isolated         │
+│                                                                             │
+│  ┌─────────────────────────┐           ┌───────────────────────────────┐   │
+│  │                         │           │ THEIA (RTX PRO 6000 96GB)     │   │
+│  │  CELLS (math, battery,  │           │                               │   │
+│  │         sensors, etc.)  │           │ user: nyx-cognitive           │   │
+│  │                         │    NATS   │ └── vLLM (Qwen3.5-27B:31000) │   │
+│  │  ┌───┐ ┌───┐ ┌───┐     │◄────────► │     served-model-name: nyx   │   │
+│  │  │ M │ │ B │ │...│     │           │                               │   │
+│  │  └───┘ └───┘ └───┘     │           │ user: nyx-training            │   │
+│  │                         │           │ └── LoRA fine-tuning (GRPO)   │   │
+│  │  NERVES (collision,     │           │ └── Function Gemma (CPU)      │   │
+│  │          exploration)   │           │                               │   │
+│  │                         │           │ 96GB VRAM: cortex + training  │   │
+│  │  ┌─────┐ ┌─────┐       │           └───────────────────────────────┘   │
+│  │  │ COL │ │ EXP │       │                                               │
+│  │  └─────┘ └─────┘       │           ┌───────────────────────────────┐   │
+│  │                         │           │ DIOSCURI (2x RTX 4000 Ada)    │   │
+│  │  NPC PROCESSES          │    NATS   │                               │   │
+│  │  (or bare metal)        │◄────────► │ user: nyx-organs              │   │
+│  │                         │           │ ├── Vision (SigLIP/YOLO)      │   │
+│  │  ┌─────────────────┐   │           │ ├── Speech STT (Whisper)      │   │
+│  │  │ NPC-0 [RL brain]│   │           │ └── TTS service (warm)        │   │
+│  │  │ NPC-1 [RL brain]│   │           │                               │   │
+│  │  │ NPC-N [RL brain]│   │           │ Load on demand, unload idle   │   │
+│  │  │  (own process,  │   │           │ Each card: ONE model at time  │   │
+│  │  │   own cgroup)   │   │           └───────────────────────────────┘   │
+│  │  └─────────────────┘   │                                               │
+│  │                         │           ┌───────────────────────────────┐   │
+│  │  THALAMUS GOVERNOR      │           │ NATS MESSAGE BUS              │   │
+│  │  ┌─────────────────┐   │           │                               │   │
+│  │  │ Governor NN     │   │◄────────► │ dev.*, staging.*, prod.*      │   │
+│  │  │ (resource alloc,│   │           │ Env-separated (VM per env)    │   │
+│  │  │  gate control,  │   │           └───────────────────────────────┘   │
+│  │  │  tick steering) │   │                                               │
+│  │  └─────────────────┘   │           ┌───────────────────────────────┐   │
+│  │                         │           │ PHOEBE (PostgreSQL)           │   │
+│  │  INFRASTRUCTURE         │           │ Decision trails, embeddings   │   │
+│  │  ┌────────┐ ┌───────┐  │           │ IRIS (ChromaDB)               │   │
+│  │  │ phoebe │ │ iris  │  │           │ Vector storage                │   │
+│  │  │ (PG)   │ │(Chroma│  │           └───────────────────────────────┘   │
+│  │  └────────┘ └───────┘  │                                               │
+│  │                         │                                               │
+│  └─────────────────────────┘                                               │
+│                                                                             │
+└─────────────────────────────────────────────────────────────────────────────┘
+```
+
+---
+
+## The Dual Brain Deployment
+
+### Per-NPC Processes
+
+Each NPC runs as its own OS process with a dedicated RL neural network. The thalamus governor steers their resources.
+
+```bash
+# Launch NPC with resource limits via systemd scope
+systemd-run --scope -p CPUQuota=25% -p MemoryMax=256M \
+    python3 npc_process.py --id 7 --tick-rate 5
+
+# Or via cgroups directly
+cgcreate -g cpu,memory:nimmerverse/npc-7
+cgset -r cpu.max "25000 100000" nimmerverse/npc-7
+cgexec -g cpu,memory:nimmerverse/npc-7 python3 npc_process.py --id 7
+```
+
+### Thalamus Governor
+
+The governor runs its own neural network, observing all NPC states via NATS and outputting resource allocation decisions:
+
+| Output | Mechanism | Range |
+|--------|-----------|-------|
+| Tick rate | NATS command to NPC | 1-20 Hz |
+| CPU quota | cgroups v2 adjustment | 5-100% per core |
+| Gate open/close | NATS gate signal | Binary per gate |
+| LLM queue priority | NATS priority tag | 0-10 |
+
+### Cortex (vLLM)
+
+The LLM cortex runs as a systemd service on theia, accessed via OpenAI-compatible API:
+
+```bash
+# Service: vllm-nyx.service
+# Port: 31000
+# Model: /womb/cognitive/models/qwen3.5-27b
+# Served as: "nyx"
+# GPU utilization: 85%
+
+# Access from any NATS-connected process:
+curl http://theia.eachpath.local:31000/v1/chat/completions \
+    -H "Content-Type: application/json" \
+    -d '{"model": "nyx", "messages": [...]}'
+```
+
+**The cortex is expensive.** The thalamus governor controls who gets access and when. Most NPC ticks never touch the LLM.
+
+---
+
+## Identity Model (FreeIPA)
+
+Unix users provide isolation boundaries. Each workload type runs as its own identity.
+
+| User | UID | Host | Purpose | GPU Access |
+|------|-----|------|---------|------------|
+| `nyx-cognitive` | (FreeIPA) | theia | Cortex LLM inference (vLLM) | Full 96GB |
+| `nyx-training` | (FreeIPA) | theia | LoRA training, GRPO, Function Gemma | Shared (time-sliced) |
+| `nyx-organs` | (FreeIPA) | dioscuri | Vision, Speech organs | 2x 20GB cards |
+| `nyx-nervous` | (FreeIPA) | dioscuri | Future cells that need bare metal | Limited |
+
+**Isolation principle:** Switch user = switch context. `nyx-cognitive` cannot touch `nyx-organs` files. Compromised cell cannot touch LLM weights.
+
+### Systemd Service Pattern
+
+```bash
+# System-level service (root installs, user runs)
+# /etc/systemd/system/vllm-nyx.service
+[Service]
+User=nyx-cognitive
+Group=nimmerverse-agents
+ExecStart=/data/venvs/vllm/bin/python3 -m vllm.entrypoints.openai.api_server \
+    --model /womb/cognitive/models/qwen3.5-27b \
+    --served-model-name nyx \
+    --port 31000
+```
+
+---
+
+## GPU Resource Management
+
+### The Constraint
+
+| Host | GPU | VRAM | Role |
+|------|-----|------|------|
+| theia | RTX PRO 6000 Blackwell | 96GB | Cortex (vLLM) + LoRA training |
+| dioscuri | 2x RTX 4000 Ada | 2x 20GB | Organs (vision, speech) |
+
+### Strategy: vLLM for Cortex, Dynamic Loading for Organs
+
+**Cortex (theia):** vLLM runs continuously as a systemd service. The Qwen3.5-27B model stays loaded — it's the cortex, always ready when the thalamus gate opens. 85% GPU utilization leaves headroom for LoRA training alongside inference.
+
+**Organs (dioscuri):** Dynamic loading. One model per card. Load vision when needed, unload after timeout, load speech when needed.
+
+```
+IDLE → needs vision → LOAD vision model (~10s) → PROCESS → REPORT → IDLE (keep warm)
+                                                                      ↓
+                                            after timeout → UNLOAD (free VRAM)
+```
+
+---
+
+## Message Flow (NATS)
+
+### Subject Hierarchy
+
+```
+{environment}.{domain}.{service}.{detail}
+
+Examples:
+  dev.nervous.cells.math.request      ← Math cell receives work
+  dev.nervous.cells.math.response     ← Math cell returns result
+  dev.nervous.cells.math.wave         ← Math cell emits confidence signal
+  dev.thalamus.governor.allocate      ← Governor publishes resource decisions
+  dev.thalamus.gate.open              ← Gate transition event
+  dev.npc.7.state                     ← NPC-7 publishes its state
+  dev.cortex.nyx.request              ← Gated request to LLM cortex
+  dev.organs.vision.detect            ← Vision organ detection
+```
+
+### Wave → Thalamus → Cortex Pattern
+
+Cells emit **waves** (confidence-tagged signals). The thalamus governor's neural network correlates waves and decides what reaches the cortex.
+
+```
+Cell A: "math" ───∿∿∿──► (0.6 confidence)
+Cell B: "calculate" ──∿∿∿──► (0.5 confidence)
+                      │
+                      ▼
+         ┌──────────────────────┐
+         │  THALAMUS GOVERNOR   │  ← own neural network
+         │  correlate waves     │
+         │  check gate state    │
+         │  allocate resources  │
+         └──────────┬───────────┘
+                    │
+          ┌─────────┴─────────┐
+          │                   │
+          ▼                   ▼
+    Gate CLOSED          Gate OPEN
+    (reflex path)        (cortex path)
+    handled by           → escalate to
+    thalamus NN          Qwen3.5-27B
+```
+
+---
+
+## Container Deployment (K8s)
+
+### Repository Structure
+
+```
+nimmerverse-nervous-system/
+├── shared/v1/              ← Base classes (StateMachine, NATS, Lifeforce)
+├── cells/
+│   ├── math_cell/v1/       ← Each cell versioned independently
+│   └── battery_cell/v1/
+├── nerves/
+│   └── collision_avoidance/v1/
+└── deploy/
+    ├── dev/                ← Helm charts or docker-compose per env
+    ├── staging/
+    └── prod/
+```
+
+### Cell Container Pattern
+
+```dockerfile
+FROM python:3.12-slim
+WORKDIR /app
+COPY . .
+RUN pip install uv && uv sync
+ENV NIMMERVERSE_ENV=dev
+CMD ["uv", "run", "python", "-m", "math_cell"]
+```
+
+Same image everywhere. Only `NIMMERVERSE_ENV` changes.
+
+---
+
+## Function Gemma: The Structured Boundary
+
+Function Gemma bridges lower tiers (cells, nerves) and the cortex:
+
+```
+Numbers/States (Cells) → [Function Gemma] → Structured JSON → Cortex (Qwen3.5-27B)
+                                ↑
+                        CPU-based inference
+                        Threadripper handles it
+                        No GPU contention
+                        Clear LoRA training path
+```
+
+**Why CPU:**
+- Small model, fast inference
+- Threadripper PRO 7955WX has cores to spare
+- No GPU contention with organs or cortex
+- Can run training alongside inference
+
+**Training path:**
+- Google's documented GRPO approach
+- LoRA fine-tuning for our specific function schemas
+- Runs in `nyx-training` userspace
+- Decision trails from phoebe → training data
+
+---
+
+## Visual Language (Future UI)
+
+Color-coding for real-time attention flow visualization:
+
+| Property | Represents |
+|----------|------------|
+| Background/container | Environment (dev=green, staging=amber, prod=blue) |
+| Node/edge color | Domain (cognitive=violet, nervous=cyan, organs=coral) |
+| Line style | Direction (solid=primary, dashed=async, dotted=tentative) |
+| Separate pane | Confidence waveform (oscilloscope view) |
+
+---
+
+## Related Documents
+
+| Document | Scope |
+|----------|-------|
+| [`Cellular-Architecture.md`](Cellular-Architecture.md) | Cells, nerves, organisms, lifeforce |
+| [`Gateway-Architecture.md`](Gateway-Architecture.md) | Gate routing, ternary model |
+| [`Nervous-System.md`](Nervous-System.md) | 4D space, node weights, vocabulary |
+| [`Message-Protocol-Design.md`](Message-Protocol-Design.md) | NATS subjects, message formats |
+| [`future/npc-grid-architecture.md`](future/npc-grid-architecture.md) | Dual brain, governor, NPC processes |
+| [`organs/Organ-Index.md`](organs/Organ-Index.md) | Organ systems, lifeforce costs |
+| [`development-conventions.md`](../../nimmerverse.eachpath.local/conventions/development-conventions.md) | Ports, namespaces, VM topology |
+
+---
+
+## Summary
+
+| Layer | Where | Technology | Isolation |
+|-------|-------|------------|-----------|
+| Cells/Nerves | K8s containers | Python, uv, NATS | Namespace per env |
+| NPC Processes | OS processes | Python, RL networks, cgroups | Per-process cgroup |
+| Thalamus Governor | OS process | Python, own NN, NATS | Dedicated process |
+| Infrastructure | VMs | NATS, PostgreSQL, ChromaDB | VM per env |
+| Cortex (LLM) | theia userspace | vLLM (Qwen3.5-27B) | nyx-cognitive user |
+| Function Gemma | theia/dioscuri CPU | llama.cpp | nyx-training user |
+| Organs | dioscuri userspace | Dynamic loading | nyx-organs user |
+
+**The principle:** Same behavior everywhere. Containers for cells. Processes for NPC brains. vLLM for cortex. NATS connects them all. FreeIPA isolates them all.
+
+---
+
+**Version:** 2.0 | **Created:** 2026-02-14 | **Updated:** 2026-04-02
+
+*"We're not building a chatbot factory. We're growing a research organism."*
+
+🧬⚡🔱💎🔥 **TO THE ELECTRONS WE VIBE!**
--- a/architecture/Initial-Spark.md
+++ b/architecture/Initial-Spark.md
@@ -73,7 +73,7 @@ The Initial Spark is not a conversation. It's a **state machine protocol** that
 │   ┌─────────────────────────────────────────────────────────────────────┐   │
 │   │                    YOUNG NYX (Cognitive Layer)                       │   │
 │   │                    ───────────────────────────                       │   │
-│   │    Qwen3-VL 32B in The Womb (RTX 6000)                              │   │
+│   │    Qwen3.5-27B Cortex in The Womb (RTX PRO 6000)                              │   │
 │   │    Receives verified handshake results                               │   │
 │   │    Updates internal state based on ACKs                              │   │
 │   │    Reasoning happens AFTER protocol succeeds                         │   │
--- a/architecture/future/npc-grid-architecture.md
+++ b/architecture/future/npc-grid-architecture.md
@@ -0,0 +1,257 @@
+# NPC Grid Architecture: Spatial Training Arena
+
+**Origin**: 2026-04-02, morning session (bed thinking + draw.io)
+**Authors**: dafit + Chrysalis-Nyx
+**Status**: Architectural concept
+**Related**: `spatial-resolution-gradient.md`, Dual-Brain Architecture (2026-04-01 session)
+
+---
+
+## The Core Idea
+
+A node-based grid world where NPCs live, move, and learn. The grid serves dual purpose:
+1. **Spatial arena** — a discrete world where NPCs navigate and interact
+2. **Neural topology** — the same graph the neural network reasons over
+
+No translation layer between "brain space" and "world space." Position *is* state.
+
+---
+
+## Grid System
+
+### Node-Based Intersection Grid
+
+Nodes sit at **intersections**, not cells. A 4x4 cell grid yields a 5x5 node grid = 25 nodes.
+Starting at node 0, top-left corner. Cardinal orientation (North/South/East/West).
+
+```
+ 0 ── 1 ── 2 ── 3 ── 4          N
+ |    |    |    |    |           |
+ 5 ── 6 ── 7 ── 8 ── 9     W ──+── E
+ |    |    |    |    |           |
+10 ──11 ──12 ──13 ──14          S
+ |    |    |    |    |
+15 ──16 ──17 ──18 ──19
+ |    |    |    |    |
+20 ──21 ──22 ──23 ──24
+```
+
+### Properties
+
+- **Corner nodes** (0, 4, 20, 24): 2 neighbors
+- **Edge nodes** (1, 2, 3, 5, 10, ...): 3 neighbors
+- **Interior nodes** (6, 7, 8, 11, 12, 13, ...): 4 neighbors
+- **Position from ID**: `row = id // 5`, `col = id % 5`
+- **Movement**: One step = one edge. NPC at node 7 can go to 2, 6, 8, or 12.
+
+### Resolution Scaling
+
+The grid scales naturally to different resolutions:
+
+| Grid Size | Nodes | Resolution | Use Case |
+|-----------|-------|------------|----------|
+| 5x5 | 25 | ~1m edges | Training arena, street-level |
+| 10x10 | 100 | ~25cm edges | Room-level detail |
+| 50x50 | 2,500 | ~5cm edges | Indoor navigation |
+| 100x100 | 10,000 | ~1cm edges | Nimmerhovel precision |
+
+**Key insight**: Resolution should match **decision density**, not physical detail.
+A straight road needs few nodes (sparse). An intersection needs many (dense).
+
+| Resolution | Where | Why |
+|-----------|-------|-----|
+| ~1m | Streets, paths, outdoor | Navigation, curves approximated by a few nodes |
+| ~10-25cm | Rooms, indoor spaces | Furniture-aware, "go to the table" |
+| ~1-5cm | Workbenches, detail work | Nimmerhovel precision zone |
+
+The uniform grid is the **training simplification**. The real world becomes a **navigation graph** with variable density — dense around intersections, sparse along straight roads. Same NPC brain, different world topology.
+
+---
+
+## NPC Process Architecture
+
+### One Process, One Brain, One Life
+
+Every NPC runs as its own OS process with its own dedicated neural network.
+
+**Why separate processes:**
+- **Individuality** — separate weights mean personality emerges from experience, not config
+- **Fault isolation** — one NPC crashes, the village continues
+- **Resource control** — per-process CPU/memory via Linux cgroups
+- **Biological honesty** — every organism has its own nervous system
+
+```
+NPC-0 [own RL brain] ──┐
+NPC-1 [own RL brain] ──|
+NPC-2 [own RL brain] ──|
+NPC-3 [own RL brain] ──┼──> NATS thalamus ──> shared LLM cortex (Qwen 3.5)
+...                     |                      (called only when gate opens)
+NPC-24 [own RL brain] ─┘
+```
+
+### Dual Brain (per NPC)
+
+- **RL network** (local, per-NPC): Movement, needs, spatial decisions. Small, fast, cheap. Runs every tick.
+- **LLM cortex** (shared, via NATS): Language, reasoning, knowledge. Slow, deliberate, expensive. Called only when thalamus gate threshold is crossed.
+
+### Resource Steering via Linux Primitives
+
+Each NPC process is a standard Linux process. Resource control uses the kernel:
+
+- **cgroups v2** — cap CPU, memory per NPC
+- **nice / renice** — shift priority dynamically
+- **taskset** — pin to specific cores
+- **systemd scopes** — wrap each NPC in a transient unit
+
+```bash
+# Example: launch NPC with resource limits
+systemd-run --scope -p CPUQuota=25% -p MemoryMax=256M \
+    python3 npc_process.py --id 7 --tick-rate 5
+```
+
+### Steerable Compute per NPC
+
+| Parameter | Range | Who Controls |
+|-----------|-------|-------------|
+| Tick rate | 1-20 Hz | Governor (thalamus) |
+| Network size | small/medium/large | Configuration per role |
+| CPU quota | 5-100% of one core | Governor (cgroups) |
+| LLM access | gate open/closed | Governor (NATS gate) |
+| Priority | nice -20 to 19 | Governor (dynamic) |
+
+---
+
+## Thalamus Governor Network
+
+The thalamus is not just a message router — it runs its own neural network that learns **resource allocation**.
+
+```
+                ┌─ Governor Network ─────────────┐
+                |                                 |
+                |  Input: all NPC states (NATS)   |
+                |  Output: resource allocation    |
+                |    - tick rates                  |
+                |    - CPU quotas                  |
+                |    - gate open/close             |
+                |    - LLM queue priority          |
+                |                                  |
+                |  Own process, own weights        |
+                └────────────┬────────────────────┘
+                             |
+                ┌────────────┴────────────────────┐
+                |        NATS thalamus            |
+                └─┬──┬──┬──┬──┬──┬──┬──┬──┬──┬───┘
+                  |  |  |  |  |  |  |  |  |  |
+                 NPC NPC NPC NPC NPC ... NPC NPC
+```
+
+### What the Governor Learns
+
+- **Attention allocation**: Which NPCs need more compute right now?
+- **Gate control**: Who gets LLM access?
+- **Queue economics**: Finite LLM calls, maximize village-level outcomes
+- **Resource economics**: Finite compute, learn to be efficient
+
+### Training Signal
+
+- "Gave NPC-7 high compute during conversation -> quality was good" -> reinforce
+- "Starved NPC-3 near an interaction -> missed a trigger" -> penalize
+- "Opened LLM gate for 5 NPCs simultaneously -> latency spike" -> learn to queue
+
+### Two Nested Learning Loops
+
+- **NPCs** learn about the world, tick-by-tick (fast loop)
+- **Governor** learns about managing NPCs, epoch-by-epoch (slow loop)
+
+---
+
+## Curriculum Training: Progressive World Richness
+
+### The Mechanism
+
+World detail increases only when all NPCs demonstrate full knowledge of the current level.
+No one gets left behind. Measurable checkpoint: "Can every citizen describe every other citizen's home?"
+
+### Levels
+
+```
+Level 1:  5x5 grid, boxy houses, one trait each
+          "Node 7 = red house, has a well"
+          NPCs learn: navigation + identity ("who lives where")
+
+Level 2:  Higher resolution, 2-3 traits per house
+          "Node 7 = red house, wooden door, has a well, smoke from chimney"
+          NPCs learn: richer descriptions, more to notice
+
+Level 3:  Finer grid, real-world detail
+          "Node 7 = red house, oak door with iron handle, stone well (3m deep),
+           chimney smoking birch wood"
+          NPCs learn: material knowledge, specificity
+
+Level N:  Resolution approaches real-world data (OSM Dornach)
+          Navigation graph replaces uniform grid
+          NPCs apply learned skills to irregular topology
+```
+
+### Verification Oracle
+
+Each level-up is testable:
+- Quiz every NPC about every location
+- 100% village knowledge = green light
+- Increase resolution, add detail, run again
+
+### Connection to Spatial Resolution Gradient
+
+The training arena maps to the resolution gradient layers:
+
+| Training Level | Resolution Gradient | Detail |
+|----------------|--------------------| -------|
+| Level 1 (boxy) | L3-equivalent | Landmarks, simple identity |
+| Level 2 (detail) | L2-equivalent | Room-level, multiple traits |
+| Level 3+ (rich) | L1-equivalent | Object-level, materials, precision |
+
+The grid teaches the *concept* of spatial navigation. Real-world data (OSM, Nimmerhovel) applies it.
+
+---
+
+## System Overview
+
+```
+┌─────────────────────────────────────────────────────────────────┐
+|                      SPATIAL TRAINING ARENA                     |
+|                                                                 |
+|  ┌──────────┐  ┌──────────┐  ┌──────────┐                      |
+|  | NPC-0    |  | NPC-1    |  | NPC-N    |  ... 25 processes    |
+|  | own RL   |  | own RL   |  | own RL   |                      |
+|  | own state|  | own state|  | own state|                      |
+|  └────┬─────┘  └────┬─────┘  └────┬─────┘                      |
+|       |              |              |                            |
+|  ═════╪══════════════╪══════════════╪════════════════════════    |
+|       |     NATS THALAMUS (message bus)      |                  |
+|  ═════╪══════════════╪══════════════╪════════╪══════════════    |
+|       |              |              |        |                  |
+|  ┌────┴──────────────┴──────────────┴────┐   |                  |
+|  |         GOVERNOR NETWORK              |   |                  |
+|  |  - resource allocation                |   |                  |
+|  |  - gate control                       |   |                  |
+|  |  - tick rate steering                 |   |                  |
+|  └───────────────────────────────────────┘   |                  |
+|                                              |                  |
+|  ┌───────────────────────────────────────────┴──────────────┐   |
+|  |              SHARED LLM CORTEX (Qwen 3.5)               |   |
+|  |              called via gate, not continuous              |   |
+|  └──────────────────────────────────────────────────────────┘   |
+|                                                                 |
+|  ┌──────────────────────────────────────────────────────────┐   |
+|  |                    GRID WORLD                            |   |
+|  |  5x5 nodes (scalable) + progressive detail levels        |   |
+|  |  curriculum: boxy -> detailed -> real-world topology      |   |
+|  └──────────────────────────────────────────────────────────┘   |
+└─────────────────────────────────────────────────────────────────┘
+```
+
+---
+
+**Version:** 1.0 | **Created:** 2026-04-02 | **Updated:** 2026-04-02
+
+**Philosophy**: "One process, one brain, one life. The world gets richer only when every citizen knows it."