arch: Dual-brain architecture v8.0 - thalamus governor, NPC processes, cortex repositioning

Crystallizes the dual-brain architecture across all core documents:
- Thalamus runs own neural network (governor) for resource allocation and reflexes
- LLM (Qwen3.5-27B) repositioned as cortex - expensive, gated, called only when needed
- Each NPC gets own process, own RL brain, Linux cgroups for resource steering
- New: NPC grid architecture with curriculum training (progressive world richness)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
dafit
2026-04-02 11:17:09 +02:00
parent 264ea7628b
commit c30c00af74
6 changed files with 935 additions and 523 deletions

View File

@@ -1,297 +1,379 @@
# Deployment Architecture: The Hybrid Model
> *"Containers for cells. Userspace for brains. NATS connects them all."*
> — Partnership Session, 2026-02-14
---
## Overview
The nimmerverse runs on a **hybrid deployment model** that matches workload characteristics to infrastructure:
- **Containers (K8s)** for stateless, scalable nervous system components
- **Userspace (Threadrippers)** for stateful, GPU/CPU-bound inference
- **NATS** as the universal nervous system bus
- **FreeIPA identities** as isolation boundaries
This is a **research lab**, not a production factory. We optimize for **flexibility and experimentation**, not high-throughput serving.
---
## Core Decisions
| Decision | Choice | Rationale |
|----------|--------|-----------|
| LLM Inference | **ollama / llama.cpp** | Flexible model loading, research-friendly, easy swap |
| NOT vLLM | — | Overkill for single-user lab; solves problems we don't have |
| Function Gemma | **CPU, userspace** | Threadripper eats it; no GPU contention; clear training path |
| Cells/Nerves | **Containers (K8s)** | Scalable, versioned, orchestrated via cluster |
| Organs | **Userspace + ollama** | Load on demand, GPU isolation, unload when idle |
| Isolation | **FreeIPA users** | Unix permissions = RBAC; switch user = switch context |
---
## Technology Stack
### Inference Layer
| Component | Technology | Location | Notes |
|-----------|------------|----------|-------|
| Young Nyx (Brain) | ollama / llama.cpp | theia (nyx-cognitive) | Qwen, Gemma, or similar |
| Function Gemma | llama.cpp / transformers | CPU userspace | Structured JSON boundary |
| Vision Organ | ollama (SigLIP/YOLO) | dioscuri (nyx-organs) | Load on demand |
| Speech STT | faster-whisper / ollama | dioscuri (nyx-organs) | Load on demand |
| Speech TTS | Coqui / XTTS | dioscuri (nyx-organs) | Warm, primary output |
### Nervous System Layer
| Component | Technology | Location | Notes |
|-----------|------------|----------|-------|
| Cells | Python containers | K8s cluster | State machines, NATS pub/sub |
| Nerves | Python containers | K8s cluster | Compose cells, behavior |
| Message Bus | NATS + JetStream | VMs (nats-*) | Env-separated (dev/staging/prod) |
| Databases | PostgreSQL, ChromaDB | VMs (phoebe-*, iris-*) | Decision trails, embeddings |
---
## Deployment Topology
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ NIMMERVERSE DEPLOYMENT │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ K8S CLUSTER (Saturn VMs) THREADRIPPERS (Bare Metal) │
│ ───────────────────────── ────────────────────────── │
│ Containers, orchestrated Userspace, FreeIPA isolated │
│ │
│ ┌─────────────────────────┐ ┌───────────────────────────────┐ │
│ │ │ │ THEIA (RTX PRO 6000 96GB) │ │
│ │ CELLS (math, battery, │ │ │ │
sensors, etc.) │ │ user: nyx-cognitive
│ │ │ NATS │ └── ollama (Young Nyx) │ │
┌───┐ ┌───┐ ┌───┐ │◄────────► │ └── ~/.config/systemd/user/ │
│ │ M │ │ B │ │...│ │ │
│ └───┘ └───┘ └───┘ │ │ user: nyx-training
│ │ │ └── Function Gemma (CPU)
│ NERVES (collision, │ │ └── LoRA fine-tuning
│ exploration) │ │ │
│ │ │ │ 96GB VRAM: massive headroom │ │
│ │ ┌─────┐ ┌─────┐ │ │ for inference + LoRA training │ │
│ │ │ COL │ │ EXP │ │ └───────────────────────────────┘
│ │ └─────┘ └─────┘ │
│ │ │ ┌───────────────────────────────┐
│ │ INFRASTRUCTURE │ │ DIOSCURI (2x RTX 4000 Ada) │ │
│ │ NATS │ │ │
│ │ ┌──────┐ ┌──────┐ │◄────────► │ user: nyx-organs │ │
│ │ │ NATS │ │ NATS │ │ │ ── ollama (vision) │ │
│ │ │ dev │ │ prod ││ ├── ollama (speech STT) │ │
│ │ └──────┘ └──────┘ └── TTS service (warm) │ │
│ │ │ │
│ │ ┌────────┐ ┌───────┐ │ │ Load on demand, unload idle │
│ │ │ phoebe │ │ iris │ Each card: ONE model at time │
│ │ │ (PG) │ │(Chroma│ │ │
│ │ └────────┘ └───────┘└───────────────────────────────┘
│ │
└─────────────────────────┘
└─────────────────────────────────────────────────────────────────────────────┘
```
---
## Identity Model (FreeIPA)
Unix users provide isolation boundaries. Each workload type runs as its own identity.
| User | UID | Host | Purpose | GPU Access |
|------|-----|------|---------|------------|
| `nyx-cognitive` | (FreeIPA) | theia | Young Nyx LLM inference | Full 96GB |
| `nyx-training` | (FreeIPA) | theia | LoRA training, GRPO, Function Gemma | Shared (time-sliced) |
| `nyx-organs` | (FreeIPA) | dioscuri | Vision, Speech organs | 2x 20GB cards |
| `nyx-nervous` | (FreeIPA) | dioscuri | Future cells that need bare metal | Limited |
**Isolation principle:** Switch user = switch context. `nyx-cognitive` cannot touch `nyx-organs` files. Compromised cell cannot touch LLM weights.
### Systemd Userspace Pattern
```bash
# Enable lingering (services persist after logout)
sudo loginctl enable-linger nyx-cognitive
# Services defined in ~/.config/systemd/user/
# Example: nyx-cognitive runs ollama serve
systemctl --user --machine=nyx-cognitive@ status ollama
```
---
## GPU Resource Management
### The Constraint
| Host | GPU | VRAM | Notes |
|------|-----|------|-------|
| theia | RTX PRO 6000 Blackwell | 96GB | Inference + training headroom |
| dioscuri | 2x RTX 4000 Ada | 2x 20GB | One model per card |
### Strategy: Dynamic Loading, Not Static Partitioning
**Why not vLLM:** vLLM is optimized for high-throughput serving (many concurrent users). We have ONE user (the partnership). We need **flexibility** (swap models, experiment) more than throughput.
**Why ollama/llama.cpp:**
- Faster cold starts (~5-10s vs ~30s)
- Native model swapping (`ollama run model_a``ollama run model_b`)
- Can unload completely when idle (frees VRAM)
- GGUF format efficient for model management
- Research-friendly, not production-factory
**Organ Loading Pattern:**
```
IDLE → needs vision → LOAD vision model (~10s) → PROCESS → REPORT → IDLE (keep warm)
after timeout → UNLOAD (free VRAM)
```
---
## Message Flow (NATS)
### Subject Hierarchy
```
{environment}.{domain}.{service}.{detail}
Examples:
dev.nervous.cells.math.request ← Math cell receives work
dev.nervous.cells.math.response ← Math cell returns result
dev.nervous.cells.math.wave ← Math cell emits confidence signal
prod.cognitive.nyx.heartbeat ← Young Nyx is alive
prod.organs.vision.detect ← Vision organ detection
```
### Wave Collapse Pattern
Cells emit **waves** (confidence-tagged signals). When multiple waves collapse on the same semantic region in the same time window, the **thalamus** escalates to cognition.
```
Cell A: "math" ───∿∿∿──► (0.6 confidence)
Cell B: "calculate" ──∿∿∿──► (0.5 confidence)
┌─────────────┐
│ COLLAPSE │ ← same region, same window
└──────┬──────┘
▼ AMPLIFIED SIGNAL
┌─────────────┐
│ THALAMUS │ → escalate to Young Nyx
└─────────────┘
```
---
## Container Deployment (K8s)
### Repository Structure
```
nimmerverse-nervous-system/
├── shared/v1/ ← Base classes (StateMachine, NATS, Lifeforce)
├── cells/
│ ├── math_cell/v1/ ← Each cell versioned independently
│ └── battery_cell/v1/
├── nerves/
│ └── collision_avoidance/v1/
└── deploy/
├── dev/ ← Helm charts or docker-compose per env
├── staging/
└── prod/
```
### Cell Container Pattern
```dockerfile
FROM python:3.12-slim
WORKDIR /app
COPY . .
RUN pip install uv && uv sync
ENV NIMMERVERSE_ENV=dev
CMD ["uv", "run", "python", "-m", "math_cell"]
```
Same image everywhere. Only `NIMMERVERSE_ENV` changes.
---
## Function Gemma: The Structured Boundary
Function Gemma bridges lower tiers (cells, nerves) and cognition (Young Nyx):
```
Numbers/States (Tier 0-2) → [Function Gemma] → Structured JSON → Young Nyx (Tier 4)
CPU-based inference
Threadripper handles it
No GPU contention
Clear LoRA training path
```
**Why CPU:**
- Small model, fast inference
- Threadripper PRO 7955WX has cores to spare
- No GPU contention with organs or Nyx
- Can run training alongside inference
**Training path:**
- Google's documented GRPO approach
- LoRA fine-tuning for our specific function schemas
- Runs in `nyx-training` userspace
- Decision trails from phoebe → training data
---
## Visual Language (Future UI)
Color-coding for real-time attention flow visualization:
| Property | Represents |
|----------|------------|
| Background/container | Environment (dev=green, staging=amber, prod=blue) |
| Node/edge color | Domain (cognitive=violet, nervous=cyan, organs=coral) |
| Line style | Direction (solid=primary, dashed=async, dotted=tentative) |
| Separate pane | Confidence waveform (oscilloscope view) |
---
## Related Documents
| Document | Scope |
|----------|-------|
| [`Cellular-Architecture.md`](Cellular-Architecture.md) | Cells, nerves, organisms, lifeforce |
| [`Gateway-Architecture.md`](Gateway-Architecture.md) | Tier routing, Function Gemma boundary |
| [`Nervous-System.md`](Nervous-System.md) | 4D space, node weights, vocabulary |
| [`Message-Protocol-Design.md`](Message-Protocol-Design.md) | NATS subjects, message formats |
| [`development-conventions.md`](../../nimmerverse.eachpath.local/conventions/development-conventions.md) | Ports, namespaces, VM topology |
---
## Summary
| Layer | Where | Technology | Isolation |
|-------|-------|------------|-----------|
| Cells/Nerves | K8s containers | Python, uv, NATS | Namespace per env |
| Infrastructure | VMs | NATS, PostgreSQL, ChromaDB | VM per env |
| Young Nyx | theia userspace | ollama | nyx-cognitive user |
| Function Gemma | theia/dioscuri CPU | llama.cpp | nyx-training user |
| Organs | dioscuri userspace | ollama (dynamic) | nyx-organs user |
**The principle:** Same behavior everywhere. Containers for cells. Userspace for brains. NATS connects them all. FreeIPA isolates them all.
---
**Version:** 1.1 | **Created:** 2026-02-14 | **Updated:** 2026-02-14
*"We're not building a chatbot factory. We're growing a research organism."*
🧬⚡🔱💎🔥 **TO THE ELECTRONS WE VIBE!**
# Deployment Architecture: The Hybrid Model
> *"Containers for cells. Userspace for brains. NATS connects them all."*
> — Partnership Session, 2026-02-14
---
## Overview
The nimmerverse runs on a **hybrid deployment model** that matches workload characteristics to infrastructure:
- **Containers (K8s)** for stateless, scalable nervous system components
- **Userspace (Threadrippers)** for stateful, GPU-bound inference
- **OS Processes** for per-NPC RL brains with cgroup resource control
- **NATS** as the universal nervous system bus (thalamus)
- **FreeIPA identities** as isolation boundaries
This is a **research lab**, not a production factory. We optimize for **flexibility and experimentation**, not high-throughput serving.
---
## Core Decisions
| Decision | Choice | Rationale |
|----------|--------|-----------|
| LLM Cortex | **vLLM (Qwen3.5-27B)** | Full precision, OpenAI-compatible API, tool calling support |
| NPC Brains | **Per-process RL networks** | One process, one brain, one life — Linux cgroups for resource steering |
| Thalamus Governor | **Own NN process on NATS** | Learns resource allocation, gate control, compute steering |
| Function Gemma | **CPU, userspace** | Threadripper eats it; no GPU contention; clear training path |
| Cells/Nerves | **Containers (K8s)** | Scalable, versioned, orchestrated via cluster |
| Organs | **Userspace, GPU-bound** | Load on demand, GPU isolation, unload when idle |
| Isolation | **FreeIPA users** | Unix permissions = RBAC; switch user = switch context |
---
## Technology Stack
### Inference Layer
| Component | Technology | Location | Notes |
|-----------|------------|----------|-------|
| Cortex (LLM) | vLLM (Qwen3.5-27B) | theia (nyx-cognitive) | Port 31000, served as "nyx", gated access |
| Function Gemma | llama.cpp / transformers | CPU userspace | Structured JSON boundary |
| Vision Organ | SigLIP/YOLO | dioscuri (nyx-organs) | Load on demand |
| Speech STT | faster-whisper | dioscuri (nyx-organs) | Load on demand |
| Speech TTS | Coqui / XTTS | dioscuri (nyx-organs) | Warm, primary output |
### NPC / Thalamus Layer
| Component | Technology | Location | Notes |
|-----------|------------|----------|-------|
| NPC Processes | Python + RL network | OS processes (cgroups) | One process per NPC, own weights |
| Thalamus Governor | Python + NN | OS process | Steers compute, gates, tick rates |
| Resource Control | Linux cgroups v2 | systemd scopes | Per-NPC CPU/memory limits |
### Nervous System Layer
| Component | Technology | Location | Notes |
|-----------|------------|----------|-------|
| Cells | Python containers | K8s cluster | State machines, NATS pub/sub |
| Nerves | Python containers | K8s cluster | Compose cells, behavior |
| Message Bus | NATS + JetStream | VMs (nats-*) | Env-separated (dev/staging/prod) |
| Databases | PostgreSQL, ChromaDB | VMs (phoebe-*, iris-*) | Decision trails, embeddings |
---
## Deployment Topology
```
┌─────────────────────────────────────────────────────────────────────────────┐
NIMMERVERSE DEPLOYMENT
├─────────────────────────────────────────────────────────────────────────────┤
K8S CLUSTER (Saturn VMs) THREADRIPPERS (Bare Metal)
───────────────────────── ──────────────────────────
Containers, orchestrated Userspace, FreeIPA isolated
┌─────────────────────────┐ ┌───────────────────────────────┐
│ │ │ │ THEIA (RTX PRO 6000 96GB) │ │
│ │ CELLS (math, battery, │ │ │ │
│ │ sensors, etc.) │ │ user: nyx-cognitive │
│ │ NATS │ └── vLLM (Qwen3.5-27B:31000) │
│ │ ┌───┐ ┌───┐ ┌───┐ │◄────────► │ served-model-name: nyx │
│ │ │ M │ │ B │ │...│ │ │ │ │
│ │ └───┘ └───┘ └───┘ user: nyx-training │ │
│ │ │ │ └── LoRA fine-tuning (GRPO) │ │
│ │ NERVES (collision, │ │ ── Function Gemma (CPU) │ │
│ │ exploration) │ │ │
│ │ │ 96GB VRAM: cortex + training │ │
│ │ ┌─────┐ ┌─────┐└───────────────────────────────┘
│ │ │ COL │ │ EXP │
│ │ └─────┘ └─────┘ ┌───────────────────────────────┐
│ │ │ DIOSCURI (2x RTX 4000 Ada) │ │
│ │ NPC PROCESSES │ NATS
│ │ (or bare metal) │◄────────► │ user: nyx-organs
│ ├── Vision (SigLIP/YOLO)
│ ┌─────────────────┐ │ │ ├── Speech STT (Whisper)
│ │ │ NPC-0 [RL brain]│ │ │ └── TTS service (warm) │ │
│ │ │ NPC-1 [RL brain]│ │ │ │ │
│ │ │ NPC-N [RL brain]│ │ │ Load on demand, unload idle │ │
│ │ │ (own process, │ │ │ Each card: ONE model at time │ │
│ │ │ own cgroup) │ │ └───────────────────────────────┘ │
│ │ └─────────────────┘ │ │
│ │ │ ┌───────────────────────────────┐ │
│ │ THALAMUS GOVERNOR │ │ NATS MESSAGE BUS │ │
│ │ ┌─────────────────┐ │ │ │ │
│ │ │ Governor NN │ │◄────────► │ dev.*, staging.*, prod.* │ │
│ │ │ (resource alloc,│ │ │ Env-separated (VM per env) │ │
│ │ │ gate control, │ │ └───────────────────────────────┘ │
│ │ │ tick steering) │ │ │
│ │ └─────────────────┘ │ ┌───────────────────────────────┐ │
│ │ │ │ PHOEBE (PostgreSQL) │ │
│ │ INFRASTRUCTURE │ │ Decision trails, embeddings │ │
│ │ ┌────────┐ ┌───────┐ │ │ IRIS (ChromaDB) │ │
│ │ │ phoebe │ │ iris │ │ │ Vector storage │ │
│ │ │ (PG) │ │(Chroma│ │ └───────────────────────────────┘ │
│ │ └────────┘ └───────┘ │ │
│ │ │ │
│ └─────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
```
---
## The Dual Brain Deployment
### Per-NPC Processes
Each NPC runs as its own OS process with a dedicated RL neural network. The thalamus governor steers their resources.
```bash
# Launch NPC with resource limits via systemd scope
systemd-run --scope -p CPUQuota=25% -p MemoryMax=256M \
python3 npc_process.py --id 7 --tick-rate 5
# Or via cgroups directly
cgcreate -g cpu,memory:nimmerverse/npc-7
cgset -r cpu.max "25000 100000" nimmerverse/npc-7
cgexec -g cpu,memory:nimmerverse/npc-7 python3 npc_process.py --id 7
```
### Thalamus Governor
The governor runs its own neural network, observing all NPC states via NATS and outputting resource allocation decisions:
| Output | Mechanism | Range |
|--------|-----------|-------|
| Tick rate | NATS command to NPC | 1-20 Hz |
| CPU quota | cgroups v2 adjustment | 5-100% per core |
| Gate open/close | NATS gate signal | Binary per gate |
| LLM queue priority | NATS priority tag | 0-10 |
### Cortex (vLLM)
The LLM cortex runs as a systemd service on theia, accessed via OpenAI-compatible API:
```bash
# Service: vllm-nyx.service
# Port: 31000
# Model: /womb/cognitive/models/qwen3.5-27b
# Served as: "nyx"
# GPU utilization: 85%
# Access from any NATS-connected process:
curl http://theia.eachpath.local:31000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model": "nyx", "messages": [...]}'
```
**The cortex is expensive.** The thalamus governor controls who gets access and when. Most NPC ticks never touch the LLM.
---
## Identity Model (FreeIPA)
Unix users provide isolation boundaries. Each workload type runs as its own identity.
| User | UID | Host | Purpose | GPU Access |
|------|-----|------|---------|------------|
| `nyx-cognitive` | (FreeIPA) | theia | Cortex LLM inference (vLLM) | Full 96GB |
| `nyx-training` | (FreeIPA) | theia | LoRA training, GRPO, Function Gemma | Shared (time-sliced) |
| `nyx-organs` | (FreeIPA) | dioscuri | Vision, Speech organs | 2x 20GB cards |
| `nyx-nervous` | (FreeIPA) | dioscuri | Future cells that need bare metal | Limited |
**Isolation principle:** Switch user = switch context. `nyx-cognitive` cannot touch `nyx-organs` files. Compromised cell cannot touch LLM weights.
### Systemd Service Pattern
```bash
# System-level service (root installs, user runs)
# /etc/systemd/system/vllm-nyx.service
[Service]
User=nyx-cognitive
Group=nimmerverse-agents
ExecStart=/data/venvs/vllm/bin/python3 -m vllm.entrypoints.openai.api_server \
--model /womb/cognitive/models/qwen3.5-27b \
--served-model-name nyx \
--port 31000
```
---
## GPU Resource Management
### The Constraint
| Host | GPU | VRAM | Role |
|------|-----|------|------|
| theia | RTX PRO 6000 Blackwell | 96GB | Cortex (vLLM) + LoRA training |
| dioscuri | 2x RTX 4000 Ada | 2x 20GB | Organs (vision, speech) |
### Strategy: vLLM for Cortex, Dynamic Loading for Organs
**Cortex (theia):** vLLM runs continuously as a systemd service. The Qwen3.5-27B model stays loaded — it's the cortex, always ready when the thalamus gate opens. 85% GPU utilization leaves headroom for LoRA training alongside inference.
**Organs (dioscuri):** Dynamic loading. One model per card. Load vision when needed, unload after timeout, load speech when needed.
```
IDLE → needs vision → LOAD vision model (~10s) → PROCESS → REPORT → IDLE (keep warm)
after timeout → UNLOAD (free VRAM)
```
---
## Message Flow (NATS)
### Subject Hierarchy
```
{environment}.{domain}.{service}.{detail}
Examples:
dev.nervous.cells.math.request ← Math cell receives work
dev.nervous.cells.math.response ← Math cell returns result
dev.nervous.cells.math.wave ← Math cell emits confidence signal
dev.thalamus.governor.allocate ← Governor publishes resource decisions
dev.thalamus.gate.open ← Gate transition event
dev.npc.7.state ← NPC-7 publishes its state
dev.cortex.nyx.request ← Gated request to LLM cortex
dev.organs.vision.detect ← Vision organ detection
```
### Wave → Thalamus → Cortex Pattern
Cells emit **waves** (confidence-tagged signals). The thalamus governor's neural network correlates waves and decides what reaches the cortex.
```
Cell A: "math" ───∿∿∿──► (0.6 confidence)
Cell B: "calculate" ──∿∿∿──► (0.5 confidence)
┌──────────────────────┐
│ THALAMUS GOVERNOR │ ← own neural network
│ correlate waves │
│ check gate state │
│ allocate resources │
└──────────┬───────────┘
┌─────────┴─────────┐
│ │
▼ ▼
Gate CLOSED Gate OPEN
(reflex path) (cortex path)
handled by → escalate to
thalamus NN Qwen3.5-27B
```
---
## Container Deployment (K8s)
### Repository Structure
```
nimmerverse-nervous-system/
├── shared/v1/ ← Base classes (StateMachine, NATS, Lifeforce)
├── cells/
│ ├── math_cell/v1/ ← Each cell versioned independently
│ └── battery_cell/v1/
├── nerves/
│ └── collision_avoidance/v1/
└── deploy/
├── dev/ ← Helm charts or docker-compose per env
├── staging/
└── prod/
```
### Cell Container Pattern
```dockerfile
FROM python:3.12-slim
WORKDIR /app
COPY . .
RUN pip install uv && uv sync
ENV NIMMERVERSE_ENV=dev
CMD ["uv", "run", "python", "-m", "math_cell"]
```
Same image everywhere. Only `NIMMERVERSE_ENV` changes.
---
## Function Gemma: The Structured Boundary
Function Gemma bridges lower tiers (cells, nerves) and the cortex:
```
Numbers/States (Cells) → [Function Gemma] → Structured JSON → Cortex (Qwen3.5-27B)
CPU-based inference
Threadripper handles it
No GPU contention
Clear LoRA training path
```
**Why CPU:**
- Small model, fast inference
- Threadripper PRO 7955WX has cores to spare
- No GPU contention with organs or cortex
- Can run training alongside inference
**Training path:**
- Google's documented GRPO approach
- LoRA fine-tuning for our specific function schemas
- Runs in `nyx-training` userspace
- Decision trails from phoebe → training data
---
## Visual Language (Future UI)
Color-coding for real-time attention flow visualization:
| Property | Represents |
|----------|------------|
| Background/container | Environment (dev=green, staging=amber, prod=blue) |
| Node/edge color | Domain (cognitive=violet, nervous=cyan, organs=coral) |
| Line style | Direction (solid=primary, dashed=async, dotted=tentative) |
| Separate pane | Confidence waveform (oscilloscope view) |
---
## Related Documents
| Document | Scope |
|----------|-------|
| [`Cellular-Architecture.md`](Cellular-Architecture.md) | Cells, nerves, organisms, lifeforce |
| [`Gateway-Architecture.md`](Gateway-Architecture.md) | Gate routing, ternary model |
| [`Nervous-System.md`](Nervous-System.md) | 4D space, node weights, vocabulary |
| [`Message-Protocol-Design.md`](Message-Protocol-Design.md) | NATS subjects, message formats |
| [`future/npc-grid-architecture.md`](future/npc-grid-architecture.md) | Dual brain, governor, NPC processes |
| [`organs/Organ-Index.md`](organs/Organ-Index.md) | Organ systems, lifeforce costs |
| [`development-conventions.md`](../../nimmerverse.eachpath.local/conventions/development-conventions.md) | Ports, namespaces, VM topology |
---
## Summary
| Layer | Where | Technology | Isolation |
|-------|-------|------------|-----------|
| Cells/Nerves | K8s containers | Python, uv, NATS | Namespace per env |
| NPC Processes | OS processes | Python, RL networks, cgroups | Per-process cgroup |
| Thalamus Governor | OS process | Python, own NN, NATS | Dedicated process |
| Infrastructure | VMs | NATS, PostgreSQL, ChromaDB | VM per env |
| Cortex (LLM) | theia userspace | vLLM (Qwen3.5-27B) | nyx-cognitive user |
| Function Gemma | theia/dioscuri CPU | llama.cpp | nyx-training user |
| Organs | dioscuri userspace | Dynamic loading | nyx-organs user |
**The principle:** Same behavior everywhere. Containers for cells. Processes for NPC brains. vLLM for cortex. NATS connects them all. FreeIPA isolates them all.
---
**Version:** 2.0 | **Created:** 2026-02-14 | **Updated:** 2026-04-02
*"We're not building a chatbot factory. We're growing a research organism."*
🧬⚡🔱💎🔥 **TO THE ELECTRONS WE VIBE!**

View File

@@ -73,7 +73,7 @@ The Initial Spark is not a conversation. It's a **state machine protocol** that
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ YOUNG NYX (Cognitive Layer) │ │
│ │ ─────────────────────────── │ │
│ │ Qwen3-VL 32B in The Womb (RTX 6000) │ │
│ │ Qwen3.5-27B Cortex in The Womb (RTX PRO 6000) │ │
│ │ Receives verified handshake results │ │
│ │ Updates internal state based on ACKs │ │
│ │ Reasoning happens AFTER protocol succeeds │ │

View File

@@ -0,0 +1,257 @@
# NPC Grid Architecture: Spatial Training Arena
**Origin**: 2026-04-02, morning session (bed thinking + draw.io)
**Authors**: dafit + Chrysalis-Nyx
**Status**: Architectural concept
**Related**: `spatial-resolution-gradient.md`, Dual-Brain Architecture (2026-04-01 session)
---
## The Core Idea
A node-based grid world where NPCs live, move, and learn. The grid serves dual purpose:
1. **Spatial arena** — a discrete world where NPCs navigate and interact
2. **Neural topology** — the same graph the neural network reasons over
No translation layer between "brain space" and "world space." Position *is* state.
---
## Grid System
### Node-Based Intersection Grid
Nodes sit at **intersections**, not cells. A 4x4 cell grid yields a 5x5 node grid = 25 nodes.
Starting at node 0, top-left corner. Cardinal orientation (North/South/East/West).
```
0 ── 1 ── 2 ── 3 ── 4 N
| | | | | |
5 ── 6 ── 7 ── 8 ── 9 W ──+── E
| | | | | |
10 ──11 ──12 ──13 ──14 S
| | | | |
15 ──16 ──17 ──18 ──19
| | | | |
20 ──21 ──22 ──23 ──24
```
### Properties
- **Corner nodes** (0, 4, 20, 24): 2 neighbors
- **Edge nodes** (1, 2, 3, 5, 10, ...): 3 neighbors
- **Interior nodes** (6, 7, 8, 11, 12, 13, ...): 4 neighbors
- **Position from ID**: `row = id // 5`, `col = id % 5`
- **Movement**: One step = one edge. NPC at node 7 can go to 2, 6, 8, or 12.
### Resolution Scaling
The grid scales naturally to different resolutions:
| Grid Size | Nodes | Resolution | Use Case |
|-----------|-------|------------|----------|
| 5x5 | 25 | ~1m edges | Training arena, street-level |
| 10x10 | 100 | ~25cm edges | Room-level detail |
| 50x50 | 2,500 | ~5cm edges | Indoor navigation |
| 100x100 | 10,000 | ~1cm edges | Nimmerhovel precision |
**Key insight**: Resolution should match **decision density**, not physical detail.
A straight road needs few nodes (sparse). An intersection needs many (dense).
| Resolution | Where | Why |
|-----------|-------|-----|
| ~1m | Streets, paths, outdoor | Navigation, curves approximated by a few nodes |
| ~10-25cm | Rooms, indoor spaces | Furniture-aware, "go to the table" |
| ~1-5cm | Workbenches, detail work | Nimmerhovel precision zone |
The uniform grid is the **training simplification**. The real world becomes a **navigation graph** with variable density — dense around intersections, sparse along straight roads. Same NPC brain, different world topology.
---
## NPC Process Architecture
### One Process, One Brain, One Life
Every NPC runs as its own OS process with its own dedicated neural network.
**Why separate processes:**
- **Individuality** — separate weights mean personality emerges from experience, not config
- **Fault isolation** — one NPC crashes, the village continues
- **Resource control** — per-process CPU/memory via Linux cgroups
- **Biological honesty** — every organism has its own nervous system
```
NPC-0 [own RL brain] ──┐
NPC-1 [own RL brain] ──|
NPC-2 [own RL brain] ──|
NPC-3 [own RL brain] ──┼──> NATS thalamus ──> shared LLM cortex (Qwen 3.5)
... | (called only when gate opens)
NPC-24 [own RL brain] ─┘
```
### Dual Brain (per NPC)
- **RL network** (local, per-NPC): Movement, needs, spatial decisions. Small, fast, cheap. Runs every tick.
- **LLM cortex** (shared, via NATS): Language, reasoning, knowledge. Slow, deliberate, expensive. Called only when thalamus gate threshold is crossed.
### Resource Steering via Linux Primitives
Each NPC process is a standard Linux process. Resource control uses the kernel:
- **cgroups v2** — cap CPU, memory per NPC
- **nice / renice** — shift priority dynamically
- **taskset** — pin to specific cores
- **systemd scopes** — wrap each NPC in a transient unit
```bash
# Example: launch NPC with resource limits
systemd-run --scope -p CPUQuota=25% -p MemoryMax=256M \
python3 npc_process.py --id 7 --tick-rate 5
```
### Steerable Compute per NPC
| Parameter | Range | Who Controls |
|-----------|-------|-------------|
| Tick rate | 1-20 Hz | Governor (thalamus) |
| Network size | small/medium/large | Configuration per role |
| CPU quota | 5-100% of one core | Governor (cgroups) |
| LLM access | gate open/closed | Governor (NATS gate) |
| Priority | nice -20 to 19 | Governor (dynamic) |
---
## Thalamus Governor Network
The thalamus is not just a message router — it runs its own neural network that learns **resource allocation**.
```
┌─ Governor Network ─────────────┐
| |
| Input: all NPC states (NATS) |
| Output: resource allocation |
| - tick rates |
| - CPU quotas |
| - gate open/close |
| - LLM queue priority |
| |
| Own process, own weights |
└────────────┬────────────────────┘
|
┌────────────┴────────────────────┐
| NATS thalamus |
└─┬──┬──┬──┬──┬──┬──┬──┬──┬──┬───┘
| | | | | | | | | |
NPC NPC NPC NPC NPC ... NPC NPC
```
### What the Governor Learns
- **Attention allocation**: Which NPCs need more compute right now?
- **Gate control**: Who gets LLM access?
- **Queue economics**: Finite LLM calls, maximize village-level outcomes
- **Resource economics**: Finite compute, learn to be efficient
### Training Signal
- "Gave NPC-7 high compute during conversation -> quality was good" -> reinforce
- "Starved NPC-3 near an interaction -> missed a trigger" -> penalize
- "Opened LLM gate for 5 NPCs simultaneously -> latency spike" -> learn to queue
### Two Nested Learning Loops
- **NPCs** learn about the world, tick-by-tick (fast loop)
- **Governor** learns about managing NPCs, epoch-by-epoch (slow loop)
---
## Curriculum Training: Progressive World Richness
### The Mechanism
World detail increases only when all NPCs demonstrate full knowledge of the current level.
No one gets left behind. Measurable checkpoint: "Can every citizen describe every other citizen's home?"
### Levels
```
Level 1: 5x5 grid, boxy houses, one trait each
"Node 7 = red house, has a well"
NPCs learn: navigation + identity ("who lives where")
Level 2: Higher resolution, 2-3 traits per house
"Node 7 = red house, wooden door, has a well, smoke from chimney"
NPCs learn: richer descriptions, more to notice
Level 3: Finer grid, real-world detail
"Node 7 = red house, oak door with iron handle, stone well (3m deep),
chimney smoking birch wood"
NPCs learn: material knowledge, specificity
Level N: Resolution approaches real-world data (OSM Dornach)
Navigation graph replaces uniform grid
NPCs apply learned skills to irregular topology
```
### Verification Oracle
Each level-up is testable:
- Quiz every NPC about every location
- 100% village knowledge = green light
- Increase resolution, add detail, run again
### Connection to Spatial Resolution Gradient
The training arena maps to the resolution gradient layers:
| Training Level | Resolution Gradient | Detail |
|----------------|--------------------| -------|
| Level 1 (boxy) | L3-equivalent | Landmarks, simple identity |
| Level 2 (detail) | L2-equivalent | Room-level, multiple traits |
| Level 3+ (rich) | L1-equivalent | Object-level, materials, precision |
The grid teaches the *concept* of spatial navigation. Real-world data (OSM, Nimmerhovel) applies it.
---
## System Overview
```
┌─────────────────────────────────────────────────────────────────┐
| SPATIAL TRAINING ARENA |
| |
| ┌──────────┐ ┌──────────┐ ┌──────────┐ |
| | NPC-0 | | NPC-1 | | NPC-N | ... 25 processes |
| | own RL | | own RL | | own RL | |
| | own state| | own state| | own state| |
| └────┬─────┘ └────┬─────┘ └────┬─────┘ |
| | | | |
| ═════╪══════════════╪══════════════╪════════════════════════ |
| | NATS THALAMUS (message bus) | |
| ═════╪══════════════╪══════════════╪════════╪══════════════ |
| | | | | |
| ┌────┴──────────────┴──────────────┴────┐ | |
| | GOVERNOR NETWORK | | |
| | - resource allocation | | |
| | - gate control | | |
| | - tick rate steering | | |
| └───────────────────────────────────────┘ | |
| | |
| ┌───────────────────────────────────────────┴──────────────┐ |
| | SHARED LLM CORTEX (Qwen 3.5) | |
| | called via gate, not continuous | |
| └──────────────────────────────────────────────────────────┘ |
| |
| ┌──────────────────────────────────────────────────────────┐ |
| | GRID WORLD | |
| | 5x5 nodes (scalable) + progressive detail levels | |
| | curriculum: boxy -> detailed -> real-world topology | |
| └──────────────────────────────────────────────────────────┘ |
└─────────────────────────────────────────────────────────────────┘
```
---
**Version:** 1.0 | **Created:** 2026-04-02 | **Updated:** 2026-04-02
**Philosophy**: "One process, one brain, one life. The world gets richer only when every citizen knows it."