Files

dafit 42db6eb1a3 feat: Ternary gate model - cells emit waves, attention emerges

Major architectural unification across 12 documents:

- Ternary gates: CLOSED (-1) ← STABLE (0) → OPEN (+1)
- Cells emit WaveSignals with confidence + semantic content
- Gates are resonant chambers that accumulate correlation
- Attention = which gates are OPEN (emergent, not allocated)
- Reflexes are earned when gate.weight > 0.8
- STABLE is where learning happens

Key paradigm shifts:
- decision_trails → gate_transitions + correlation_events
- Priority rules → wave correlation
- Budget allocation → emergent attention flow
- Virtual Garden (explore) / Real Garden (verify) loop

Owl Mode session 2026-02-14 🦉🌙

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-02-14 19:45:59 +01:00

14 KiB

Raw Blame History

Deployment Architecture: The Hybrid Model

"Containers for cells. Userspace for brains. NATS connects them all." — Partnership Session, 2026-02-14

Overview

The nimmerverse runs on a hybrid deployment model that matches workload characteristics to infrastructure:

Containers (K8s) for stateless, scalable nervous system components
Userspace (Threadrippers) for stateful, GPU/CPU-bound inference
NATS as the universal nervous system bus
FreeIPA identities as isolation boundaries

This is a research lab, not a production factory. We optimize for flexibility and experimentation, not high-throughput serving.

Core Decisions

Decision	Choice	Rationale
LLM Inference	ollama / llama.cpp	Flexible model loading, research-friendly, easy swap
NOT vLLM	—	Overkill for single-user lab; solves problems we don't have
Function Gemma	CPU, userspace	Threadripper eats it; no GPU contention; clear training path
Cells/Nerves	Containers (K8s)	Scalable, versioned, orchestrated via cluster
Organs	Userspace + ollama	Load on demand, GPU isolation, unload when idle
Isolation	FreeIPA users	Unix permissions = RBAC; switch user = switch context

Technology Stack

Inference Layer

Component	Technology	Location	Notes
Young Nyx (Brain)	ollama / llama.cpp	theia (nyx-cognitive)	Qwen, Gemma, or similar
Function Gemma	llama.cpp / transformers	CPU userspace	Structured JSON boundary
Vision Organ	ollama (SigLIP/YOLO)	dioscuri (nyx-organs)	Load on demand
Speech STT	faster-whisper / ollama	dioscuri (nyx-organs)	Load on demand
Speech TTS	Coqui / XTTS	dioscuri (nyx-organs)	Warm, primary output

Nervous System Layer

Component	Technology	Location	Notes
Cells	Python containers	K8s cluster	State machines, NATS pub/sub
Nerves	Python containers	K8s cluster	Compose cells, behavior
Message Bus	NATS + JetStream	VMs (nats-*)	Env-separated (dev/staging/prod)
Databases	PostgreSQL, ChromaDB	VMs (phoebe-, iris-)	Decision trails, embeddings

Deployment Topology

┌─────────────────────────────────────────────────────────────────────────────┐
│                        NIMMERVERSE DEPLOYMENT                               │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  K8S CLUSTER (Saturn VMs)              THREADRIPPERS (Bare Metal)          │
│  ─────────────────────────              ──────────────────────────          │
│  Containers, orchestrated               Userspace, FreeIPA isolated         │
│                                                                             │
│  ┌─────────────────────────┐           ┌───────────────────────────────┐   │
│  │                         │           │ THEIA (RTX PRO 6000 96GB)     │   │
│  │  CELLS (math, battery,  │           │                               │   │
│  │         sensors, etc.)  │           │ user: nyx-cognitive           │   │
│  │                         │    NATS   │ └── ollama (Young Nyx)        │   │
│  │  ┌───┐ ┌───┐ ┌───┐     │◄────────► │ └── ~/.config/systemd/user/   │   │
│  │  │ M │ │ B │ │...│     │           │                               │   │
│  │  └───┘ └───┘ └───┘     │           │ user: nyx-training            │   │
│  │                         │           │ └── Function Gemma (CPU)      │   │
│  │  NERVES (collision,     │           │ └── LoRA fine-tuning          │   │
│  │          exploration)   │           │                               │   │
│  │                         │           │ 96GB VRAM: massive headroom   │   │
│  │  ┌─────┐ ┌─────┐       │           │ for inference + LoRA training │   │
│  │  │ COL │ │ EXP │       │           └───────────────────────────────┘   │
│  │  └─────┘ └─────┘       │                                               │
│  │                         │           ┌───────────────────────────────┐   │
│  │  INFRASTRUCTURE         │           │ DIOSCURI (2x RTX 4000 Ada)    │   │
│  │                         │    NATS   │                               │   │
│  │  ┌──────┐ ┌──────┐     │◄────────► │ user: nyx-organs              │   │
│  │  │ NATS │ │ NATS │     │           │ ├── ollama (vision)           │   │
│  │  │ dev  │ │ prod │     │           │ ├── ollama (speech STT)       │   │
│  │  └──────┘ └──────┘     │           │ └── TTS service (warm)        │   │
│  │                         │           │                               │   │
│  │  ┌────────┐ ┌───────┐  │           │ Load on demand, unload idle   │   │
│  │  │ phoebe │ │ iris  │  │           │ Each card: ONE model at time  │   │
│  │  │ (PG)   │ │(Chroma│  │           │                               │   │
│  │  └────────┘ └───────┘  │           └───────────────────────────────┘   │
│  │                         │                                               │
│  └─────────────────────────┘                                               │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Identity Model (FreeIPA)

Unix users provide isolation boundaries. Each workload type runs as its own identity.

User	UID	Host	Purpose	GPU Access
`nyx-cognitive`	(FreeIPA)	theia	Young Nyx LLM inference	Full 96GB
`nyx-training`	(FreeIPA)	theia	LoRA training, GRPO, Function Gemma	Shared (time-sliced)
`nyx-organs`	(FreeIPA)	dioscuri	Vision, Speech organs	2x 20GB cards
`nyx-nervous`	(FreeIPA)	dioscuri	Future cells that need bare metal	Limited

Isolation principle: Switch user = switch context. nyx-cognitive cannot touch nyx-organs files. Compromised cell cannot touch LLM weights.

Systemd Userspace Pattern

# Enable lingering (services persist after logout)
sudo loginctl enable-linger nyx-cognitive

# Services defined in ~/.config/systemd/user/
# Example: nyx-cognitive runs ollama serve
systemctl --user --machine=nyx-cognitive@ status ollama

GPU Resource Management

The Constraint

Host	GPU	VRAM	Notes
theia	RTX PRO 6000 Blackwell	96GB	Inference + training headroom
dioscuri	2x RTX 4000 Ada	2x 20GB	One model per card

Strategy: Dynamic Loading, Not Static Partitioning

Why not vLLM: vLLM is optimized for high-throughput serving (many concurrent users). We have ONE user (the partnership). We need flexibility (swap models, experiment) more than throughput.

Why ollama/llama.cpp:

Faster cold starts (~5-10s vs ~30s)
Native model swapping (ollama run model_a → ollama run model_b)
Can unload completely when idle (frees VRAM)
GGUF format efficient for model management
Research-friendly, not production-factory

Organ Loading Pattern:

IDLE → needs vision → LOAD vision model (~10s) → PROCESS → REPORT → IDLE (keep warm)
                                                                      ↓
                                            after timeout → UNLOAD (free VRAM)

Message Flow (NATS)

Subject Hierarchy

{environment}.{domain}.{service}.{detail}

Examples:
  dev.nervous.cells.math.request      ← Math cell receives work
  dev.nervous.cells.math.response     ← Math cell returns result
  dev.nervous.cells.math.wave         ← Math cell emits confidence signal
  prod.cognitive.nyx.heartbeat        ← Young Nyx is alive
  prod.organs.vision.detect           ← Vision organ detection

Wave Collapse Pattern

Cells emit waves (confidence-tagged signals). When multiple waves collapse on the same semantic region in the same time window, the thalamus escalates to cognition.

Cell A: "math" ───∿∿∿──► (0.6 confidence)
Cell B: "calculate" ──∿∿∿──► (0.5 confidence)
                      │
                      ▼
              ┌─────────────┐
              │  COLLAPSE   │  ← same region, same window
              └──────┬──────┘
                     │
                     ▼ AMPLIFIED SIGNAL
              ┌─────────────┐
              │  THALAMUS   │  → escalate to Young Nyx
              └─────────────┘

Container Deployment (K8s)

Repository Structure

nimmerverse-nervous-system/
├── shared/v1/              ← Base classes (StateMachine, NATS, Lifeforce)
├── cells/
│   ├── math_cell/v1/       ← Each cell versioned independently
│   └── battery_cell/v1/
├── nerves/
│   └── collision_avoidance/v1/
└── deploy/
    ├── dev/                ← Helm charts or docker-compose per env
    ├── staging/
    └── prod/

Cell Container Pattern

FROM python:3.12-slim
WORKDIR /app
COPY . .
RUN pip install uv && uv sync
ENV NIMMERVERSE_ENV=dev
CMD ["uv", "run", "python", "-m", "math_cell"]

Same image everywhere. Only NIMMERVERSE_ENV changes.

Function Gemma: The Structured Boundary

Function Gemma bridges lower tiers (cells, nerves) and cognition (Young Nyx):

Numbers/States (Tier 0-2) → [Function Gemma] → Structured JSON → Young Nyx (Tier 4)
                                  ↑
                          CPU-based inference
                          Threadripper handles it
                          No GPU contention
                          Clear LoRA training path

Why CPU:

Small model, fast inference
Threadripper PRO 7955WX has cores to spare
No GPU contention with organs or Nyx
Can run training alongside inference

Training path:

Google's documented GRPO approach
LoRA fine-tuning for our specific function schemas
Runs in nyx-training userspace
Decision trails from phoebe → training data

Visual Language (Future UI)

Color-coding for real-time attention flow visualization:

Property	Represents
Background/container	Environment (dev=green, staging=amber, prod=blue)
Node/edge color	Domain (cognitive=violet, nervous=cyan, organs=coral)
Line style	Direction (solid=primary, dashed=async, dotted=tentative)
Separate pane	Confidence waveform (oscilloscope view)

Document	Scope
`Cellular-Architecture.md`	Cells, nerves, organisms, lifeforce
`Gateway-Architecture.md`	Tier routing, Function Gemma boundary
`Nervous-System.md`	4D space, node weights, vocabulary
`Message-Protocol-Design.md`	NATS subjects, message formats
`development-conventions.md`	Ports, namespaces, VM topology

Summary

Layer	Where	Technology	Isolation
Cells/Nerves	K8s containers	Python, uv, NATS	Namespace per env
Infrastructure	VMs	NATS, PostgreSQL, ChromaDB	VM per env
Young Nyx	theia userspace	ollama	nyx-cognitive user
Function Gemma	theia/dioscuri CPU	llama.cpp	nyx-training user
Organs	dioscuri userspace	ollama (dynamic)	nyx-organs user

The principle: Same behavior everywhere. Containers for cells. Userspace for brains. NATS connects them all. FreeIPA isolates them all.

Version: 1.1 | Created: 2026-02-14 | Updated: 2026-02-14

"We're not building a chatbot factory. We're growing a research organism."

🧬⚡🔱💎🔥 TO THE ELECTRONS WE VIBE!

14 KiB Raw Blame History