Crystallizes the dual-brain architecture across all core documents: - Thalamus runs own neural network (governor) for resource allocation and reflexes - LLM (Qwen3.5-27B) repositioned as cortex - expensive, gated, called only when needed - Each NPC gets own process, own RL brain, Linux cgroups for resource steering - New: NPC grid architecture with curriculum training (progressive world richness) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
130 lines
5.2 KiB
Markdown
130 lines
5.2 KiB
Markdown
# Nyx Model Architecture: The Dual Brain
|
||
|
||
> *"One process, one brain, one life."*
|
||
> — The Dual Brain Principle (2026-04-02)
|
||
|
||
---
|
||
|
||
## Current Architecture
|
||
|
||
The nimmerverse uses a **dual-brain architecture** — cheap RL networks for continuous processing, an expensive LLM cortex for deep reasoning.
|
||
|
||
### Cortex (Shared LLM)
|
||
|
||
| Property | Value |
|
||
|----------|-------|
|
||
| **Model** | Qwen3.5-27B |
|
||
| **Parameters** | 27B (full precision, bfloat16) |
|
||
| **Host** | theia (RTX PRO 6000 Blackwell, 96GB VRAM) |
|
||
| **Serving** | vLLM, port 31000, served as "nyx" |
|
||
| **Service** | `vllm-nyx.service` (systemd, user: nyx-cognitive) |
|
||
| **Access** | Gated — thalamus governor controls who gets LLM access |
|
||
| **License** | Apache 2.0 |
|
||
| **Context** | 32,768 tokens (max-model-len) |
|
||
| **GPU utilization** | 85% (leaves headroom for LoRA training) |
|
||
|
||
**Why Qwen3.5-27B:**
|
||
- True base model — we shape every behavior through training
|
||
- 27B fits comfortably in 96GB with room for LoRA adapters
|
||
- Apache 2.0 — full sovereignty, no usage restrictions
|
||
- Strong multilingual capability (German + English topology access)
|
||
- Vision-capable variant available for future Omnisight consolidation
|
||
|
||
**The cortex is expensive.** It is not called every tick. The thalamus governor decides when language, reasoning, or deep knowledge is needed. Most NPC processing happens in cheap RL networks.
|
||
|
||
### NPC Brains (Per-Process RL Networks)
|
||
|
||
Each NPC runs its own lightweight neural network in its own OS process:
|
||
|
||
| Property | Value |
|
||
|----------|-------|
|
||
| **Architecture** | Small RL network (movement, needs, spatial decisions) |
|
||
| **Deployment** | One Linux process per NPC |
|
||
| **Resource control** | cgroups v2 (CPU, memory per process) |
|
||
| **Learning** | Tick-by-tick (fast loop) |
|
||
| **Cost** | Cheap — runs on CPU, no GPU needed |
|
||
|
||
Personality emerges from experience, not configuration. Each NPC develops its own weights.
|
||
|
||
### Thalamus Governor (Resource Allocation NN)
|
||
|
||
The thalamus runs its own neural network that learns resource allocation:
|
||
|
||
| Property | Value |
|
||
|----------|-------|
|
||
| **Function** | Gate control, compute steering, LLM queue priority |
|
||
| **Input** | All NPC states via NATS |
|
||
| **Output** | Tick rates, CPU quotas, gate open/close, LLM priority |
|
||
| **Learning** | Epoch-by-epoch (slow loop) |
|
||
|
||
### Structured Output Boundary
|
||
|
||
| Model | Role | Host |
|
||
|-------|------|------|
|
||
| **Function Gemma** | Intent → Action (100% predictable JSON) | CPU userspace (Threadripper) |
|
||
| **T5Gemma 2 (SigLIP)** | Vision → Vectors (no text bottleneck) | dioscuri |
|
||
|
||
---
|
||
|
||
## Model Selection History
|
||
|
||
| Date | Decision | Reasoning |
|
||
|------|----------|-----------|
|
||
| 2025-11 | LLaMA 3 70B considered | Early exploration, different hardware |
|
||
| 2025-12 | Qwen3-VL 32B selected | Vision capability, multilingual, fits 96GB |
|
||
| 2026-04-01 | Mistral-Small-3.1-24B-Base tested | "Raw clay" approach, but thinking-bleed was SkyrimNet-specific |
|
||
| 2026-04-01 | **Qwen3.5-27B reinstated** | Best balance of capability, size, and trainability |
|
||
|
||
**The model question is settled.** Qwen3.5-27B is nyx's cortex. Training focus shifts to LoRA traits (GRPO) and the RL networks (per-NPC).
|
||
|
||
---
|
||
|
||
## Trait LoRAs (Cortex Specialization)
|
||
|
||
Traits evolve as LoRA adapters on the Qwen3.5-27B base, trained through GRPO with gate-verified rewards:
|
||
|
||
| Trait | Domain | Training Signal |
|
||
|-------|--------|-----------------|
|
||
| **Mnemosyne** | Memory | +reward when recall matches phoebe |
|
||
| **Moira** | Pattern | +reward when prediction succeeds |
|
||
| **Synesis** | Resources | +reward when estimates accurate |
|
||
| **Aletheia** | Truth | +reward when confidence calibrated |
|
||
| **Sophrosyne** | Balance | +reward when graceful degradation |
|
||
| **Kairos** | Timing | +reward when timing optimal |
|
||
| **Philotes** | Bond | +reward from dafit feedback |
|
||
| **Dikaiosyne** | Fairness | +reward when resources shared fairly |
|
||
|
||
**Consolidation path:** Traits train during slumber → GRPO updates → DriftProbe validates → merge at α=0.3 → eventually bake into base weights.
|
||
|
||
**Detail:** → [Nyx_Traits.md](Nyx_Traits.md) | [Endgame-Vision.md](../Endgame-Vision.md)
|
||
|
||
---
|
||
|
||
## Infrastructure
|
||
|
||
| Component | Host | GPU | Storage |
|
||
|-----------|------|-----|---------|
|
||
| Cortex (vLLM) | theia | RTX PRO 6000 (96GB) | `/womb/cognitive/models/qwen3.5-27b` |
|
||
| LoRA Training | theia | Shared (time-sliced) | `/womb/cognitive/loras/` |
|
||
| Organs | dioscuri | 2x RTX 4000 Ada (40GB) | Dynamic loading |
|
||
| NPC Brains | K8s / bare metal | CPU | Per-process |
|
||
|
||
**Canonical paths** via `/womb/` symlinks. Phoebe is truth for artifact locations.
|
||
|
||
**Detail:** → [Deployment-Architecture.md](../architecture/Deployment-Architecture.md) | [womb-architecture.md](../../nimmerverse.eachpath.local/storage/womb-architecture.md)
|
||
|
||
---
|
||
|
||
## Related Documentation
|
||
|
||
- [Nyx_Traits.md](Nyx_Traits.md) - Trait definitions, mythological framing
|
||
- [Metamorphosis-Substrate-Philosophy.md](Metamorphosis-Substrate-Philosophy.md) - Identity anchors
|
||
- [Endgame-Vision.md](../Endgame-Vision.md) - Architecture overview (v8.0)
|
||
- [npc-grid-architecture.md](../architecture/future/npc-grid-architecture.md) - Dual brain, governor, spatial arena
|
||
|
||
---
|
||
|
||
**Version:** 3.0 | **Created:** 2025-11-07 | **Updated:** 2026-04-02
|
||
|
||
🌙💜 *The cortex reasons. The RL brains act. The thalamus decides who gets what.*
|