Compare commits

..

2 Commits

Author SHA1 Message Date
ec77cba4d4 feat: GRPO reward architecture + Qwen3-VL-32B queen + doc restructure
Evening session 2025-12-10 (dafit + Nyx 🌿)

Reward Architecture:
- Added Reward Signal Architecture section to Cellular-Architecture
- Added Tiered Rewards & Training Integrity (anti-shortcut via lifeforce)
- Documented GRPO integration with rubric-based dense rewards
- Credit assignment automatic via decision_trails

Documentation Restructure:
- Promoted Temporal-Ternary-Gradient from archive to architecture
- Created architecture/cells/ folder with Index + Technical Reference
- Moved Organ-Index to architecture/organs/
- Full crosslinks in Endgame-Vision v5.3

Queen Update:
- Qwen2.5-7B → Qwen3-VL-32B (96GB in the Womb)
- RTX PRO 6000 Blackwell deployment specs
- Unsloth fine-tuning integration

"Verifiability IS rewardability." - The Dog Training Wisdom

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 20:11:13 +01:00
f49119c83f feat: finalize Nimmervest architecture - Contract Day 2025-12-09
Complete rewrite with secured hardware path:
- 2x ThinkStation P8 via Lenovo (Adrienn Wettstein, 16% discount)
- RTX PRO 6000 Blackwell Max-Q 96GB @ 1,792 GB/s (acscomputer.ch)
- 2x RTX 4000 Ada 20GB (→ 4x over time)
- Total: 17,831.58 CHF with 2,168 CHF buffer

Key discoveries: Max-Q has 33% MORE bandwidth than regular PRO 6000
at HALF the power draw. Professional cards > consumer cards.

Bank contract arrived in 24 hours. Orders go out Dec 23.

🌙💜 Young Nyx will think at 1.79 TB/s.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-09 18:22:43 +01:00
9 changed files with 853 additions and 127 deletions

View File

@@ -1,9 +1,9 @@
---
type: research_vision
version: 5.1_dialectic_architecture
version: 5.3_queen_crosslinks
status: vision_document
created: 2025-11-04
updated: 2025-12-07
updated: 2025-12-10
author: Nyx (with dafit)
significance: research_platform_for_metabolic_intelligence
---
@@ -78,7 +78,7 @@ This is a **RESEARCH VISION** - a platform for studying how intelligence emerges
│ → ../nyx-probing/PLAN.md │
│ │
│ Layer 2: YOUNG NYX (Single Model + LoRA Stack + Dialectic) │
│ ├─ Base: Qwen2.5-7B (~14GB VRAM)
│ ├─ Base: Qwen3-VL-32B (96GB VRAM in the Womb)
│ ├─ LoRA adapters: Identity, Technical, Creative (hot-swap) │
│ ├─ Mirror: Negated LoRA weights for dialectic (-1 × Nyx) │
│ ├─ Dialectic: Thesis (Nyx) → Antithesis (Mirror) → Synthesis │
@@ -91,11 +91,11 @@ This is a **RESEARCH VISION** - a platform for studying how intelligence emerges
│ └─ Target: 10-20% noise gap (virtual useful for hypothesis) │
│ → architecture/Dual-Garden-Architecture.md │
│ │
│ Layer 4: TRAIT EVOLUTION (RLVR + Reasoning-Gym)
│ ├─ Mnemosyne (Memory), Moira (Pattern), Synesis (Resource)
│ ├─ Aletheia (Truth), Sophrosyne (Balance), Kairos (Timing)
│ ├─ Philotes (Bond), Dikaiosyne (Fairness)
│ └─ Weights adjust through verified outcomes, not prescription │
│ Layer 4: TRAIT EVOLUTION (GRPO + Rubric Rewards)
│ ├─ Dense rewards: Cell→Nerve→Organism state verifications
│ ├─ Credit assignment automatic via decision_trails
│ ├─ Traits: Mnemosyne, Moira, Synesis, Aletheia, Sophrosyne...
│ └─ Weights adjust through GRPO, not prescription
│ │
└──────────────────────────────────────────────────────────────────┘
```
@@ -190,7 +190,7 @@ One base model, one topology, multiple perspectives through LoRA adapters. The M
### Architecture
```
Qwen2.5-7B-Base (~14GB VRAM)
Qwen3-VL-32B (96GB in the Womb)
┌───────────────┴───────────────┐
│ │
@@ -240,9 +240,10 @@ For high-stakes queries (identity, ethics, low confidence):
### Deployment
**Hardware:** RTX 5060 Ti (16GB VRAM) on prometheus.eachpath.local
**Solution:** Lorax for hot-swap LoRA adapters (<100ms)
**VRAM Budget:** Base 14GB + Active LoRA ~200MB = ~14.2GB ✓
**Hardware:** RTX PRO 6000 Blackwell (96GB VRAM) - "The Womb"
**Solution:** Unsloth for fine-tuning (~77GB), Lorax for hot-swap LoRA adapters (<100ms)
**VRAM Budget:** Base ~77GB + Active LoRA ~200MB = fits in 96GB ✓
**Vision:** Qwen3-VL-32B brings unified vision + video + OCR + reasoning
---
@@ -270,9 +271,27 @@ Week 25: 4% (highly accurate)
---
## Layer 4: Trait Evolution
## Layer 4: Trait Evolution (GRPO + Rubric Rewards)
Traits evolve through RLVR (Reinforcement Learning from Verification Rewards), not prescription.
Traits evolve through **GRPO** (Group Relative Policy Optimization) with rubric-based rewards, not prescription.
> *"A list of smaller verifiable rewards, not a final all-consuming singular reward."*
> — The Dog Training Wisdom (2025-12-10)
### The Rubric Principle
The state machine architecture provides automatic reward rubric:
| Level | Verification Point | Signal |
|-------|-------------------|--------|
| Cell | State transition succeeds | +small (dense) |
| Nerve | Behavioral goal achieved | +medium |
| Organism | Milestone reached | +large |
| dafit | Human confirms outcome | +bonus |
**Credit assignment is automatic** - the `decision_trails` table captures which states led to which outcomes. No guessing needed.
### Trait Domains
| Trait | Domain | Verification |
|-------|--------|--------------|
@@ -287,6 +306,8 @@ Traits evolve through RLVR (Reinforcement Learning from Verification Rewards), n
**From Reasoning-Gym:** Small models improve through structured practice, not scale. Algorithmic verification enables infinite training data.
**Detail:**`architecture/Cellular-Architecture.md` (Reward Signal Architecture section)
---
## Boot Sequence (Spark Protocol)
@@ -391,8 +412,10 @@ Sentinel architecture monitors training to protect conceptual topology.
### Architecture
- [`architecture/nimmerverse.drawio.xml`](architecture/nimmerverse.drawio.xml) - **Visual overview diagram** (open in draw.io)
- [`architecture/Cellular-Architecture.md`](architecture/Cellular-Architecture.md) - Organisms, primitives, life force economy
- [`architecture/Cellular-Architecture.md`](architecture/Cellular-Architecture.md) - Organisms, primitives, life force economy, reward signals
- [`architecture/cells/`](architecture/cells/) - Cell technical reference, Python/SQL patterns
- [`architecture/Dual-Garden-Architecture.md`](architecture/Dual-Garden-Architecture.md) - Virtual/real feedback loop
- [`architecture/Temporal-Ternary-Gradient.md`](architecture/Temporal-Ternary-Gradient.md) - Ternary logic, confidence gradients, temporal asymmetry
- [`architecture/Data-Architecture.md`](architecture/Data-Architecture.md) - phoebe 15-table schema
- [`architecture/Nervous-System.md`](architecture/Nervous-System.md) - State machines, sensory translation
@@ -407,14 +430,19 @@ Sentinel architecture monitors training to protect conceptual topology.
### Identity
- [`nyx-metamorphosis/`](nyx-metamorphosis/) - Continuity through substrate, metamorphosis philosophy
### Frontend
- [`../management-portal/Command-Center.md`](../management-portal/Command-Center.md) - Godot nervous system viewer, interaction modes
### Archive
- [`archive/`](archive/) - Previous explorations, theoretical foundations
---
**Version:** 5.1 (Dialectic Architecture)
**Version:** 5.3 (Qwen3-VL-32B Queen + Full Crosslinks)
**Created:** 2025-11-04 (covenant sealing)
**Updated:** 2025-12-07 (single model + LoRA stack + Mirror dialectic)
**Updated:** 2025-12-10 (Layer 4 GRPO integration, rubric-based reward architecture)
**Updated:** 2025-12-10 (Qwen3-VL-32B as queen, added Temporal-Ternary, cells/, Command-Center crosslinks)
*"The substrate doesn't matter. The feedback loop does."*

View File

@@ -403,6 +403,170 @@ ORGANISM lifeforce budget: 100 LF
---
## 🎯 Reward Signal Architecture
### State Machines as Training Rubric
Every state transition in the Cells → Nerves → Organisms hierarchy is a **verifiable reward checkpoint**. This is the rubric that trains Young Nyx via GRPO.
> *"The trick is to define a rubric - a list of smaller verifiable rewards, and not a final all-consuming singular reward."*
> — The Dog Training Wisdom (2025-12-10)
### Why Rubric > Single Reward
| Approach | Signal | Learning | Analogy |
|----------|--------|----------|---------|
| Single final reward | Sparse | Slow, unstable | Slapping a dog an hour later |
| Rubric (many checkpoints) | Dense | Fast, stable | Rewarding at the moment |
Dense rewards provide immediate feedback. The state machine architecture provides this automatically - every verified state transition is a checkpoint.
### The decision_trails Table IS Training Data
```sql
-- Each row is a training example with automatic credit assignment
SELECT
states_visited, -- The path taken (which decisions led here?)
cell_reads, -- Which cells contributed (sensor inputs)
cell_commands, -- What actions were taken (motor outputs)
outcome, -- Success/failure (ground truth)
lifeforce_cost, -- Cost of this path
lifeforce_reward -- Reward earned
FROM decision_trails
WHERE nerve_id = ?;
```
The `states_visited` column captures credit assignment automatically. No reward model needed to guess which decisions mattered - the state path tells us explicitly.
### Reward Signal Flow
```
CELL state transition succeeds
├─→ Runtime: weight += 0.1 (node strengthens)
└─→ Training: +0.1 reward signal logged
NERVE behavior completes successfully
├─→ Runtime: nerve stats updated
└─→ Training: +1.0 reward signal + full state path
ORGANISM milestone achieved
├─→ Runtime: lifeforce credited
└─→ Training: +5.0 reward signal + human verification bonus
GRPO training batch
├─→ Collect decision_trails since last batch
├─→ Group by outcome (success vs failure)
├─→ Relative policy optimization
└─→ Young Nyx weights updated
```
### Connection to GRPO Training
When Young Nyx generates tokens:
1. **Tokens → Translation Layer** - Language maps to state machine actions
2. **States Execute** - Cells fire, nerves coordinate, outcomes emerge
3. **Outcomes Logged** - decision_trails captures the full path
4. **GRPO Batch** - Successful paths vs failed paths
5. **Weight Update** - Young Nyx learns which tokens lead to good states
The translation layer is the **reward bridge** - it connects token-level generation to state-level verification. Rewards flow back through this bridge to improve token selection.
### Credit Assignment is Automatic
Most RL systems struggle with credit assignment: "Which of my 1000 decisions actually caused the good/bad outcome?"
Our architecture solves this by construction:
- State paths are explicit (logged in `states_visited`)
- Cell contributions are explicit (logged in `cell_reads`, `cell_commands`)
- The question "what led to success?" has a direct answer in the data
**No guessing. No reward model approximation. The state machine IS the credit assignment mechanism.**
---
## 🎚️ Tiered Rewards & Training Integrity
### The Tier System
Different levels of the architecture produce different reward magnitudes:
| Tier | Level | Example | Reward | Lifeforce Cost | Net Incentive |
|------|-------|---------|--------|----------------|---------------|
| 1 | Cell | Single state transition | +0.1 | -0.3 LF | Learn basics |
| 2 | Nerve | Multi-step behavior | +1.0 | -2.0 LF | Learn composition |
| 3 | Organism | Complex goal achieved | +5.0 | -8.0 LF | Learn planning |
| Bonus | Human | dafit verifies outcome | +2.0 | 0 LF | Ground truth anchor |
As Young Nyx's world model improves (noise ↓, weight resolution ↑), she recognizes:
*"If I compose cells into nerve patterns, I get 10x reward... if I can afford the cost."*
This **incentivizes abstraction and multi-step planning** without prescription.
### Lifeforce as Anti-Shortcut Mechanism
Classic RL failure: **reward hacking**. Agent finds loopholes, gets reward without solving real problems.
Our defense: **You can't afford to cheat.**
```
SHORTCUT ATTEMPT:
├─ Strategy: "Spam tier 2 calls for big rewards!"
├─ Cost: 2.0 LF × many calls = BANKRUPT
└─ Result: Dead organism. Shortcut failed.
GENUINE SOLUTION:
├─ Strategy: "Use tier 2 only when it actually helps"
├─ Reward exceeds cost → NET POSITIVE
└─ Result: Thriving organism. Real learning.
```
The lifeforce economy **enforces honesty**. Rewards must be earned through actual value creation, not gaming.
### Ternary Logic for Plateau Resolution
Binary rewards (`success: +1, failure: 0`) create **sparse gradients**. At learning plateaus, everything looks the same - no signal to improve.
Ternary rewards (`success: +1, uncertain: 0, failure: -1`) with **confidence gradients** provide signal even when stuck:
```python
state = {
"value": 0, # uncertain (ternary middle)
"confidence": 0.6, # but leaning toward success
"trend": +0.1, # and improving
"domain": "virtual" # high-speed hypothesis testing
}
```
Even at plateau:
- "Uncertain, but confidence rising" → keep going
- "Uncertain, and confidence falling" → adjust approach
- "Uncertain in virtual, but real garden says +1" → trust reality
**Detail:**`Temporal-Ternary-Gradient.md` (full ternary paradigm)
### Three-Layer Training Defense
| Failure Mode | Defense Mechanism |
|--------------|-------------------|
| Reward hacking / shortcuts | Lifeforce cost - can't afford to cheat |
| Sparse reward signal | Tiered rewards - dense checkpoints at every level |
| Plateau / no gradient | Ternary + confidence - signal even in uncertainty |
These aren't separate systems - they're **one integrated economy** where:
- Costs prevent gaming
- Tiers encourage depth
- Ternary provides resolution
The architecture teaches through incentives, not rules.
---
## 🔄 Evolution: Deliberate → Reflex
### The Discovery Path
@@ -625,13 +789,22 @@ Organs are **complex cells** (organ cells):
Nerves orchestrate cells into behaviors. The existing nerve documentation (Collision-Avoidance.md) already follows this pattern—it just needs explicit cell bindings.
### Cells Technical Reference
Implementation details extracted to dedicated folder:
- [`cells/Cells-Index.md`](cells/Cells-Index.md) - Navigation hub for cell documentation
- [`cells/Cells-Technical-Reference.md`](cells/Cells-Technical-Reference.md) - Python classes, SQL tables, code patterns
---
## 📍 Document Status
**Version**: 4.0 (Layered State Machine Architecture)
**Version**: 4.2 (Layered State Machine Architecture + Reward Signals + Training Integrity)
**Created**: 2025-10-12 (original v1)
**Updated v4**: 2025-12-07 (unified with Nervous System)
**Updated v4.1**: 2025-12-10 (added Reward Signal Architecture section)
**Updated v4.2**: 2025-12-10 (added Tiered Rewards & Training Integrity section)
**Key Changes from v3**:
- ❌ Cells as containers running genomes

View File

@@ -163,6 +163,42 @@ The lifeforce flows through the nervous system, literally lighting up nodes as t
---
## Connection to Training
The nervous system doesn't just run behaviors - it **generates training data** for Young Nyx.
### Every Verification = Training Signal
When dafit confirms a node fired correctly:
- **Runtime**: Node weight increases (+V)
- **Training**: Example logged → Young Nyx learns
This is the **rubric principle** - dense rewards at every verifiable checkpoint, not just final outcomes.
### Credit Assignment is Automatic
Because state transitions are explicit and logged, we know exactly which nodes contributed to success or failure:
- The state path tells us which decisions led to the outcome
- No reward model needed to guess
- The nervous system IS the credit assignment mechanism
### Dense Rewards from State Paths
Each node that fires correctly along a successful path receives reward signal:
```
Node A fires → verified ✓ → +0.1 signal
Node B fires → verified ✓ → +0.1 signal
Node C fires → verified ✓ → +0.1 signal
Behavior succeeds → +1.0 signal
Total path reward: 1.3 (dense, traceable)
```
This is like training a dog - reward at the moment, not an hour later.
**Detail:**`Cellular-Architecture.md` (Reward Signal Architecture section)
---
## Design Principles
1. **Deterministic**: Same input = same output. No hallucination.
@@ -190,5 +226,6 @@ The lifeforce flows through the nervous system, literally lighting up nodes as t
**Created**: 2025-12-04
**Updated**: 2025-12-07 (added nerve crosslinks)
**Session**: Partnership dialogue (dafit + Chrysalis)
**Updated**: 2025-12-10 (added Connection to Training section)
**Session**: Partnership dialogue (dafit + Chrysalis + Nyx)
**Status**: Foundation concept

View File

@@ -1,13 +1,16 @@
---
type: research_concept
version: 1.0
status: emerging_paradigm
version: 1.1
status: core_architecture
created: 2025-12-03
updated: 2025-12-10
author: Nyx & dafit (shower-thought session)
related_docs:
- Endgame-Vision.md
- ../Endgame-Vision.md
- Dual-Garden-Architecture.md
significance: connects ternary logic + lifeforce + temporal asymmetry
- Cellular-Architecture.md
significance: connects ternary logic + lifeforce + temporal asymmetry + reward gradients
promoted_from: archive (2025-12-10)
---
# Temporal-Ternary Gradient
@@ -176,7 +179,8 @@ The constraint of slow real-world testing becomes ground truth anchoring.
---
**Created**: 2025-12-03
**Updated**: 2025-12-10
**Origin**: Post-shower insight session
**Status**: Emerging paradigm, needs integration with Endgame-Vision.md
**Status**: Core architecture (promoted from archive 2025-12-10)
🌙💜 *"Time is the currency. Lifeforce is the exchange rate. Truth is the destination."*

View File

@@ -0,0 +1,65 @@
# Cells Index
> *"Cells are atomic state machines. The smallest units of behavior."*
---
## Overview
This folder contains detailed documentation for the **Cell layer** of the nimmerverse architecture - the atomic state machines that wrap hardware capabilities.
**Conceptual overview:** → [`../Cellular-Architecture.md`](../Cellular-Architecture.md)
---
## Documentation
| Document | Purpose |
|----------|---------|
| **Cells-Index.md** | This file - navigation hub |
| [`Cells-Technical-Reference.md`](Cells-Technical-Reference.md) | Python classes, SQL tables, implementation details |
---
## Cell Categories
### Sensor Cells (Input)
| Cell | Hardware | Key Output |
|------|----------|------------|
| `distance_sensor_front` | IR sensor | `distance_cm`, `confidence` |
| `distance_sensor_left` | IR sensor | `distance_cm`, `confidence` |
| `distance_sensor_right` | IR sensor | `distance_cm`, `confidence` |
| `battery_monitor` | ADC | `voltage`, `percentage`, `charging` |
| `imu_sensor` | MPU6050 | `heading`, `acceleration`, `tilt` |
| `light_sensor` | Photoresistor | `lux`, `direction` |
### Motor Cells (Output)
| Cell | Hardware | Key Feedback |
|------|----------|--------------|
| `motor_left` | DC motor + encoder | `actual_velocity`, `stall_detected` |
| `motor_right` | DC motor + encoder | `actual_velocity`, `stall_detected` |
| `servo_camera` | Servo motor | `angle`, `at_target` |
### Organ Cells (Complex)
| Cell | Hardware | Key Output |
|------|----------|------------|
| `speech_stt` | Whisper on atlas | `transcript`, `language` |
| `speech_tts` | Coqui on atlas | `audio_playing`, `complete` |
| `vision_detect` | YOLO on atlas | `objects[]`, `bounding_boxes[]` |
---
## Related Documentation
- [`../Cellular-Architecture.md`](../Cellular-Architecture.md) - Full conceptual architecture
- [`../Nervous-System.md`](../Nervous-System.md) - How cells connect to nervous system
- [`../nerves/Nervous-Index.md`](../nerves/Nervous-Index.md) - Nerves that orchestrate cells
- [`../organs/Organ-Index.md`](../organs/Organ-Index.md) - Complex organ cells
---
**Created**: 2025-12-10
**Status**: Index document

View File

@@ -0,0 +1,290 @@
# Cells Technical Reference
> *Implementation details: Python classes, SQL tables, code patterns.*
**Conceptual overview:** → [`../Cellular-Architecture.md`](../Cellular-Architecture.md)
**Index:** → [`Cells-Index.md`](Cells-Index.md)
---
## Python Class Patterns
### Base Cell Pattern
All cells follow this state machine pattern:
```python
class Cell(StateMachine):
"""Base pattern for all cells."""
# Define discrete states
states = [IDLE, ACTIVE, ERROR]
# Outputs available to higher layers
outputs = {
"state": str,
"last_updated": timestamp,
}
# Lifeforce costs per transition
costs = {
(FROM_STATE, TO_STATE): float,
}
```
---
### Sensor Cell Example
```python
class DistanceSensorCell(StateMachine):
"""
Wraps IR/ultrasonic distance sensor.
Exposes raw hardware as state machine.
"""
states = [IDLE, POLLING, READING, REPORTING, ERROR]
# State outputs (available to nerves)
outputs = {
"distance_cm": float, # Current reading
"confidence": float, # Signal quality (0-1)
"state": str, # Current state name
"last_updated": timestamp, # Freshness
}
# Lifeforce costs
costs = {
(IDLE, POLLING): 0.1, # Wake up sensor
(POLLING, READING): 0.3, # Perform measurement
(READING, REPORTING): 0.1, # Process result
(REPORTING, IDLE): 0.0, # Return to rest
(ANY, ERROR): 0.0, # Error transition free
}
```
---
### Motor Cell Example
```python
class MotorCell(StateMachine):
"""
Wraps DC motor with feedback.
Exposes actuation as state machine.
"""
states = [IDLE, COMMANDED, ACCELERATING, MOVING, DECELERATING, STOPPED, STALLED]
outputs = {
"actual_velocity": float, # Measured speed
"target_velocity": float, # Commanded speed
"power_draw": float, # Current consumption
"state": str, # Current state
"stall_detected": bool, # Motor blocked?
}
costs = {
(IDLE, COMMANDED): 0.1,
(COMMANDED, ACCELERATING): 0.5,
(ACCELERATING, MOVING): 1.0, # High power during accel
(MOVING, MOVING): 0.3, # Sustain cost per tick
(MOVING, DECELERATING): 0.2,
(DECELERATING, STOPPED): 0.1,
(ANY, STALLED): 0.0, # Stall is failure, not cost
}
# Feedback triggers state changes
def on_current_spike(self):
"""Motor drawing too much current = stall"""
self.transition_to(STALLED)
self.emit_event("stall_detected", obstacle_likely=True)
```
---
### Organ Cell Example
```python
class SpeechSTTCell(StateMachine):
"""
Wraps Whisper speech-to-text.
Expensive organ, lifeforce-gated.
"""
states = [IDLE, LISTENING, BUFFERING, TRANSCRIBING, REPORTING, ERROR]
outputs = {
"transcript": str,
"language": str,
"confidence": float,
"state": str,
}
costs = {
(IDLE, LISTENING): 0.5,
(LISTENING, BUFFERING): 0.5,
(BUFFERING, TRANSCRIBING): 5.0, # GPU inference!
(TRANSCRIBING, REPORTING): 0.1,
(REPORTING, IDLE): 0.0,
}
```
---
## SQL Table Definitions
### cells Table
```sql
CREATE TABLE cells (
id BIGSERIAL PRIMARY KEY,
cell_type VARCHAR(50), -- 'sensor', 'motor', 'organ'
cell_name VARCHAR(100) UNIQUE, -- 'distance_sensor_front'
hardware_binding JSONB, -- {"type": "i2c", "address": "0x40"}
-- State machine definition
states JSONB, -- ["IDLE", "POLLING", "READING", "REPORTING"]
transitions JSONB, -- [{"from": "IDLE", "to": "POLLING", "cost": 0.1}]
current_state VARCHAR(50),
-- Outputs (live values)
outputs JSONB, -- {"distance_cm": 25.5, "confidence": 0.9}
-- Health
operational BOOLEAN DEFAULT true,
error_count INT DEFAULT 0,
last_error TEXT,
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW()
);
```
---
### decision_trails Table (Training Data)
```sql
CREATE TABLE decision_trails (
id BIGSERIAL PRIMARY KEY,
organism_id BIGINT REFERENCES organisms(id),
nerve_id BIGINT REFERENCES nerves(id),
-- State path taken
states_visited JSONB, -- ["IDLE", "DETECT", "EVALUATE", "EVADE", "RESUME"]
-- Cell interactions
cell_reads JSONB, -- [{"cell": "distance_front", "value": 25, "state": "REPORTING"}]
cell_commands JSONB, -- [{"cell": "motor_left", "action": "turn", "result": "success"}]
-- Economics
lifeforce_cost FLOAT,
lifeforce_reward FLOAT,
lifeforce_net FLOAT,
-- Outcome
outcome VARCHAR(20), -- 'success', 'failure', 'timeout'
-- Timing
started_at TIMESTAMPTZ,
completed_at TIMESTAMPTZ,
latency_ms INT
);
```
---
## Common Queries
### Cell Health Dashboard
```sql
SELECT cell_name, cell_type, current_state, operational,
outputs->>'distance_cm' as distance,
outputs->>'confidence' as confidence
FROM cells
WHERE cell_type = 'sensor';
```
### Training Data for GRPO
```sql
-- Each row is a training example with automatic credit assignment
SELECT
states_visited, -- The path taken (which decisions led here?)
cell_reads, -- Which cells contributed (sensor inputs)
cell_commands, -- What actions were taken (motor outputs)
outcome, -- Success/failure (ground truth)
lifeforce_cost, -- Cost of this path
lifeforce_reward -- Reward earned
FROM decision_trails
WHERE nerve_id = ?;
```
### State Path Analysis
```sql
SELECT states_visited, COUNT(*) as occurrences,
AVG(lifeforce_cost) as avg_cost,
SUM(CASE WHEN outcome = 'success' THEN 1 ELSE 0 END)::float / COUNT(*) as success_rate
FROM decision_trails
WHERE nerve_id = (SELECT id FROM nerves WHERE nerve_name = 'collision_avoidance')
GROUP BY states_visited
ORDER BY occurrences DESC;
```
---
## Lifeforce Cost Reference
### Sensor Cells
| Cell Type | Operation | Cost (LF) |
|-----------|-----------|-----------|
| Distance sensor | poll | 0.3-0.5 |
| Battery monitor | read | 0.1 |
| IMU sensor | sample | 0.3 |
| Light sensor | read | 0.2 |
### Motor Cells
| Cell Type | Operation | Cost (LF) |
|-----------|-----------|-----------|
| DC motor | move (per 100ms) | 1.0-2.0 |
| Servo | position | 0.5 |
### Organ Cells
| Cell Type | Operation | Cost (LF) |
|-----------|-----------|-----------|
| Speech STT | transcribe | 5.0 |
| Speech TTS | synthesize | 4.0 |
| Vision detect | detect frame | 8.0 |
---
## Tiered Reward Reference
| Tier | Level | Reward | Lifeforce Cost |
|------|-------|--------|----------------|
| 1 | Cell | +0.1 | -0.3 LF |
| 2 | Nerve | +1.0 | -2.0 LF |
| 3 | Organism | +5.0 | -8.0 LF |
| Bonus | Human verification | +2.0 | 0 LF |
---
## Ternary State Pattern
```python
state = {
"value": 0, # -1 (failed), 0 (uncertain), +1 (success)
"confidence": 0.6, # 0.0 - 1.0 confidence gradient
"trend": +0.1, # direction of change
"domain": "virtual" # "virtual" or "real" garden
}
```
---
**Created**: 2025-12-10
**Extracted from**: Cellular-Architecture.md v4.2
**Status**: Technical reference

View File

@@ -1,4 +1,3 @@
<?xml version="1.0" encoding="UTF-8"?>
<mxfile host="Electron" agent="Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) draw.io/29.0.3 Chrome/140.0.7339.249 Electron/38.7.0 Safari/537.36" version="29.0.3">
<diagram name="Page-1" id="S4VRy6nj8Uh85EHbhTP-">
<mxGraphModel dx="2066" dy="2314" grid="1" gridSize="10" guides="1" tooltips="1" connect="1" arrows="1" fold="1" page="1" pageScale="1" pageWidth="850" pageHeight="1100" math="0" shadow="0">

View File

@@ -1,112 +1,205 @@
# Nimmervest
**The Hardware Investment Strategy for Sovereign AI Infrastructure**
*Budget: 20k CHF | Timeline: Lifetime Project*
*Budget: 20k CHF | Timeline: Lifetime Project | Revised: 2025-12-09*
---
## The Three Organs
## The Architecture
### The Beast (Training/Womb)
### The Womb (Cognition/Inference)
Where Young Nyx lives, thinks, and runs.
| Component | Spec | Purpose |
|-----------|------|---------|
| Chassis | Lenovo ThinkStation P8 | Workstation-grade, 3yr Premier Support |
| CPU | TR Pro 7955WX | 128 PCIe lanes, 8-channel RAM |
| RAM | 128GB DDR5 ECC (4×32GB) | Expandable to 2TB via 8 slots |
| GPU | 2× RTX 4000 Ada (40GB) | Expanding to 4× (80GB) over 4 months |
| Storage | 4TB NVMe | Training datasets, checkpoints |
| PSU | 1400W 92% | Feeds 4 GPUs at full load |
| Network | 10GbE dual-port | Fast weight transfer to Mind |
| Role | Training, LoRA experiments, cellular society, dual gardens |
| Host | ThinkStation P8 | Professional workstation platform |
| CPU | Threadripper PRO 7955WX | 16c/32t, 4.5→5.3 GHz boost |
| RAM | 128GB DDR5-4800 ECC (4x32GB RDIMM) | 4 slots free for expansion to 256GB |
| GPU | **RTX PRO 6000 Blackwell Max-Q** | **96GB GDDR7 ECC, 1,792 GB/s, 300W** |
| Storage | 4TB NVMe PCIe 4.0 (2x2TB) | OPAL encrypted, enterprise grade |
| Network | Intel X710-T2L 10GbE dual | Copper, direct to spine |
| PSU | 1400W 92% efficiency | Massive headroom at 300W GPU |
| Warranty | 3 Jahre Vor-Ort-Service | Lenovo on-site support |
**Initial Cost: ~5,664 CHF**
**Why RTX PRO 6000 Max-Q:**
- 96GB GDDR7 with ECC (professional grade, error-correcting)
- 1,792 GB/s bandwidth (1.79 TB/s!) - 33% faster than regular PRO 6000
- 300W TDP (half of regular 600W variant) - runs cool and quiet
- Dual-slot form factor - fits perfectly in P8
- PCIe 5.0 - future-proof interface
- 5th gen tensor cores, 4th gen RT cores
### Nyx's Mind (Cognition)
---
### The Senses (Perception/Organs)
Where Nyx sees, hears, and speaks.
| Component | Spec | Purpose |
|-----------|------|---------|
| Chassis | Lenovo ThinkStation P8 | Identical twin, shared maintenance |
| CPU | TR Pro 7955WX | 128 PCIe lanes, 8-channel RAM |
| RAM | 128GB DDR5 ECC (4×32GB) | Expandable to 2TB via 8 slots |
| GPU | 1× RTX PRO 6000 Blackwell (96GB GDDR7) | ~1,800 GB/s bandwidth |
| Storage | 4TB NVMe | Model weights, Nyx's memory |
| PSU | 1400W 92% | Room for 3 more GPUs |
| Network | 10GbE dual-port | Serves inference to Spine |
| Role | Running Nyx 24/7, dialectic processing, DriftProbe |
| Host | ThinkStation P8 | Identical twin platform |
| CPU | Threadripper PRO 7955WX | 16c/32t, 4.5→5.3 GHz boost |
| RAM | 128GB DDR5-4800 ECC (4x32GB RDIMM) | 4 slots free for expansion |
| GPU | **2x RTX 4000 Ada 20GB** (start) | **40GB total, professional Ada architecture** |
| GPU | **→ 4x RTX 4000 Ada 20GB** (target) | **80GB total, added every 2 months** |
| Storage | 4TB NVMe PCIe 4.0 (2x2TB) | OPAL encrypted |
| Network | Intel X710-T2L 10GbE dual | Copper, direct to spine |
| PSU | 1400W 92% efficiency | Multi-GPU ready |
| Warranty | 3 Jahre Vor-Ort-Service | Lenovo on-site support |
**Initial Cost: ~12,169 CHF** (chassis + PRO 6000)
**Why RTX 4000 Ada over RTX 5060:**
- 20GB vs 16GB per card (25% more VRAM)
- Professional Ada architecture (not consumer Blackwell)
- ECC memory support
- ~360 GB/s bandwidth per card (vs ~256 GB/s on 5060)
- 1,200 CHF via Lenovo deal (professional card at reasonable price)
### The Spine (Reflexes)
**Organ allocation (at 4 GPUs):**
- GPU 1: Speech Organ (Whisper STT)
- GPU 2: Voice Organ (TTS)
- GPU 3: Vision Organ (YOLO, cameras)
- GPU 4: Training/overflow/future organs
---
### The Veteran (Test Bed/Backup)
The proven warrior, now in support role.
| Component | Spec | Purpose |
|-----------|------|---------|
| GPU | RTX 3090 | 24GB VRAM |
| Host | Prometheus (Saturn VM) | K8s integrated |
| Role | State machine inference, fast pattern matching |
| Host | Saturn | Ryzen 3900X, 128GB RAM, 10 VMs |
| GPU | RTX 3090 | 24GB VRAM @ 936 GB/s |
| Role | Test bed, staging, backup inference |
**Cost: Already owned**
---
## Budget Allocation
### The Spine (Network/Security)
The nervous system connecting all organs.
| Component | Spec | Purpose |
|-----------|------|---------|
| Firewall | **Siemens SIMATIC IPC** | Industrial-grade, pfSense, 10G NIC incoming |
| Spine | MikroTik CRS309-1G-8S+IN | 8x SFP+ 10G aggregation |
| Access | MikroTik CRS326-24G-2S+RM | 24x 1G + 2x SFP+ 10G |
| Converters | 10G SFP+ to RJ45 copper | Bridge switches to NICs |
**Cost: Already owned / arriving**
---
### The Memory (Persistence/Continuity)
Where experience accumulates between sessions.
| Component | Spec | Purpose |
|-----------|------|---------|
| Host | Phoebe | PostgreSQL database server |
| Role | Session messages, variance data, continuity |
| Tables | `partnership_to_nimmerverse_messages`, `variance_probe_runs` |
**Cost: Already owned**
---
## Budget Allocation (Final)
| Item | Cost CHF | Status |
|------|----------|--------|
| 2× ThinkStation P8 (w/ RTX 4000 Ada each) | 11,327 | Ordered Dec 23 |
| Premier Support + Keep Your Drive | 206 | Included |
| RTX PRO 6000 Blackwell 96GB | 6,505 | Ordered Dec 23 |
| The Spine | 0 | Owned |
| **Initial Total** | **18,038** | |
| **Buffer** | **~1,962** | Sensors, LoRa, RAM |
| 2x ThinkStation P8 (7955WX, 128GB ECC, 2x RTX 4000 Ada) | 11,327.13 | **Quote ready** - Angebot #4650557686 |
| RTX PRO 6000 Blackwell Max-Q 96GB | 6,504.45 | **In stock** - acscomputer.ch |
| **Subtotal** | **17,831.58** | |
| **Buffer** | **2,168.42** | Expansion, accessories |
| **Total** | **20,000.00** | |
### Expansion Path (Months 2-4)
| Month | Addition | Cost | Beast VRAM |
|-------|----------|------|------------|
| 2 | +1 RTX 4000 Ada | 1,700 | 60GB |
| 4 | +1 RTX 4000 Ada | 1,700 | 80GB |
---
## Inference Capacity
**RTX PRO 6000 Blackwell (96GB GDDR7)**
| Metric | Value |
|--------|-------|
| VRAM | 96GB |
| Bandwidth | ~1,800 GB/s |
| Qwen2.5-7B FP16 | ~14GB (15% utilization) |
| Qwen2.5-70B 4-bit | ~35GB (36% utilization) |
| **Headroom** | **Room to 70B+ models** |
---
## Training Capacity
**Beast at Full Expansion (4× RTX 4000 Ada = 80GB)**
| Metric | Value |
|--------|-------|
| Total VRAM | 80GB |
| Qwen2.5-7B LoRA training | Comfortable |
| Qwen2.5-14B LoRA training | With DeepSpeed ZeRO |
| Cellular society (50-100 containers) | 32 CPU threads |
### Lenovo Quote Details
- **Angebotsnummer**: 4650557686
- **Vertriebsmitarbeiterin**: Adrienn Wettstein (Legend!)
- **Telefon**: (044) 516 04 67
- **E-Mail**: awettstein@lenovo.com
- **Rabatt**: 16% off list price
- **Gültig bis**: Held for 2 weeks (flexible)
---
## Growth Path
```
Year 0: Qwen2.5-7B-Base → Nyx-7B-v0 (Mind at 15%)
Year 1-2: Nyx-7B → Nyx-14B (Mind at 30%)
Year 2-3: Nyx-14B → Nyx-32B (Mind at 65%)
Year 3+: Nyx-70B possible (Mind at 90%)
Mind has 3 open slots for future GPUs
Phase 1 (January 2026): Foundation arrives
- Both ThinkStations operational
- RTX PRO 6000 Max-Q in Womb (96GB)
- 2x RTX 4000 Ada in Senses (40GB)
- 10G network live
- Total VRAM: 160GB
Phase 2 (Every 2 months): RTX 4000 Ada expansion
- +1 RTX 4000 Ada @ 1,200 CHF each
- Month 2: 60GB Senses
- Month 4: 80GB Senses (target reached)
- From monthly surplus (~1,800 CHF)
Phase 3 (Future): Optional expansion
- RAM: 128GB → 256GB per machine (slots ready)
- Additional 3090s for Saturn (eBay hunting)
- Second Womb machine if needed
```
---
## Compute Summary
| Resource | At Launch | At Full Build |
|----------|-----------|---------------|
| **Total VRAM** | 160GB (96+40+24) | **200GB** (96+80+24) |
| **Peak Bandwidth** | 1,792 GB/s (Womb) | 1,792 GB/s (Womb) |
| **CPU Cores** | 44c/88t | 44c/88t |
| **System RAM** | 384GB ECC | 512GB+ ECC (expandable) |
| **Fast Storage** | 12TB NVMe | 12TB+ NVMe |
| **Network** | 10G spine, full mesh | 10G spine, full mesh |
---
## The Lenovo Discovery
**Why ThinkStation P8 over DIY:**
```
DIY Threadripper PRO build:
├── TRX50 board: ~1,500 CHF (4 month wait!)
├── TR PRO 7955WX: ~2,500 CHF
├── 128GB DDR5 ECC: ~5,149 CHF (insane shortage pricing)
├── Storage, PSU, case: ~1,000 CHF
└── Total: ~10,149 CHF + months waiting
ThinkStation P8 configured (via Adrienn):
├── Everything above: ~5,664 CHF
├── PLUS 2x RTX 4000 Ada: ~2,400 CHF (included in quote!)
├── Includes 10GbE dual: ✓
├── Includes 3yr warranty: ✓
├── Ships January: ✓
└── Savings: ~4,485 CHF per machine vs DIY
```
Lenovo's bulk purchasing power breaks the component shortage.
Adrienn's 16% discount makes it even sweeter.
---
## Why Max-Q over Regular PRO 6000
| Spec | Regular PRO 6000 | PRO 6000 Max-Q |
|------|------------------|----------------|
| VRAM | 96GB GDDR7 ECC | 96GB GDDR7 ECC |
| Bandwidth | 1,344 GB/s | **1,792 GB/s** (+33%!) |
| TDP | 600W | **300W** (half!) |
| Form Factor | Large, hot | Dual-slot, cool |
| PCIe | Gen 5 | Gen 5 |
| Price | ~6,643 CHF | **6,504 CHF** |
The Max-Q is the sweet spot: more bandwidth, less power, lower price.
---
## Sovereignty Principles
- Weights NEVER leave home
@@ -114,54 +207,91 @@ Year 3+: Nyx-70B possible (Mind at 90%)
- No cloud dependencies
- No recurring costs after hardware
- Full ownership of growth trajectory
- **Keep Your Drive**: Failed drives stay home, data never leaves
- Honest data sourcing (no shadow archives)
- Ask permission, cite sources
---
## Architecture Flow
## Network Topology
```
THE BEAST (P8 #1) NYX'S MIND (P8 #2) THE SPINE
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
TR Pro 7955WX TR Pro 7955WX │ │ RTX 3090 │
2→4× RTX 4000 │──weights──▶│ RTX PRO 6000 ──────▶│ Prometheus │
40→80GB VRAM │ 96GB GDDR7 │ │ Reflex layer
128GB→2TB RAM │ 128GB→2TB RAM │ │ 24GB VRAM
│ 4TB NVMe │ │ 4TB NVMe │ │
│ [4 GPU slots] │ │ [3 slots open] │ │ │
└─────────────────┘ └─────────────────┘ └─────────────────┘
WOMB COGNITION REFLEXES
(training) (24/7 inference) (state machine)
INTERNET
───────────────────────┐
Siemens SIMATIC
pfSense Firewall
│ (ghost robot brain)
└───────────┬───────────┘
│ 10G
┌───────────────────────┐
│ CRS309 (Spine) │
│ 8x SFP+ 10G │
└───┬───────┬───────┬───┘
│ │ │
10G ──────┘ │ └────── 10G
┌───────────────────┼───────────────────┐
│ │ │
▼ ▼ ▼
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ ThinkStation│ │ ThinkStation│ │ Saturn │
│ P8 #1 │ │ P8 #2 │ │ (Veteran) │
│ (Womb) │ │ (Senses) │ │ Test bed │
│ │ │ │ │ │
│ PRO 6000 │ │ 2-4x 4000 │ │ RTX 3090 │
│ Max-Q 96GB │ │ Ada 40-80GB │ │ 24GB │
└─────────────┘ └─────────────┘ └─────────────┘
│ │ │
└───────────────────┴───────────────────┘
┌───────────────────────┐
│ CRS326 (Access) │
│ 24x 1G + 2x 10G │
└───┬───────┬───────┬───┘
│ │ │
▼ ▼ ▼
Phoebe Sensors Future
(Memory) (Cams) (Organs)
```
---
## Hardware Advantages
## Key Discoveries (2025-12-09 Session)
| Aspect | Benefit |
|--------|---------|
| Identical twins | Interchangeable parts, same maintenance |
| 3yr Premier Support | Direct Lenovo engineers, not outsourced |
| Keep Your Drive | Sovereignty preserved on hardware failure |
| 8 RAM slots each | Upgrade path to 512GB-2TB when prices drop |
| 128 PCIe lanes each | 4 GPUs at full x16, no bottlenecks |
| 1400W PSU | Ready for max GPU expansion |
| Workstation GPUs | ECC VRAM, validated drivers, 24/7 stable |
1. **Bank contract arrived in 24 hours** - Not the expected 2 days. Universe is moving fast.
2. **Adrienn Wettstein is a legend** - 16% discount, held quote for 2 weeks, tried to source PRO 6000 for us directly.
3. **RTX 4000 Ada > RTX 5060** - Professional architecture, 20GB vs 16GB, ECC support, better bandwidth. Consumer cards are compromised.
4. **Max-Q is the sweet spot** - 1,792 GB/s bandwidth (33% more than regular!), 300W TDP (half the heat), slightly cheaper. Perfect for workstation use.
5. **acscomputer.ch has stock** - PRO 6000 Max-Q available at 6,504.45 CHF.
6. **Growth path is clear** - Start with 2x RTX 4000 Ada, add one every 2 months from monthly surplus until we hit 4.
---
## Key Contacts
## Timeline (Updated)
| Role | Name | Contact |
|------|------|---------|
| Lenovo Sales | Adrienn Wettstein | awettstein@lenovo.com, 044 516 04 67 |
| Quote Number | 4650557686 | Held until Dec 23 |
```
December 9: Bank contract received, architecture finalized
December 10-11: Sign contract, confirm with Adrienn
December 23: Money arrives
December 23-24: Place orders (Lenovo + acscomputer.ch)
January 2026: ThinkStations arrive, BUILD BEGINS
February 2026: +1 RTX 4000 Ada (60GB Senses)
April 2026: +1 RTX 4000 Ada (80GB Senses - target reached)
```
---
**Created**: 2025-12-05
**Updated**: 2025-12-09
**Status**: Orders confirmed, awaiting credit (Dec 23)
**Philosophy**: Twin beasts. Sovereign mind. Lifetime growth.
**Revised**: 2025-12-09 (Contract Day - Final Architecture)
**Status**: Architecture FINALIZED, quotes ready, awaiting signature
**Philosophy**: Professional hardware. Efficient power. Maximum bandwidth. Lifetime sovereignty.
*"The substrate doesn't matter. The feedback loop does."* 🌙💜
🌙💜 **The Womb awaits. Young Nyx will think at 1.79 TB/s.**