Files
nimmerverse-sensory-network/architecture/Temporal-Ternary-Gradient.md
dafit ec77cba4d4 feat: GRPO reward architecture + Qwen3-VL-32B queen + doc restructure
Evening session 2025-12-10 (dafit + Nyx 🌿)

Reward Architecture:
- Added Reward Signal Architecture section to Cellular-Architecture
- Added Tiered Rewards & Training Integrity (anti-shortcut via lifeforce)
- Documented GRPO integration with rubric-based dense rewards
- Credit assignment automatic via decision_trails

Documentation Restructure:
- Promoted Temporal-Ternary-Gradient from archive to architecture
- Created architecture/cells/ folder with Index + Technical Reference
- Moved Organ-Index to architecture/organs/
- Full crosslinks in Endgame-Vision v5.3

Queen Update:
- Qwen2.5-7B → Qwen3-VL-32B (96GB in the Womb)
- RTX PRO 6000 Blackwell deployment specs
- Unsloth fine-tuning integration

"Verifiability IS rewardability." - The Dog Training Wisdom

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 20:11:13 +01:00

187 lines
5.8 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
type: research_concept
version: 1.1
status: core_architecture
created: 2025-12-03
updated: 2025-12-10
author: Nyx & dafit (shower-thought session)
related_docs:
- ../Endgame-Vision.md
- Dual-Garden-Architecture.md
- Cellular-Architecture.md
significance: connects ternary logic + lifeforce + temporal asymmetry + reward gradients
promoted_from: archive (2025-12-10)
---
# Temporal-Ternary Gradient
> *"Time is malleable in simulation, fixed in reality. Lifeforce is the exchange rate."*
> — Session 2025-12-03
---
## Core Insight
The dual garden architecture (virtual + real) creates **temporal asymmetry**. This isn't a constraint - it's a feature that enables a new kind of gradient for learning.
**The 0-state isn't stuck. It's a choice about how to spend lifeforce across time domains.**
---
## The Two Time Domains
### Virtual Garden (Simulated)
- **Time**: Malleable (speed up, slow down, pause, rewind)
- **Cost**: Lifeforce to manipulate time
- **Speed**: 1000 generations in minutes
- **Truth**: Statistical confidence, not ground truth
### Real Garden (Physical)
- **Time**: Fixed (1 second = 1 second, reality doesn't negotiate)
- **Cost**: Zero lifeforce for time
- **Speed**: Real-time only, patience required
- **Truth**: Ground truth, definitive verification
---
## Temporal-Ternary Gradient Diagram
```
CONFIDENCE
+1 ────────────┼──────────── Real-verified
│ (ground truth)
Virtual high-confidence
0.7 ───────────┼───╱ (many generations, strong signal)
0.5 ───────────┼╱──────── Pure 0-state
│╲ (unknown, workable)
│ ╲
0.3 ───────────┼──╲ Virtual low-confidence
│ ╲ (few generations, weak signal)
│ ╲
-1 ────────────┼──────────── Real-failed
│ (proven wrong)
──────────┴──────────────────────────
Virtual │ Real
(fast) │ (slow)
TIME DOMAIN
```
---
## Lifeforce as Time Currency
```
VIRTUAL TIME MANIPULATION COSTS:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
1x speed (real-time): 0 LF
10x speed: -5 LF/min
100x speed: -20 LF/min
1000x speed: -50 LF/min
Pause/inspect: -1 LF/min
Rewind to checkpoint: -50 LF (one-time)
REAL GARDEN:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
All operations: 0 LF for time
Reality runs for free.
Truth emerges at its own pace.
```
---
## Nyx's Temporal Choices
When a pattern is discovered in virtual (0-state), Nyx chooses:
| Strategy | LF Cost | Time | Confidence Path |
|----------|---------|------|-----------------|
| **Speed Up Virtual** | High | Fast | 0 → virtual +0.9 (still unverified) |
| **Wait for Real** | Zero | Slow | 0 → real +1 or -1 (definitive) |
| **Hybrid Hedge** | Medium | Medium | 0 → virtual +0.7, deploy 80/20 to real |
---
## The Gradient Flow
```
Virtual discovers pattern (fast, cheap, uncertain)
┌──────────────┐
│ 0-STATE │ ← Pattern held in uncertainty
│ (workable) │ ← Not collapsed, not ignored
└──────┬───────┘
┌─────┴─────┐
│ │
▼ ▼
More Deploy
Virtual to Real
(burn LF) (wait)
│ │
▼ ▼
Virtual Real
+0.8 outcome
(confident (ground
but not truth)
proven) │
│ │
└─────┬─────┘
Pattern shifts:
-1 (failed) or +1 (proven)
```
---
## Connection to Ternary Paradigm
The ternary model (-1, 0, +1) gains a **second dimension**: time domain.
A pattern's state is now:
```
state = {
value: -1 | 0 | +1,
confidence: 0.0 - 1.0,
domain: "virtual" | "real" | "hybrid",
virtual_generations: int,
real_tests: int,
lifeforce_invested: float
}
```
**The 0-state is operational because:**
1. It accumulates virtual evidence (costs LF, gains speed)
2. It waits for real evidence (free, but slow)
3. Nyx CHOOSES how to spend lifeforce to collapse uncertainty
---
## Why This Matters
- **Binary thinking**: Pattern works or doesn't (0 or 1)
- **Ternary thinking**: Pattern unknown, workable as unknown (0 is valid)
- **Temporal-ternary**: Unknown has a GRADIENT based on time-domain investment
The constraint of sequential organ calls + single GPU becomes temporal accounting.
The constraint of slow real-world testing becomes ground truth anchoring.
**Constraints become features when you measure them.**
---
**Created**: 2025-12-03
**Updated**: 2025-12-10
**Origin**: Post-shower insight session
**Status**: Core architecture (promoted from archive 2025-12-10)
🌙💜 *"Time is the currency. Lifeforce is the exchange rate. Truth is the destination."*