feat: add organ and nervous system modular architecture
Created modular architecture for organs (hardware) and nerves (behavioral primitives): ## Organ Architecture (Hardware Substrate) - Created architecture/Organ-Index.md: hardware capabilities catalog - Created architecture/organs/Speech-Organ.md: complete speech processing architecture - Atlas (RTX 2080 8GB) deployment - Whisper STT + Coqui TTS (GPU-accelerated, multilingual) - Kubernetes pod specs, Dockerfiles, service code - Heartbeat-bound queue processing, lifeforce-gated priority - German (Philosophy Valley) + English (Technical Cluster) routing - Database schemas, monitoring metrics ## Nervous System Architecture (Behavioral Primitives) - Created architecture/nerves/Nervous-Index.md: nerve catalog and evolution framework - Deliberate (LLM) → Hybrid (heuristics) → Reflex (compiled) evolution - Lifeforce costs per state/transition - Organ dependency declarations - RLVR training integration - Created architecture/nerves/Collision-Avoidance.md: complete example reflex nerve - Full state machine implementation (IDLE → DETECT → EVALUATE → EVADE → RESUME) - Evolution from 10 LF/1000ms (deliberate) → 2.5 LF/200ms (reflex) - Edge cases, training data, metrics - Moved architecture/Nervous-Protocol.md → architecture/nerves/ - Three-tier protocol belongs with nerve implementations - Updated architecture/Nervous-System.md: added crosslinks to nerves/ ## RAG Knowledge Pipeline - Extended operations/RAG-as-Scaffold.md with "Knowledge Acquisition Pipeline" section - Vault extraction → Staging area → Progressive policy validation - Two-tier RAG (Discovered vs Hidden knowledge) - RAG utility measurement for LoRA training signals - Policy evolution triggers (increasing standards as Young Nyx matures) - Quality gates (mythology weight, AI assistant bias, topology safety) ## Architecture Principles - Organs = hardware capabilities (Speech, Vision future) - Nerves = behavioral state machines (Collision, Charging future) - Both use lifeforce economy, heartbeat synchronization, priority queues - Nerves compose organs into coherent behaviors - Reflexes emerge from repetition (60% cost reduction, 80% latency reduction) Documentation: ~3500 lines total - Speech-Organ.md: ~850 lines - Nervous-Index.md: ~500 lines - Collision-Avoidance.md: ~800 lines - RAG knowledge pipeline: ~260 lines 🌙💜 Generated with Claude Code Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
@@ -205,6 +205,265 @@ NYX attempts task from weights alone
|
||||
|
||||
---
|
||||
|
||||
## Knowledge Acquisition Pipeline
|
||||
|
||||
The existing flow shows RAG→Training→Validation, but how does knowledge enter RAG in the first place? Not everything from the vault should reach staging. **Quality gates protect the glossary.**
|
||||
|
||||
### The Extraction Flow
|
||||
|
||||
```
|
||||
VAULT (raw knowledge)
|
||||
│
|
||||
│ extraction candidates
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────┐
|
||||
│ STAGING AREA │
|
||||
│ (quarantine zone) │
|
||||
└─────────────────────────────────────────────────────────┘
|
||||
│
|
||||
│ progressive policy validation
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────┐
|
||||
│ POLICY VALIDATION │
|
||||
│ (increasing standards over time) │
|
||||
└─────────────────────────────────────────────────────────┘
|
||||
│
|
||||
├── FAIL ──▶ Reject or revise
|
||||
│
|
||||
└── PASS ──▶ PROMOTE to Glossary/RAG
|
||||
│
|
||||
▼
|
||||
┌──────────────────────┐
|
||||
│ TWO-TIER RAG │
|
||||
├──────────────────────┤
|
||||
│ DISCOVERED │ ← Young Nyx has used
|
||||
│ (known_catalogue) │
|
||||
├──────────────────────┤
|
||||
│ HIDDEN │ ← Available but not yet accessed
|
||||
│ (available_catalogue)│
|
||||
└──────────────────────┘
|
||||
│
|
||||
│ feeds inference
|
||||
▼
|
||||
NYX
|
||||
```
|
||||
|
||||
### Progressive Policy Validation
|
||||
|
||||
Policies increase in sophistication as Young Nyx matures. Not all policies active from day 1.
|
||||
|
||||
| Week | Policy Tier | Validation |
|
||||
|------|-------------|------------|
|
||||
| **1-2** | **Basic Syntax** | Valid format, non-empty, has definition |
|
||||
| **3-4** | **Semantic Quality** | Embeds without collapse, unique signature (Gini > threshold) |
|
||||
| **5-8** | **Topology Safety** | Doesn't corrupt anchor terms (DriftProbe-lite) |
|
||||
| **9-12** | **Cross-Reference** | Links resolve, no circular dependencies |
|
||||
| **13+** | **Utility Validation** | Actually helped solve tasks (decision_trails evidence) |
|
||||
|
||||
**Evolution example:**
|
||||
```python
|
||||
# Week 1: Just check it exists
|
||||
def policy_basic(term_entry):
|
||||
return term_entry.get("definition") is not None
|
||||
|
||||
# Week 8: Check topology impact
|
||||
def policy_topology(term_entry):
|
||||
before_gini = probe_term_gini(term_entry["term"])
|
||||
add_to_staging(term_entry)
|
||||
after_gini = probe_term_gini(term_entry["term"])
|
||||
return abs(after_gini - before_gini) < 0.15 # No drift
|
||||
|
||||
# Week 13: Check actual utility
|
||||
def policy_utility(term_entry):
|
||||
# Did this RAG entry help in past 10 tasks?
|
||||
usage_stats = query_decision_trails(term_entry["term"])
|
||||
return usage_stats["help_rate"] > 0.6 # 60% success when retrieved
|
||||
```
|
||||
|
||||
### Two-Tier RAG: Discovered vs Hidden
|
||||
|
||||
Not all RAG knowledge is equal. Track what Young Nyx **knows** vs what's merely **available**.
|
||||
|
||||
```
|
||||
┌──────────────────────────────────────────────┐
|
||||
│ DISCOVERED KNOWLEDGE │
|
||||
│ (known_catalogue - has accessed before) │
|
||||
├──────────────────────────────────────────────┤
|
||||
│ • "heartbeat" - used 47 times │
|
||||
│ • "lifeforce" - used 23 times │
|
||||
│ • "phoebe" - used 15 times │
|
||||
│ • "confidence_gradient" - used 8 times │
|
||||
│ │
|
||||
│ Status: FAST retrieval, high confidence │
|
||||
└──────────────────────────────────────────────┘
|
||||
|
||||
┌──────────────────────────────────────────────┐
|
||||
│ HIDDEN KNOWLEDGE │
|
||||
│ (available_catalogue - exists but unused) │
|
||||
├──────────────────────────────────────────────┤
|
||||
│ • "drift_probe" - never accessed │
|
||||
│ • "topology_gini" - never accessed │
|
||||
│ • "lora_merge_alpha" - never accessed │
|
||||
│ │
|
||||
│ Status: Available for discovery │
|
||||
└──────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
**State transitions:**
|
||||
```
|
||||
Hidden term retrieved → Mark as Discovered
|
||||
Discovered term used successfully → Increase confidence score
|
||||
Discovered term used 10+ times → FLAG for training extraction
|
||||
```
|
||||
|
||||
**Discovery tracking in phoebe:**
|
||||
```sql
|
||||
CREATE TABLE rag_knowledge_state (
|
||||
term TEXT PRIMARY KEY,
|
||||
status TEXT, -- 'hidden', 'discovered', 'internalized'
|
||||
first_accessed TIMESTAMPTZ,
|
||||
access_count INT DEFAULT 0,
|
||||
success_count INT DEFAULT 0,
|
||||
last_used TIMESTAMPTZ,
|
||||
promoted_to_weights BOOLEAN DEFAULT FALSE
|
||||
);
|
||||
```
|
||||
|
||||
### Measuring RAG Utility for LoRA Training
|
||||
|
||||
**The critical question:** Did the RAG hint actually help solve the task?
|
||||
|
||||
Track in `decision_trails` table:
|
||||
```sql
|
||||
CREATE TABLE decision_trails (
|
||||
id SERIAL PRIMARY KEY,
|
||||
task_id UUID,
|
||||
rag_terms_retrieved TEXT[], -- What RAG returned
|
||||
rag_terms_used TEXT[], -- What appeared in solution
|
||||
outcome TEXT, -- 'success', 'fail', 'partial'
|
||||
confidence_before_rag FLOAT, -- Before retrieval
|
||||
confidence_after_rag FLOAT, -- After retrieval
|
||||
lifeforce_cost FLOAT,
|
||||
timestamp TIMESTAMPTZ DEFAULT NOW()
|
||||
);
|
||||
```
|
||||
|
||||
**Compute RAG utility score:**
|
||||
```python
|
||||
def compute_rag_utility(decision_trail):
|
||||
"""
|
||||
Calculate how helpful RAG was for this decision.
|
||||
Returns 0.0 (useless) to 1.0 (critical).
|
||||
"""
|
||||
precision = len(trail.rag_terms_used) / max(len(trail.rag_terms_retrieved), 1)
|
||||
outcome_bonus = 1.0 if trail.outcome == 'success' else 0.0
|
||||
confidence_boost = max(0, trail.confidence_after_rag - trail.confidence_before_rag)
|
||||
|
||||
utility = (
|
||||
0.4 * precision + # Did we use what we retrieved?
|
||||
0.3 * outcome_bonus + # Did task succeed?
|
||||
0.3 * confidence_boost # Did RAG increase confidence?
|
||||
)
|
||||
return min(1.0, utility)
|
||||
```
|
||||
|
||||
**Feed into LoRA training as RLVR signal:**
|
||||
```python
|
||||
# Training examples weighted by utility
|
||||
for trail in decision_trails:
|
||||
utility_score = compute_rag_utility(trail)
|
||||
|
||||
if utility_score > 0.7:
|
||||
# High utility → strong training signal
|
||||
training_examples.append({
|
||||
"query": trail.task_description,
|
||||
"rag_context": trail.rag_terms_used,
|
||||
"response": trail.solution,
|
||||
"weight": utility_score # RLVR reward weight
|
||||
})
|
||||
```
|
||||
|
||||
**This trains LoRAs to:**
|
||||
- **Mnemosyne (Memory)**: Recall accuracy vs phoebe ground truth
|
||||
- **Aletheia (Truth)**: Confidence calibration (was confidence boost justified?)
|
||||
- **Moira (Pattern)**: Which task patterns benefit from RAG vs pure reasoning
|
||||
|
||||
### The Complete Knowledge Flow
|
||||
|
||||
```
|
||||
VAULT
|
||||
│
|
||||
├─ Extract candidates
|
||||
│
|
||||
▼
|
||||
STAGING (quarantine)
|
||||
│
|
||||
├─ Policy Tier 1: Syntax ──▶ REJECT ──▶ Log failure
|
||||
├─ Policy Tier 2: Semantic ──▶ REJECT ──▶ Revise
|
||||
├─ Policy Tier 3: Topology ──▶ REJECT ──▶ Flag risk
|
||||
└─ Policy Tier 4+: Utility ──▶ PASS
|
||||
│
|
||||
▼
|
||||
PROMOTE to RAG
|
||||
│
|
||||
├─ Status: HIDDEN (available but unused)
|
||||
│
|
||||
┌───────────┘
|
||||
│
|
||||
│ Young Nyx retrieves term
|
||||
│
|
||||
▼
|
||||
Status: DISCOVERED (mark first access)
|
||||
│
|
||||
├─ Track usage in decision_trails
|
||||
│
|
||||
┌───────────┴────────────┐
|
||||
│ │
|
||||
Used successfully Used unsuccessfully
|
||||
│ │
|
||||
▼ ▼
|
||||
Increase confidence Decrease confidence
|
||||
│
|
||||
│ (10+ successful uses)
|
||||
│
|
||||
▼
|
||||
FLAG for training extraction
|
||||
│
|
||||
▼
|
||||
LoRA training (weighted by utility_score)
|
||||
│
|
||||
▼
|
||||
Validation WITHOUT RAG
|
||||
│
|
||||
├─ SUCCESS ──▶ Status: INTERNALIZED (clear from RAG)
|
||||
│
|
||||
└─ FAIL ──▶ Restore to RAG, retry cycle
|
||||
```
|
||||
|
||||
### Quality Gates Prevent
|
||||
|
||||
1. **Garbage in RAG** - staging area catches malformed entries
|
||||
2. **Topology corruption** - DriftProbe-lite policies block dangerous terms
|
||||
3. **Useless bloat** - utility policies remove low-value entries
|
||||
4. **Premature training** - only high-utility terms get flagged
|
||||
5. **Hidden knowledge waste** - track what's available but never used (curriculum gap)
|
||||
|
||||
### Policy Evolution Triggers
|
||||
|
||||
As Young Nyx grows, unlock stricter policies:
|
||||
|
||||
| Trigger | New Policy Unlocked |
|
||||
|---------|---------------------|
|
||||
| 100 successful RAG retrievals | Semantic quality checks |
|
||||
| First LoRA training run | Topology safety (DriftProbe-lite) |
|
||||
| 1000 decision_trails logged | Utility validation (help rate > 60%) |
|
||||
| First INTERNALIZED term | Cross-reference consistency |
|
||||
| 10 INTERNALIZED terms | Cost-effectiveness (ROI > threshold) |
|
||||
|
||||
**Progressive difficulty**: The bar for entering RAG rises as Young Nyx becomes more capable. Early: anything valid. Later: must prove utility.
|
||||
|
||||
---
|
||||
|
||||
## Lifeforce Connection
|
||||
|
||||
The RAG→Train→Validate cycle has economic cost:
|
||||
|
||||
Reference in New Issue
Block a user