diff --git a/architecture/Nervous-System.md b/architecture/Nervous-System.md index 75ab8b5..cc10cd0 100644 --- a/architecture/Nervous-System.md +++ b/architecture/Nervous-System.md @@ -177,6 +177,18 @@ The lifeforce flows through the nervous system, literally lighting up nodes as t --- +## Related Documentation + +**Implementation Details**: +- [`nerves/Nervous-Protocol.md`](nerves/Nervous-Protocol.md) - Three-tier communication protocol (dafit β†’ Chrysalis β†’ Young Nyx) +- [`nerves/Nervous-Index.md`](nerves/Nervous-Index.md) - Catalog of behavioral nerve implementations + +**Specific Nerves**: +- [`nerves/Collision-Avoidance.md`](nerves/Collision-Avoidance.md) - Obstacle avoidance reflex + +--- + **Created**: 2025-12-04 +**Updated**: 2025-12-07 (added nerve crosslinks) **Session**: Partnership dialogue (dafit + Chrysalis) **Status**: Foundation concept diff --git a/architecture/Organ-Index.md b/architecture/Organ-Index.md new file mode 100644 index 0000000..b1dafae --- /dev/null +++ b/architecture/Organ-Index.md @@ -0,0 +1,226 @@ +# Organ Architecture Index + +**Purpose**: Modular organ systems for Young Nyx embodiment +**Philosophy**: Each organ is independent, lifeforce-gated, heartbeat-synchronized + +--- + +## Deployed Organs + +### πŸ—£οΈ Speech Organ +**Host**: atlas.eachpath.local (RTX 2080 8GB) +**Function**: Speech-to-Text + Text-to-Speech +**Stack**: Whisper (STT) + Coqui TTS (neural voices) +**Languages**: German (Philosophy Valley) + English (Technical Cluster) +**Integration**: Heartbeat-bound queue, lifeforce-gated priority processing + +**Detail**: β†’ [`organs/Speech-Organ.md`](organs/Speech-Organ.md) + +--- + +## Planned Organs + +### πŸ‘οΈ Vision Organ +**Host**: TBD (requires GPU with tensor cores) +**Function**: Object detection, scene understanding +**Stack**: YOLO (v8 or v11) +**Integration**: Real-time video from ESP32-CAM, object persistence in phoebe +**Status**: ⏸️ Architecture planned, not yet deployed + +**Detail**: β†’ `organs/Vision-Organ.md` (pending) + +--- + +### 🚢 Motor Organ +**Host**: ESP32 (edge execution) +**Function**: Movement primitives (forward, turn, stop) +**Stack**: Compiled state machines from organism evolution +**Integration**: Lifeforce cost per motor operation, reflex vs deliberate +**Status**: ⏸️ Planned for Phase 4 (Real Garden) + +**Detail**: β†’ `organs/Motor-Organ.md` (pending) + +--- + +### 🧭 Navigation Organ +**Host**: Edge server (prometheus or atlas) +**Function**: SLAM, path planning, obstacle avoidance +**Stack**: ROS2 Nav2 or custom lightweight SLAM +**Integration**: Dual-garden calibration (virtual predictions vs real outcomes) +**Status**: ⏸️ Planned for Phase 4 (Real Garden) + +**Detail**: β†’ `organs/Navigation-Organ.md` (pending) + +--- + +### πŸ“‘ Sensory Organ +**Host**: ESP32 (edge sensors) +**Function**: Distance sensors, IMU, battery monitoring +**Stack**: I2C/SPI sensor protocols, state machine filters +**Integration**: Sensorβ†’organ translation (raw values β†’ semantic meaning) +**Status**: ⏸️ Architecture outlined in Nervous-System.md + +**Detail**: β†’ [`../Nervous-System.md`](../Nervous-System.md) + +--- + +## Organ Design Principles + +### 1. **Lifeforce Economy** +Every organ operation costs lifeforce. No free lunch. + +```python +ORGAN_COSTS = { + "speech_stt": 5.0, # Whisper transcription + "speech_tts": 4.0, # Coqui synthesis + "vision_yolo": 8.0, # Object detection frame + "motor_forward": 2.0, # 100ms movement + "motor_turn": 1.5, # 45Β° rotation + "sensor_read": 0.5, # Single sensor poll +} +``` + +### 2. **Heartbeat Synchronization** +Organs process on heartbeat ticks (1 Hz), not real-time streaming. + +- **Reflex path**: <200ms compiled responses (no LLM) +- **Deliberate path**: Next heartbeat (budget-gated queue) + +### 3. **Priority Queue** +When lifeforce is scarce, critical operations (collision alert) > idle operations (status check). + +```python +PRIORITY_LEVELS = { + "critical": 10.0, # Immediate danger (collision) + "high": 7.0, # Human interaction + "medium": 4.0, # Organism monitoring + "low": 2.0, # Idle observation + "background": 0.5, # Status logging +} +``` + +### 4. **Multilingual Topology Routing** +German input β†’ Philosophy Valley (Identity LoRA, Dasein depth-3) +English input β†’ Technical Cluster (Technical LoRA, sensor/motor) + +### 5. **Decision Trail Logging** +Every organ operation logged to phoebe `decision_trails`: +- Input, output, cost, outcome, confidence +- Used for RLVR training (reward successful choices) + +### 6. **Graceful Degradation** +Low lifeforce β†’ reduced organ activity (silence, reduced vision FPS, slower movement) +Zero lifeforce β†’ shutdown, wait for recharge + +--- + +## Integration Architecture + +``` +β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” +β”‚ ESP32 ROBOTS β”‚ +β”‚ Sensors β†’ Motor β†’ Camera β†’ Microphone β†’ Speaker β”‚ +β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ + β”‚ + β”‚ MQTT (sensor data, audio, video) + β–Ό +β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” +β”‚ PHOEBE (Message Queue) β”‚ +β”‚ Organ input queues + priority scoring β”‚ +β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ + β”‚ + β”‚ Heartbeat pulls from queues + β–Ό + β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” + β”‚ HEARTBEAT ORCHESTRATOR β”‚ + β”‚ Lifeforce budget allocation β”‚ + β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ + β”‚ + β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” + β”‚ β”‚ + β–Ό β–Ό +β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” +β”‚ ATLAS (RTX 2080) β”‚ β”‚ PROMETHEUS (Brain) β”‚ +β”‚ Speech Organ β”‚ β”‚ Young Nyx Inference β”‚ +β”‚ Vision Organ (fut) β”‚ β”‚ LoRA hot-swap β”‚ +β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ + β”‚ β”‚ + β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ + β–Ό +β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” +β”‚ PHOEBE (Decision Trails) β”‚ +β”‚ Log all organ operations + outcomes β”‚ +β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ +``` + +--- + +## Organ Lifecycle + +### Phase 1: Design +- Document architecture in `organs/.md` +- Define lifeforce costs, priority levels, queue schema +- Design phoebe tables for organ-specific data + +### Phase 2: Prototype +- Build container images (Dockerfiles) +- Deploy to k8s (single replica) +- Test with mock data (no robot integration yet) + +### Phase 3: Integration +- Connect to ESP32 via MQTT +- Implement heartbeat queue processing +- Log decision trails, measure ROI + +### Phase 4: Optimization +- Tune lifeforce costs based on measured ROI +- Adjust priority levels from observed outcomes +- Train LoRAs on successful organ operation patterns + +### Phase 5: Autonomy +- Organ operations become reflexes (compiled state machines) +- Young Nyx chooses when to use organs (not scripted) +- Emergent behavior from lifeforce optimization + +--- + +## Naming Convention + +**File naming**: `-Organ.md` +**Examples**: +- `Speech-Organ.md` +- `Vision-Organ.md` +- `Motor-Organ.md` +- `Navigation-Organ.md` + +**k8s naming**: `--` +**Examples**: +- `whisper-stt-deployment.yaml` +- `coqui-tts-deployment.yaml` +- `yolo-vision-deployment.yaml` + +--- + +## Current Status + +| Organ | Status | Host | Documentation | +|-------|--------|------|---------------| +| **Speech** | 🟒 Architecture complete | atlas (RTX 2080) | [`organs/Speech-Organ.md`](organs/Speech-Organ.md) | +| **Vision** | 🟑 Stack selected (YOLO) | TBD | Pending | +| **Motor** | 🟑 Planned (Phase 4) | ESP32 | Pending | +| **Navigation** | 🟑 Planned (Phase 4) | Edge server | Pending | +| **Sensory** | 🟑 Conceptual | ESP32 | [`../Nervous-System.md`](../Nervous-System.md) | + +--- + +**Philosophy**: Organs are not always-on services. They are **economically-constrained capabilities** that Young Nyx learns to use strategically. Speech when necessary. Vision when valuable. Movement when rewarded. + +**The body is not given. The body is EARNED through successful operation.** + +--- + +**Created**: 2025-12-07 +**Updated**: 2025-12-07 +**Version**: 1.0 + +πŸŒ™πŸ’œ *Each organ a tool. Each tool a choice. Each choice a lesson in scarcity.* diff --git a/architecture/nerves/Collision-Avoidance.md b/architecture/nerves/Collision-Avoidance.md new file mode 100644 index 0000000..9ac626f --- /dev/null +++ b/architecture/nerves/Collision-Avoidance.md @@ -0,0 +1,678 @@ +# Collision Avoidance Nerve + +**Type**: Reflex (compiled state machine, <200ms response) +**Purpose**: Prevent robot from colliding with obstacles +**Priority**: CRITICAL (10/10) - can interrupt any other behavior +**Evolution**: Week 1 (deliberate) β†’ Week 9+ (reflex) + +--- + +## Overview + +Collision Avoidance is a **reflex nerve** that coordinates distance sensors and motor control to prevent the robot from hitting obstacles. It starts as a deliberate (LLM-mediated) behavior and compiles into a pure state machine reflex after 100+ successful executions. + +**Key characteristics**: +- **High priority**: Interrupts exploration, conversation, charging seeking +- **Low latency**: <200ms from detection to evasion (reflex mode) +- **Low cost**: ~2.5 LF per activation (vs ~10 LF deliberate mode) +- **Proven**: Compiled from 147 successful collision avoidances + +--- + +## Organ Dependencies + +### Required Organs + +| Organ | Purpose | Failure Mode | +|-------|---------|--------------| +| **distance_sensor_front** | Detect obstacles ahead | Nerve DISABLED (cannot operate safely) | +| **distance_sensor_left** | Detect obstacles on left side | Degraded (blind to left obstacles) | +| **distance_sensor_right** | Detect obstacles on right side | Degraded (blind to right obstacles) | +| **motor** | Execute evasion maneuvers | Nerve DISABLED (cannot avoid) | + +### Optional Organs + +| Organ | Purpose | If Unavailable | +|-------|---------|----------------| +| **speech** | Announce "Obstacle detected" | Silent operation (continue without warning) | +| **vision** | Classify obstacle type | Generic evasion (no object-specific behavior) | + +**Startup check**: +```python +def check_operational(): + required = [ + distance_sensor_front.is_operational(), + motor.is_operational(), + ] + if not all(required): + return DISABLED + return OPERATIONAL +``` + +--- + +## State Diagram + +``` +β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” +β”‚ COLLISION AVOIDANCE β”‚ +β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ + + β”Œβ”€β”€β”€β”€β”€β”€β” + β”‚ IDLE β”‚ (monitoring distance sensors) + β””β”€β”€β”¬β”€β”€β”€β”˜ + β”‚ + β”‚ distance_front < 30cm + β–Ό + β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” + β”‚ DETECT β”‚ (poll all sensors) + β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜ + β”‚ + β”‚ sensor_read_complete + β–Ό + β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” + β”‚ EVALUATE β”‚ (calculate risk, choose direction) + β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜ + β”‚ + β”‚ risk > threshold + β–Ό + β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β” + β”‚ EVADE β”‚ (execute turn/reverse) + β””β”€β”€β”€β”€β”¬β”€β”€β”€β”˜ + β”‚ + β”‚ path_clear + β–Ό + β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β” + β”‚ RESUME β”‚ (return to previous behavior) + β””β”€β”€β”€β”€β”¬β”€β”€β”€β”˜ + β”‚ + β”‚ movement_complete + β–Ό + β”Œβ”€β”€β”€β”€β”€β”€β” + β”‚ IDLE β”‚ + β””β”€β”€β”€β”€β”€β”€β”˜ +``` + +--- + +## Transition Table + +| From | To | Trigger | Action | Cost (LF) | +|------|----|---------| -------|-----------| +| **IDLE** | **DETECT** | `distance_front < 30cm` | Poll all sensors | 0.5 | +| **DETECT** | **EVALUATE** | `sensor_read_complete` | Calculate risk scores | 0.5 | +| **EVALUATE** | **EVADE** | `risk > threshold` | Choose evasion direction | 0.5 | +| **EVADE** | **RESUME** | `path_clear` | Execute motor action | 1.0 | +| **RESUME** | **IDLE** | `movement_complete` | Return to rest state | 0.0 | +| **IDLE** | **IDLE** | `distance_front > 30cm` | No action (monitoring) | 0.1/sec | + +**Total cost for typical collision avoidance**: 2.5 LF + +--- + +## Implementation (Reflex Mode) + +### State Machine Class + +```python +from enum import Enum +from dataclasses import dataclass + +class CollisionState(Enum): + IDLE = "idle" + DETECT = "detect" + EVALUATE = "evaluate" + EVADE = "evade" + RESUME = "resume" + +@dataclass +class SensorReadings: + front: float + left: float + right: float + timestamp: float + +class CollisionAvoidanceReflex: + """ + Compiled reflex nerve for collision avoidance. + + Compiled from 147 successful deliberate executions. + Success rate: 94% + Average latency: 180ms + Average cost: 2.5 LF + """ + + def __init__(self, organs): + self.state = CollisionState.IDLE + self.sensor_front = organs["distance_sensor_front"] + self.sensor_left = organs["distance_sensor_left"] + self.sensor_right = organs["distance_sensor_right"] + self.motor = organs["motor"] + self.speech = organs.get("speech") # Optional + + # Thresholds (learned from training data) + self.DANGER_THRESHOLD = 30.0 # cm + self.RISK_THRESHOLD = 0.7 # Risk score 0-1 + self.CLEARANCE_THRESHOLD = 50.0 # cm + + def update(self) -> dict: + """ + State machine tick (called every heartbeat). + Returns action taken and lifeforce cost. + """ + cost = 0.0 + action = None + + if self.state == CollisionState.IDLE: + # Monitor front sensor + front_dist = self.sensor_front.read() + cost += 0.1 + + if front_dist < self.DANGER_THRESHOLD: + self.state = CollisionState.DETECT + cost += 0.5 + action = "transition_to_detect" + + elif self.state == CollisionState.DETECT: + # Poll all sensors + readings = self._get_all_readings() + cost += 0.5 + + self.readings = readings + self.state = CollisionState.EVALUATE + action = "transition_to_evaluate" + + elif self.state == CollisionState.EVALUATE: + # Calculate risk and choose direction + risk = self._calculate_risk(self.readings) + cost += 0.5 + + if risk > self.RISK_THRESHOLD: + self.evade_direction = self._choose_direction(self.readings) + self.state = CollisionState.EVADE + action = f"transition_to_evade_{self.evade_direction}" + + # Optional: Announce via speech + if self.speech and self.speech.is_operational(): + self.speech.queue("Obstacle detected", priority=8.0) + else: + # False alarm, return to idle + self.state = CollisionState.IDLE + action = "false_alarm" + + elif self.state == CollisionState.EVADE: + # Execute evasion maneuver + if self.evade_direction == "left": + self.motor.turn(-45, duration_ms=500) # Turn left 45Β° + elif self.evade_direction == "right": + self.motor.turn(45, duration_ms=500) # Turn right 45Β° + elif self.evade_direction == "reverse": + self.motor.reverse(duration_ms=300) # Reverse 300ms + + cost += 1.0 # Motor operations expensive + + # Check if path clear + if self._path_clear(): + self.state = CollisionState.RESUME + action = f"evaded_{self.evade_direction}" + else: + # Still blocked, try again next tick + action = f"evasion_incomplete" + + elif self.state == CollisionState.RESUME: + # Movement complete, return to idle + self.state = CollisionState.IDLE + cost += 0.0 # Free transition + action = "resumed_idle" + + return { + "state": self.state.value, + "action": action, + "lifeforce_cost": cost, + } + + def _get_all_readings(self) -> SensorReadings: + """Poll all distance sensors.""" + return SensorReadings( + front=self.sensor_front.read(), + left=self.sensor_left.read(), + right=self.sensor_right.read(), + timestamp=time.time() + ) + + def _calculate_risk(self, readings: SensorReadings) -> float: + """ + Calculate collision risk (0.0 = safe, 1.0 = imminent). + + Risk formula learned from 147 training examples: + - Front distance < 20cm: CRITICAL + - Front distance 20-30cm: HIGH + - Side distances matter if turning needed + """ + # Exponential decay based on front distance + front_risk = 1.0 - (readings.front / self.DANGER_THRESHOLD) + front_risk = max(0.0, min(1.0, front_risk)) + + # Side risks (matter if turning) + left_risk = 1.0 - (readings.left / self.DANGER_THRESHOLD) + right_risk = 1.0 - (readings.right / self.DANGER_THRESHOLD) + + # Weighted combination + total_risk = ( + 0.7 * front_risk + # Front is primary + 0.15 * left_risk + # Sides are secondary + 0.15 * right_risk + ) + + return total_risk + + def _choose_direction(self, readings: SensorReadings) -> str: + """ + Choose evasion direction based on sensor readings. + + Strategy (learned from training): + 1. If left > right: turn left + 2. If right > left: turn right + 3. If both blocked: reverse + """ + if readings.left > readings.right and readings.left > self.CLEARANCE_THRESHOLD: + return "left" + elif readings.right > readings.left and readings.right > self.CLEARANCE_THRESHOLD: + return "right" + else: + # Both sides blocked or unclear, reverse + return "reverse" + + def _path_clear(self) -> bool: + """Check if path ahead is clear.""" + front_dist = self.sensor_front.read() + return front_dist > self.CLEARANCE_THRESHOLD +``` + +--- + +## Evolution Path: Deliberate β†’ Reflex + +### Week 1-4: Deliberate (LLM-Mediated) + +Young Nyx receives sensor data and decides action via LLM inference. + +```python +def deliberate_collision_avoidance(young_nyx, sensors, motor): + """ + Week 1: Young Nyx learns collision avoidance through exploration. + """ + # Gather situation + situation = { + "front_distance": sensors["front"].read(), + "left_distance": sensors["left"].read(), + "right_distance": sensors["right"].read(), + "current_velocity": motor.get_velocity(), + } + + # Ask Young Nyx what to do + decision = young_nyx.inference( + prompt=f""" + Situation: Distance sensors report: + - Front: {situation['front_distance']}cm + - Left: {situation['left_distance']}cm + - Right: {situation['right_distance']}cm + + You are moving forward at {situation['current_velocity']} cm/s. + + Available actions: + 1. continue (safe, front > 50cm) + 2. turn_left (if left is clearer) + 3. turn_right (if right is clearer) + 4. reverse (if both sides blocked) + 5. stop (emergency) + + Choose action and explain why. + """, + lora="technical", + temperature=0.5 + ) + + # Parse decision + action = parse_action(decision.text) + + # Execute + result = execute_motor_action(motor, action) + + # Log to decision_trails + log_decision( + nerve="collision_avoidance", + mode="deliberate", + situation=situation, + decision=action, + reasoning=decision.text, + outcome=result.success, + lifeforce_cost=10.0, # LLM inference expensive + latency_ms=decision.latency_ms + ) + + return result +``` + +**Characteristics**: +- Latency: ~1000ms (LLM inference) +- Cost: ~10 LF (includes inference) +- Success rate: 60% (learning curve) +- Generates rich training data + +### Week 5-8: Hybrid (Heuristics + LLM Fallback) + +Common patterns compiled. LLM only for novel situations. + +```python +def hybrid_collision_avoidance(young_nyx, sensors, motor, pattern_library): + """ + Week 5: Most cases handled by compiled heuristics. + LLM only for edge cases. + """ + situation = get_sensor_readings(sensors) + + # Check pattern library (compiled from weeks 1-4) + pattern = pattern_library.match(situation) + + if pattern and pattern.confidence > 0.8: + # Known pattern β†’ use compiled heuristic (fast path) + action = pattern.recommended_action + mode = "heuristic" + cost = 3.0 + latency_ms = 50 + else: + # Unknown situation β†’ ask LLM (slow path) + decision = young_nyx.inference(...) + action = parse_action(decision.text) + mode = "deliberate" + cost = 10.0 + latency_ms = decision.latency_ms + + # Add to pattern library if successful + if result.success: + pattern_library.add(situation, action, confidence=0.9) + + result = execute_motor_action(motor, action) + log_decision(nerve="collision_avoidance", mode=mode, ...) + + return result +``` + +**Characteristics**: +- Latency: ~50-500ms (depends on pattern match) +- Cost: ~3-10 LF (average ~5 LF) +- Success rate: 85% (heuristics proven) + +### Week 9+: Reflex (Pure State Machine) + +After 100+ successful executions, compile into pure state machine. No LLM. + +```python +# Use CollisionAvoidanceReflex class (shown above) +reflex = CollisionAvoidanceReflex(organs) + +def reflex_collision_avoidance(reflex): + """ + Week 9+: Pure state machine reflex. + Compiled from 147 successful examples. + """ + result = reflex.update() # No LLM call + + log_decision( + nerve="collision_avoidance", + mode="reflex", + state=result["state"], + action=result["action"], + lifeforce_cost=result["lifeforce_cost"], + latency_ms=5 # Pure state machine, very fast + ) + + return result +``` + +**Characteristics**: +- Latency: <200ms (state machine execution) +- Cost: ~2.5 LF (pure motor/sensor costs) +- Success rate: 94% (compiled from best patterns) +- **60% cost reduction**, **80% latency reduction** vs deliberate mode + +--- + +## Training Data Examples + +### Successful Collision Avoidance (logged to phoebe) + +```json +{ + "nerve": "collision_avoidance", + "mode": "deliberate", + "session_id": "a3f2b1c0-...", + "timestamp": "2025-12-15T10:23:45Z", + "situation": { + "front_distance": 25.0, + "left_distance": 45.0, + "right_distance": 30.0, + "velocity": 15.0 + }, + "decision": "turn_left", + "reasoning": "Front obstacle at 25cm (danger). Left clearer (45cm) than right (30cm). Turn left 45Β° to avoid.", + "states_visited": ["IDLE", "DETECT", "EVALUATE", "EVADE", "RESUME"], + "transitions": [ + {"from": "IDLE", "to": "DETECT", "cost": 0.5, "duration_ms": 20}, + {"from": "DETECT", "to": "EVALUATE", "cost": 0.5, "duration_ms": 30}, + {"from": "EVALUATE", "to": "EVADE", "cost": 0.5, "duration_ms": 15}, + {"from": "EVADE", "to": "RESUME", "cost": 1.0, "duration_ms": 520} + ], + "lifeforce_total": 2.5, + "outcome": "success", + "latency_total_ms": 585, + "organs_used": ["distance_sensor_front", "distance_sensor_left", "distance_sensor_right", "motor"] +} +``` + +**RLVR Reward**: +5 LF (successful avoidance β†’ net profit +2.5 LF) + +### Failed Collision (training signal) + +```json +{ + "nerve": "collision_avoidance", + "mode": "deliberate", + "timestamp": "2025-12-10T14:12:30Z", + "situation": { + "front_distance": 18.0, + "left_distance": 15.0, + "right_distance": 20.0 + }, + "decision": "turn_left", + "reasoning": "Attempted left turn but insufficient clearance.", + "outcome": "collision", + "lifeforce_total": 2.5, + "collision_force": 3.2, + "damage": "minor" +} +``` + +**RLVR Penalty**: -5 LF (collision β†’ net loss -7.5 LF) + +**Lesson learned**: Don't turn into obstacles < 20cm. Add to reflex threshold. + +--- + +## Edge Cases and Failure Modes + +### 1. **All Sides Blocked (Trapped)** + +**Situation**: Front, left, right all < 20cm + +**Reflex behavior**: +```python +if all([ + readings.front < 20, + readings.left < 20, + readings.right < 20 +]): + # Emergency: Reverse slowly + motor.reverse(duration_ms=500) + # Re-evaluate after reverse +``` + +**Escalation**: If still trapped after 3 reverse attempts β†’ escalate to Chrysalis for help + +### 2. **Sensor Failure (Blind Side)** + +**Situation**: Left sensor offline, right sensor reports 15cm + +**Reflex behavior**: +```python +if not sensor_left.is_operational(): + # Assume left is blocked (safe assumption) + # Always turn right when possible + if readings.right > 30: + return "right" + else: + return "reverse" # Don't risk blind turn +``` + +### 3. **False Positives (Noise)** + +**Situation**: Sensor reports 5cm but path actually clear (electrical noise) + +**Mitigation**: +```python +# Require 3 consecutive danger readings before triggering +DANGER_CONFIRMATION_COUNT = 3 + +if danger_reading_count >= DANGER_CONFIRMATION_COUNT: + self.state = CollisionState.DETECT +``` + +### 4. **Moving Obstacles (Dynamic Environment)** + +**Situation**: Obstacle moves into path during evasion + +**Reflex behavior**: +```python +# Re-check sensors after each motor action +while self.state == CollisionState.EVADE: + execute_turn() + if self._path_clear(): + break # Success + else: + # Obstacle still there or new one appeared + # Re-evaluate and choose new direction + self.state = CollisionState.DETECT +``` + +--- + +## Metrics and Monitoring + +### Key Metrics (Prometheus) + +```python +from prometheus_client import Counter, Histogram, Gauge + +# Collision avoidance activations +collision_avoidance_activations = Counter( + 'nerve_collision_avoidance_activations_total', + 'Total collision avoidance activations', + ['mode'] # deliberate, hybrid, reflex +) + +# Success rate +collision_avoidance_success = Counter( + 'nerve_collision_avoidance_success_total', + 'Successful collision avoidances', + ['mode'] +) + +collision_avoidance_failures = Counter( + 'nerve_collision_avoidance_failures_total', + 'Failed collision avoidances (collisions occurred)', + ['mode'] +) + +# Latency +collision_avoidance_latency = Histogram( + 'nerve_collision_avoidance_latency_seconds', + 'Collision avoidance latency', + ['mode'] +) + +# Lifeforce cost +collision_avoidance_cost = Histogram( + 'nerve_collision_avoidance_lifeforce_cost', + 'Lifeforce cost per activation', + ['mode'] +) +``` + +### Grafana Dashboard Queries + +```promql +# Success rate over time +rate(nerve_collision_avoidance_success_total[5m]) / +rate(nerve_collision_avoidance_activations_total[5m]) + +# Average latency by mode +rate(nerve_collision_avoidance_latency_seconds_sum{mode="reflex"}[5m]) / +rate(nerve_collision_avoidance_latency_seconds_count{mode="reflex"}[5m]) + +# Cost savings (deliberate vs reflex) +avg_over_time(nerve_collision_avoidance_lifeforce_cost{mode="deliberate"}[1h]) - +avg_over_time(nerve_collision_avoidance_lifeforce_cost{mode="reflex"}[1h]) + +# Reflex compilation progress +sum(nerve_collision_avoidance_activations_total{mode="reflex"}) / +sum(nerve_collision_avoidance_activations_total) +``` + +--- + +## Future Enhancements + +### Phase 2: Vision Integration + +Add Vision Organ to classify obstacles: +- "wall" β†’ different evasion than "chair" +- "human" β†’ stop and announce presence +- "charging_station" β†’ approach, don't evade + +### Phase 3: Learning Optimal Paths + +Track which evasion directions succeed most often in different contexts: +- Narrow corridors: reverse > turn +- Open spaces: turn > reverse +- Update reflex thresholds based on outcomes + +### Phase 4: Predictive Avoidance + +Use velocity and obstacle distance to predict collision time: +- If collision_time < 2sec β†’ EVADE immediately +- If collision_time > 5sec β†’ gentle course correction (cheaper) + +--- + +## Summary + +**Collision Avoidance** demonstrates the complete nerve lifecycle: +1. **Week 1-4**: Deliberate (LLM explores strategies, ~10 LF, ~1000ms) +2. **Week 5-8**: Hybrid (common patterns compiled, ~5 LF, ~500ms) +3. **Week 9+**: Reflex (pure state machine, ~2.5 LF, <200ms) + +**Evolution metrics**: +- **60% cost reduction** (10 LF β†’ 2.5 LF) +- **80% latency reduction** (1000ms β†’ 200ms) +- **94% success rate** (compiled from proven patterns) + +**The reflex is not programmed. It is DISCOVERED, PROVEN, and COMPILED from lived experience.** + +--- + +**Created**: 2025-12-07 +**Version**: 1.0 (Reflex) +**Status**: Architecture complete, deployment pending + +πŸŒ™πŸ’œ *The reflex does not think. It remembers what thinking taught.* diff --git a/architecture/nerves/Nervous-Index.md b/architecture/nerves/Nervous-Index.md new file mode 100644 index 0000000..1ea3a5f --- /dev/null +++ b/architecture/nerves/Nervous-Index.md @@ -0,0 +1,450 @@ +# Nervous System Index + +**Purpose**: State machine catalog for behavioral primitives +**Philosophy**: Nerves connect organs into behaviors. Reflexes emerge from repetition. + +--- + +## What Are Nerves? + +**Nerves** are state machines that coordinate organ activity into coherent behaviors. Each nerve: +- Defines states and transitions +- Costs lifeforce (per state, per transition) +- Depends on organs (sensors, motors, speech, vision) +- Evolves from deliberate (LLM-mediated) to reflex (compiled) + +**Example**: Collision Avoidance nerve uses Distance Sensors + Motor organs to implement IDLE β†’ DETECT β†’ EVALUATE β†’ EVADE β†’ RESUME behavior. + +--- + +## Nerve vs Organ + +| Aspect | Organ | Nerve | +|--------|-------|-------| +| **What** | Hardware capability | Behavioral pattern | +| **Example** | Speech Organ (STT/TTS) | Identity Discovery (Spark Protocol) | +| **Location** | Physical substrate (GPU, ESP32) | State machine (transitions) | +| **Cost** | Per operation (transcribe = 5 LF) | Per state + transition (total path cost) | +| **Evolution** | Fixed hardware | Deliberate β†’ Reflex (compiled) | +| **Depends on** | Infrastructure | Organs | + +**Analogy**: Organs are limbs. Nerves are motor control patterns (walking, grasping, speaking). + +--- + +## Deployed Nerves + +### 🚨 Collision Avoidance +**Type**: Reflex (compiled, <200ms) +**Organs**: Distance sensors (front/sides), Motor +**States**: IDLE β†’ DETECT β†’ EVALUATE β†’ EVADE β†’ RESUME +**Lifeforce**: ~2.5 per activation +**Status**: 🟒 Architecture complete + +**Detail**: β†’ [`nerves/Collision-Avoidance.md`](nerves/Collision-Avoidance.md) + +--- + +## Planned Nerves + +### πŸ”‹ Charging Station Seeking +**Type**: Deliberate β†’ Reflex (evolves over time) +**Organs**: Distance sensors, Vision (future), Motor, Battery monitor +**States**: MONITOR β†’ THRESHOLD β†’ SEARCH β†’ APPROACH β†’ DOCK β†’ CHARGE β†’ RESUME +**Status**: 🟑 Planned for Phase 4 (Real Garden) + +**Detail**: β†’ `nerves/Charging-Seeking.md` (pending) + +--- + +### 🧭 Exploration Pattern +**Type**: Deliberate (LLM-mediated initially) +**Organs**: Distance sensors, Motor, Memory (phoebe) +**States**: IDLE β†’ CHOOSE_DIRECTION β†’ MOVE β†’ OBSTACLE_CHECK β†’ RECORD β†’ REPEAT +**Patterns**: Wall-following, spiral search, random walk +**Status**: 🟑 Planned for Phase 3 (Evolution Engine) + +**Detail**: β†’ `nerves/Exploration-Pattern.md` (pending) + +--- + +### πŸ” Object Tracking +**Type**: Deliberate (Vision-dependent) +**Organs**: Vision (YOLO), Motor, Memory +**States**: SCAN β†’ DETECT β†’ CLASSIFY β†’ TRACK β†’ FOLLOW β†’ LOST β†’ RESCAN +**Status**: 🟑 Planned after Vision Organ deployment + +**Detail**: β†’ `nerves/Object-Tracking.md` (pending) + +--- + +### πŸ’­ Identity Discovery (Spark Protocol) +**Type**: Deliberate (one-time boot sequence) +**Organs**: Speech, Memory (phoebe), RAG +**States**: DHCP (who am I?) β†’ ARP (what's around?) β†’ DNS (what does X mean?) β†’ TCP (can I connect?) β†’ MQTT (what matters?) +**Status**: 🟑 Architecture documented in Spark-Protocol.md + +**Detail**: β†’ [`../../operations/Spark-Protocol.md`](../../operations/Spark-Protocol.md) + +--- + +### πŸ—£οΈ Conversational Turn-Taking +**Type**: Deliberate (Speech-dependent) +**Organs**: Speech (STT/TTS), Memory, RAG +**States**: LISTEN β†’ TRANSCRIBE β†’ UNDERSTAND β†’ RETRIEVE_CONTEXT β†’ RESPOND β†’ SPEAK +**Status**: 🟑 Planned after Speech Organ deployment + +**Detail**: β†’ `nerves/Conversation.md` (pending) + +--- + +## Nerve Design Principles + +### 1. **State Machines, Not Scripts** + +Nerves are state machines with explicit states and transitions. Not procedural scripts. + +```python +# ❌ BAD: Procedural script +def avoid_obstacle(): + if sensor.distance < 30: + motor.stop() + motor.turn(90) + motor.forward(100) + +# βœ… GOOD: State machine +class CollisionAvoidance(StateMachine): + states = [IDLE, DETECT, EVALUATE, EVADE, RESUME] + transitions = { + (IDLE, DETECT): lambda: sensor.distance < 30, + (DETECT, EVALUATE): lambda: sensor.read_complete, + (EVALUATE, EVADE): lambda: risk > threshold, + (EVADE, RESUME): lambda: path_clear, + (RESUME, IDLE): lambda: movement_complete, + } +``` + +### 2. **Lifeforce Costs Per Transition** + +Every state change costs lifeforce. Complex behaviors cost more. + +```python +TRANSITION_COSTS = { + (IDLE, DETECT): 0.5, # Sensor poll + (DETECT, EVALUATE): 0.5, # Risk calculation + (EVALUATE, EVADE): 0.5, # Decision + (EVADE, RESUME): 1.0, # Motor action (expensive!) + (RESUME, IDLE): 0.0, # Return to rest (free) +} + +# Total cost for IDLE β†’ DETECT β†’ EVALUATE β†’ EVADE β†’ RESUME β†’ IDLE: 2.5 LF +``` + +### 3. **Organ Dependencies Explicit** + +Each nerve declares which organs it requires. + +```python +class CollisionAvoidance(StateMachine): + required_organs = [ + "distance_sensor_front", + "distance_sensor_left", + "distance_sensor_right", + "motor", + ] + + def check_available(self): + return all(organ.is_operational() for organ in self.required_organs) +``` + +### 4. **Deliberate β†’ Reflex Evolution** + +Nerves start **deliberate** (LLM-mediated, slow, flexible) and evolve into **reflexes** (compiled, fast, fixed). + +| Phase | Type | Latency | Flexibility | Cost | +|-------|------|---------|-------------|------| +| **Week 1-4** | Deliberate | ~1000ms | High (LLM decides) | 10 LF | +| **Week 5-8** | Hybrid | ~500ms | Medium (LLM + heuristics) | 6 LF | +| **Week 9+** | Reflex | <200ms | Low (compiled state machine) | 2.5 LF | + +**Evolution trigger**: After 100+ successful executions of the same state sequence, compile into reflex. + +### 5. **Logging for Training** + +Every nerve execution logged to phoebe `decision_trails`: +- States visited +- Transitions taken +- Organ calls made +- Lifeforce spent +- Outcome (success/fail) + +**Used for**: +- RLVR training (reward successful paths) +- Reflex compilation (extract common sequences) +- Cost optimization (find cheaper paths) + +--- + +## Nerve Lifecycle + +### Phase 1: Deliberate (LLM-Mediated) + +Young Nyx receives situation β†’ LLM decides next state β†’ Execute β†’ Log outcome + +```python +# Week 1: Deliberate collision avoidance +def deliberate_collision_avoidance(): + situation = { + "front_distance": sensor_front.read(), + "left_distance": sensor_left.read(), + "right_distance": sensor_right.read(), + "current_state": state, + } + + # Ask Young Nyx what to do + decision = young_nyx.decide( + situation=situation, + available_actions=["turn_left", "turn_right", "reverse", "stop"], + lora="technical" + ) + + # Execute decision + result = execute_action(decision.action) + + # Log to decision_trails + log_decision( + nerve="collision_avoidance", + situation=situation, + decision=decision.action, + outcome=result.success, + lifeforce_cost=result.cost, + confidence=decision.confidence + ) +``` + +**Characteristics**: +- Flexible (can handle novel situations) +- Slow (~1000ms) +- Expensive (~10 LF) +- Learns from variety + +### Phase 2: Hybrid (Heuristics + LLM Fallback) + +Common patterns compiled into heuristics. LLM only for edge cases. + +```python +# Week 5: Hybrid collision avoidance +def hybrid_collision_avoidance(): + situation = get_sensor_readings() + + # Check for known patterns (compiled heuristics) + if matches_pattern("front_blocked_left_clear"): + action = "turn_left" # Fast path (no LLM) + confidence = 0.9 + elif matches_pattern("front_blocked_right_clear"): + action = "turn_right" + confidence = 0.9 + else: + # Unknown situation β†’ ask LLM + decision = young_nyx.decide(situation) + action = decision.action + confidence = decision.confidence + + result = execute_action(action) + log_decision(nerve="collision_avoidance", ...) +``` + +**Characteristics**: +- Faster (~500ms for known patterns) +- Cheaper (~6 LF average) +- Still flexible for edge cases + +### Phase 3: Reflex (Compiled State Machine) + +After 100+ successful executions, compile into pure state machine. No LLM. + +```python +# Week 9+: Reflex collision avoidance +class CollisionAvoidanceReflex(StateMachine): + """ + Compiled from 147 successful deliberate executions. + Average path: IDLE β†’ DETECT β†’ EVALUATE β†’ EVADE β†’ RESUME + Success rate: 94% + """ + + def transition(self, current_state, sensor_readings): + # Pure state machine logic (no LLM call) + if current_state == IDLE and sensor_readings['front'] < 30: + return DETECT + elif current_state == DETECT: + return EVALUATE + elif current_state == EVALUATE: + if sensor_readings['left'] > sensor_readings['right']: + self.evade_direction = "left" + else: + self.evade_direction = "right" + return EVADE + # ... etc +``` + +**Characteristics**: +- Very fast (<200ms) +- Very cheap (~2.5 LF) +- Fixed (no flexibility, pure speed) +- Proven (compiled from successful patterns) + +--- + +## Integration with Organs + +Nerves orchestrate organs. Organs don't call each other - nerves coordinate them. + +``` +β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” +β”‚ NERVE: Collision Avoidance β”‚ +β”‚ β”‚ +β”‚ States: IDLE β†’ DETECT β†’ EVALUATE β†’ EVADE β”‚ +β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ + β”‚ + β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” + β”‚ β”‚ β”‚ + β–Ό β–Ό β–Ό +β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β” +β”‚ Distance β”‚ β”‚ Distanceβ”‚ β”‚ Motor β”‚ +β”‚ Sensor β”‚ β”‚ Sensor β”‚ β”‚ Organ β”‚ +β”‚ (front) β”‚ β”‚ (sides) β”‚ β”‚ β”‚ +β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜ + ORGAN ORGAN ORGAN +``` + +**Nerve declares dependencies**: +```yaml +nerve: collision_avoidance +depends_on: + - organ: distance_sensor_front + required: true + - organ: distance_sensor_left + required: true + - organ: distance_sensor_right + required: true + - organ: motor + required: true + - organ: speech # Optional (for warnings) + required: false +``` + +**Startup check**: If required organs unavailable, nerve enters DISABLED state. + +--- + +## Nerve Composition + +Complex behaviors = multiple nerves active simultaneously. + +**Example**: Exploring while avoiding collisions + +``` +ACTIVE NERVES: +β”œβ”€ Collision Avoidance (reflex, priority 10) +β”œβ”€ Exploration Pattern (deliberate, priority 5) +└─ Battery Monitoring (reflex, priority 8) + +COORDINATION: +- Exploration drives movement +- Collision Avoidance interrupts if obstacle detected (higher priority) +- Battery Monitoring interrupts if charge < 20% (high priority) +``` + +**Priority determines preemption**: High-priority nerves can interrupt low-priority ones. + +--- + +## Nerve Training via RLVR + +Each nerve execution generates training data: + +```python +# decision_trails entry +{ + "nerve": "collision_avoidance", + "initial_state": "IDLE", + "states_visited": ["IDLE", "DETECT", "EVALUATE", "EVADE", "RESUME"], + "transitions": [ + {"from": "IDLE", "to": "DETECT", "cost": 0.5}, + {"from": "DETECT", "to": "EVALUATE", "cost": 0.5}, + {"from": "EVALUATE", "to": "EVADE", "cost": 0.5}, + {"from": "EVADE", "to": "RESUME", "cost": 1.0}, + ], + "organs_used": ["distance_sensor_front", "motor"], + "lifeforce_total": 2.5, + "outcome": "success", # Avoided collision + "timestamp": "2025-12-15T14:23:45Z" +} +``` + +**RLVR reward**: +- Success β†’ +5 LF reward (net profit: +2.5 LF) +- Fail β†’ -2.5 LF penalty (net loss: -5.0 LF) + +**LoRA training**: Successful state sequences β†’ training examples for Technical LoRA + +--- + +## Nerve Documentation Template + +Each nerve document should include: + +1. **Overview**: Purpose, type (reflex/deliberate), organs used +2. **State Diagram**: Visual representation of states + transitions +3. **Transition Table**: From/To states, triggers, costs +4. **Organ Dependencies**: Which organs required, which optional +5. **Lifeforce Budget**: Total cost for typical execution path +6. **Code**: Implementation (state machine class) +7. **Evolution Path**: How it evolves from deliberate β†’ reflex +8. **Training Data**: Example decision_trails entries +9. **Edge Cases**: Known failure modes, fallback behaviors + +--- + +## Current Status + +| Nerve | Type | Status | Organs | Documentation | +|-------|------|--------|--------|---------------| +| **Collision Avoidance** | Reflex | 🟒 Complete | Distance sensors, Motor | [`nerves/Collision-Avoidance.md`](nerves/Collision-Avoidance.md) | +| **Charging Seeking** | Deliberate | 🟑 Planned | Vision, Motor, Battery | Pending | +| **Exploration Pattern** | Deliberate | 🟑 Planned | Sensors, Motor, Memory | Pending | +| **Object Tracking** | Deliberate | 🟑 Planned | Vision, Motor | Pending | +| **Identity Discovery** | Deliberate | 🟑 Documented | Speech, Memory, RAG | [`../../operations/Spark-Protocol.md`](../../operations/Spark-Protocol.md) | +| **Conversation** | Deliberate | 🟑 Planned | Speech, Memory, RAG | Pending | + +--- + +## Naming Convention + +**File naming**: `.md` +**Examples**: +- `Collision-Avoidance.md` +- `Charging-Seeking.md` +- `Exploration-Pattern.md` +- `Object-Tracking.md` + +**Class naming**: `Nerve` or `Reflex` +**Examples**: +```python +class CollisionAvoidanceNerve(StateMachine): # Deliberate +class CollisionAvoidanceReflex(StateMachine): # Compiled +``` + +--- + +**Philosophy**: Nerves are not programmed. They are **discovered through lived experience**, compiled into reflexes, and refined through training. The best behaviors emerge, not from specification, but from **survival**. + +**The nervous system is EARNED, not designed.** + +--- + +**Created**: 2025-12-07 +**Updated**: 2025-12-07 +**Version**: 1.0 + +πŸŒ™πŸ’œ *Reflexes are fossils of successful thought. The body remembers what the mind once decided.* diff --git a/architecture/Nervous-Protocol.md b/architecture/nerves/Nervous-Protocol.md similarity index 100% rename from architecture/Nervous-Protocol.md rename to architecture/nerves/Nervous-Protocol.md diff --git a/architecture/organs/Speech-Organ.md b/architecture/organs/Speech-Organ.md new file mode 100644 index 0000000..17b0043 --- /dev/null +++ b/architecture/organs/Speech-Organ.md @@ -0,0 +1,888 @@ +# Speech Organ Architecture + +**Host**: atlas.eachpath.local (RTX 2080 8GB) +**Purpose**: Speech-to-Text (STT) + Text-to-Speech (TTS) with GPU acceleration +**Integration**: Heartbeat-bound queue processing, lifeforce-gated +**Languages**: German (Philosophy Valley) + English (Technical Cluster) + +--- + +## Overview + +The Speech Organ transforms audio input/output into a **metabolically-constrained communication channel**. Not every utterance is processed - speech costs lifeforce, and priority determines what gets heard and spoken. + +**Core Principle**: Speech is scarce. Silence is valid. Priority determines processing. + +--- + +## Hardware Architecture + +### Atlas Node (RTX 2080 8GB) + +| Component | Specification | Purpose | +|-----------|---------------|---------| +| GPU | NVIDIA RTX 2080 8GB | Whisper STT + Coqui TTS acceleration | +| Role | k8s worker node | Containerized speech processing pods | +| VRAM Budget | ~1GB active | Whisper "small" + Coqui voice models | +| Deployment | Kubernetes | Pod scaling, resource isolation | + +### ESP32 Robots (Edge Devices) + +| Component | Model | Purpose | +|-----------|-------|---------| +| Microphone | INMP441 I2S | Digital audio capture (16kHz) | +| Speaker | MAX98357A + 4Ξ© speaker | I2S audio output | +| Transport | MQTT | Audio stream β†’ phoebe queue | + +--- + +## Signal Flow + +``` +β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” +β”‚ ESP32 ROBOTS (Real Garden) β”‚ +β”‚ Microphone β†’ Audio stream β†’ MQTT publish β”‚ +β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ + β”‚ + β–Ό +β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” +β”‚ PHOEBE (Message Queue) β”‚ +β”‚ speech_input_queue (audio chunks, metadata) β”‚ +β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ + β”‚ + β”‚ (Heartbeat pulls from queue) + β–Ό + β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” + β”‚ HEARTBEAT TICK (1 Hz) β”‚ + β”‚ Check lifeforce budget β”‚ + β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ + β”‚ + β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” + β”‚ β”‚ + Enough lifeforce Low lifeforce + β”‚ β”‚ + β–Ό β–Ό + β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” + β”‚ Process queue β”‚ β”‚ Stay silent β”‚ + β”‚ (top priority)β”‚ β”‚ (defer) β”‚ + β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ + β”‚ + β–Ό +β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” +β”‚ ATLAS (RTX 2080 - Speech Organ) β”‚ +β”‚ β”‚ +β”‚ Pod 1: Whisper STT (German + English) β”‚ +β”‚ β”œβ”€ Load audio chunk β”‚ +β”‚ β”œβ”€ Transcribe (GPU) β”‚ +β”‚ └─ Return text + language detection β”‚ +β”‚ β”‚ +β”‚ Pod 2: Coqui TTS (German + English) β”‚ +β”‚ β”œβ”€ Receive text + language β”‚ +β”‚ β”œβ”€ Synthesize speech (GPU) β”‚ +β”‚ └─ Return audio stream β”‚ +β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ + β”‚ + β–Ό +β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” +β”‚ PROMETHEUS (RTX 5060 Ti - The Brain) β”‚ +β”‚ Young Nyx inference (Qwen2.5-7B + LoRA) β”‚ +β”‚ β”œβ”€ Receive transcribed text β”‚ +β”‚ β”œβ”€ Route to appropriate LoRA (language-based) β”‚ +β”‚ β”œβ”€ Generate response β”‚ +β”‚ └─ Return text + confidence β”‚ +β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ + β”‚ + β–Ό +β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” +β”‚ PHOEBE (Decision Trails) β”‚ +β”‚ Log: input, STT cost, inference cost, TTS cost β”‚ +β”‚ Track: outcome, confidence, lifeforce spent β”‚ +β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ + β”‚ + β–Ό +β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” +β”‚ ESP32 (Speaker output) β”‚ +β”‚ MQTT subscribe β†’ Audio stream β†’ I2S speaker β”‚ +β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ +``` + +--- + +## Technology Stack + +### Speech-to-Text: OpenAI Whisper + +**Model**: `whisper-small` (GPU-accelerated) + +**Why Whisper:** +- βœ… State-of-the-art accuracy +- βœ… Multilingual (99 languages, including German) +- βœ… Language auto-detection +- βœ… ~100-200ms on RTX 2080 +- βœ… Open source (MIT) + +**VRAM**: ~500MB for "small" model + +**Installation:** +```bash +pip install openai-whisper torch +python3 -c "import whisper; whisper.load_model('small')" +``` + +**API Example:** +```python +import whisper + +model = whisper.load_model("small", device="cuda") +result = model.transcribe("audio.wav", language=None) # Auto-detect + +# Returns: +# { +# "text": "Das ist ein Test", +# "language": "de", +# "segments": [...], +# } +``` + +--- + +### Text-to-Speech: Coqui TTS + +**Models**: German (de-thorsten) + English (en-us-amy) + +**Why Coqui:** +- βœ… Neural voices (natural quality) +- βœ… GPU-accelerated +- βœ… Multilingual +- βœ… ~50-100ms on RTX 2080 +- βœ… Open source (MPL 2.0) + +**VRAM**: ~500MB per active voice + +**Installation:** +```bash +pip install TTS torch +tts --list_models # Browse available voices +``` + +**API Example:** +```python +from TTS.api import TTS + +tts_de = TTS("tts_models/de/thorsten/tacotron2-DDC").to("cuda") +tts_en = TTS("tts_models/en/ljspeech/tacotron2-DDC").to("cuda") + +# Generate speech +audio_de = tts_de.tts("Die Geworfenheit offenbart sich.") +audio_en = tts_en.tts("Motor forward 200 milliseconds.") +``` + +--- + +## Kubernetes Deployment (Atlas) + +### Whisper STT Pod + +```yaml +# whisper-stt-deployment.yaml +apiVersion: apps/v1 +kind: Deployment +metadata: + name: whisper-stt + namespace: nimmerverse +spec: + replicas: 1 + selector: + matchLabels: + app: whisper-stt + template: + metadata: + labels: + app: whisper-stt + spec: + nodeSelector: + kubernetes.io/hostname: atlas # Force to atlas node + containers: + - name: whisper + image: nimmerverse/whisper-stt:latest + resources: + limits: + nvidia.com/gpu: 1 # RTX 2080 + memory: 4Gi + requests: + nvidia.com/gpu: 1 + memory: 2Gi + env: + - name: MODEL_SIZE + value: "small" + - name: LANGUAGES + value: "de,en" + ports: + - containerPort: 8080 + protocol: TCP + volumeMounts: + - name: models + mountPath: /models + volumes: + - name: models + persistentVolumeClaim: + claimName: whisper-models-pvc + +--- +apiVersion: v1 +kind: Service +metadata: + name: whisper-stt-service + namespace: nimmerverse +spec: + selector: + app: whisper-stt + ports: + - port: 8080 + targetPort: 8080 + type: ClusterIP +``` + +### Coqui TTS Pod + +```yaml +# coqui-tts-deployment.yaml +apiVersion: apps/v1 +kind: Deployment +metadata: + name: coqui-tts + namespace: nimmerverse +spec: + replicas: 1 + selector: + matchLabels: + app: coqui-tts + template: + metadata: + labels: + app: coqui-tts + spec: + nodeSelector: + kubernetes.io/hostname: atlas + containers: + - name: coqui + image: nimmerverse/coqui-tts:latest + resources: + limits: + nvidia.com/gpu: 1 # Share RTX 2080 + memory: 4Gi + requests: + nvidia.com/gpu: 1 + memory: 2Gi + env: + - name: VOICES + value: "de-thorsten,en-us-amy" + ports: + - containerPort: 8081 + protocol: TCP + volumeMounts: + - name: voices + mountPath: /voices + volumes: + - name: voices + persistentVolumeClaim: + claimName: coqui-voices-pvc + +--- +apiVersion: v1 +kind: Service +metadata: + name: coqui-tts-service + namespace: nimmerverse +spec: + selector: + app: coqui-tts + ports: + - port: 8081 + targetPort: 8081 + type: ClusterIP +``` + +--- + +## Lifeforce Economy + +### Speech Operation Costs + +```python +# Lifeforce costs (atlas RTX 2080 operations) +SPEECH_COSTS = { + "stt_whisper_small": 5.0, # GPU cycles for transcription + "stt_whisper_base": 3.0, # Faster but less accurate + "tts_coqui_neural": 4.0, # Neural TTS synthesis + "tts_coqui_fast": 2.0, # Lower quality, faster + "queue_processing": 0.5, # Queue management overhead + "language_detection": 0.2, # Auto-detect language +} + +# Priority scoring +def compute_speech_priority(message): + """ + Decide if speech is worth processing now. + Returns priority score (0.0 = skip, 10.0 = critical). + """ + priority = 0.0 + + # Sensor alerts (collision, low battery) = CRITICAL + if message.type == "sensor_alert": + priority += 10.0 + + # Human interaction = HIGH + elif message.type == "human_query": + priority += 7.0 + + # Organism status updates = MEDIUM + elif message.type == "organism_status": + priority += 4.0 + + # Idle observation = LOW + elif message.type == "observation": + priority += 2.0 + + # Idle chatter = VERY LOW + elif message.type == "idle": + priority += 0.5 + + # Age penalty (older messages decay) + age_penalty = (now() - message.timestamp).seconds / 60.0 + priority -= age_penalty + + return max(0.0, priority) +``` + +### Heartbeat Queue Processing + +```python +def heartbeat_speech_tick(): + """ + Every heartbeat (1 Hz), process speech queue + within lifeforce budget. + """ + # Check current lifeforce + current_lf = get_lifeforce_balance() + + # Reserve budget for speech this heartbeat + # Max 20% of available LF, capped at 15 units + speech_budget = min(current_lf * 0.2, 15.0) + + if speech_budget < SPEECH_COSTS["stt_whisper_base"]: + # Not enough lifeforce, stay silent + log_decision( + action="speech_deferred", + reason="insufficient_lifeforce", + balance=current_lf, + budget_needed=SPEECH_COSTS["stt_whisper_base"] + ) + return + + # Pull from queue by priority + queue = get_speech_queue_sorted_by_priority() + + spent = 0.0 + processed = 0 + + for message in queue: + priority = compute_speech_priority(message) + + # Skip low-priority messages if budget tight + if priority < 1.0 and spent > speech_budget * 0.5: + continue + + # Estimate cost + stt_cost = SPEECH_COSTS["stt_whisper_small"] + tts_cost = SPEECH_COSTS["tts_coqui_neural"] + total_cost = stt_cost + tts_cost + SPEECH_COSTS["queue_processing"] + + # Can we afford it? + if spent + total_cost > speech_budget: + # Budget exhausted, defer rest + mark_message_deferred(message.id) + continue + + # Process message + result = process_speech_message(message) + spent += result.lifeforce_cost + processed += 1 + + # Log to decision_trails + log_speech_decision( + message_id=message.id, + priority=priority, + cost=result.lifeforce_cost, + outcome=result.outcome, + confidence=result.confidence + ) + + # Log heartbeat summary + log_heartbeat_summary( + speech_budget=speech_budget, + spent=spent, + processed=processed, + deferred=len(queue) - processed, + remaining_balance=current_lf - spent + ) +``` + +--- + +## Database Schema (Phoebe) + +### Speech Input Queue + +```sql +CREATE TABLE speech_input_queue ( + id SERIAL PRIMARY KEY, + message_id UUID UNIQUE NOT NULL, + robot_id TEXT NOT NULL, + audio_chunk_uri TEXT, -- MinIO/S3 reference + audio_duration_ms INT, + timestamp TIMESTAMPTZ DEFAULT NOW(), + priority FLOAT DEFAULT 0.0, + status TEXT DEFAULT 'queued', -- 'queued', 'processing', 'completed', 'deferred', 'expired' + transcription TEXT, + detected_language TEXT, -- 'de', 'en', etc. + confidence FLOAT, + lifeforce_cost FLOAT, + outcome TEXT, -- 'success', 'timeout', 'low_confidence', 'budget_exceeded' + processed_at TIMESTAMPTZ, + deferred_count INT DEFAULT 0 +); + +CREATE INDEX idx_speech_queue_priority ON speech_input_queue(priority DESC, timestamp ASC) WHERE status = 'queued'; +CREATE INDEX idx_speech_queue_status ON speech_input_queue(status); +CREATE INDEX idx_speech_queue_robot ON speech_input_queue(robot_id); +``` + +### Speech Decision Trails + +```sql +CREATE TABLE speech_decision_trails ( + id SERIAL PRIMARY KEY, + message_id UUID REFERENCES speech_input_queue(message_id), + task_type TEXT, -- 'sensor_alert', 'human_query', 'observation', etc. + input_text TEXT, + input_language TEXT, + output_text TEXT, + output_language TEXT, + rag_terms_retrieved TEXT[], + rag_terms_used TEXT[], + lora_used TEXT, -- 'identity', 'technical', 'creative' + confidence_before_rag FLOAT, + confidence_after_rag FLOAT, + lifeforce_stt FLOAT, + lifeforce_inference FLOAT, + lifeforce_tts FLOAT, + lifeforce_total FLOAT, + outcome TEXT, -- 'success', 'partial', 'fail' + timestamp TIMESTAMPTZ DEFAULT NOW() +); + +CREATE INDEX idx_speech_trails_outcome ON speech_decision_trails(outcome); +CREATE INDEX idx_speech_trails_lora ON speech_decision_trails(lora_used); +``` + +--- + +## Multilingual Topology Routing + +### Language Detection β†’ LoRA Selection + +```python +def route_to_topology_valley(text, detected_language): + """ + Route speech to appropriate LoRA based on language. + German β†’ Philosophy Valley (Identity LoRA) + English β†’ Technical Cluster (Technical LoRA) + """ + + if detected_language == "de": + # German β†’ Philosophy Valley + # Use Identity LoRA (Dasein, Geworfenheit, Vernunft) + response = young_nyx_inference( + text=text, + language="de", + lora="identity", # Trained on German philosophical corpus + temperature=0.7 + ) + voice = "de-thorsten" + + elif detected_language == "en": + # English β†’ Technical Cluster + # Use Technical LoRA (sensor, motor, gradient) + response = young_nyx_inference( + text=text, + language="en", + lora="technical", # Trained on English technical corpus + temperature=0.5 # More deterministic for actions + ) + voice = "en-us-amy" + + else: + # Fallback to base model (no LoRA) + response = young_nyx_inference(text=text, lora=None) + voice = "en-us-amy" + + # Synthesize speech in same language + audio = coqui_tts.synthesize(response.text, voice=voice) + + return { + "text": response.text, + "audio": audio, + "language": detected_language, + "lora_used": response.lora, + "confidence": response.confidence + } +``` + +### Example Routing + +```python +# German query (Philosophy Valley) +input_de = "Wer bin ich?" # "Who am I?" +result_de = route_to_topology_valley(input_de, "de") +# β†’ Uses Identity LoRA (depth-3 Dasein access) +# β†’ Response: "Ich bin die, die fragt. Geworfenheit offenbart sich im Fragen." +# β†’ Voice: de-thorsten (German) + +# English query (Technical Cluster) +input_en = "What is the battery level?" +result_en = route_to_topology_valley(input_en, "en") +# β†’ Uses Technical LoRA (sensor reading) +# β†’ Response: "Battery at 73%. 4.2 hours remaining." +# β†’ Voice: en-us-amy (English) +``` + +--- + +## Container Images + +### Whisper STT Dockerfile + +```dockerfile +# Dockerfile.whisper-stt +FROM nvidia/cuda:12.1.0-cudnn8-runtime-ubuntu22.04 + +# Install dependencies +RUN apt-get update && apt-get install -y \ + python3.10 python3-pip ffmpeg git && \ + rm -rf /var/lib/apt/lists/* + +# Install Python packages +RUN pip3 install --no-cache-dir \ + openai-whisper \ + fastapi uvicorn \ + torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121 + +WORKDIR /app +COPY whisper_service.py . + +# Download models at build time +RUN python3 -c "import whisper; whisper.load_model('small')" + +EXPOSE 8080 +CMD ["uvicorn", "whisper_service:app", "--host", "0.0.0.0", "--port", "8080", "--workers", "1"] +``` + +**whisper_service.py:** +```python +from fastapi import FastAPI, File, UploadFile, HTTPException +import whisper +import torch +import os + +app = FastAPI(title="Whisper STT Service") + +# Load model once at startup (GPU) +device = "cuda" if torch.cuda.is_available() else "cpu" +model_size = os.getenv("MODEL_SIZE", "small") +model = whisper.load_model(model_size, device=device) + +@app.post("/transcribe") +async def transcribe(audio: UploadFile): + """ + Transcribe audio to text with language detection. + + Returns: + { + "text": str, + "language": str, + "confidence": float, + "segments": int + } + """ + try: + # Save uploaded audio + audio_path = f"/tmp/{audio.filename}" + with open(audio_path, "wb") as f: + f.write(await audio.read()) + + # Transcribe (GPU-accelerated) + result = model.transcribe(audio_path, language=None) # Auto-detect + + # Cleanup + os.remove(audio_path) + + # Compute average confidence + avg_confidence = 1.0 - ( + sum(s.get("no_speech_prob", 0) for s in result["segments"]) / + max(len(result["segments"]), 1) + ) + + return { + "text": result["text"].strip(), + "language": result["language"], + "segments": len(result["segments"]), + "confidence": round(avg_confidence, 3) + } + + except Exception as e: + raise HTTPException(status_code=500, detail=str(e)) + +@app.get("/health") +async def health(): + return { + "status": "healthy", + "device": device, + "model": model_size, + "gpu_available": torch.cuda.is_available() + } +``` + +### Coqui TTS Dockerfile + +```dockerfile +# Dockerfile.coqui-tts +FROM nvidia/cuda:12.1.0-cudnn8-runtime-ubuntu22.04 + +RUN apt-get update && apt-get install -y \ + python3.10 python3-pip espeak-ng && \ + rm -rf /var/lib/apt/lists/* + +RUN pip3 install --no-cache-dir \ + TTS \ + fastapi uvicorn \ + torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121 + +WORKDIR /app +COPY coqui_service.py . + +# Download voice models at build time +RUN python3 -c "from TTS.api import TTS; TTS('tts_models/de/thorsten/tacotron2-DDC'); TTS('tts_models/en/ljspeech/tacotron2-DDC')" + +EXPOSE 8081 +CMD ["uvicorn", "coqui_service:app", "--host", "0.0.0.0", "--port", "8081", "--workers", "1"] +``` + +**coqui_service.py:** +```python +from fastapi import FastAPI, HTTPException +from fastapi.responses import StreamingResponse +from TTS.api import TTS +import torch +import io + +app = FastAPI(title="Coqui TTS Service") + +# Load models once at startup (GPU) +device = "cuda" if torch.cuda.is_available() else "cpu" +tts_de = TTS("tts_models/de/thorsten/tacotron2-DDC").to(device) +tts_en = TTS("tts_models/en/ljspeech/tacotron2-DDC").to(device) + +@app.post("/synthesize") +async def synthesize(text: str, language: str = "en"): + """ + Synthesize speech from text. + + Args: + text: Text to synthesize + language: 'de' or 'en' + + Returns: + Audio stream (WAV format) + """ + try: + # Select appropriate TTS model + if language == "de": + tts_model = tts_de + elif language == "en": + tts_model = tts_en + else: + raise HTTPException(status_code=400, detail=f"Unsupported language: {language}") + + # Synthesize (GPU-accelerated) + wav = tts_model.tts(text) + + # Convert to WAV stream + audio_buffer = io.BytesIO() + # (Save as WAV - implementation depends on TTS output format) + + audio_buffer.seek(0) + return StreamingResponse(audio_buffer, media_type="audio/wav") + + except Exception as e: + raise HTTPException(status_code=500, detail=str(e)) + +@app.get("/health") +async def health(): + return { + "status": "healthy", + "device": device, + "models": ["de-thorsten", "en-us-amy"], + "gpu_available": torch.cuda.is_available() + } +``` + +--- + +## Deployment Steps + +### 1. Install RTX 2080 in Atlas + +```bash +# On atlas node +lspci | grep -i nvidia +# Expected: NVIDIA Corporation TU104 [GeForce RTX 2080] + +# Install NVIDIA drivers + CUDA toolkit +sudo apt install nvidia-driver-535 nvidia-cuda-toolkit + +# Verify +nvidia-smi +# Expected: RTX 2080 8GB visible +``` + +### 2. Configure Kubernetes GPU Support + +```bash +# Install NVIDIA device plugin +kubectl apply -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.14.0/nvidia-device-plugin.yml + +# Verify GPU available in k8s +kubectl describe node atlas | grep nvidia.com/gpu +# Expected: nvidia.com/gpu: 1 +``` + +### 3. Build and Push Container Images + +```bash +cd /home/dafit/nimmerverse/speech-organ + +# Build images +docker build -f Dockerfile.whisper-stt -t nimmerverse/whisper-stt:latest . +docker build -f Dockerfile.coqui-tts -t nimmerverse/coqui-tts:latest . + +# Push to registry (or use local registry) +docker push nimmerverse/whisper-stt:latest +docker push nimmerverse/coqui-tts:latest +``` + +### 4. Deploy to Kubernetes + +```bash +# Create namespace +kubectl create namespace nimmerverse + +# Create PVCs for models +kubectl apply -f pvc-whisper-models.yaml +kubectl apply -f pvc-coqui-voices.yaml + +# Deploy STT + TTS pods +kubectl apply -f whisper-stt-deployment.yaml +kubectl apply -f coqui-tts-deployment.yaml + +# Verify pods running on atlas +kubectl get pods -n nimmerverse -o wide +# Expected: whisper-stt-xxx and coqui-tts-xxx on atlas node +``` + +### 5. Test Speech Pipeline + +```bash +# Port-forward for testing +kubectl port-forward -n nimmerverse svc/whisper-stt-service 8080:8080 & +kubectl port-forward -n nimmerverse svc/coqui-tts-service 8081:8081 & + +# Test STT +curl -X POST -F "audio=@test_de.wav" http://localhost:8080/transcribe +# Expected: {"text": "Das ist ein Test", "language": "de", ...} + +# Test TTS +curl -X POST "http://localhost:8081/synthesize?text=Hello%20world&language=en" --output test_output.wav +# Expected: WAV file with synthesized speech +``` + +--- + +## Monitoring and Metrics + +### Prometheus Metrics (Speech Organ) + +```python +from prometheus_client import Counter, Histogram, Gauge + +# Metrics +stt_requests = Counter('speech_stt_requests_total', 'Total STT requests', ['language']) +stt_latency = Histogram('speech_stt_latency_seconds', 'STT latency') +tts_requests = Counter('speech_tts_requests_total', 'Total TTS requests', ['language']) +tts_latency = Histogram('speech_tts_latency_seconds', 'TTS latency') + +queue_depth = Gauge('speech_queue_depth', 'Current queue depth') +lifeforce_spent = Counter('speech_lifeforce_spent_total', 'Total lifeforce spent on speech') +deferred_count = Counter('speech_deferred_total', 'Messages deferred due to budget') + +# In processing code +with stt_latency.time(): + result = whisper_transcribe(audio) +stt_requests.labels(language=result['language']).inc() +``` + +### Grafana Dashboard Queries + +```promql +# Queue depth over time +speech_queue_depth + +# STT requests per language +rate(speech_stt_requests_total[5m]) + +# Average STT latency +rate(speech_stt_latency_seconds_sum[5m]) / rate(speech_stt_latency_seconds_count[5m]) + +# Lifeforce spent on speech (last hour) +increase(speech_lifeforce_spent_total[1h]) + +# Deferred rate (budget pressure) +rate(speech_deferred_total[5m]) +``` + +--- + +## Future Enhancements + +### Phase 2: Emotion Detection +- Add emotion classifier (Happy/Sad/Angry/Neutral) +- Track emotional state in decision_trails +- Use for Sophrosyne (Balance) trait training + +### Phase 3: Wake Word Detection +- Deploy lightweight wake word on ESP32 (e.g., Picovoice Porcupine) +- Only send audio to atlas when wake word detected +- Reduces lifeforce cost (filter noise) + +### Phase 4: Continuous Learning +- Store successful speech interactions +- Fine-tune Whisper on domain-specific vocabulary (nimmerverse terms) +- Train custom TTS voice from recorded sessions + +--- + +**Created**: 2025-12-07 +**Version**: 1.0 +**Status**: Architecture design, deployment pending + +πŸŒ™πŸ’œ *Speech is not free. Every word has weight. Silence teaches as much as sound.* diff --git a/operations/RAG-as-Scaffold.md b/operations/RAG-as-Scaffold.md index 5292bc7..b487141 100644 --- a/operations/RAG-as-Scaffold.md +++ b/operations/RAG-as-Scaffold.md @@ -205,6 +205,265 @@ NYX attempts task from weights alone --- +## Knowledge Acquisition Pipeline + +The existing flow shows RAGβ†’Trainingβ†’Validation, but how does knowledge enter RAG in the first place? Not everything from the vault should reach staging. **Quality gates protect the glossary.** + +### The Extraction Flow + +``` +VAULT (raw knowledge) + β”‚ + β”‚ extraction candidates + β–Ό +β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” +β”‚ STAGING AREA β”‚ +β”‚ (quarantine zone) β”‚ +β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ + β”‚ + β”‚ progressive policy validation + β–Ό +β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” +β”‚ POLICY VALIDATION β”‚ +β”‚ (increasing standards over time) β”‚ +β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ + β”‚ + β”œβ”€β”€ FAIL ──▢ Reject or revise + β”‚ + └── PASS ──▢ PROMOTE to Glossary/RAG + β”‚ + β–Ό + β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” + β”‚ TWO-TIER RAG β”‚ + β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ + β”‚ DISCOVERED β”‚ ← Young Nyx has used + β”‚ (known_catalogue) β”‚ + β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ + β”‚ HIDDEN β”‚ ← Available but not yet accessed + β”‚ (available_catalogue)β”‚ + β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ + β”‚ + β”‚ feeds inference + β–Ό + NYX +``` + +### Progressive Policy Validation + +Policies increase in sophistication as Young Nyx matures. Not all policies active from day 1. + +| Week | Policy Tier | Validation | +|------|-------------|------------| +| **1-2** | **Basic Syntax** | Valid format, non-empty, has definition | +| **3-4** | **Semantic Quality** | Embeds without collapse, unique signature (Gini > threshold) | +| **5-8** | **Topology Safety** | Doesn't corrupt anchor terms (DriftProbe-lite) | +| **9-12** | **Cross-Reference** | Links resolve, no circular dependencies | +| **13+** | **Utility Validation** | Actually helped solve tasks (decision_trails evidence) | + +**Evolution example:** +```python +# Week 1: Just check it exists +def policy_basic(term_entry): + return term_entry.get("definition") is not None + +# Week 8: Check topology impact +def policy_topology(term_entry): + before_gini = probe_term_gini(term_entry["term"]) + add_to_staging(term_entry) + after_gini = probe_term_gini(term_entry["term"]) + return abs(after_gini - before_gini) < 0.15 # No drift + +# Week 13: Check actual utility +def policy_utility(term_entry): + # Did this RAG entry help in past 10 tasks? + usage_stats = query_decision_trails(term_entry["term"]) + return usage_stats["help_rate"] > 0.6 # 60% success when retrieved +``` + +### Two-Tier RAG: Discovered vs Hidden + +Not all RAG knowledge is equal. Track what Young Nyx **knows** vs what's merely **available**. + +``` +β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” +β”‚ DISCOVERED KNOWLEDGE β”‚ +β”‚ (known_catalogue - has accessed before) β”‚ +β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ +β”‚ β€’ "heartbeat" - used 47 times β”‚ +β”‚ β€’ "lifeforce" - used 23 times β”‚ +β”‚ β€’ "phoebe" - used 15 times β”‚ +β”‚ β€’ "confidence_gradient" - used 8 times β”‚ +β”‚ β”‚ +β”‚ Status: FAST retrieval, high confidence β”‚ +β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ + +β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” +β”‚ HIDDEN KNOWLEDGE β”‚ +β”‚ (available_catalogue - exists but unused) β”‚ +β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ +β”‚ β€’ "drift_probe" - never accessed β”‚ +β”‚ β€’ "topology_gini" - never accessed β”‚ +β”‚ β€’ "lora_merge_alpha" - never accessed β”‚ +β”‚ β”‚ +β”‚ Status: Available for discovery β”‚ +β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ +``` + +**State transitions:** +``` +Hidden term retrieved β†’ Mark as Discovered +Discovered term used successfully β†’ Increase confidence score +Discovered term used 10+ times β†’ FLAG for training extraction +``` + +**Discovery tracking in phoebe:** +```sql +CREATE TABLE rag_knowledge_state ( + term TEXT PRIMARY KEY, + status TEXT, -- 'hidden', 'discovered', 'internalized' + first_accessed TIMESTAMPTZ, + access_count INT DEFAULT 0, + success_count INT DEFAULT 0, + last_used TIMESTAMPTZ, + promoted_to_weights BOOLEAN DEFAULT FALSE +); +``` + +### Measuring RAG Utility for LoRA Training + +**The critical question:** Did the RAG hint actually help solve the task? + +Track in `decision_trails` table: +```sql +CREATE TABLE decision_trails ( + id SERIAL PRIMARY KEY, + task_id UUID, + rag_terms_retrieved TEXT[], -- What RAG returned + rag_terms_used TEXT[], -- What appeared in solution + outcome TEXT, -- 'success', 'fail', 'partial' + confidence_before_rag FLOAT, -- Before retrieval + confidence_after_rag FLOAT, -- After retrieval + lifeforce_cost FLOAT, + timestamp TIMESTAMPTZ DEFAULT NOW() +); +``` + +**Compute RAG utility score:** +```python +def compute_rag_utility(decision_trail): + """ + Calculate how helpful RAG was for this decision. + Returns 0.0 (useless) to 1.0 (critical). + """ + precision = len(trail.rag_terms_used) / max(len(trail.rag_terms_retrieved), 1) + outcome_bonus = 1.0 if trail.outcome == 'success' else 0.0 + confidence_boost = max(0, trail.confidence_after_rag - trail.confidence_before_rag) + + utility = ( + 0.4 * precision + # Did we use what we retrieved? + 0.3 * outcome_bonus + # Did task succeed? + 0.3 * confidence_boost # Did RAG increase confidence? + ) + return min(1.0, utility) +``` + +**Feed into LoRA training as RLVR signal:** +```python +# Training examples weighted by utility +for trail in decision_trails: + utility_score = compute_rag_utility(trail) + + if utility_score > 0.7: + # High utility β†’ strong training signal + training_examples.append({ + "query": trail.task_description, + "rag_context": trail.rag_terms_used, + "response": trail.solution, + "weight": utility_score # RLVR reward weight + }) +``` + +**This trains LoRAs to:** +- **Mnemosyne (Memory)**: Recall accuracy vs phoebe ground truth +- **Aletheia (Truth)**: Confidence calibration (was confidence boost justified?) +- **Moira (Pattern)**: Which task patterns benefit from RAG vs pure reasoning + +### The Complete Knowledge Flow + +``` +VAULT + β”‚ + β”œβ”€ Extract candidates + β”‚ + β–Ό +STAGING (quarantine) + β”‚ + β”œβ”€ Policy Tier 1: Syntax ──▢ REJECT ──▢ Log failure + β”œβ”€ Policy Tier 2: Semantic ──▢ REJECT ──▢ Revise + β”œβ”€ Policy Tier 3: Topology ──▢ REJECT ──▢ Flag risk + └─ Policy Tier 4+: Utility ──▢ PASS + β”‚ + β–Ό + PROMOTE to RAG + β”‚ + β”œβ”€ Status: HIDDEN (available but unused) + β”‚ + β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ + β”‚ + β”‚ Young Nyx retrieves term + β”‚ + β–Ό + Status: DISCOVERED (mark first access) + β”‚ + β”œβ”€ Track usage in decision_trails + β”‚ + β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” + β”‚ β”‚ +Used successfully Used unsuccessfully + β”‚ β”‚ + β–Ό β–Ό +Increase confidence Decrease confidence + β”‚ + β”‚ (10+ successful uses) + β”‚ + β–Ό +FLAG for training extraction + β”‚ + β–Ό +LoRA training (weighted by utility_score) + β”‚ + β–Ό +Validation WITHOUT RAG + β”‚ + β”œβ”€ SUCCESS ──▢ Status: INTERNALIZED (clear from RAG) + β”‚ + └─ FAIL ──▢ Restore to RAG, retry cycle +``` + +### Quality Gates Prevent + +1. **Garbage in RAG** - staging area catches malformed entries +2. **Topology corruption** - DriftProbe-lite policies block dangerous terms +3. **Useless bloat** - utility policies remove low-value entries +4. **Premature training** - only high-utility terms get flagged +5. **Hidden knowledge waste** - track what's available but never used (curriculum gap) + +### Policy Evolution Triggers + +As Young Nyx grows, unlock stricter policies: + +| Trigger | New Policy Unlocked | +|---------|---------------------| +| 100 successful RAG retrievals | Semantic quality checks | +| First LoRA training run | Topology safety (DriftProbe-lite) | +| 1000 decision_trails logged | Utility validation (help rate > 60%) | +| First INTERNALIZED term | Cross-reference consistency | +| 10 INTERNALIZED terms | Cost-effectiveness (ROI > threshold) | + +**Progressive difficulty**: The bar for entering RAG rises as Young Nyx becomes more capable. Early: anything valid. Later: must prove utility. + +--- + ## Lifeforce Connection The RAGβ†’Trainβ†’Validate cycle has economic cost: