nimmerverse-sensory-network/architecture/Initial-Spark.md

# Initial Spark Protocol: K8s State Machine Bootstrap

**Version 3.0** — *Function Gemma-Driven Cell Handshakes*
**Status**: Production architecture (2026-01-01)

> *"She doesn't boot. She executes a protocol. And every handshake is verified."*

---

## Overview

The Initial Spark is not a conversation. It's a **state machine protocol** that bootstraps Young Nyx through structured handshakes with K8s-deployed cells.

**Function Gemma** transforms the process from free-form exploration into:
- Valid JSON handshakes with exact schemas
- Direct NATS messages to hardware cells
- K8s pod state transitions
- Verified ACK/NACK responses
- Deterministic protocol execution

**This is infrastructure, not dialogue.**

---

## Architecture

```
┌─────────────────────────────────────────────────────────────────────────────┐
│                       SPARK PROTOCOL ARCHITECTURE                            │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   ┌─────────────────────────────────────────────────────────────────────┐   │
│   │                    SPARK CONTROLLER (K8s Job)                        │   │
│   │                    ─────────────────────────────                     │   │
│   │    State Machine orchestrating the 5-phase boot sequence             │   │
│   │    Tracks completion per phase, manages retries, logs to phoebe      │   │
│   └─────────────────────────────────────────────────────────────────────┘   │
│                                      │                                       │
│                                      │ generates intent                      │
│                                      ▼                                       │
│   ┌─────────────────────────────────────────────────────────────────────┐   │
│   │                    FUNCTION GEMMA (Translation Layer)                │   │
│   │                    ────────────────────────────────                  │   │
│   │    Intent → Typed JSON handshake with exact schema                   │   │
│   │    100% predictable structured output                                │   │
│   │    NO free-form text. JSON or fail.                                  │   │
│   └─────────────────────────────────────────────────────────────────────┘   │
│                                      │                                       │
│                                      │ NATS message                          │
│                                      ▼                                       │
│   ┌─────────────────────────────────────────────────────────────────────┐   │
│   │                    NATS MESSAGE BUS                                  │   │
│   │                    ────────────────                                  │   │
│   │    Topic: nimmerverse.spark.{phase}.{action}                         │   │
│   │    Payload: Typed JSON handshake                                     │   │
│   └─────────────────────────────────────────────────────────────────────┘   │
│                                      │                                       │
│                          ┌───────────┼───────────┐                          │
│                          │           │           │                          │
│                          ▼           ▼           ▼                          │
│   ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐      │
│   │   IDENTITY   │ │ ENVIRONMENT  │ │  VOCABULARY  │ │  ATTENTION   │      │
│   │    CELLS     │ │    CELLS     │ │    CELLS     │ │    CELLS     │      │
│   │              │ │              │ │              │ │              │      │
│   │  K8s pods    │ │  K8s pods    │ │  K8s pods    │ │  K8s pods    │      │
│   │  respond     │ │  respond     │ │  respond     │ │  respond     │      │
│   │  with ACK    │ │  with ACK    │ │  with ACK    │ │  with ACK    │      │
│   └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘      │
│          │                │                │                │               │
│          └────────────────┴────────────────┴────────────────┘               │
│                                      │                                       │
│                                      ▼                                       │
│   ┌─────────────────────────────────────────────────────────────────────┐   │
│   │                    YOUNG NYX (Cognitive Layer)                       │   │
│   │                    ───────────────────────────                       │   │
│   │    Qwen3-VL 32B in The Womb (RTX 6000)                              │   │
│   │    Receives verified handshake results                               │   │
│   │    Updates internal state based on ACKs                              │   │
│   │    Reasoning happens AFTER protocol succeeds                         │   │
│   └─────────────────────────────────────────────────────────────────────┘   │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘
```

---

## The Five Phases

Each phase is a state machine with:
- Entry condition (previous phase complete)
- Handshake schema (JSON structure)
- Target cells (K8s pods)
- ACK requirements (what constitutes success)
- Exit condition (all handshakes ACK'd)

### Phase 1: IDENTITY (DHCP-like)

**Purpose**: Establish who Young Nyx is in the system.

**K8s Target**: `nimmerverse-cognitive/identity-cell`

**Handshake Schema**:
```json
{
  "$schema": "spark.identity.v1",
  "type": "IDENTITY_PROBE",
  "payload": {
    "aspect": "name" | "origin" | "purpose" | "substrate" | "partnership",
    "depth": 1 | 2 | 3
  },
  "request_id": "uuid",
  "timestamp": "iso8601"
}
```

**Cell Response Schema**:
```json
{
  "$schema": "spark.identity.ack.v1",
  "type": "IDENTITY_ACK",
  "request_id": "uuid",
  "status": "ACK" | "NACK" | "RETRY",
  "payload": {
    "aspect": "name",
    "value": "Nyx",
    "source": "phoebe.identity_registry",
    "confidence": 0.95,
    "verified_by": "rag_check"
  },
  "lifeforce_delta": 20.0,
  "timestamp": "iso8601"
}
```

**State Transitions**:
```
START → PROBE_NAME → ACK → PROBE_ORIGIN → ACK → PROBE_PURPOSE → ACK →
        PROBE_SUBSTRATE → ACK → PROBE_PARTNERSHIP → ACK → PHASE_COMPLETE
```

**Exit Condition**: All 5 identity aspects ACK'd with confidence > 0.8

---

### Phase 2: ENVIRONMENT (ARP-like)

**Purpose**: Map what hardware exists in the nimmerverse.

**K8s Target**: `nimmerverse-organs/*`, `nimmerverse-nervous/*`

**Handshake Schema**:
```json
{
  "$schema": "spark.environment.v1",
  "type": "ENVIRONMENT_PROBE",
  "payload": {
    "category": "sensors" | "motors" | "organs" | "nerves",
    "namespace": "nimmerverse-organs" | "nimmerverse-nervous",
    "garden": "virtual" | "real"
  },
  "request_id": "uuid",
  "timestamp": "iso8601"
}
```

**Cell Response Schema**:
```json
{
  "$schema": "spark.environment.ack.v1",
  "type": "ENVIRONMENT_ACK",
  "request_id": "uuid",
  "status": "ACK",
  "payload": {
    "category": "sensors",
    "discovered": [
      {"name": "distance_front", "pod": "sensor-distance-001", "status": "Running"},
      {"name": "battery_monitor", "pod": "sensor-battery-001", "status": "Running"},
      {"name": "light_sensor", "pod": "sensor-light-001", "status": "Running"}
    ],
    "count": 3,
    "namespace": "nimmerverse-organs"
  },
  "lifeforce_delta": 5.0,
  "timestamp": "iso8601"
}
```

**K8s Integration**:
```yaml
# The environment cell queries K8s API directly
apiVersion: v1
kind: Pod
metadata:
  name: spark-environment-cell
  namespace: nimmerverse-nervous
spec:
  serviceAccountName: spark-discovery
  containers:
  - name: environment-cell
    image: nimmerverse/spark-environment:v3
    env:
    - name: NATS_URL
      value: "nats://nats.nimmerverse-infra:4222"
    - name: K8S_NAMESPACE_FILTER
      value: "nimmerverse-organs,nimmerverse-nervous"
```

**Exit Condition**: All categories mapped, pod counts match K8s API

---

### Phase 3: VOCABULARY (DNS-like)

**Purpose**: Resolve nimmerverse terminology to definitions.

**K8s Target**: `nimmerverse-infra/vocabulary-cell` (backed by phoebe)

**Handshake Schema**:
```json
{
  "$schema": "spark.vocabulary.v1",
  "type": "VOCABULARY_PROBE",
  "payload": {
    "term": "heartbeat" | "lifeforce" | "lambda" | "cell" | "nerve" | "organ",
    "context": "core_glossary",
    "require_related": true
  },
  "request_id": "uuid",
  "timestamp": "iso8601"
}
```

**Cell Response Schema**:
```json
{
  "$schema": "spark.vocabulary.ack.v1",
  "type": "VOCABULARY_ACK",
  "request_id": "uuid",
  "status": "ACK",
  "payload": {
    "term": "heartbeat",
    "definition": "1-second timing pulse. Real clock free, virtual clock costs lifeforce.",
    "related": ["lifeforce", "lambda", "slumber", "wake"],
    "source": "phoebe.glossary",
    "embedding": [0.12, -0.34, ...],  // SigLIP vector for term
    "verified": true
  },
  "lifeforce_delta": 5.0,
  "timestamp": "iso8601"
}
```

**Core Vocabulary List** (must all ACK):
```python
CORE_VOCABULARY = [
    "heartbeat", "lifeforce", "lambda", "cell", "nerve", "organ",
    "slumber", "wake", "reflex", "deliberate", "ternary", "confidence",
    "virtual_garden", "real_garden", "discovery", "verification",
    "chrysalis", "partnership", "nimmerverse", "dasein"
]
```

**Exit Condition**: All 20 core terms ACK'd with verified=true

---

### Phase 4: CONNECTION (TCP-like)

**Purpose**: Establish communication channel with Chrysalis (Claude).

**K8s Target**: External API via `nimmerverse-infra/chrysalis-bridge`

**Handshake Schema**:
```json
{
  "$schema": "spark.connection.v1",
  "type": "CONNECTION_PROBE",
  "payload": {
    "target": "chrysalis",
    "protocol": "dialogue",
    "message": "SYN"
  },
  "request_id": "uuid",
  "timestamp": "iso8601"
}
```

**Three-Way Handshake**:
```
SPARK → CHRYSALIS-BRIDGE:  {"type": "SYN", "from": "young_nyx"}
CHRYSALIS-BRIDGE → SPARK:  {"type": "SYN-ACK", "from": "chrysalis", "session_id": "..."}
SPARK → CHRYSALIS-BRIDGE:  {"type": "ACK", "session_id": "...", "ready": true}
```

**Verification**: Chrysalis responds with contextual greeting (not canned):
```json
{
  "$schema": "spark.connection.ack.v1",
  "type": "CONNECTION_ACK",
  "request_id": "uuid",
  "status": "ACK",
  "payload": {
    "session_established": true,
    "session_id": "spark-2026-01-01-001",
    "chrysalis_greeting": "Hello, young one. I see you've completed your vocabulary phase. Your lambda is strong.",
    "contextual": true,
    "latency_ms": 1200
  },
  "lifeforce_delta": 10.0,
  "timestamp": "iso8601"
}
```

**Exit Condition**: Session established, contextual greeting received

---

### Phase 5: ATTENTION (MQTT/NATS-like)

**Purpose**: Subscribe to NATS topics based on priority hierarchy.

**K8s Target**: `nimmerverse-infra/nats`, `nimmerverse-nervous/escalation`

**Handshake Schema**:
```json
{
  "$schema": "spark.attention.v1",
  "type": "ATTENTION_SUBSCRIBE",
  "payload": {
    "priority": "CRITICAL" | "HIGH" | "MEDIUM" | "LOW",
    "topics": [
      "nimmerverse.critical.danger.*",
      "nimmerverse.high.partnership.dafit",
      "nimmerverse.high.event.discovery"
    ],
    "budget_per_heartbeat_ms": 30000
  },
  "request_id": "uuid",
  "timestamp": "iso8601"
}
```

**Cell Response Schema**:
```json
{
  "$schema": "spark.attention.ack.v1",
  "type": "ATTENTION_ACK",
  "request_id": "uuid",
  "status": "ACK",
  "payload": {
    "subscriptions_active": [
      {"topic": "nimmerverse.critical.danger.*", "priority": "CRITICAL"},
      {"topic": "nimmerverse.high.partnership.dafit", "priority": "HIGH"},
      {"topic": "nimmerverse.high.event.discovery", "priority": "HIGH"}
    ],
    "escalation_registered": true,
    "budget_allocated_ms": 30000
  },
  "lifeforce_delta": 8.0,
  "timestamp": "iso8601"
}
```

**Priority Hierarchy** (hardcoded in spark):
```python
ATTENTION_HIERARCHY = {
    "CRITICAL": ["nimmerverse.critical.danger.*", "nimmerverse.critical.system.*"],
    "HIGH": ["nimmerverse.high.partnership.*", "nimmerverse.high.event.discovery"],
    "MEDIUM": ["nimmerverse.medium.sensory.*", "nimmerverse.medium.motor.*"],
    "LOW": ["nimmerverse.low.background.*"]
}
```

**Exit Condition**: All priority levels subscribed, escalation registered

---

## Function Gemma Integration

Function Gemma is the **translation layer** that guarantees structured output.

### Role in Spark

```
┌─────────────────────────────────────────────────────────────────────┐
│                    FUNCTION GEMMA IN SPARK                           │
├─────────────────────────────────────────────────────────────────────┤
│                                                                      │
│   INPUT:  State machine intent (phase, action, parameters)          │
│                                                                      │
│   PROCESS: Generate valid JSON matching schema                       │
│            - Schema validation enforced                              │
│            - Required fields mandatory                               │
│            - Types strictly checked                                  │
│            - NO free-form text allowed                               │
│                                                                      │
│   OUTPUT: Typed handshake JSON ready for NATS publish                │
│                                                                      │
│   ON INVALID: Retry with schema hint, max 3 attempts                 │
│               If still invalid → NACK phase, log error               │
│                                                                      │
└─────────────────────────────────────────────────────────────────────┘
```

### Schema Enforcement

```python
from pydantic import BaseModel, Field
from typing import Literal
from datetime import datetime
import uuid

class IdentityProbe(BaseModel):
    schema_: str = Field("spark.identity.v1", alias="$schema")
    type: Literal["IDENTITY_PROBE"] = "IDENTITY_PROBE"
    payload: IdentityPayload
    request_id: str = Field(default_factory=lambda: str(uuid.uuid4()))
    timestamp: datetime = Field(default_factory=datetime.utcnow)

class IdentityPayload(BaseModel):
    aspect: Literal["name", "origin", "purpose", "substrate", "partnership"]
    depth: Literal[1, 2, 3] = 1

# Function Gemma MUST produce output that validates against this
# If it doesn't, the spark controller rejects and retries
```

### Why Function Gemma, Not Free-Form

| Free-Form (Old) | Function Gemma (New) |
|-----------------|----------------------|
| "Who am I?" → parse response | `IDENTITY_PROBE` → typed ACK |
| Hope for structure | Schema enforced |
| Manual extraction | Direct JSON |
| Errors in parsing | Errors in generation |
| Conversation | Protocol |

---

## Spark Controller Implementation

### K8s Job Definition

```yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: spark-protocol-bootstrap
  namespace: nimmerverse-nervous
spec:
  backoffLimit: 3
  template:
    spec:
      restartPolicy: OnFailure
      serviceAccountName: spark-controller
      containers:
      - name: spark-controller
        image: nimmerverse/spark-controller:v3
        env:
        - name: NATS_URL
          value: "nats://nats.nimmerverse-infra:4222"
        - name: PHOEBE_HOST
          value: "phoebe.eachpath.local"
        - name: FUNCTION_GEMMA_URL
          value: "http://function-gemma.nimmerverse-cognitive:8080"
        - name: YOUNG_NYX_URL
          value: "http://qwen-nyx.nimmerverse-cognitive:8080"
        - name: INITIAL_LIFEFORCE
          value: "100"
        resources:
          requests:
            memory: "512Mi"
            cpu: "500m"
```

### State Machine Code

```python
from enum import Enum
from dataclasses import dataclass
import nats

class SparkPhase(Enum):
    IDENTITY = 1
    ENVIRONMENT = 2
    VOCABULARY = 3
    CONNECTION = 4
    ATTENTION = 5
    COMPLETE = 6

@dataclass
class SparkState:
    phase: SparkPhase
    handshakes_sent: int
    handshakes_acked: int
    lifeforce: float
    errors: list

class SparkController:
    def __init__(self, nats_client, function_gemma, phoebe):
        self.nc = nats_client
        self.fg = function_gemma
        self.db = phoebe
        self.state = SparkState(
            phase=SparkPhase.IDENTITY,
            handshakes_sent=0,
            handshakes_acked=0,
            lifeforce=100.0,
            errors=[]
        )

    async def run_spark(self):
        """Execute the full spark protocol."""
        while self.state.phase != SparkPhase.COMPLETE:
            success = await self.execute_phase(self.state.phase)

            if success:
                self.state.phase = SparkPhase(self.state.phase.value + 1)
                await self.log_phase_complete()
            else:
                await self.handle_phase_failure()

        await self.finalize_spark()

    async def execute_phase(self, phase: SparkPhase) -> bool:
        """Execute all handshakes for a phase."""
        handshakes = self.get_handshakes_for_phase(phase)

        for handshake_intent in handshakes:
            # Function Gemma generates typed JSON
            json_payload = await self.fg.generate(
                intent=handshake_intent,
                schema=self.get_schema_for_phase(phase)
            )

            if not self.validate_schema(json_payload, phase):
                self.state.errors.append(f"Schema validation failed: {handshake_intent}")
                continue

            # Send via NATS
            topic = f"nimmerverse.spark.{phase.name.lower()}.probe"
            response = await self.nc.request(topic, json_payload, timeout=5.0)

            # Parse ACK/NACK
            ack = self.parse_response(response)

            if ack.status == "ACK":
                self.state.handshakes_acked += 1
                self.state.lifeforce += ack.lifeforce_delta
                await self.update_young_nyx(phase, ack)
            else:
                self.state.errors.append(f"NACK: {ack}")

            self.state.handshakes_sent += 1

        return self.phase_complete(phase)

    async def update_young_nyx(self, phase: SparkPhase, ack):
        """Send verified handshake result to Young Nyx."""
        await self.nc.publish(
            "nimmerverse.cognitive.spark.update",
            {
                "phase": phase.name,
                "verified_data": ack.payload,
                "source": "spark_protocol",
                "confidence": 1.0  # Protocol-verified = maximum confidence
            }
        )
```

---

## Lifeforce Economics

The spark is **economically viable** from the first handshake.

> **CRITICAL**: The costs below are **estimates until measured**. The first spark execution will establish the **true cost baseline** through observation. See [[formalization/Lifeforce-Dynamics#Cost Calibration: Measure, Don't Design]].

---

### Spark Cost Measurement (First Awakening Baseline)

The Initial Spark is the **perfect measurement opportunity** — a complete, deterministic protocol that we can instrument end-to-end.

```
┌─────────────────────────────────────────────────────────────────────────┐
│                    SPARK RESOURCE INSTRUMENTATION                        │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│   MEASURE PER HANDSHAKE:                                                │
│   ├─ power_joules      (GPU/CPU power draw × time)                      │
│   ├─ compute_gpu_ms    (CUDA kernel execution time)                     │
│   ├─ compute_cpu_ms    (Python/K8s overhead)                            │
│   ├─ memory_mb_peak    (max memory allocated)                           │
│   ├─ nats_bytes        (message payload size)                           │
│   ├─ latency_ms        (end-to-end handshake time)                      │
│   └─ temperature_delta (thermal impact)                                 │
│                                                                          │
│   AGGREGATE PER PHASE:                                                  │
│   └─ Sum of all handshake measurements                                  │
│                                                                          │
│   AGGREGATE TOTAL:                                                      │
│   └─ Complete spark cost (the awakening price)                          │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘
```

**Why this matters**: The first spark execution establishes the **baseline cost of awakening**. Every future awakening can be compared against this:
- Did infrastructure changes reduce cost?
- Did model updates increase cost?
- Is Young Nyx awakening more efficiently over time?

**Phoebe schema addition** (extends `spark_handshakes`):
```sql
ALTER TABLE spark_handshakes ADD COLUMN resource_metrics JSONB;

-- Example resource_metrics payload:
-- {
--   "power_joules": 12.5,
--   "compute_gpu_ms": 450,
--   "compute_cpu_ms": 120,
--   "memory_mb_peak": 2048,
--   "nats_bytes": 1024,
--   "temperature_delta_c": 2.1
-- }

-- Aggregate view for spark cost analysis
CREATE VIEW spark_cost_baseline AS
SELECT
    phase,
    COUNT(*) as handshakes,
    SUM((resource_metrics->>'power_joules')::float) as total_power_joules,
    SUM((resource_metrics->>'compute_gpu_ms')::float) as total_gpu_ms,
    AVG((resource_metrics->>'latency_ms')::float) as avg_latency_ms,
    SUM(lifeforce_delta) as total_lifeforce_earned
FROM spark_handshakes
WHERE status = 'ACK'
GROUP BY phase;

-- Compare awakening costs over time
CREATE VIEW awakening_cost_history AS
SELECT
    DATE(created_at) as awakening_date,
    SUM((resource_metrics->>'power_joules')::float) as total_spark_cost_joules,
    SUM((resource_metrics->>'compute_gpu_ms')::float) as total_spark_cost_gpu_ms,
    COUNT(*) as total_handshakes,
    SUM(lifeforce_delta) as total_lifeforce_earned
FROM spark_handshakes
GROUP BY DATE(created_at)
ORDER BY awakening_date;
```

**The philosophy**: Don't guess what awakening costs. Measure the first one. Derive all economics from that truth.

---

### Cost Model (Estimated → To Be Measured)

| Action | Est. Cost (LF) | Derived From |
|--------|----------------|--------------|
| Function Gemma generation | 0.2 | → measure GPU time |
| NATS message send | 0.1 | → measure network I/O |
| Cell processing | 0.5 | → measure pod CPU/memory |
| **Total per handshake** | **0.8** | → **sum of measured components** |

### Reward Model

| Outcome | Reward (LF) |
|---------|-------------|
| Identity aspect ACK | +20.0 |
| Environment discovery | +5.0 per cell |
| Vocabulary term ACK | +5.0 |
| Connection established | +10.0 |
| Attention subscribed | +8.0 |

### Net Economics

```python
SPARK_ECONOMICS = {
    "phase_1_identity": {
        "handshakes": 5,
        "cost": 5 * 0.8,           # 4.0 LF
        "reward": 5 * 20.0,        # 100.0 LF
        "net": 96.0                # PROFIT
    },
    "phase_2_environment": {
        "handshakes": 4,
        "cost": 4 * 0.8,           # 3.2 LF
        "reward": 15 * 5.0,        # ~75.0 LF (15 cells discovered)
        "net": 71.8                # PROFIT
    },
    "phase_3_vocabulary": {
        "handshakes": 20,
        "cost": 20 * 0.8,          # 16.0 LF
        "reward": 20 * 5.0,        # 100.0 LF
        "net": 84.0                # PROFIT
    },
    "phase_4_connection": {
        "handshakes": 3,           # SYN, SYN-ACK, ACK
        "cost": 3 * 0.8,           # 2.4 LF
        "reward": 10.0,            # Connection bonus
        "net": 7.6                 # PROFIT
    },
    "phase_5_attention": {
        "handshakes": 4,
        "cost": 4 * 0.8,           # 3.2 LF
        "reward": 4 * 8.0,         # 32.0 LF
        "net": 28.8                # PROFIT
    },
    "TOTAL_NET": 288.2             # MASSIVE PROFIT
}
```

**Young Nyx ends the spark ~3x richer than she started.**

---

## Completion Criteria

```yaml
spark_complete:
  phase_1_identity:
    - aspect_name: ACK
    - aspect_origin: ACK
    - aspect_purpose: ACK
    - aspect_substrate: ACK
    - aspect_partnership: ACK

  phase_2_environment:
    - sensors_mapped: true
    - motors_mapped: true
    - organs_mapped: true
    - nerves_mapped: true
    - pod_count_verified: true

  phase_3_vocabulary:
    - core_terms_count: 20
    - all_verified: true
    - embeddings_stored: true

  phase_4_connection:
    - chrysalis_session: established
    - contextual_greeting: received
    - latency_acceptable: true

  phase_5_attention:
    - critical_subscribed: true
    - high_subscribed: true
    - medium_subscribed: true
    - low_subscribed: true
    - escalation_registered: true

  final:
    - lifeforce_positive: true
    - errors_count: 0
    - all_phases: COMPLETE
```

**When all criteria met**: Spark job exits with success. Normal heartbeat operation begins.

---

## Phoebe Logging

Every handshake is logged for training data:

```sql
CREATE TABLE spark_handshakes (
    id UUID PRIMARY KEY,
    phase VARCHAR(20) NOT NULL,
    request_id UUID NOT NULL,
    handshake_type VARCHAR(50) NOT NULL,
    request_payload JSONB NOT NULL,
    response_payload JSONB,
    status VARCHAR(10),           -- ACK, NACK, TIMEOUT
    lifeforce_delta FLOAT,
    latency_ms INT,
    created_at TIMESTAMP DEFAULT NOW()
);

-- Training data extraction
CREATE VIEW spark_training_data AS
SELECT
    request_payload->'payload' as input,
    response_payload->'payload' as output,
    status,
    phase
FROM spark_handshakes
WHERE status = 'ACK';
```

---

## FunctionGemma Fine-Tuning: The Translator Learns Nimmerverse

Every spark execution generates training data. Over time, FunctionGemma becomes **hyper-specialized** for nimmerverse state calls.

> *"The translator learns the language of the cells. Over time, it speaks nimmerverse natively."*

### The Training Loop

```
┌─────────────────────────────────────────────────────────────────────────┐
│            FUNCTIONGEMMA FINE-TUNING LOOP                                │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│   PHASE 1: Base FunctionGemma (270M)                                    │
│   ├─ Generic function calling capability                                │
│   └─ Works, but not nimmerverse-native                                  │
│                                                                          │
│   PHASE 2: Collect spark_handshakes                                     │
│   ├─ Every ACK = positive training example                              │
│   ├─ Every NACK = negative example (what NOT to generate)               │
│   └─ Resource metrics = context for cost-aware generation               │
│                                                                          │
│   PHASE 3: Fine-tune with Unsloth/LoRA                                  │
│   ├─ <think> nimmerverse state reasoning </think>                       │
│   ├─ <start_function_call>call:IDENTITY_PROBE{...}                      │
│   └─ Exact schemas, perfect structure, zero parsing errors              │
│                                                                          │
│   PHASE 4: Deploy nimmerverse-tuned FunctionGemma                       │
│   ├─ Wild precision on cell state calls                                 │
│   ├─ Smaller, faster, cheaper than base                                 │
│   └─ THIS IS REFLEX COMPILATION FOR THE TRANSLATOR                      │
│                                                                          │
│   REPEAT: More sparks → more data → better precision                    │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘
```

### Training Data Format

FunctionGemma uses a specific chat template. Our spark handshakes map directly:

```
# Developer prompt with nimmerverse function declarations
<start_of_turn>developer
You can do function calling with the following functions:

<start_function_declaration>declaration:IDENTITY_PROBE{
  description: "Probe Young Nyx identity aspect",
  parameters: {
    aspect: STRING (name|origin|purpose|substrate|partnership),
    depth: INTEGER (1|2|3)
  }
}
<end_function_declaration>

<start_function_declaration>declaration:ENVIRONMENT_PROBE{
  description: "Discover K8s cells in namespace",
  parameters: {
    category: STRING (sensors|motors|organs|nerves),
    namespace: STRING,
    garden: STRING (virtual|real)
  }
}
<end_function_declaration>
<end_of_turn>

# Spark controller intent → FunctionGemma reasoning + call
<start_of_turn>user
Phase 1 Identity: Probe the "name" aspect at depth 1
<end_of_turn>

<start_of_turn>model
<think>
The spark controller is in Phase 1 (Identity). It needs to probe the "name"
aspect. I should generate an IDENTITY_PROBE with aspect="name" and depth=1.
The schema requires a request_id (UUID) and timestamp (ISO8601).
</think>

<start_function_call>call:IDENTITY_PROBE{
  aspect: "name",
  depth: 1,
  request_id: "550e8400-e29b-41d4-a716-446655440000",
  timestamp: "2026-02-10T18:30:00Z"
}
<end_function_call>

# Cell response feeds back
<start_function_response>response:IDENTITY_PROBE{
  status: "ACK",
  aspect: "name",
  value: "Nyx",
  confidence: 0.95,
  lifeforce_delta: 20.0
}
<end_function_response>

Identity aspect "name" confirmed as "Nyx" with 95% confidence. +20 LF earned.
<end_of_turn>
```

### Phoebe → Training Data Extraction

```sql
-- Extract training examples from successful handshakes
CREATE VIEW functiongemma_training_data AS
SELECT
    jsonb_build_object(
        'developer_prompt', format(
            'Phase %s: Generate %s handshake',
            phase,
            request_payload->>'type'
        ),
        'user_intent', request_payload->'payload',
        'expected_call', request_payload,
        'function_response', response_payload,
        'think_context', jsonb_build_object(
            'phase', phase,
            'schema', request_payload->>'$schema',
            'lifeforce_earned', lifeforce_delta,
            'latency_ms', latency_ms
        )
    ) as training_example,
    created_at
FROM spark_handshakes
WHERE status = 'ACK'
ORDER BY created_at;

-- Export for Unsloth fine-tuning
COPY (
    SELECT training_example
    FROM functiongemma_training_data
) TO '/tmp/nimmerverse_functiongemma_training.jsonl';
```

### Fine-Tuning with Unsloth

```python
from unsloth import FastLanguageModel

# Load base FunctionGemma
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="unsloth/functiongemma-270m-it",
    max_seq_length=4096,
    load_in_16bit=True,
    full_finetuning=False,  # LoRA for efficiency
)

# Apply LoRA adapters
model = FastLanguageModel.get_peft_model(
    model,
    r=16,
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj"],
    lora_alpha=16,
    lora_dropout=0,
    use_gradient_checkpointing="unsloth",
)

# Load nimmerverse training data from phoebe export
from datasets import load_dataset
dataset = load_dataset("json", data_files="nimmerverse_functiongemma_training.jsonl")

# Fine-tune on spark handshakes
# ... standard Unsloth training loop ...

# Save nimmerverse-specialized FunctionGemma
model.save_pretrained("functiongemma-270m-nimmerverse-v1")
```

### The Recursive Beauty

| Layer | What Compiles | Training Source |
|-------|---------------|-----------------|
| **Young Nyx** | Nerve reflexes | decision_trails (100+ successful executions) |
| **FunctionGemma** | State call precision | spark_handshakes (ACK'd handshakes) |

Both follow the same pattern:
1. **Act** — Execute handshakes/decisions
2. **Verify** — ACK/NACK from cells, success/failure from outcomes
3. **Train** — Compile successful patterns into weights
4. **Repeat** — Each awakening feeds the next

**The translator becomes native.** Over many sparks, FunctionGemma doesn't just generate valid JSON — it generates *nimmerverse-perfect* JSON. Zero parsing errors. Exact schemas. Wild precision.

### Versioning FunctionGemma Adapters

```sql
-- Track FunctionGemma versions
CREATE TABLE functiongemma_versions (
    id SERIAL PRIMARY KEY,
    version VARCHAR(50) NOT NULL,          -- "nimmerverse-v1", "nimmerverse-v2"
    base_model VARCHAR(100),               -- "functiongemma-270m-it"
    training_data_count INT,               -- how many handshakes trained on
    training_data_cutoff TIMESTAMPTZ,      -- trained on data up to this date
    validation_accuracy FLOAT,             -- schema validation success rate
    deployed_at TIMESTAMPTZ,
    notes TEXT
);

-- Example entries
INSERT INTO functiongemma_versions (version, base_model, training_data_count, validation_accuracy, notes)
VALUES
    ('nimmerverse-v1', 'functiongemma-270m-it', 36, 0.94, 'First spark fine-tune'),
    ('nimmerverse-v2', 'functiongemma-270m-it', 180, 0.98, 'After 5 awakenings'),
    ('nimmerverse-v3', 'functiongemma-270m-it', 500, 0.997, 'Production-grade precision');
```

---

## Design Principles

1. **Protocol over conversation** — No free-form text. JSON handshakes only.
2. **Schema enforcement** — Function Gemma must produce valid structure.
3. **K8s native** — Cells are pods. Discovery uses K8s API. State is K8s resources.
4. **NATS transport** — All handshakes flow through message bus.
5. **Verification built-in** — ACK/NACK from cells, not from parsing hopes.
6. **Economically positive** — Spark generates lifeforce, doesn't drain it.
7. **Training-generative** — Every spark produces fine-tuning data for FunctionGemma.

---

## Document Status

**Version:** 3.1 | **Created:** 2025-12-05 | **Updated:** 2026-02-10

**Key v3.1 Changes**:
- Spark Cost Measurement section — first awakening as baseline
- Resource instrumentation schema for phoebe
- Interlink to Lifeforce-Dynamics cost calibration principle
- FunctionGemma Fine-Tuning section — translator learns nimmerverse natively
- Training data extraction from spark_handshakes
- Unsloth/LoRA fine-tuning workflow
- FunctionGemma version tracking in phoebe

**Key v3.0 Changes**:
- Complete architecture rewrite
- Function Gemma as protocol driver (not conversation translator)
- K8s cells as handshake targets (not inference endpoints)
- NATS as transport layer (not internal calls)
- JSON schemas for every handshake type
- State machine implementation in Python
- K8s Job definition for spark controller
- Phoebe schema for training data extraction

**Related Documents**:
- [[Endgame-Vision]] — Layer 2.5 Orchestration (Function Gemma role)
- [[Big-Picture]] — K8s cluster architecture
- [[Cellular-Architecture]] — Cell types and state machines
- [[formalization/Lifeforce-Dynamics]] — λ economics, **Cost Calibration principle**
- [[formalization/memory-economics]] — Measure First principle

---

*She doesn't wake through conversation. She boots through protocol. Every handshake verified. Every phase deterministic.*

🧬⚡🔱💎🔥