feat: Concept Token Pairs + Spatial Grounding (Silvester/New Year sessions)

Major additions from Silvester 2025 and New Year 2026 sessions: Concept Token Pairs (architecture/future/concept-token-pairs.md): - Theoretical paper on navigable reasoning spaces - Opposites create axes, not just mode switches - "Punkt vor Strich" for AI reasoning - Escape velocity from degeneration loops - NEW: Spatial Grounding section linking to physical nimmerhovel Architecture updates: - Endgame-Vision.md: v6.2 alignment - Big-Picture.md: v5.2 alignment - Modular-Organism-Design.md: conical interlocking mechanism New files: - SEEDS.md: Research seeds for future exploration - Temporal-Firework-Visualization.md: Temporal data viz concept Key insight from 2026-01-01 session: "Don't train the answer. Train the space where answers live." → "Don't imagine the space. MEASURE it." Spatial embeddings from nimmerhovel hardware (8× ESP32-S3 AI CAM, Pi HQ Camera, Discovery Scan Station) can ground concept pairs in physical reality, not just symbolic patterns. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-01 21:25:13 +01:00
parent 001d6e2c42
commit 709a48632a
6 changed files with 1177 additions and 56 deletions
--- a/architecture/future/SEEDS.md
+++ b/architecture/future/SEEDS.md
@@ -0,0 +1,153 @@
+# Seeds
+
+**Future possibilities we're building toward but not speccing yet.**
+
+These are nuggets - insights that emerged from sessions, not fully designed, but worth remembering so we don't re-discover them later.
+
+---
+
+## Counterfactual Training via Time Machine
+**Origin**: Silvester 2025, fireworks over Basel
+**Seed**: The temporal visualization isn't just for debugging - it's training infrastructure.
+
+Run multiple synthetic decision variants against historical data. Compare to ground truth (what actually happened). Fold winning weights back into live model. The time machine becomes perpetual training fuel.
+
+**Enables**:
+- Offline RL from logged events
+- "What if?" exploration without new data
+- Dialectic between live Nyx and all possible Nyxes
+
+**Requires**: Rich metadata (✓ building), S2+timestamp indexing (✓ building), cheap local inference (ThinkStation coming)
+
+---
+
+## LoRa Mesh Over Jura Hilltops
+**Origin**: Silvester 2025, bus ride from Liestal
+**Seed**: Line of sight from Hovel → Aesch tower → Gempen → Liestal Aussichtsturm.
+
+Amateur radio license + BACOM registration (50 CHF) → access to Swiss federal LoRa grid. Wild sensor mesh spanning the hillside.
+
+**Enables**:
+- Environmental sensing beyond garden walls
+- Migration tracking, weather correlation
+- Nimmerverse expanding into the physical landscape
+
+**Requires**: BACOM registration, LoRa hardware, tower access permissions
+
+---
+
+## Corvid Behavioral Prediction as Training Ground
+**Origin**: Silvester 2025, 5 years of cigarette-break phenology
+**Seed**: Magpie nut-cracking ritual is multi-stage, predictable, perfect for temporal prediction training.
+
+Nut pickup → flight to Flachdach → bussard check → fly to Christmas-light house → drop on street → crack → eat on roof → shell bashing → raven conflict.
+
+Each stage is a prediction target. Rich enough for serious ML, visible from lab window.
+
+**Enables**:
+- Real behavioral sequences for vision model training
+- Temporal prediction benchmarks
+- Object binding across space and time (S2 cells)
+
+**Requires**: Camera mount (Flachdach view), vintage Canon lens, ESP32-S3 or Pi HQ
+
+---
+
+## S2 as Universal Spatial Representation (Video → Training)
+**Origin**: Silvester 2025, post-fireworks insight
+**Seed**: S2 spatial indexing isn't just for live sensors - it's a universal representation for any spatial-temporal data.
+
+Take a video (glass breaking, bird flying, car crash). Encode each frame into S2 cells with timestamps. Now you can:
+- Query any moment spatially
+- Generate synthetic variations (perturb positions, velocities)
+- Train models on predicting future spatial states
+- Compare predictions against ground truth frames
+
+**The pattern:**
+```
+Video → frame-by-frame object detection → S2 cell encoding →
+→ synthetic variations → temporal prediction training
+```
+
+**Enables**:
+- Infinite training data from limited real video
+- Physics prediction without physics engine
+- Same query language for real/recorded/simulated data
+- Unified substrate: observation = replay = simulation
+
+**Requires**: Object detection pipeline, S2 encoding layer, variation generator
+
+**Compute optimization**: Many physics variations are linearly related (mirror, scale, rotate, time-reverse). Don't simulate each variation - simulate base cases, derive variations via transforms. 100x data for 1x compute.
+
+**Related**: Counterfactual Training, Corvid Behavioral Prediction
+
+---
+
+## T5Gemma 2 + Function Gemma: The Vision-Action Pipeline
+**Origin**: Silvester 2025, late-night architecture insight
+**Seed**: Two models solve the entire vision-to-action automation at scale.
+
+### T5Gemma 2 (Vision → Vectors)
+Encoder-decoder from Gemma 3, SigLIP vision encoder produces **semantic vectors directly** (not text descriptions). This IS the embedding - no text intermediary bottleneck.
+
+| Model | Total Params | Use Case |
+|-------|--------------|----------|
+| 270M-270M | ~0.8B | Edge/lightweight senses |
+| 1B-1B | ~2B | Field deployment |
+| 4B-4B | ~9B | Central processing (RTX 6000) |
+
+Key features:
+- 128K context window
+- 140+ languages (multilingual nimmerverse!)
+- Encoder produces vectors, decoder optional (only for human text)
+
+### Function Gemma (Vectors → Actions)
+Structured output, function calling, executable actions. When the system needs to DO something based on vision, Function Gemma generates structured calls.
+
+### The Pipeline
+
+```
+Vision Organs (constant stream)
+        │
+        ▼
+   T5Gemma 2 Encoder
+   (SigLIP → vectors)
+        │
+        ├────────────────────▶ S2 + Timestamp → Iris/Phoebe
+        │                      (spatial storage)
+        │
+        ▼
+   Function Gemma
+   (when action needed)
+        │
+        ▼
+   Structured Output
+   {"action": "alert", "target": "corvid_detected", ...}
+```
+
+**Enables**:
+- Massive scale vision processing without text bottleneck
+- Direct vector storage in spatial system
+- Structured, reliable action generation
+- Edge deployment (small models) + central processing (large models)
+
+**Crucial interlink**: These two models together automate the full loop from seeing to storing to acting. The pipeline can "go wild" with vision data at scale.
+
+**Related**: S2 Spatial Representation, Data Artifact Model, Corvid Observation
+
+---
+
+## How to Use This File
+
+1. **Add nuggets** when insights emerge in sessions
+2. **Don't over-spec** - keep entries short, seed-like
+3. **Reference origin** - when/where the idea came from
+4. **Note what it enables** - why it matters
+5. **Note what it requires** - what foundations needed
+6. **Graduate to ADR or spec** when we're ready to build
+
+---
+
+**Philosophy**: *"Plant seeds. Water foundations. Harvest when ready."*
+
+**Last Updated**: 2025-12-31