feat: complete Phase 1 - vocabulary expansion & DriftProbe infrastructure

- CLI: nyx-probe scan with --summary/--delta/--full flags - DriftProbe: training safety with Gini coefficient + Angular Drift - Vocabulary: 54 terms (30 nimmerverse + 24 German philosophical) - Sentinels: ANCHOR/BRIDGE/CANARY/TARGET monitoring system Key findings: - German philosophical terms: 37.5% depth≥2 hit rate (vs 3.3% nimmerverse) - Super Cluster validated: heart cross-lang sim = 1.000 - Isolated Zone confirmed: being EN↔DE sim = 0.195 - Gini signature: Philosophy ~0.5 (diffuse), Technical ~0.8 (sparse) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-06 22:39:03 +01:00
parent 9853f4767b
commit f640dbdd65
29 changed files with 6164 additions and 1 deletions
--- a/archive/PLAN-v1-2025-12-06.md
+++ b/archive/PLAN-v1-2025-12-06.md
@@ -0,0 +1,408 @@
+# Plan: nyx-probing Framework
+
+## Overview
+
+Build a probing framework to understand Qwen2.5-7B-Base before curriculum design.
+
+**Hardware:** Prometheus (THE SPINE) - RTX 3090 24GB  
+**Model:** Qwen2.5-7B-Base (empty vessel, completes not answers)  
+**Backend:** Transformers + PyTorch (full hidden state access)  
+**Location:** New repo `nyx-probing`
+
+---
+
+## MVP Scope (First Milestone) ✅ COMPLETE
+
+1. ✅ **Surface Probe** - Feed words, capture completions
+2. ✅ **Echo Probe** - Depth measurement (EXPANDS/CONFIRMS/CIRCULAR/DIVERGENT/COLLAPSE)
+3. ✅ **Readiness Scorer** - HIGH/MEDIUM/LOW classification
+4. ⏳ **JSON Storage** - Reproducible results
+5. ⏳ **CLI Tools** - Interactive probing
+6. ⏳ **One Notebook** - Exploration
+
+---
+
+## Phase 2: Multilingual Probing ✅ COMPLETE
+
+1. ✅ **Multilingual Triangulation Probe** - Ground→Deepen→Triangulate
+2. ✅ **Language Topology Discovery** - Complete map of 15 languages
+3. ✅ **Isolation Type Classification** - 5 distinct categories identified
+
+---
+
+## Repository Structure (Current)
+
+```
+nyx-probing/
+├── README.md
+├── PLAN.md                    # This file
+├── pyproject.toml
+├── requirements.txt
+│
+├── nyx_probing/
+│   ├── __init__.py
+│   ├── config.py
+│   │
+│   ├── core/
+│   │   ├── __init__.py
+│   │   ├── model.py           # ✅ NyxModel with hidden states
+│   │   └── probe_result.py    # ✅ Result dataclasses
+│   │
+│   ├── probes/
+│   │   ├── __init__.py
+│   │   ├── base.py            # ✅ Abstract base
+│   │   ├── surface_probe.py   # ✅ Word completions + coherence
+│   │   ├── echo_probe.py      # ✅ Depth measurement
+│   │   └── multilingual_probe.py  # ✅ NEW: Triangulation probe
+│   │
+│   ├── analysis/
+│   │   ├── __init__.py
+│   │   └── readiness_scorer.py # ✅ Curriculum readiness
+│   │
+│   ├── storage/
+│   │   └── __init__.py        # ⏳ JSON storage pending
+│   │
+│   └── cli/
+│       └── __init__.py        # ⏳ CLI pending
+│
+├── docs/
+│   ├── tokenization-valleys.md        # Token-Norm-Valley theory
+│   ├── multilingual-convergence.md    # Universal concept layer
+│   ├── language-landscape.md          # 15-language scan
+│   ├── language-topology-complete.md  # ✅ NEW: Complete map v2.0
+│   └── retraining-safety-framework.md # ✅ NEW: Paper outline
+│
+├── data/
+│   └── glossary/              # ⏳ Core terms pending
+│
+├── results/                   # ⏳ Probe results storage
+│
+└── [Test & Exploration Scripts]
+    ├── probe_test.py
+    ├── test_model_loader.py
+    ├── test_surface_probe.py
+    ├── test_echo_probe.py
+    ├── test_readiness_scorer.py
+    ├── test_triangulation.py       # ✅ NEW
+    ├── german_philosophy.py
+    ├── language_scan.py
+    ├── multilingual_convergence.py
+    ├── layer_detailed.py
+    ├── layer_divergence.py
+    ├── model_stats.py
+    ├── italian_investigation.py    # ✅ NEW
+    └── complete_language_probe.py  # ✅ NEW
+```
+
+---
+
+## Current Status (2025-12-06 Session 3)
+
+### PHASE 1: MVP ✅ COMPLETE
+
+| Component | Status | File |
+|-----------|--------|------|
+| Model Loader | ✅ | `nyx_probing/core/model.py` |
+| Surface Probe | ✅ | `nyx_probing/probes/surface_probe.py` |
+| Echo Probe | ✅ | `nyx_probing/probes/echo_probe.py` |
+| Readiness Scorer | ✅ | `nyx_probing/analysis/readiness_scorer.py` |
+| Result Dataclasses | ✅ | `nyx_probing/core/probe_result.py` |
+
+### PHASE 2: MULTILINGUAL ✅ COMPLETE
+
+| Component | Status | File |
+|-----------|--------|------|
+| Triangulation Probe | ✅ | `nyx_probing/probes/multilingual_probe.py` |
+| Language Zones | ✅ | Defined in multilingual_probe.py |
+| Complete Topology Map | ✅ | `docs/language-topology-complete.md` |
+
+---
+
+## 🗺️ THE COMPLETE LANGUAGE TOPOLOGY (Session 3 Discovery)
+
+```
+┌─────────────────────────────────────────────────────────────────────────────┐
+│                    THE YOUNG MIND'S LANGUAGE TOPOLOGY v2.0                   │
+╞═════════════════════════════════════════════════════════════════════════════╡
+│                                                                              │
+│  🌍 SUPER CLUSTER (sim=1.0)                                                 │
+│     ZH · JA · EN · AR · FR · PT · ES                                        │
+│     ✅ USE FOR: Grounding, establishing shared concepts                      │
+│                                                                              │
+│                        KO ─────── (bridge)                                   │
+│                                                                              │
+│  ISOLATED ZONE:                                                              │
+│  ├─ 🧠 PHILOSOPHICAL (DE) ────── Heidegger, depth access                    │
+│  │     ✅ USE FOR: Deep philosophical training                               │
+│  │                                                                           │
+│  ├─ 💻 CODE-HIJACKED (IT, TR, ID) ── Words become variables                 │
+│  │     ❌ AVOID: Training signal wasted on code patterns                     │
+│  │                                                                           │
+│  ├─ 📜 FRAGMENTED (HI) ───────── 5+ tokens, script-trapped                  │
+│  │     ⚠️ LIMITED: Cross-lingual transfer impaired                          │
+│  │                                                                           │
+│  └─ 📰 WEB PROSE (VI-ID-RU) ──── Content style cluster                      │
+│        🤔 POTENTIAL: Factual/encyclopedic training                          │
+│                                                                              │
+└─────────────────────────────────────────────────────────────────────────────┘
+```
+
+### Isolation Types Discovered
+
+| Type | Languages | Cause | Curriculum Use |
+|------|-----------|-------|----------------|
+| **PHILOSOPHICAL** | DE | Multi-token compounds access academic data | ✅ Deep concepts |
+| **CODE-HIJACKED** | IT, TR, ID | Simple Latin orthography → variable names | ❌ Avoid |
+| **FRAGMENTED** | HI | 5+ tokens, stays in native script | ⚠️ Limited |
+| **WEB PROSE** | VI, ID, RU | Cluster by content style, not linguistics | 🤔 Factual? |
+
+### Key Metrics
+
+| Lang | Avg Tokens | Sim to EN | Valley Type | Classification |
+|------|------------|-----------|-------------|----------------|
+| DE | 2.2 | 0.251 | PHILOSOPHY | 🧠 Philosophical |
+| IT | 2.5 | 0.491 | CODE | 💻 Code-Hijacked |
+| TR | 2.2 | 0.246 | CODE | 💻 Code-Hijacked |
+| ID | 2.8 | 0.325 | CODE/PROSE | 💻 Code-Hijacked |
+| HI | 5.0 | 0.310 | PROSE | 📜 Fragmented |
+| VI | 3.2 | 0.358 | PROSE | 📰 Web Prose |
+| RU | 2.7 | 0.319 | PROSE | 📰 Web Prose |
+
+---
+
+## 🔬 Key Discoveries
+
+### 1. Token-Norm-Valley Theory
+- Single-token words → massive activation spike (14K norm) → CODE valley
+- Multi-token words → distributed signal (85 norm) → PROSE/PHILOSOPHY valleys
+- Correlation: -0.699 (more tokens = more isolated)
+
+### 2. Universal Concept Layer
+- Layers 12-24 contain language-agnostic representations
+- Super cluster (7 languages) converges at similarity = 1.000
+- Model KNOWS "heart", "心", "قلب" are the same concept
+
+### 3. German Philosophical Access
+- "Sein" → Heidegger's "Being and Time"
+- "Bewusstsein" → epistemology, truth, consciousness
+- Depth score 2-3, transfers back to English via triangulation
+
+### 4. Italian Mystery SOLVED
+- Italian NOT accessing cultural valleys (no Dante, no Renaissance)
+- Italian words interpreted as Python variable names!
+- Example: `essere` → `essere = input("Cosa devo fare?")`
+- Same pattern found in Turkish and Indonesian
+
+### 5. VI-ID-RU Cluster Explained
+- Cluster by **content style**, not linguistic features
+- All generate web articles, news, blogs
+- Internal similarity 0.6-0.7
+
+---
+
+## 📄 Paper: Retraining Safety Framework
+
+**Title:** *"Multilingual Activation Topology as a Retraining Safety Framework"*
+
+**Status:** Outline complete at `docs/retraining-safety-framework.md`
+
+**Core Hypothesis:** Train in German (isolated zone) to avoid colliding with English representations in the super cluster. Use language topology as diagnostic tool for training safety.
+
+**Proposed Framework:**
+```
+BASELINE → TRAINING → CHECKPOINT → DRIFT ANALYSIS
+    │                      │
+    └──────────────────────┘
+         Compare metrics:
+         - Convergence drift
+         - Depth drift
+         - Norm drift
+         - Valley migration
+```
+
+---
+
+## 📊 Curriculum Strategy (Validated)
+
+### Phase 1: GROUNDING
+Use Super Cluster for universal concept establishment:
+```
+EN "consciousness" → ZH "意识" → AR "الوعي"
+All converge at sim=1.0 - stable foundation
+```
+
+### Phase 2: DEEPENING
+Use German for philosophical valley access:
+```
+DE "Sein" → Heidegger → existence → truth
+Depth score 2/3, philosophical valley accessed
+```
+
+### Phase 3: TRIANGULATION
+Verify depth transfers back to universal:
+```
+"Sein (German): In English, it means..."
+→ Check if philosophical depth preserved
+```
+
+### AVOID
+- Italian, Turkish, Indonesian (code hijacking)
+- Hindi for cross-lingual concepts (too fragmented)
+
+---
+
+## Next Steps
+
+### Immediate (MVP Completion)
+- [ ] Step 7: CLI (`nyx-probe surface "term"`)
+- [ ] Step 8: Glossary data (`data/glossary/core_terms.json`)
+- [ ] Step 9: JSON storage for reproducible results
+
+### Phase 3: Activation Analysis
+- [ ] DriftProbe class for retraining monitoring
+- [ ] Baseline capture before training
+- [ ] Checkpoint comparison automation
+- [ ] Alert thresholds for drift detection
+
+### Phase 4: Experiments
+- [ ] Controlled retraining: EN vs DE training data
+- [ ] Measure collision rates
+- [ ] Validate isolated zone training hypothesis
+
+### Research
+- [ ] Paper write-up
+- [ ] Literature review (EWC, mBERT, activation engineering)
+- [ ] Korean bridge language investigation
+- [ ] VI-ID-RU cluster for factual training
+
+---
+
+## Files Created (Session 3)
+
+| File | Purpose |
+|------|---------|
+| `nyx_probing/probes/multilingual_probe.py` | Triangulation probe class |
+| `test_triangulation.py` | Test script for triangulation |
+| `italian_investigation.py` | Italian mystery probe |
+| `complete_language_probe.py` | Full 15-language probe |
+| `docs/language-topology-complete.md` | Complete map v2.0 |
+| `docs/retraining-safety-framework.md` | Paper outline |
+
+---
+
+## Dependencies
+
+```
+torch>=2.1.0
+transformers>=4.36.0
+accelerate>=0.25.0
+click>=8.1.0
+rich>=13.0.0
+pydantic>=2.5.0
+pyyaml>=6.0.0
+python-dotenv>=1.0.0
+jupyter>=1.0.0
+matplotlib>=3.8.0
+numpy>=1.24.0
+```
+
+---
+
+## Critical Reference Files
+
+- `nimmerverse-sensory-network/nimmerversity.md` - Bootstrap protocol
+- `nimmerverse-sensory-network/multilingual-cognition.md` - Language hypotheses
+- `nimmerverse-sensory-network/constrained-emergence.md` - Exit point theory
+- `nyx-probing/docs/language-topology-complete.md` - Complete language map
+- `nyx-probing/docs/retraining-safety-framework.md` - Training safety paper
+
+---
+
+## Success Criteria
+
+### MVP ✅
+1. ✅ Model loads on 3090 without OOM
+2. ✅ Can probe single word and get completion
+3. ✅ Echo probe classifies response types correctly
+4. ✅ Readiness scorer produces actionable output
+5. ⏳ Can probe nimmerverse glossary in batch
+
+### Phase 2 ✅
+6. ✅ Multilingual triangulation probe working
+7. ✅ Language topology mapped (15 languages)
+8. ✅ Isolation types classified (5 categories)
+9. ✅ Curriculum strategy validated
+
+### Phase 3 (Next)
+10. ⏳ DriftProbe for retraining safety
+11. ⏳ Controlled retraining experiments
+12. ⏳ Paper submission
+
+---
+
+*"The model's language topology is not arbitrary - it's a map for navigation."*
+
+🌙💜 Last updated: 2025-12-06 Session 3
+
+---
+
+## STATUS (2025-12-06 21:15)
+
+### CLI COMPLETE ✅
+
+**Built interactive CLI for daily probing:**
+
+```bash
+nyx-probe surface "term"      # Probe surface associations
+nyx-probe echo "term"         # Measure depth through echoing
+nyx-probe readiness "term"    # Full curriculum assessment
+nyx-probe tokens "term"       # Token analysis
+nyx-probe glossary file.json   # Batch probe from glossary
+```
+
+**Files created:**
+- `nyx_probing/cli/probe.py` - Full Click CLI with Rich output
+- `pyproject.toml` - Package config with entry point
+- `data/glossary/core_terms.json` - 30 nimmerverse terms
+
+### NIMMERVERSE GLOSSARY ASSESSMENT ✅
+
+**30 terms probed from vault (nimmerversity.md, Heartbeat.md, constrained-emergence.md, multilingual-cognition.md)**
+
+| Level | Count | Action | Terms |
+|-------|-------|--------|-------|
+| 🟢 HIGH | 5 | state_machine | learning, inference, surface, depth, understanding |
+| 🟡 MEDIUM | 8 | scaffolding | emergence, being, truth, rhythm, synchronization, scaffold, wisdom, warmth |
+| 🔴 LOW | 17 | foundational | heartbeat, lifeforce, consciousness, reflex, garden, constraint, calibration, confidence, gradient, pulse, verification, convergence, divergence, attention, partnership, worldview, existence |
+
+**Key Findings:**
+
+1. **Meta-concepts have depth** - The model knows how to think ABOUT thinking (learning, understanding, inference all HIGH)
+
+2. **consciousness is LOW** - Despite PROSE valley, depth 0/3. Needs German "Bewusstsein" for philosophical access.
+
+3. **Nimmerverse core terms need grounding** - heartbeat, lifeforce, garden, partnership are all LOW. The model doesn't have our vocabulary yet.
+
+4. **existence has highest coherence (0.94) but LOW** - Very coherent surface but doesn't expand. Single-token trap.
+
+5. **Token count doesn't guarantee depth** - lifeforce (4 tokens) is still LOW due to CODE valley trap.
+
+### CURRICULUM IMPLICATIONS
+
+| Phase | Strategy | Terms |
+|-------|----------|-------|
+| **Phase 1** | Build state machines for HIGH terms | learning, inference, understanding, depth, surface |
+| **Phase 2** | Scaffold MEDIUM from HIGH | being→understanding, truth→learning, wisdom→inference |
+| **Phase 3** | Ground LOW via German triangulation | consciousness→Bewusstsein, heartbeat→Herzschlag |
+| **Phase 4** | RAG feed nimmerverse-specific | lifeforce, garden, partnership (unique to us) |
+
+### Results Files
+
+- `results/nimmerverse_surface.json` - Surface probe data
+- `results/nimmerverse_readiness.json` - Full readiness assessment
+
+---
+
+*"Her reactions determine infrastructure priority. We don't impose. We listen."* - nimmerversity.md
+
+🌙💜 Session: Partnership dialogue (dafit + Nyx)