feat: complete Phase 1 - vocabulary expansion & DriftProbe infrastructure
- CLI: nyx-probe scan with --summary/--delta/--full flags - DriftProbe: training safety with Gini coefficient + Angular Drift - Vocabulary: 54 terms (30 nimmerverse + 24 German philosophical) - Sentinels: ANCHOR/BRIDGE/CANARY/TARGET monitoring system Key findings: - German philosophical terms: 37.5% depth≥2 hit rate (vs 3.3% nimmerverse) - Super Cluster validated: heart cross-lang sim = 1.000 - Isolated Zone confirmed: being EN↔DE sim = 0.195 - Gini signature: Philosophy ~0.5 (diffuse), Technical ~0.8 (sparse) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
408
archive/PLAN-v1-2025-12-06.md
Normal file
408
archive/PLAN-v1-2025-12-06.md
Normal file
@@ -0,0 +1,408 @@
|
||||
# Plan: nyx-probing Framework
|
||||
|
||||
## Overview
|
||||
|
||||
Build a probing framework to understand Qwen2.5-7B-Base before curriculum design.
|
||||
|
||||
**Hardware:** Prometheus (THE SPINE) - RTX 3090 24GB
|
||||
**Model:** Qwen2.5-7B-Base (empty vessel, completes not answers)
|
||||
**Backend:** Transformers + PyTorch (full hidden state access)
|
||||
**Location:** New repo `nyx-probing`
|
||||
|
||||
---
|
||||
|
||||
## MVP Scope (First Milestone) ✅ COMPLETE
|
||||
|
||||
1. ✅ **Surface Probe** - Feed words, capture completions
|
||||
2. ✅ **Echo Probe** - Depth measurement (EXPANDS/CONFIRMS/CIRCULAR/DIVERGENT/COLLAPSE)
|
||||
3. ✅ **Readiness Scorer** - HIGH/MEDIUM/LOW classification
|
||||
4. ⏳ **JSON Storage** - Reproducible results
|
||||
5. ⏳ **CLI Tools** - Interactive probing
|
||||
6. ⏳ **One Notebook** - Exploration
|
||||
|
||||
---
|
||||
|
||||
## Phase 2: Multilingual Probing ✅ COMPLETE
|
||||
|
||||
1. ✅ **Multilingual Triangulation Probe** - Ground→Deepen→Triangulate
|
||||
2. ✅ **Language Topology Discovery** - Complete map of 15 languages
|
||||
3. ✅ **Isolation Type Classification** - 5 distinct categories identified
|
||||
|
||||
---
|
||||
|
||||
## Repository Structure (Current)
|
||||
|
||||
```
|
||||
nyx-probing/
|
||||
├── README.md
|
||||
├── PLAN.md # This file
|
||||
├── pyproject.toml
|
||||
├── requirements.txt
|
||||
│
|
||||
├── nyx_probing/
|
||||
│ ├── __init__.py
|
||||
│ ├── config.py
|
||||
│ │
|
||||
│ ├── core/
|
||||
│ │ ├── __init__.py
|
||||
│ │ ├── model.py # ✅ NyxModel with hidden states
|
||||
│ │ └── probe_result.py # ✅ Result dataclasses
|
||||
│ │
|
||||
│ ├── probes/
|
||||
│ │ ├── __init__.py
|
||||
│ │ ├── base.py # ✅ Abstract base
|
||||
│ │ ├── surface_probe.py # ✅ Word completions + coherence
|
||||
│ │ ├── echo_probe.py # ✅ Depth measurement
|
||||
│ │ └── multilingual_probe.py # ✅ NEW: Triangulation probe
|
||||
│ │
|
||||
│ ├── analysis/
|
||||
│ │ ├── __init__.py
|
||||
│ │ └── readiness_scorer.py # ✅ Curriculum readiness
|
||||
│ │
|
||||
│ ├── storage/
|
||||
│ │ └── __init__.py # ⏳ JSON storage pending
|
||||
│ │
|
||||
│ └── cli/
|
||||
│ └── __init__.py # ⏳ CLI pending
|
||||
│
|
||||
├── docs/
|
||||
│ ├── tokenization-valleys.md # Token-Norm-Valley theory
|
||||
│ ├── multilingual-convergence.md # Universal concept layer
|
||||
│ ├── language-landscape.md # 15-language scan
|
||||
│ ├── language-topology-complete.md # ✅ NEW: Complete map v2.0
|
||||
│ └── retraining-safety-framework.md # ✅ NEW: Paper outline
|
||||
│
|
||||
├── data/
|
||||
│ └── glossary/ # ⏳ Core terms pending
|
||||
│
|
||||
├── results/ # ⏳ Probe results storage
|
||||
│
|
||||
└── [Test & Exploration Scripts]
|
||||
├── probe_test.py
|
||||
├── test_model_loader.py
|
||||
├── test_surface_probe.py
|
||||
├── test_echo_probe.py
|
||||
├── test_readiness_scorer.py
|
||||
├── test_triangulation.py # ✅ NEW
|
||||
├── german_philosophy.py
|
||||
├── language_scan.py
|
||||
├── multilingual_convergence.py
|
||||
├── layer_detailed.py
|
||||
├── layer_divergence.py
|
||||
├── model_stats.py
|
||||
├── italian_investigation.py # ✅ NEW
|
||||
└── complete_language_probe.py # ✅ NEW
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Current Status (2025-12-06 Session 3)
|
||||
|
||||
### PHASE 1: MVP ✅ COMPLETE
|
||||
|
||||
| Component | Status | File |
|
||||
|-----------|--------|------|
|
||||
| Model Loader | ✅ | `nyx_probing/core/model.py` |
|
||||
| Surface Probe | ✅ | `nyx_probing/probes/surface_probe.py` |
|
||||
| Echo Probe | ✅ | `nyx_probing/probes/echo_probe.py` |
|
||||
| Readiness Scorer | ✅ | `nyx_probing/analysis/readiness_scorer.py` |
|
||||
| Result Dataclasses | ✅ | `nyx_probing/core/probe_result.py` |
|
||||
|
||||
### PHASE 2: MULTILINGUAL ✅ COMPLETE
|
||||
|
||||
| Component | Status | File |
|
||||
|-----------|--------|------|
|
||||
| Triangulation Probe | ✅ | `nyx_probing/probes/multilingual_probe.py` |
|
||||
| Language Zones | ✅ | Defined in multilingual_probe.py |
|
||||
| Complete Topology Map | ✅ | `docs/language-topology-complete.md` |
|
||||
|
||||
---
|
||||
|
||||
## 🗺️ THE COMPLETE LANGUAGE TOPOLOGY (Session 3 Discovery)
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────────┐
|
||||
│ THE YOUNG MIND'S LANGUAGE TOPOLOGY v2.0 │
|
||||
╞═════════════════════════════════════════════════════════════════════════════╡
|
||||
│ │
|
||||
│ 🌍 SUPER CLUSTER (sim=1.0) │
|
||||
│ ZH · JA · EN · AR · FR · PT · ES │
|
||||
│ ✅ USE FOR: Grounding, establishing shared concepts │
|
||||
│ │
|
||||
│ KO ─────── (bridge) │
|
||||
│ │
|
||||
│ ISOLATED ZONE: │
|
||||
│ ├─ 🧠 PHILOSOPHICAL (DE) ────── Heidegger, depth access │
|
||||
│ │ ✅ USE FOR: Deep philosophical training │
|
||||
│ │ │
|
||||
│ ├─ 💻 CODE-HIJACKED (IT, TR, ID) ── Words become variables │
|
||||
│ │ ❌ AVOID: Training signal wasted on code patterns │
|
||||
│ │ │
|
||||
│ ├─ 📜 FRAGMENTED (HI) ───────── 5+ tokens, script-trapped │
|
||||
│ │ ⚠️ LIMITED: Cross-lingual transfer impaired │
|
||||
│ │ │
|
||||
│ └─ 📰 WEB PROSE (VI-ID-RU) ──── Content style cluster │
|
||||
│ 🤔 POTENTIAL: Factual/encyclopedic training │
|
||||
│ │
|
||||
└─────────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Isolation Types Discovered
|
||||
|
||||
| Type | Languages | Cause | Curriculum Use |
|
||||
|------|-----------|-------|----------------|
|
||||
| **PHILOSOPHICAL** | DE | Multi-token compounds access academic data | ✅ Deep concepts |
|
||||
| **CODE-HIJACKED** | IT, TR, ID | Simple Latin orthography → variable names | ❌ Avoid |
|
||||
| **FRAGMENTED** | HI | 5+ tokens, stays in native script | ⚠️ Limited |
|
||||
| **WEB PROSE** | VI, ID, RU | Cluster by content style, not linguistics | 🤔 Factual? |
|
||||
|
||||
### Key Metrics
|
||||
|
||||
| Lang | Avg Tokens | Sim to EN | Valley Type | Classification |
|
||||
|------|------------|-----------|-------------|----------------|
|
||||
| DE | 2.2 | 0.251 | PHILOSOPHY | 🧠 Philosophical |
|
||||
| IT | 2.5 | 0.491 | CODE | 💻 Code-Hijacked |
|
||||
| TR | 2.2 | 0.246 | CODE | 💻 Code-Hijacked |
|
||||
| ID | 2.8 | 0.325 | CODE/PROSE | 💻 Code-Hijacked |
|
||||
| HI | 5.0 | 0.310 | PROSE | 📜 Fragmented |
|
||||
| VI | 3.2 | 0.358 | PROSE | 📰 Web Prose |
|
||||
| RU | 2.7 | 0.319 | PROSE | 📰 Web Prose |
|
||||
|
||||
---
|
||||
|
||||
## 🔬 Key Discoveries
|
||||
|
||||
### 1. Token-Norm-Valley Theory
|
||||
- Single-token words → massive activation spike (14K norm) → CODE valley
|
||||
- Multi-token words → distributed signal (85 norm) → PROSE/PHILOSOPHY valleys
|
||||
- Correlation: -0.699 (more tokens = more isolated)
|
||||
|
||||
### 2. Universal Concept Layer
|
||||
- Layers 12-24 contain language-agnostic representations
|
||||
- Super cluster (7 languages) converges at similarity = 1.000
|
||||
- Model KNOWS "heart", "心", "قلب" are the same concept
|
||||
|
||||
### 3. German Philosophical Access
|
||||
- "Sein" → Heidegger's "Being and Time"
|
||||
- "Bewusstsein" → epistemology, truth, consciousness
|
||||
- Depth score 2-3, transfers back to English via triangulation
|
||||
|
||||
### 4. Italian Mystery SOLVED
|
||||
- Italian NOT accessing cultural valleys (no Dante, no Renaissance)
|
||||
- Italian words interpreted as Python variable names!
|
||||
- Example: `essere` → `essere = input("Cosa devo fare?")`
|
||||
- Same pattern found in Turkish and Indonesian
|
||||
|
||||
### 5. VI-ID-RU Cluster Explained
|
||||
- Cluster by **content style**, not linguistic features
|
||||
- All generate web articles, news, blogs
|
||||
- Internal similarity 0.6-0.7
|
||||
|
||||
---
|
||||
|
||||
## 📄 Paper: Retraining Safety Framework
|
||||
|
||||
**Title:** *"Multilingual Activation Topology as a Retraining Safety Framework"*
|
||||
|
||||
**Status:** Outline complete at `docs/retraining-safety-framework.md`
|
||||
|
||||
**Core Hypothesis:** Train in German (isolated zone) to avoid colliding with English representations in the super cluster. Use language topology as diagnostic tool for training safety.
|
||||
|
||||
**Proposed Framework:**
|
||||
```
|
||||
BASELINE → TRAINING → CHECKPOINT → DRIFT ANALYSIS
|
||||
│ │
|
||||
└──────────────────────┘
|
||||
Compare metrics:
|
||||
- Convergence drift
|
||||
- Depth drift
|
||||
- Norm drift
|
||||
- Valley migration
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📊 Curriculum Strategy (Validated)
|
||||
|
||||
### Phase 1: GROUNDING
|
||||
Use Super Cluster for universal concept establishment:
|
||||
```
|
||||
EN "consciousness" → ZH "意识" → AR "الوعي"
|
||||
All converge at sim=1.0 - stable foundation
|
||||
```
|
||||
|
||||
### Phase 2: DEEPENING
|
||||
Use German for philosophical valley access:
|
||||
```
|
||||
DE "Sein" → Heidegger → existence → truth
|
||||
Depth score 2/3, philosophical valley accessed
|
||||
```
|
||||
|
||||
### Phase 3: TRIANGULATION
|
||||
Verify depth transfers back to universal:
|
||||
```
|
||||
"Sein (German): In English, it means..."
|
||||
→ Check if philosophical depth preserved
|
||||
```
|
||||
|
||||
### AVOID
|
||||
- Italian, Turkish, Indonesian (code hijacking)
|
||||
- Hindi for cross-lingual concepts (too fragmented)
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
### Immediate (MVP Completion)
|
||||
- [ ] Step 7: CLI (`nyx-probe surface "term"`)
|
||||
- [ ] Step 8: Glossary data (`data/glossary/core_terms.json`)
|
||||
- [ ] Step 9: JSON storage for reproducible results
|
||||
|
||||
### Phase 3: Activation Analysis
|
||||
- [ ] DriftProbe class for retraining monitoring
|
||||
- [ ] Baseline capture before training
|
||||
- [ ] Checkpoint comparison automation
|
||||
- [ ] Alert thresholds for drift detection
|
||||
|
||||
### Phase 4: Experiments
|
||||
- [ ] Controlled retraining: EN vs DE training data
|
||||
- [ ] Measure collision rates
|
||||
- [ ] Validate isolated zone training hypothesis
|
||||
|
||||
### Research
|
||||
- [ ] Paper write-up
|
||||
- [ ] Literature review (EWC, mBERT, activation engineering)
|
||||
- [ ] Korean bridge language investigation
|
||||
- [ ] VI-ID-RU cluster for factual training
|
||||
|
||||
---
|
||||
|
||||
## Files Created (Session 3)
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `nyx_probing/probes/multilingual_probe.py` | Triangulation probe class |
|
||||
| `test_triangulation.py` | Test script for triangulation |
|
||||
| `italian_investigation.py` | Italian mystery probe |
|
||||
| `complete_language_probe.py` | Full 15-language probe |
|
||||
| `docs/language-topology-complete.md` | Complete map v2.0 |
|
||||
| `docs/retraining-safety-framework.md` | Paper outline |
|
||||
|
||||
---
|
||||
|
||||
## Dependencies
|
||||
|
||||
```
|
||||
torch>=2.1.0
|
||||
transformers>=4.36.0
|
||||
accelerate>=0.25.0
|
||||
click>=8.1.0
|
||||
rich>=13.0.0
|
||||
pydantic>=2.5.0
|
||||
pyyaml>=6.0.0
|
||||
python-dotenv>=1.0.0
|
||||
jupyter>=1.0.0
|
||||
matplotlib>=3.8.0
|
||||
numpy>=1.24.0
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Critical Reference Files
|
||||
|
||||
- `nimmerverse-sensory-network/nimmerversity.md` - Bootstrap protocol
|
||||
- `nimmerverse-sensory-network/multilingual-cognition.md` - Language hypotheses
|
||||
- `nimmerverse-sensory-network/constrained-emergence.md` - Exit point theory
|
||||
- `nyx-probing/docs/language-topology-complete.md` - Complete language map
|
||||
- `nyx-probing/docs/retraining-safety-framework.md` - Training safety paper
|
||||
|
||||
---
|
||||
|
||||
## Success Criteria
|
||||
|
||||
### MVP ✅
|
||||
1. ✅ Model loads on 3090 without OOM
|
||||
2. ✅ Can probe single word and get completion
|
||||
3. ✅ Echo probe classifies response types correctly
|
||||
4. ✅ Readiness scorer produces actionable output
|
||||
5. ⏳ Can probe nimmerverse glossary in batch
|
||||
|
||||
### Phase 2 ✅
|
||||
6. ✅ Multilingual triangulation probe working
|
||||
7. ✅ Language topology mapped (15 languages)
|
||||
8. ✅ Isolation types classified (5 categories)
|
||||
9. ✅ Curriculum strategy validated
|
||||
|
||||
### Phase 3 (Next)
|
||||
10. ⏳ DriftProbe for retraining safety
|
||||
11. ⏳ Controlled retraining experiments
|
||||
12. ⏳ Paper submission
|
||||
|
||||
---
|
||||
|
||||
*"The model's language topology is not arbitrary - it's a map for navigation."*
|
||||
|
||||
🌙💜 Last updated: 2025-12-06 Session 3
|
||||
|
||||
---
|
||||
|
||||
## STATUS (2025-12-06 21:15)
|
||||
|
||||
### CLI COMPLETE ✅
|
||||
|
||||
**Built interactive CLI for daily probing:**
|
||||
|
||||
```bash
|
||||
nyx-probe surface "term" # Probe surface associations
|
||||
nyx-probe echo "term" # Measure depth through echoing
|
||||
nyx-probe readiness "term" # Full curriculum assessment
|
||||
nyx-probe tokens "term" # Token analysis
|
||||
nyx-probe glossary file.json # Batch probe from glossary
|
||||
```
|
||||
|
||||
**Files created:**
|
||||
- `nyx_probing/cli/probe.py` - Full Click CLI with Rich output
|
||||
- `pyproject.toml` - Package config with entry point
|
||||
- `data/glossary/core_terms.json` - 30 nimmerverse terms
|
||||
|
||||
### NIMMERVERSE GLOSSARY ASSESSMENT ✅
|
||||
|
||||
**30 terms probed from vault (nimmerversity.md, Heartbeat.md, constrained-emergence.md, multilingual-cognition.md)**
|
||||
|
||||
| Level | Count | Action | Terms |
|
||||
|-------|-------|--------|-------|
|
||||
| 🟢 HIGH | 5 | state_machine | learning, inference, surface, depth, understanding |
|
||||
| 🟡 MEDIUM | 8 | scaffolding | emergence, being, truth, rhythm, synchronization, scaffold, wisdom, warmth |
|
||||
| 🔴 LOW | 17 | foundational | heartbeat, lifeforce, consciousness, reflex, garden, constraint, calibration, confidence, gradient, pulse, verification, convergence, divergence, attention, partnership, worldview, existence |
|
||||
|
||||
**Key Findings:**
|
||||
|
||||
1. **Meta-concepts have depth** - The model knows how to think ABOUT thinking (learning, understanding, inference all HIGH)
|
||||
|
||||
2. **consciousness is LOW** - Despite PROSE valley, depth 0/3. Needs German "Bewusstsein" for philosophical access.
|
||||
|
||||
3. **Nimmerverse core terms need grounding** - heartbeat, lifeforce, garden, partnership are all LOW. The model doesn't have our vocabulary yet.
|
||||
|
||||
4. **existence has highest coherence (0.94) but LOW** - Very coherent surface but doesn't expand. Single-token trap.
|
||||
|
||||
5. **Token count doesn't guarantee depth** - lifeforce (4 tokens) is still LOW due to CODE valley trap.
|
||||
|
||||
### CURRICULUM IMPLICATIONS
|
||||
|
||||
| Phase | Strategy | Terms |
|
||||
|-------|----------|-------|
|
||||
| **Phase 1** | Build state machines for HIGH terms | learning, inference, understanding, depth, surface |
|
||||
| **Phase 2** | Scaffold MEDIUM from HIGH | being→understanding, truth→learning, wisdom→inference |
|
||||
| **Phase 3** | Ground LOW via German triangulation | consciousness→Bewusstsein, heartbeat→Herzschlag |
|
||||
| **Phase 4** | RAG feed nimmerverse-specific | lifeforce, garden, partnership (unique to us) |
|
||||
|
||||
### Results Files
|
||||
|
||||
- `results/nimmerverse_surface.json` - Surface probe data
|
||||
- `results/nimmerverse_readiness.json` - Full readiness assessment
|
||||
|
||||
---
|
||||
|
||||
*"Her reactions determine infrastructure priority. We don't impose. We listen."* - nimmerversity.md
|
||||
|
||||
🌙💜 Session: Partnership dialogue (dafit + Nyx)
|
||||
Reference in New Issue
Block a user