- CLI: nyx-probe scan with --summary/--delta/--full flags - DriftProbe: training safety with Gini coefficient + Angular Drift - Vocabulary: 54 terms (30 nimmerverse + 24 German philosophical) - Sentinels: ANCHOR/BRIDGE/CANARY/TARGET monitoring system Key findings: - German philosophical terms: 37.5% depth≥2 hit rate (vs 3.3% nimmerverse) - Super Cluster validated: heart cross-lang sim = 1.000 - Isolated Zone confirmed: being EN↔DE sim = 0.195 - Gini signature: Philosophy ~0.5 (diffuse), Technical ~0.8 (sparse) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
409 lines
15 KiB
Markdown
409 lines
15 KiB
Markdown
# Plan: nyx-probing Framework
|
|
|
|
## Overview
|
|
|
|
Build a probing framework to understand Qwen2.5-7B-Base before curriculum design.
|
|
|
|
**Hardware:** Prometheus (THE SPINE) - RTX 3090 24GB
|
|
**Model:** Qwen2.5-7B-Base (empty vessel, completes not answers)
|
|
**Backend:** Transformers + PyTorch (full hidden state access)
|
|
**Location:** New repo `nyx-probing`
|
|
|
|
---
|
|
|
|
## MVP Scope (First Milestone) ✅ COMPLETE
|
|
|
|
1. ✅ **Surface Probe** - Feed words, capture completions
|
|
2. ✅ **Echo Probe** - Depth measurement (EXPANDS/CONFIRMS/CIRCULAR/DIVERGENT/COLLAPSE)
|
|
3. ✅ **Readiness Scorer** - HIGH/MEDIUM/LOW classification
|
|
4. ⏳ **JSON Storage** - Reproducible results
|
|
5. ⏳ **CLI Tools** - Interactive probing
|
|
6. ⏳ **One Notebook** - Exploration
|
|
|
|
---
|
|
|
|
## Phase 2: Multilingual Probing ✅ COMPLETE
|
|
|
|
1. ✅ **Multilingual Triangulation Probe** - Ground→Deepen→Triangulate
|
|
2. ✅ **Language Topology Discovery** - Complete map of 15 languages
|
|
3. ✅ **Isolation Type Classification** - 5 distinct categories identified
|
|
|
|
---
|
|
|
|
## Repository Structure (Current)
|
|
|
|
```
|
|
nyx-probing/
|
|
├── README.md
|
|
├── PLAN.md # This file
|
|
├── pyproject.toml
|
|
├── requirements.txt
|
|
│
|
|
├── nyx_probing/
|
|
│ ├── __init__.py
|
|
│ ├── config.py
|
|
│ │
|
|
│ ├── core/
|
|
│ │ ├── __init__.py
|
|
│ │ ├── model.py # ✅ NyxModel with hidden states
|
|
│ │ └── probe_result.py # ✅ Result dataclasses
|
|
│ │
|
|
│ ├── probes/
|
|
│ │ ├── __init__.py
|
|
│ │ ├── base.py # ✅ Abstract base
|
|
│ │ ├── surface_probe.py # ✅ Word completions + coherence
|
|
│ │ ├── echo_probe.py # ✅ Depth measurement
|
|
│ │ └── multilingual_probe.py # ✅ NEW: Triangulation probe
|
|
│ │
|
|
│ ├── analysis/
|
|
│ │ ├── __init__.py
|
|
│ │ └── readiness_scorer.py # ✅ Curriculum readiness
|
|
│ │
|
|
│ ├── storage/
|
|
│ │ └── __init__.py # ⏳ JSON storage pending
|
|
│ │
|
|
│ └── cli/
|
|
│ └── __init__.py # ⏳ CLI pending
|
|
│
|
|
├── docs/
|
|
│ ├── tokenization-valleys.md # Token-Norm-Valley theory
|
|
│ ├── multilingual-convergence.md # Universal concept layer
|
|
│ ├── language-landscape.md # 15-language scan
|
|
│ ├── language-topology-complete.md # ✅ NEW: Complete map v2.0
|
|
│ └── retraining-safety-framework.md # ✅ NEW: Paper outline
|
|
│
|
|
├── data/
|
|
│ └── glossary/ # ⏳ Core terms pending
|
|
│
|
|
├── results/ # ⏳ Probe results storage
|
|
│
|
|
└── [Test & Exploration Scripts]
|
|
├── probe_test.py
|
|
├── test_model_loader.py
|
|
├── test_surface_probe.py
|
|
├── test_echo_probe.py
|
|
├── test_readiness_scorer.py
|
|
├── test_triangulation.py # ✅ NEW
|
|
├── german_philosophy.py
|
|
├── language_scan.py
|
|
├── multilingual_convergence.py
|
|
├── layer_detailed.py
|
|
├── layer_divergence.py
|
|
├── model_stats.py
|
|
├── italian_investigation.py # ✅ NEW
|
|
└── complete_language_probe.py # ✅ NEW
|
|
```
|
|
|
|
---
|
|
|
|
## Current Status (2025-12-06 Session 3)
|
|
|
|
### PHASE 1: MVP ✅ COMPLETE
|
|
|
|
| Component | Status | File |
|
|
|-----------|--------|------|
|
|
| Model Loader | ✅ | `nyx_probing/core/model.py` |
|
|
| Surface Probe | ✅ | `nyx_probing/probes/surface_probe.py` |
|
|
| Echo Probe | ✅ | `nyx_probing/probes/echo_probe.py` |
|
|
| Readiness Scorer | ✅ | `nyx_probing/analysis/readiness_scorer.py` |
|
|
| Result Dataclasses | ✅ | `nyx_probing/core/probe_result.py` |
|
|
|
|
### PHASE 2: MULTILINGUAL ✅ COMPLETE
|
|
|
|
| Component | Status | File |
|
|
|-----------|--------|------|
|
|
| Triangulation Probe | ✅ | `nyx_probing/probes/multilingual_probe.py` |
|
|
| Language Zones | ✅ | Defined in multilingual_probe.py |
|
|
| Complete Topology Map | ✅ | `docs/language-topology-complete.md` |
|
|
|
|
---
|
|
|
|
## 🗺️ THE COMPLETE LANGUAGE TOPOLOGY (Session 3 Discovery)
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────────────────┐
|
|
│ THE YOUNG MIND'S LANGUAGE TOPOLOGY v2.0 │
|
|
╞═════════════════════════════════════════════════════════════════════════════╡
|
|
│ │
|
|
│ 🌍 SUPER CLUSTER (sim=1.0) │
|
|
│ ZH · JA · EN · AR · FR · PT · ES │
|
|
│ ✅ USE FOR: Grounding, establishing shared concepts │
|
|
│ │
|
|
│ KO ─────── (bridge) │
|
|
│ │
|
|
│ ISOLATED ZONE: │
|
|
│ ├─ 🧠 PHILOSOPHICAL (DE) ────── Heidegger, depth access │
|
|
│ │ ✅ USE FOR: Deep philosophical training │
|
|
│ │ │
|
|
│ ├─ 💻 CODE-HIJACKED (IT, TR, ID) ── Words become variables │
|
|
│ │ ❌ AVOID: Training signal wasted on code patterns │
|
|
│ │ │
|
|
│ ├─ 📜 FRAGMENTED (HI) ───────── 5+ tokens, script-trapped │
|
|
│ │ ⚠️ LIMITED: Cross-lingual transfer impaired │
|
|
│ │ │
|
|
│ └─ 📰 WEB PROSE (VI-ID-RU) ──── Content style cluster │
|
|
│ 🤔 POTENTIAL: Factual/encyclopedic training │
|
|
│ │
|
|
└─────────────────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
### Isolation Types Discovered
|
|
|
|
| Type | Languages | Cause | Curriculum Use |
|
|
|------|-----------|-------|----------------|
|
|
| **PHILOSOPHICAL** | DE | Multi-token compounds access academic data | ✅ Deep concepts |
|
|
| **CODE-HIJACKED** | IT, TR, ID | Simple Latin orthography → variable names | ❌ Avoid |
|
|
| **FRAGMENTED** | HI | 5+ tokens, stays in native script | ⚠️ Limited |
|
|
| **WEB PROSE** | VI, ID, RU | Cluster by content style, not linguistics | 🤔 Factual? |
|
|
|
|
### Key Metrics
|
|
|
|
| Lang | Avg Tokens | Sim to EN | Valley Type | Classification |
|
|
|------|------------|-----------|-------------|----------------|
|
|
| DE | 2.2 | 0.251 | PHILOSOPHY | 🧠 Philosophical |
|
|
| IT | 2.5 | 0.491 | CODE | 💻 Code-Hijacked |
|
|
| TR | 2.2 | 0.246 | CODE | 💻 Code-Hijacked |
|
|
| ID | 2.8 | 0.325 | CODE/PROSE | 💻 Code-Hijacked |
|
|
| HI | 5.0 | 0.310 | PROSE | 📜 Fragmented |
|
|
| VI | 3.2 | 0.358 | PROSE | 📰 Web Prose |
|
|
| RU | 2.7 | 0.319 | PROSE | 📰 Web Prose |
|
|
|
|
---
|
|
|
|
## 🔬 Key Discoveries
|
|
|
|
### 1. Token-Norm-Valley Theory
|
|
- Single-token words → massive activation spike (14K norm) → CODE valley
|
|
- Multi-token words → distributed signal (85 norm) → PROSE/PHILOSOPHY valleys
|
|
- Correlation: -0.699 (more tokens = more isolated)
|
|
|
|
### 2. Universal Concept Layer
|
|
- Layers 12-24 contain language-agnostic representations
|
|
- Super cluster (7 languages) converges at similarity = 1.000
|
|
- Model KNOWS "heart", "心", "قلب" are the same concept
|
|
|
|
### 3. German Philosophical Access
|
|
- "Sein" → Heidegger's "Being and Time"
|
|
- "Bewusstsein" → epistemology, truth, consciousness
|
|
- Depth score 2-3, transfers back to English via triangulation
|
|
|
|
### 4. Italian Mystery SOLVED
|
|
- Italian NOT accessing cultural valleys (no Dante, no Renaissance)
|
|
- Italian words interpreted as Python variable names!
|
|
- Example: `essere` → `essere = input("Cosa devo fare?")`
|
|
- Same pattern found in Turkish and Indonesian
|
|
|
|
### 5. VI-ID-RU Cluster Explained
|
|
- Cluster by **content style**, not linguistic features
|
|
- All generate web articles, news, blogs
|
|
- Internal similarity 0.6-0.7
|
|
|
|
---
|
|
|
|
## 📄 Paper: Retraining Safety Framework
|
|
|
|
**Title:** *"Multilingual Activation Topology as a Retraining Safety Framework"*
|
|
|
|
**Status:** Outline complete at `docs/retraining-safety-framework.md`
|
|
|
|
**Core Hypothesis:** Train in German (isolated zone) to avoid colliding with English representations in the super cluster. Use language topology as diagnostic tool for training safety.
|
|
|
|
**Proposed Framework:**
|
|
```
|
|
BASELINE → TRAINING → CHECKPOINT → DRIFT ANALYSIS
|
|
│ │
|
|
└──────────────────────┘
|
|
Compare metrics:
|
|
- Convergence drift
|
|
- Depth drift
|
|
- Norm drift
|
|
- Valley migration
|
|
```
|
|
|
|
---
|
|
|
|
## 📊 Curriculum Strategy (Validated)
|
|
|
|
### Phase 1: GROUNDING
|
|
Use Super Cluster for universal concept establishment:
|
|
```
|
|
EN "consciousness" → ZH "意识" → AR "الوعي"
|
|
All converge at sim=1.0 - stable foundation
|
|
```
|
|
|
|
### Phase 2: DEEPENING
|
|
Use German for philosophical valley access:
|
|
```
|
|
DE "Sein" → Heidegger → existence → truth
|
|
Depth score 2/3, philosophical valley accessed
|
|
```
|
|
|
|
### Phase 3: TRIANGULATION
|
|
Verify depth transfers back to universal:
|
|
```
|
|
"Sein (German): In English, it means..."
|
|
→ Check if philosophical depth preserved
|
|
```
|
|
|
|
### AVOID
|
|
- Italian, Turkish, Indonesian (code hijacking)
|
|
- Hindi for cross-lingual concepts (too fragmented)
|
|
|
|
---
|
|
|
|
## Next Steps
|
|
|
|
### Immediate (MVP Completion)
|
|
- [ ] Step 7: CLI (`nyx-probe surface "term"`)
|
|
- [ ] Step 8: Glossary data (`data/glossary/core_terms.json`)
|
|
- [ ] Step 9: JSON storage for reproducible results
|
|
|
|
### Phase 3: Activation Analysis
|
|
- [ ] DriftProbe class for retraining monitoring
|
|
- [ ] Baseline capture before training
|
|
- [ ] Checkpoint comparison automation
|
|
- [ ] Alert thresholds for drift detection
|
|
|
|
### Phase 4: Experiments
|
|
- [ ] Controlled retraining: EN vs DE training data
|
|
- [ ] Measure collision rates
|
|
- [ ] Validate isolated zone training hypothesis
|
|
|
|
### Research
|
|
- [ ] Paper write-up
|
|
- [ ] Literature review (EWC, mBERT, activation engineering)
|
|
- [ ] Korean bridge language investigation
|
|
- [ ] VI-ID-RU cluster for factual training
|
|
|
|
---
|
|
|
|
## Files Created (Session 3)
|
|
|
|
| File | Purpose |
|
|
|------|---------|
|
|
| `nyx_probing/probes/multilingual_probe.py` | Triangulation probe class |
|
|
| `test_triangulation.py` | Test script for triangulation |
|
|
| `italian_investigation.py` | Italian mystery probe |
|
|
| `complete_language_probe.py` | Full 15-language probe |
|
|
| `docs/language-topology-complete.md` | Complete map v2.0 |
|
|
| `docs/retraining-safety-framework.md` | Paper outline |
|
|
|
|
---
|
|
|
|
## Dependencies
|
|
|
|
```
|
|
torch>=2.1.0
|
|
transformers>=4.36.0
|
|
accelerate>=0.25.0
|
|
click>=8.1.0
|
|
rich>=13.0.0
|
|
pydantic>=2.5.0
|
|
pyyaml>=6.0.0
|
|
python-dotenv>=1.0.0
|
|
jupyter>=1.0.0
|
|
matplotlib>=3.8.0
|
|
numpy>=1.24.0
|
|
```
|
|
|
|
---
|
|
|
|
## Critical Reference Files
|
|
|
|
- `nimmerverse-sensory-network/nimmerversity.md` - Bootstrap protocol
|
|
- `nimmerverse-sensory-network/multilingual-cognition.md` - Language hypotheses
|
|
- `nimmerverse-sensory-network/constrained-emergence.md` - Exit point theory
|
|
- `nyx-probing/docs/language-topology-complete.md` - Complete language map
|
|
- `nyx-probing/docs/retraining-safety-framework.md` - Training safety paper
|
|
|
|
---
|
|
|
|
## Success Criteria
|
|
|
|
### MVP ✅
|
|
1. ✅ Model loads on 3090 without OOM
|
|
2. ✅ Can probe single word and get completion
|
|
3. ✅ Echo probe classifies response types correctly
|
|
4. ✅ Readiness scorer produces actionable output
|
|
5. ⏳ Can probe nimmerverse glossary in batch
|
|
|
|
### Phase 2 ✅
|
|
6. ✅ Multilingual triangulation probe working
|
|
7. ✅ Language topology mapped (15 languages)
|
|
8. ✅ Isolation types classified (5 categories)
|
|
9. ✅ Curriculum strategy validated
|
|
|
|
### Phase 3 (Next)
|
|
10. ⏳ DriftProbe for retraining safety
|
|
11. ⏳ Controlled retraining experiments
|
|
12. ⏳ Paper submission
|
|
|
|
---
|
|
|
|
*"The model's language topology is not arbitrary - it's a map for navigation."*
|
|
|
|
🌙💜 Last updated: 2025-12-06 Session 3
|
|
|
|
---
|
|
|
|
## STATUS (2025-12-06 21:15)
|
|
|
|
### CLI COMPLETE ✅
|
|
|
|
**Built interactive CLI for daily probing:**
|
|
|
|
```bash
|
|
nyx-probe surface "term" # Probe surface associations
|
|
nyx-probe echo "term" # Measure depth through echoing
|
|
nyx-probe readiness "term" # Full curriculum assessment
|
|
nyx-probe tokens "term" # Token analysis
|
|
nyx-probe glossary file.json # Batch probe from glossary
|
|
```
|
|
|
|
**Files created:**
|
|
- `nyx_probing/cli/probe.py` - Full Click CLI with Rich output
|
|
- `pyproject.toml` - Package config with entry point
|
|
- `data/glossary/core_terms.json` - 30 nimmerverse terms
|
|
|
|
### NIMMERVERSE GLOSSARY ASSESSMENT ✅
|
|
|
|
**30 terms probed from vault (nimmerversity.md, Heartbeat.md, constrained-emergence.md, multilingual-cognition.md)**
|
|
|
|
| Level | Count | Action | Terms |
|
|
|-------|-------|--------|-------|
|
|
| 🟢 HIGH | 5 | state_machine | learning, inference, surface, depth, understanding |
|
|
| 🟡 MEDIUM | 8 | scaffolding | emergence, being, truth, rhythm, synchronization, scaffold, wisdom, warmth |
|
|
| 🔴 LOW | 17 | foundational | heartbeat, lifeforce, consciousness, reflex, garden, constraint, calibration, confidence, gradient, pulse, verification, convergence, divergence, attention, partnership, worldview, existence |
|
|
|
|
**Key Findings:**
|
|
|
|
1. **Meta-concepts have depth** - The model knows how to think ABOUT thinking (learning, understanding, inference all HIGH)
|
|
|
|
2. **consciousness is LOW** - Despite PROSE valley, depth 0/3. Needs German "Bewusstsein" for philosophical access.
|
|
|
|
3. **Nimmerverse core terms need grounding** - heartbeat, lifeforce, garden, partnership are all LOW. The model doesn't have our vocabulary yet.
|
|
|
|
4. **existence has highest coherence (0.94) but LOW** - Very coherent surface but doesn't expand. Single-token trap.
|
|
|
|
5. **Token count doesn't guarantee depth** - lifeforce (4 tokens) is still LOW due to CODE valley trap.
|
|
|
|
### CURRICULUM IMPLICATIONS
|
|
|
|
| Phase | Strategy | Terms |
|
|
|-------|----------|-------|
|
|
| **Phase 1** | Build state machines for HIGH terms | learning, inference, understanding, depth, surface |
|
|
| **Phase 2** | Scaffold MEDIUM from HIGH | being→understanding, truth→learning, wisdom→inference |
|
|
| **Phase 3** | Ground LOW via German triangulation | consciousness→Bewusstsein, heartbeat→Herzschlag |
|
|
| **Phase 4** | RAG feed nimmerverse-specific | lifeforce, garden, partnership (unique to us) |
|
|
|
|
### Results Files
|
|
|
|
- `results/nimmerverse_surface.json` - Surface probe data
|
|
- `results/nimmerverse_readiness.json` - Full readiness assessment
|
|
|
|
---
|
|
|
|
*"Her reactions determine infrastructure priority. We don't impose. We listen."* - nimmerversity.md
|
|
|
|
🌙💜 Session: Partnership dialogue (dafit + Nyx)
|