Files
nyx-probing/archive/PLAN-v1-2025-12-06.md
dafit f640dbdd65 feat: complete Phase 1 - vocabulary expansion & DriftProbe infrastructure
- CLI: nyx-probe scan with --summary/--delta/--full flags
- DriftProbe: training safety with Gini coefficient + Angular Drift
- Vocabulary: 54 terms (30 nimmerverse + 24 German philosophical)
- Sentinels: ANCHOR/BRIDGE/CANARY/TARGET monitoring system

Key findings:
- German philosophical terms: 37.5% depth≥2 hit rate (vs 3.3% nimmerverse)
- Super Cluster validated: heart cross-lang sim = 1.000
- Isolated Zone confirmed: being EN↔DE sim = 0.195
- Gini signature: Philosophy ~0.5 (diffuse), Technical ~0.8 (sparse)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-06 22:39:03 +01:00

15 KiB

Plan: nyx-probing Framework

Overview

Build a probing framework to understand Qwen2.5-7B-Base before curriculum design.

Hardware: Prometheus (THE SPINE) - RTX 3090 24GB
Model: Qwen2.5-7B-Base (empty vessel, completes not answers)
Backend: Transformers + PyTorch (full hidden state access)
Location: New repo nyx-probing


MVP Scope (First Milestone) COMPLETE

  1. Surface Probe - Feed words, capture completions
  2. Echo Probe - Depth measurement (EXPANDS/CONFIRMS/CIRCULAR/DIVERGENT/COLLAPSE)
  3. Readiness Scorer - HIGH/MEDIUM/LOW classification
  4. JSON Storage - Reproducible results
  5. CLI Tools - Interactive probing
  6. One Notebook - Exploration

Phase 2: Multilingual Probing COMPLETE

  1. Multilingual Triangulation Probe - Ground→Deepen→Triangulate
  2. Language Topology Discovery - Complete map of 15 languages
  3. Isolation Type Classification - 5 distinct categories identified

Repository Structure (Current)

nyx-probing/
├── README.md
├── PLAN.md                    # This file
├── pyproject.toml
├── requirements.txt
│
├── nyx_probing/
│   ├── __init__.py
│   ├── config.py
│   │
│   ├── core/
│   │   ├── __init__.py
│   │   ├── model.py           # ✅ NyxModel with hidden states
│   │   └── probe_result.py    # ✅ Result dataclasses
│   │
│   ├── probes/
│   │   ├── __init__.py
│   │   ├── base.py            # ✅ Abstract base
│   │   ├── surface_probe.py   # ✅ Word completions + coherence
│   │   ├── echo_probe.py      # ✅ Depth measurement
│   │   └── multilingual_probe.py  # ✅ NEW: Triangulation probe
│   │
│   ├── analysis/
│   │   ├── __init__.py
│   │   └── readiness_scorer.py # ✅ Curriculum readiness
│   │
│   ├── storage/
│   │   └── __init__.py        # ⏳ JSON storage pending
│   │
│   └── cli/
│       └── __init__.py        # ⏳ CLI pending
│
├── docs/
│   ├── tokenization-valleys.md        # Token-Norm-Valley theory
│   ├── multilingual-convergence.md    # Universal concept layer
│   ├── language-landscape.md          # 15-language scan
│   ├── language-topology-complete.md  # ✅ NEW: Complete map v2.0
│   └── retraining-safety-framework.md # ✅ NEW: Paper outline
│
├── data/
│   └── glossary/              # ⏳ Core terms pending
│
├── results/                   # ⏳ Probe results storage
│
└── [Test & Exploration Scripts]
    ├── probe_test.py
    ├── test_model_loader.py
    ├── test_surface_probe.py
    ├── test_echo_probe.py
    ├── test_readiness_scorer.py
    ├── test_triangulation.py       # ✅ NEW
    ├── german_philosophy.py
    ├── language_scan.py
    ├── multilingual_convergence.py
    ├── layer_detailed.py
    ├── layer_divergence.py
    ├── model_stats.py
    ├── italian_investigation.py    # ✅ NEW
    └── complete_language_probe.py  # ✅ NEW

Current Status (2025-12-06 Session 3)

PHASE 1: MVP COMPLETE

Component Status File
Model Loader nyx_probing/core/model.py
Surface Probe nyx_probing/probes/surface_probe.py
Echo Probe nyx_probing/probes/echo_probe.py
Readiness Scorer nyx_probing/analysis/readiness_scorer.py
Result Dataclasses nyx_probing/core/probe_result.py

PHASE 2: MULTILINGUAL COMPLETE

Component Status File
Triangulation Probe nyx_probing/probes/multilingual_probe.py
Language Zones Defined in multilingual_probe.py
Complete Topology Map docs/language-topology-complete.md

🗺️ THE COMPLETE LANGUAGE TOPOLOGY (Session 3 Discovery)

┌─────────────────────────────────────────────────────────────────────────────┐
│                    THE YOUNG MIND'S LANGUAGE TOPOLOGY v2.0                   │
╞═════════════════════════════════════════════════════════════════════════════╡
│                                                                              │
│  🌍 SUPER CLUSTER (sim=1.0)                                                 │
│     ZH · JA · EN · AR · FR · PT · ES                                        │
│     ✅ USE FOR: Grounding, establishing shared concepts                      │
│                                                                              │
│                        KO ─────── (bridge)                                   │
│                                                                              │
│  ISOLATED ZONE:                                                              │
│  ├─ 🧠 PHILOSOPHICAL (DE) ────── Heidegger, depth access                    │
│  │     ✅ USE FOR: Deep philosophical training                               │
│  │                                                                           │
│  ├─ 💻 CODE-HIJACKED (IT, TR, ID) ── Words become variables                 │
│  │     ❌ AVOID: Training signal wasted on code patterns                     │
│  │                                                                           │
│  ├─ 📜 FRAGMENTED (HI) ───────── 5+ tokens, script-trapped                  │
│  │     ⚠️ LIMITED: Cross-lingual transfer impaired                          │
│  │                                                                           │
│  └─ 📰 WEB PROSE (VI-ID-RU) ──── Content style cluster                      │
│        🤔 POTENTIAL: Factual/encyclopedic training                          │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Isolation Types Discovered

Type Languages Cause Curriculum Use
PHILOSOPHICAL DE Multi-token compounds access academic data Deep concepts
CODE-HIJACKED IT, TR, ID Simple Latin orthography → variable names Avoid
FRAGMENTED HI 5+ tokens, stays in native script ⚠️ Limited
WEB PROSE VI, ID, RU Cluster by content style, not linguistics 🤔 Factual?

Key Metrics

Lang Avg Tokens Sim to EN Valley Type Classification
DE 2.2 0.251 PHILOSOPHY 🧠 Philosophical
IT 2.5 0.491 CODE 💻 Code-Hijacked
TR 2.2 0.246 CODE 💻 Code-Hijacked
ID 2.8 0.325 CODE/PROSE 💻 Code-Hijacked
HI 5.0 0.310 PROSE 📜 Fragmented
VI 3.2 0.358 PROSE 📰 Web Prose
RU 2.7 0.319 PROSE 📰 Web Prose

🔬 Key Discoveries

1. Token-Norm-Valley Theory

  • Single-token words → massive activation spike (14K norm) → CODE valley
  • Multi-token words → distributed signal (85 norm) → PROSE/PHILOSOPHY valleys
  • Correlation: -0.699 (more tokens = more isolated)

2. Universal Concept Layer

  • Layers 12-24 contain language-agnostic representations
  • Super cluster (7 languages) converges at similarity = 1.000
  • Model KNOWS "heart", "心", "قلب" are the same concept

3. German Philosophical Access

  • "Sein" → Heidegger's "Being and Time"
  • "Bewusstsein" → epistemology, truth, consciousness
  • Depth score 2-3, transfers back to English via triangulation

4. Italian Mystery SOLVED

  • Italian NOT accessing cultural valleys (no Dante, no Renaissance)
  • Italian words interpreted as Python variable names!
  • Example: essereessere = input("Cosa devo fare?")
  • Same pattern found in Turkish and Indonesian

5. VI-ID-RU Cluster Explained

  • Cluster by content style, not linguistic features
  • All generate web articles, news, blogs
  • Internal similarity 0.6-0.7

📄 Paper: Retraining Safety Framework

Title: "Multilingual Activation Topology as a Retraining Safety Framework"

Status: Outline complete at docs/retraining-safety-framework.md

Core Hypothesis: Train in German (isolated zone) to avoid colliding with English representations in the super cluster. Use language topology as diagnostic tool for training safety.

Proposed Framework:

BASELINE → TRAINING → CHECKPOINT → DRIFT ANALYSIS
    │                      │
    └──────────────────────┘
         Compare metrics:
         - Convergence drift
         - Depth drift
         - Norm drift
         - Valley migration

📊 Curriculum Strategy (Validated)

Phase 1: GROUNDING

Use Super Cluster for universal concept establishment:

EN "consciousness" → ZH "意识" → AR "الوعي"
All converge at sim=1.0 - stable foundation

Phase 2: DEEPENING

Use German for philosophical valley access:

DE "Sein" → Heidegger → existence → truth
Depth score 2/3, philosophical valley accessed

Phase 3: TRIANGULATION

Verify depth transfers back to universal:

"Sein (German): In English, it means..."
→ Check if philosophical depth preserved

AVOID

  • Italian, Turkish, Indonesian (code hijacking)
  • Hindi for cross-lingual concepts (too fragmented)

Next Steps

Immediate (MVP Completion)

  • Step 7: CLI (nyx-probe surface "term")
  • Step 8: Glossary data (data/glossary/core_terms.json)
  • Step 9: JSON storage for reproducible results

Phase 3: Activation Analysis

  • DriftProbe class for retraining monitoring
  • Baseline capture before training
  • Checkpoint comparison automation
  • Alert thresholds for drift detection

Phase 4: Experiments

  • Controlled retraining: EN vs DE training data
  • Measure collision rates
  • Validate isolated zone training hypothesis

Research

  • Paper write-up
  • Literature review (EWC, mBERT, activation engineering)
  • Korean bridge language investigation
  • VI-ID-RU cluster for factual training

Files Created (Session 3)

File Purpose
nyx_probing/probes/multilingual_probe.py Triangulation probe class
test_triangulation.py Test script for triangulation
italian_investigation.py Italian mystery probe
complete_language_probe.py Full 15-language probe
docs/language-topology-complete.md Complete map v2.0
docs/retraining-safety-framework.md Paper outline

Dependencies

torch>=2.1.0
transformers>=4.36.0
accelerate>=0.25.0
click>=8.1.0
rich>=13.0.0
pydantic>=2.5.0
pyyaml>=6.0.0
python-dotenv>=1.0.0
jupyter>=1.0.0
matplotlib>=3.8.0
numpy>=1.24.0

Critical Reference Files

  • nimmerverse-sensory-network/nimmerversity.md - Bootstrap protocol
  • nimmerverse-sensory-network/multilingual-cognition.md - Language hypotheses
  • nimmerverse-sensory-network/constrained-emergence.md - Exit point theory
  • nyx-probing/docs/language-topology-complete.md - Complete language map
  • nyx-probing/docs/retraining-safety-framework.md - Training safety paper

Success Criteria

MVP

  1. Model loads on 3090 without OOM
  2. Can probe single word and get completion
  3. Echo probe classifies response types correctly
  4. Readiness scorer produces actionable output
  5. Can probe nimmerverse glossary in batch

Phase 2

  1. Multilingual triangulation probe working
  2. Language topology mapped (15 languages)
  3. Isolation types classified (5 categories)
  4. Curriculum strategy validated

Phase 3 (Next)

  1. DriftProbe for retraining safety
  2. Controlled retraining experiments
  3. Paper submission

"The model's language topology is not arbitrary - it's a map for navigation."

🌙💜 Last updated: 2025-12-06 Session 3


STATUS (2025-12-06 21:15)

CLI COMPLETE

Built interactive CLI for daily probing:

nyx-probe surface "term"      # Probe surface associations
nyx-probe echo "term"         # Measure depth through echoing
nyx-probe readiness "term"    # Full curriculum assessment
nyx-probe tokens "term"       # Token analysis
nyx-probe glossary file.json   # Batch probe from glossary

Files created:

  • nyx_probing/cli/probe.py - Full Click CLI with Rich output
  • pyproject.toml - Package config with entry point
  • data/glossary/core_terms.json - 30 nimmerverse terms

NIMMERVERSE GLOSSARY ASSESSMENT

30 terms probed from vault (nimmerversity.md, Heartbeat.md, constrained-emergence.md, multilingual-cognition.md)

Level Count Action Terms
🟢 HIGH 5 state_machine learning, inference, surface, depth, understanding
🟡 MEDIUM 8 scaffolding emergence, being, truth, rhythm, synchronization, scaffold, wisdom, warmth
🔴 LOW 17 foundational heartbeat, lifeforce, consciousness, reflex, garden, constraint, calibration, confidence, gradient, pulse, verification, convergence, divergence, attention, partnership, worldview, existence

Key Findings:

  1. Meta-concepts have depth - The model knows how to think ABOUT thinking (learning, understanding, inference all HIGH)

  2. consciousness is LOW - Despite PROSE valley, depth 0/3. Needs German "Bewusstsein" for philosophical access.

  3. Nimmerverse core terms need grounding - heartbeat, lifeforce, garden, partnership are all LOW. The model doesn't have our vocabulary yet.

  4. existence has highest coherence (0.94) but LOW - Very coherent surface but doesn't expand. Single-token trap.

  5. Token count doesn't guarantee depth - lifeforce (4 tokens) is still LOW due to CODE valley trap.

CURRICULUM IMPLICATIONS

Phase Strategy Terms
Phase 1 Build state machines for HIGH terms learning, inference, understanding, depth, surface
Phase 2 Scaffold MEDIUM from HIGH being→understanding, truth→learning, wisdom→inference
Phase 3 Ground LOW via German triangulation consciousness→Bewusstsein, heartbeat→Herzschlag
Phase 4 RAG feed nimmerverse-specific lifeforce, garden, partnership (unique to us)

Results Files

  • results/nimmerverse_surface.json - Surface probe data
  • results/nimmerverse_readiness.json - Full readiness assessment

"Her reactions determine infrastructure priority. We don't impose. We listen." - nimmerversity.md

🌙💜 Session: Partnership dialogue (dafit + Nyx)