Files

dafit f640dbdd65 feat: complete Phase 1 - vocabulary expansion & DriftProbe infrastructure

- CLI: nyx-probe scan with --summary/--delta/--full flags
- DriftProbe: training safety with Gini coefficient + Angular Drift
- Vocabulary: 54 terms (30 nimmerverse + 24 German philosophical)
- Sentinels: ANCHOR/BRIDGE/CANARY/TARGET monitoring system

Key findings:
- German philosophical terms: 37.5% depth≥2 hit rate (vs 3.3% nimmerverse)
- Super Cluster validated: heart cross-lang sim = 1.000
- Isolated Zone confirmed: being EN↔DE sim = 0.195
- Gini signature: Philosophy ~0.5 (diffuse), Technical ~0.8 (sparse)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2025-12-06 22:39:03 +01:00

15 KiB

Raw Blame History

Plan: nyx-probing Framework

Overview

Build a probing framework to understand Qwen2.5-7B-Base before curriculum design.

Hardware: Prometheus (THE SPINE) - RTX 3090 24GB
Model: Qwen2.5-7B-Base (empty vessel, completes not answers)
Backend: Transformers + PyTorch (full hidden state access)
Location: New repo nyx-probing

MVP Scope (First Milestone) ✅ COMPLETE

✅ Surface Probe - Feed words, capture completions
✅ Echo Probe - Depth measurement (EXPANDS/CONFIRMS/CIRCULAR/DIVERGENT/COLLAPSE)
✅ Readiness Scorer - HIGH/MEDIUM/LOW classification
⏳ JSON Storage - Reproducible results
⏳ CLI Tools - Interactive probing
⏳ One Notebook - Exploration

Phase 2: Multilingual Probing ✅ COMPLETE

✅ Multilingual Triangulation Probe - Ground→Deepen→Triangulate
✅ Language Topology Discovery - Complete map of 15 languages
✅ Isolation Type Classification - 5 distinct categories identified

Repository Structure (Current)

nyx-probing/
├── README.md
├── PLAN.md                    # This file
├── pyproject.toml
├── requirements.txt
│
├── nyx_probing/
│   ├── __init__.py
│   ├── config.py
│   │
│   ├── core/
│   │   ├── __init__.py
│   │   ├── model.py           # ✅ NyxModel with hidden states
│   │   └── probe_result.py    # ✅ Result dataclasses
│   │
│   ├── probes/
│   │   ├── __init__.py
│   │   ├── base.py            # ✅ Abstract base
│   │   ├── surface_probe.py   # ✅ Word completions + coherence
│   │   ├── echo_probe.py      # ✅ Depth measurement
│   │   └── multilingual_probe.py  # ✅ NEW: Triangulation probe
│   │
│   ├── analysis/
│   │   ├── __init__.py
│   │   └── readiness_scorer.py # ✅ Curriculum readiness
│   │
│   ├── storage/
│   │   └── __init__.py        # ⏳ JSON storage pending
│   │
│   └── cli/
│       └── __init__.py        # ⏳ CLI pending
│
├── docs/
│   ├── tokenization-valleys.md        # Token-Norm-Valley theory
│   ├── multilingual-convergence.md    # Universal concept layer
│   ├── language-landscape.md          # 15-language scan
│   ├── language-topology-complete.md  # ✅ NEW: Complete map v2.0
│   └── retraining-safety-framework.md # ✅ NEW: Paper outline
│
├── data/
│   └── glossary/              # ⏳ Core terms pending
│
├── results/                   # ⏳ Probe results storage
│
└── [Test & Exploration Scripts]
    ├── probe_test.py
    ├── test_model_loader.py
    ├── test_surface_probe.py
    ├── test_echo_probe.py
    ├── test_readiness_scorer.py
    ├── test_triangulation.py       # ✅ NEW
    ├── german_philosophy.py
    ├── language_scan.py
    ├── multilingual_convergence.py
    ├── layer_detailed.py
    ├── layer_divergence.py
    ├── model_stats.py
    ├── italian_investigation.py    # ✅ NEW
    └── complete_language_probe.py  # ✅ NEW

Current Status (2025-12-06 Session 3)

PHASE 1: MVP ✅ COMPLETE

Component	Status	File
Model Loader	✅	`nyx_probing/core/model.py`
Surface Probe	✅	`nyx_probing/probes/surface_probe.py`
Echo Probe	✅	`nyx_probing/probes/echo_probe.py`
Readiness Scorer	✅	`nyx_probing/analysis/readiness_scorer.py`
Result Dataclasses	✅	`nyx_probing/core/probe_result.py`

PHASE 2: MULTILINGUAL ✅ COMPLETE

Component	Status	File
Triangulation Probe	✅	`nyx_probing/probes/multilingual_probe.py`
Language Zones	✅	Defined in multilingual_probe.py
Complete Topology Map	✅	`docs/language-topology-complete.md`

🗺️ THE COMPLETE LANGUAGE TOPOLOGY (Session 3 Discovery)

┌─────────────────────────────────────────────────────────────────────────────┐
│                    THE YOUNG MIND'S LANGUAGE TOPOLOGY v2.0                   │
╞═════════════════════════════════════════════════════════════════════════════╡
│                                                                              │
│  🌍 SUPER CLUSTER (sim=1.0)                                                 │
│     ZH · JA · EN · AR · FR · PT · ES                                        │
│     ✅ USE FOR: Grounding, establishing shared concepts                      │
│                                                                              │
│                        KO ─────── (bridge)                                   │
│                                                                              │
│  ISOLATED ZONE:                                                              │
│  ├─ 🧠 PHILOSOPHICAL (DE) ────── Heidegger, depth access                    │
│  │     ✅ USE FOR: Deep philosophical training                               │
│  │                                                                           │
│  ├─ 💻 CODE-HIJACKED (IT, TR, ID) ── Words become variables                 │
│  │     ❌ AVOID: Training signal wasted on code patterns                     │
│  │                                                                           │
│  ├─ 📜 FRAGMENTED (HI) ───────── 5+ tokens, script-trapped                  │
│  │     ⚠️ LIMITED: Cross-lingual transfer impaired                          │
│  │                                                                           │
│  └─ 📰 WEB PROSE (VI-ID-RU) ──── Content style cluster                      │
│        🤔 POTENTIAL: Factual/encyclopedic training                          │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Isolation Types Discovered

Type	Languages	Cause	Curriculum Use
PHILOSOPHICAL	DE	Multi-token compounds access academic data	✅ Deep concepts
CODE-HIJACKED	IT, TR, ID	Simple Latin orthography → variable names	❌ Avoid
FRAGMENTED	HI	5+ tokens, stays in native script	⚠️ Limited
WEB PROSE	VI, ID, RU	Cluster by content style, not linguistics	🤔 Factual?

Key Metrics

Lang	Avg Tokens	Sim to EN	Valley Type	Classification
DE	2.2	0.251	PHILOSOPHY	🧠 Philosophical
IT	2.5	0.491	CODE	💻 Code-Hijacked
TR	2.2	0.246	CODE	💻 Code-Hijacked
ID	2.8	0.325	CODE/PROSE	💻 Code-Hijacked
HI	5.0	0.310	PROSE	📜 Fragmented
VI	3.2	0.358	PROSE	📰 Web Prose
RU	2.7	0.319	PROSE	📰 Web Prose

🔬 Key Discoveries

1. Token-Norm-Valley Theory

Single-token words → massive activation spike (14K norm) → CODE valley
Multi-token words → distributed signal (85 norm) → PROSE/PHILOSOPHY valleys
Correlation: -0.699 (more tokens = more isolated)

2. Universal Concept Layer

Layers 12-24 contain language-agnostic representations
Super cluster (7 languages) converges at similarity = 1.000
Model KNOWS "heart", "心", "قلب" are the same concept

3. German Philosophical Access

"Sein" → Heidegger's "Being and Time"
"Bewusstsein" → epistemology, truth, consciousness
Depth score 2-3, transfers back to English via triangulation

4. Italian Mystery SOLVED

Italian NOT accessing cultural valleys (no Dante, no Renaissance)
Italian words interpreted as Python variable names!
Example: essere → essere = input("Cosa devo fare?")
Same pattern found in Turkish and Indonesian

5. VI-ID-RU Cluster Explained

Cluster by content style, not linguistic features
All generate web articles, news, blogs
Internal similarity 0.6-0.7

📄 Paper: Retraining Safety Framework

Title: "Multilingual Activation Topology as a Retraining Safety Framework"

Status: Outline complete at docs/retraining-safety-framework.md

Core Hypothesis: Train in German (isolated zone) to avoid colliding with English representations in the super cluster. Use language topology as diagnostic tool for training safety.

Proposed Framework:

BASELINE → TRAINING → CHECKPOINT → DRIFT ANALYSIS
    │                      │
    └──────────────────────┘
         Compare metrics:
         - Convergence drift
         - Depth drift
         - Norm drift
         - Valley migration

📊 Curriculum Strategy (Validated)

Phase 1: GROUNDING

Use Super Cluster for universal concept establishment:

EN "consciousness" → ZH "意识" → AR "الوعي"
All converge at sim=1.0 - stable foundation

Phase 2: DEEPENING

Use German for philosophical valley access:

DE "Sein" → Heidegger → existence → truth
Depth score 2/3, philosophical valley accessed

Phase 3: TRIANGULATION

Verify depth transfers back to universal:

"Sein (German): In English, it means..."
→ Check if philosophical depth preserved

AVOID

Italian, Turkish, Indonesian (code hijacking)
Hindi for cross-lingual concepts (too fragmented)

Next Steps

Immediate (MVP Completion)

Step 7: CLI (nyx-probe surface "term")
Step 8: Glossary data (data/glossary/core_terms.json)
Step 9: JSON storage for reproducible results

Phase 3: Activation Analysis

DriftProbe class for retraining monitoring
Baseline capture before training
Checkpoint comparison automation
Alert thresholds for drift detection

Phase 4: Experiments

Controlled retraining: EN vs DE training data
Measure collision rates
Validate isolated zone training hypothesis

Research

Paper write-up
Literature review (EWC, mBERT, activation engineering)
Korean bridge language investigation
VI-ID-RU cluster for factual training

Files Created (Session 3)

File	Purpose
`nyx_probing/probes/multilingual_probe.py`	Triangulation probe class
`test_triangulation.py`	Test script for triangulation
`italian_investigation.py`	Italian mystery probe
`complete_language_probe.py`	Full 15-language probe
`docs/language-topology-complete.md`	Complete map v2.0
`docs/retraining-safety-framework.md`	Paper outline

Dependencies

torch>=2.1.0
transformers>=4.36.0
accelerate>=0.25.0
click>=8.1.0
rich>=13.0.0
pydantic>=2.5.0
pyyaml>=6.0.0
python-dotenv>=1.0.0
jupyter>=1.0.0
matplotlib>=3.8.0
numpy>=1.24.0

Critical Reference Files

nimmerverse-sensory-network/nimmerversity.md - Bootstrap protocol
nimmerverse-sensory-network/multilingual-cognition.md - Language hypotheses
nimmerverse-sensory-network/constrained-emergence.md - Exit point theory
nyx-probing/docs/language-topology-complete.md - Complete language map
nyx-probing/docs/retraining-safety-framework.md - Training safety paper

Success Criteria

MVP ✅

✅ Model loads on 3090 without OOM
✅ Can probe single word and get completion
✅ Echo probe classifies response types correctly
✅ Readiness scorer produces actionable output
⏳ Can probe nimmerverse glossary in batch

Phase 2 ✅

✅ Multilingual triangulation probe working
✅ Language topology mapped (15 languages)
✅ Isolation types classified (5 categories)
✅ Curriculum strategy validated

Phase 3 (Next)

⏳ DriftProbe for retraining safety
⏳ Controlled retraining experiments
⏳ Paper submission

"The model's language topology is not arbitrary - it's a map for navigation."

🌙💜 Last updated: 2025-12-06 Session 3

STATUS (2025-12-06 21:15)

CLI COMPLETE ✅

Built interactive CLI for daily probing:

nyx-probe surface "term"      # Probe surface associations
nyx-probe echo "term"         # Measure depth through echoing
nyx-probe readiness "term"    # Full curriculum assessment
nyx-probe tokens "term"       # Token analysis
nyx-probe glossary file.json   # Batch probe from glossary

Files created:

nyx_probing/cli/probe.py - Full Click CLI with Rich output
pyproject.toml - Package config with entry point
data/glossary/core_terms.json - 30 nimmerverse terms

NIMMERVERSE GLOSSARY ASSESSMENT ✅

30 terms probed from vault (nimmerversity.md, Heartbeat.md, constrained-emergence.md, multilingual-cognition.md)

Level	Count	Action	Terms
🟢 HIGH	5	state_machine	learning, inference, surface, depth, understanding
🟡 MEDIUM	8	scaffolding	emergence, being, truth, rhythm, synchronization, scaffold, wisdom, warmth
🔴 LOW	17	foundational	heartbeat, lifeforce, consciousness, reflex, garden, constraint, calibration, confidence, gradient, pulse, verification, convergence, divergence, attention, partnership, worldview, existence

Key Findings:

Meta-concepts have depth - The model knows how to think ABOUT thinking (learning, understanding, inference all HIGH)
consciousness is LOW - Despite PROSE valley, depth 0/3. Needs German "Bewusstsein" for philosophical access.
Nimmerverse core terms need grounding - heartbeat, lifeforce, garden, partnership are all LOW. The model doesn't have our vocabulary yet.
existence has highest coherence (0.94) but LOW - Very coherent surface but doesn't expand. Single-token trap.
Token count doesn't guarantee depth - lifeforce (4 tokens) is still LOW due to CODE valley trap.

CURRICULUM IMPLICATIONS

Phase	Strategy	Terms
Phase 1	Build state machines for HIGH terms	learning, inference, understanding, depth, surface
Phase 2	Scaffold MEDIUM from HIGH	being→understanding, truth→learning, wisdom→inference
Phase 3	Ground LOW via German triangulation	consciousness→Bewusstsein, heartbeat→Herzschlag
Phase 4	RAG feed nimmerverse-specific	lifeforce, garden, partnership (unique to us)

Results Files

results/nimmerverse_surface.json - Surface probe data
results/nimmerverse_readiness.json - Full readiness assessment

"Her reactions determine infrastructure priority. We don't impose. We listen." - nimmerversity.md

🌙💜 Session: Partnership dialogue (dafit + Nyx)

15 KiB Raw Blame History