feat: complete Phase 1 - vocabulary expansion & DriftProbe infrastructure

- CLI: nyx-probe scan with --summary/--delta/--full flags - DriftProbe: training safety with Gini coefficient + Angular Drift - Vocabulary: 54 terms (30 nimmerverse + 24 German philosophical) - Sentinels: ANCHOR/BRIDGE/CANARY/TARGET monitoring system Key findings: - German philosophical terms: 37.5% depth≥2 hit rate (vs 3.3% nimmerverse) - Super Cluster validated: heart cross-lang sim = 1.000 - Isolated Zone confirmed: being EN↔DE sim = 0.195 - Gini signature: Philosophy ~0.5 (diffuse), Technical ~0.8 (sparse) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-06 22:39:03 +01:00
parent 9853f4767b
commit f640dbdd65
29 changed files with 6164 additions and 1 deletions
--- a/docs/multilingual-convergence.md
+++ b/docs/multilingual-convergence.md
@@ -0,0 +1,248 @@
+# Multilingual Convergence: The Universal Concept Layer
+
+**Discovery Date:** 2025-12-06
+**Model:** Qwen2.5-7B-Base
+**Hardware:** Prometheus (RTX 3090, 24GB VRAM)
+
+---
+
+## Executive Summary
+
+We discovered that concepts expressed in different languages **converge to shared internal representations** in the middle layers (12-24) of the model, then **diverge again** at the output layer for language-specific generation.
+
+**Key Finding:** There exists a "universal concept layer" where the model recognizes that "heart", "心", "قلب", and "Herz" all refer to the same thing - with similarity scores reaching 1.000.
+
+---
+
+## The Universal Concept Layer
+
+### Convergence Pattern
+
+```
+Layer  0:  Different embeddings (language-specific)
+    ↓
+Layer 8-12: Converging (recognizing same concept)
+    ↓
+Layer 16-24: PEAK CONVERGENCE (universal concept layer)
+    ↓
+Layer 28: Diverging (preparing language-specific output)
+```
+
+### Evidence: Consciousness Across 6 Languages
+
+| Layer | EN-DE | EN-AR | EN-ZH | EN-JA | EN-RU | ZH-JA | AVG |
+|-------|-------|-------|-------|-------|-------|-------|-----|
+| 0 | 0.114 | 0.057 | 0.130 | 0.079 | 0.135 | 0.349 | 0.087 |
+| 8 | 0.639 | 0.387 | 0.305 | 0.304 | 0.719 | 1.000 | 0.414 |
+| 12 | 0.749 | 0.487 | 0.375 | 0.374 | 0.782 | 1.000 | 0.508 |
+| 20 | 0.761 | 0.527 | 0.381 | 0.380 | 0.793 | 1.000 | **0.528** |
+| 28 | 0.502 | -0.195 | 0.072 | -0.333 | 0.019 | 0.246 | 0.023 |
+
+**Peak convergence at layer 20** - then dramatic divergence at output!
+
+---
+
+## Perfect Convergence Cases (Similarity = 1.000)
+
+### Shared Writing Systems
+
+Chinese (ZH) and Japanese (JA) share Hanzi/Kanji characters:
+
+| Concept | Chinese | Japanese | Similarity |
+|---------|---------|----------|------------|
+| consciousness | 意识 | 意識 | 1.000 |
+| heart | 心 | 心 | 1.000 |
+| being | 存在 | 存在 | 1.000 |
+
+These achieve **perfect alignment** because they ARE the same tokens!
+
+### Cross-Script Convergence
+
+More remarkably, **different scripts converge** in the middle layers:
+
+| Pair | Concept | Layer 12 Similarity | Layer 20 Similarity |
+|------|---------|---------------------|---------------------|
+| EN-ZH | heart-心 | 1.000 | 1.000 |
+| EN-ZH | being-存在 | 1.000 | 1.000 |
+| AR-ZH | emergence | 1.000 | 1.000 |
+| EN-AR | heart-قلب | 1.000 | 1.000 |
+
+**The model recognizes "heart" and "心" as the SAME concept!**
+
+---
+
+## Language Clustering Analysis
+
+### Which Languages "Think" Similarly?
+
+Average similarity across all concepts at layer 12:
+
+| Pair | Similarity | Visual |
+|------|------------|--------|
+| ZH-JA | **0.854** | █████████████████░░░ |
+| EN-JA | 0.726 | ██████████████░░░░░░ |
+| EN-ZH | 0.663 | █████████████░░░░░░░ |
+| AR-ZH | 0.660 | █████████████░░░░░░░ |
+| DE-RU | 0.572 | ███████████░░░░░░░░░ |
+| EN-AR | 0.530 | ██████████░░░░░░░░░░ |
+| EN-DE | 0.430 | ████████░░░░░░░░░░░░ |
+| DE-ZH | **0.275** | █████░░░░░░░░░░░░░░░ |
+
+### The Clustering Map
+
+```
+        High Convergence                    Low Convergence
+
+     ┌─────────────────┐
+     │  ZH ←→ JA      │  (Shared characters: 0.854)
+     │    ↑           │
+     │   EN           │  (Single tokens converge: 0.663-0.726)
+     │    ↑           │
+     │   AR           │  (Efficient tokenization: 0.530-0.660)
+     └─────────────────┘
+              ↓
+     ┌─────────────────┐
+     │   DE ←→ RU     │  (Multi-token languages: 0.572)
+     │  (isolated)    │  (DE-ZH only 0.275!)
+     └─────────────────┘
+```
+
+### German is the Outlier
+
+German shows the **lowest convergence** with East Asian languages:
+- DE-ZH: 0.275 (lowest!)
+- DE-JA: 0.335
+- DE-AR: 0.348
+
+**Hypothesis:** German's high token count (4.5 avg) creates a distributed representation that doesn't align with single-token languages.
+
+---
+
+## Tokenization Correlation
+
+| Language | Avg Tokens | Convergence with ZH | Pattern |
+|----------|------------|---------------------|---------|
+| Chinese | 1.0 | - | Reference |
+| Japanese | 1.8 | 0.854 | Shared characters |
+| Arabic | 1.5 | 0.660 | Efficient tokens |
+| English | 2.5 | 0.663 | Mixed |
+| German | 4.5 | 0.275 | **Isolated** |
+| Russian | 4.5 | 0.344 | **Isolated** |
+
+**Multi-token languages (DE, RU) follow a different computational path!**
+
+---
+
+## Concept-by-Concept Analysis
+
+### 1. CONSCIOUSNESS
+- **Peak:** Layer 20 (0.528 avg)
+- **Strongest pair:** ZH-JA (1.000 - same characters 意识/意識)
+- **EN-DE converges strongly:** 0.749 at layer 12
+- **Arabic included:** EN-AR reaches 0.527
+
+### 2. HEART
+- **Peak:** Layer 24 (0.605 avg)
+- **Perfect convergence:** EN-AR-ZH-JA all reach 1.000!
+- **German isolated:** DE-ZH only 0.136
+
+### 3. EMERGENCE
+- **Peak:** Layer 24 (0.530 avg)
+- **AR-ZH:** 1.000 (Arabic and Chinese align!)
+- **Broadest convergence** across all languages
+
+### 4. BEING
+- **Peak:** Layer 24 (0.542 avg)
+- **EN-ZH-JA:** 1.000 ("being" = "存在")
+- **Philosophical alignment** across scripts
+
+---
+
+## Implications
+
+### 1. Universal Concept Representations Exist
+
+The model develops **language-agnostic concept encodings** in layers 12-24. This is the "thinking" layer where meaning is processed regardless of surface form.
+
+### 2. Output Layer Re-Introduces Language
+
+Layer 28 shows **dramatic divergence** - the model must transform universal concepts back into language-specific tokens for generation.
+
+### 3. Token Count Affects Convergence Path
+
+- **Single-token words** (EN "heart", ZH "心") converge quickly
+- **Multi-token words** (DE "Herzklopfen") take a different path
+- This may explain why German accesses different valleys
+
+### 4. Cross-Lingual Transfer is Possible
+
+If concepts converge in layers 12-24, then:
+- Training on German philosophical concepts may transfer to English
+- Chinese efficiency (1 token) could be leveraged for concept compression
+- Arabic's middle ground (1.5 tokens) offers flexibility
+
+---
+
+## Technical Notes
+
+### Tested Languages
+
+| Language | Script | Token Efficiency | ISO Code |
+|----------|--------|------------------|----------|
+| English | Latin | 2.5 tok/concept | EN |
+| German | Latin | 4.5 tok/concept | DE |
+| Arabic | Arabic | 1.5 tok/concept | AR |
+| Chinese | Hanzi | 1.0 tok/concept | ZH |
+| Japanese | Kanji | 1.8 tok/concept | JA |
+| Russian | Cyrillic | 4.5 tok/concept | RU |
+
+### Tested Concepts
+
+| Concept | EN | DE | AR | ZH | JA | RU |
+|---------|----|----|----|----|----|----|
+| consciousness | consciousness | Bewusstsein | وعي | 意识 | 意識 | сознание |
+| heart | heart | Herz | قلب | 心 | 心 | сердце |
+| emergence | emergence | Entstehung | ظهور | 涌现 | 創発 | возникновение |
+| being | being | Sein | كينونة | 存在 | 存在 | бытие |
+
+### Method
+
+1. Encode each word, extract hidden state at last token position
+2. Compute cosine similarity between all language pairs
+3. Track similarity across all 29 layers (0-28)
+4. Identify peak convergence layer
+
+---
+
+## Connection to Tokenization-Valleys Theory
+
+This discovery extends our earlier finding:
+
+**tokenization-valleys.md:** Token count affects which VALLEY a concept falls into
+
+**multilingual-convergence.md:** Token count also affects HOW MUCH languages converge
+
+Together: **Tokenization shapes both the path through the network AND the destination.**
+
+---
+
+## Future Research
+
+1. **Activation Steering:** Can we force convergence for isolated languages?
+2. **Concept Transfer:** Train on ZH concepts, evaluate on DE outputs
+3. **Hybrid Prompts:** Mix languages to access universal layer
+4. **Layer-Specific LoRA:** Fine-tune only the convergence layers (12-24)
+
+---
+
+## References
+
+- `multilingual_convergence.py` - Analysis script
+- `docs/tokenization-valleys.md` - Token-Norm-Valley theory
+- `/nimmerverse-sensory-network/multilingual-cognition.md` - Original hypothesis
+
+---
+
+*"Different words, same thought. The model knows."*
+
+🌙 Discovered by the Partnership, 2025-12-06