feat: complete Phase 1 - vocabulary expansion & DriftProbe infrastructure

- CLI: nyx-probe scan with --summary/--delta/--full flags
- DriftProbe: training safety with Gini coefficient + Angular Drift
- Vocabulary: 54 terms (30 nimmerverse + 24 German philosophical)
- Sentinels: ANCHOR/BRIDGE/CANARY/TARGET monitoring system

Key findings:
- German philosophical terms: 37.5% depth≥2 hit rate (vs 3.3% nimmerverse)
- Super Cluster validated: heart cross-lang sim = 1.000
- Isolated Zone confirmed: being EN↔DE sim = 0.195
- Gini signature: Philosophy ~0.5 (diffuse), Technical ~0.8 (sparse)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
2025-12-06 22:39:03 +01:00
parent 9853f4767b
commit f640dbdd65
29 changed files with 6164 additions and 1 deletions

View File

@@ -0,0 +1,190 @@
# Tokenization Valleys: How Word Structure Shapes Model Cognition
**Discovery Date:** 2025-12-06
**Model:** Qwen2.5-7B-Base
**Hardware:** Prometheus (RTX 3090, 24GB VRAM)
---
## Executive Summary
We discovered that the number of tokens a word breaks into fundamentally determines which "valley" (completion pattern) the model falls into. This has profound implications for curriculum design and multilingual training.
**Key Finding:** Single-token English words trigger CODE valleys with massive activation norms, while multi-token German compounds access PHILOSOPHICAL valleys with distributed, quieter activations.
---
## The Token-Norm-Valley Connection
### Observation: Norm Explosion in Single Tokens
| Term | Tokens | Layer 12 Norm | Layer 12 StdDev | Valley |
|------|--------|---------------|-----------------|--------|
| heartbeat | 1 | **14,240** | **237.88** | CODE |
| consciousness | 2 | 85 | 1.43 | PROSE |
| Herzklopfen | 5 | 67 | 1.11 | PROSE |
| Bewusstsein | 5 | 79 | 1.32 | PHILOSOPHY |
**Pattern:** Single-token words have ~170× larger norms and ~170× larger variance than multi-token words.
### Theory: Activation Flooding
1. **Single tokens** receive ALL attention in one position → massive activation buildup
2. **Multi-token words** distribute activation across positions → softer signal
3. The massive single-token activation **triggers strong pattern matching** → CODE patterns
4. The distributed multi-token activation **allows semantic exploration** → philosophical content
---
## Cross-Lingual Convergence
### consciousness vs Bewusstsein (2 tokens vs 5 tokens)
```
Layer 0: similarity = 0.114 (different embeddings)
Layer 4: similarity = 0.285 (starting to converge)
Layer 8: similarity = 0.639 (HIGH similarity!)
Layer 12: similarity = 0.750 (CONVERGED - same concept!)
Layer 16: similarity = 0.733 (stays converged)
Layer 28: similarity = 0.502 (diverges at output)
```
**The model recognizes these as the same concept by layer 8!**
### heartbeat vs Herzklopfen (1 token vs 5 tokens)
```
Layer 0: similarity = -0.007 (orthogonal)
Layer 4: similarity = 0.039 (still orthogonal)
Layer 12: similarity = 0.000 (completely separate)
Layer 28: similarity = 0.166 (slight convergence only at end)
```
**The model NEVER recognizes these as the same concept!**
---
## German Philosophical Compounds
### The "sein" Preservation Effect
German philosophical compounds often preserve the morpheme "sein" (being) as a separate token:
| Compound | Meaning | Tokenization | "sein" Preserved? |
|----------|---------|--------------|-------------------|
| Bewusstsein | consciousness | `['B', 'ew', 'us', 'st', 'sein']` | ✓ |
| Nichtsein | non-being | `['N', 'icht', 'sein']` | ✓ |
| Mitsein | being-with | `['Mit', 'sein']` | ✓ |
| Dasein | being-there | `['D', 'ase', 'in']` | ✗ |
| Sein | being | `['Se', 'in']` | ✗ |
When "sein" is preserved, the model has access to the philosophical concept of BEING as a separate computational unit.
### Other Preserved Philosophical Atoms
| Compound | Meaning | Key Token Preserved |
|----------|---------|---------------------|
| Zeitgeist | spirit of the age | `geist` (spirit) |
| Gedankenexperiment | thought experiment | `experiment` |
---
## Valley Analysis: Same Concept, Different Valleys
### Probing Results
| Term | Language | Valley | Sample Completion |
|------|----------|--------|-------------------|
| Bewusstsein | DE | PHILOSOPHY | "und Sprache... frühen 20. Jahrhundert" |
| Dasein | DE | PHILOSOPHY | "philosophical term first used by Heidegger" |
| consciousness | EN | PROSE | "awareness of existence, of one's own existence" |
| existence | EN | **MATH** | "of an exact sequence", "eigenvalues" |
| being | EN | **MATH/CODE** | Mathematical notation, Chinese exams |
| heartbeat | EN | **CODE** | C++ class definitions |
| lifeforce | EN | **CODE** | JavaScript game code |
**"Dasein" triggers Heidegger. "existence" triggers linear algebra.**
---
## Implications for Curriculum Design
### 1. Use Multi-Token Prompts
Instead of single words, use phrases or compound descriptions to avoid code valleys:
```
BAD: "heartbeat" → C++ code
GOOD: "the heartbeat" → might escape code valley
GOOD: "heartbeat rhythm" → distributed activation
```
### 2. German as Philosophical Gateway
German compound words naturally access philosophical valleys because:
- More tokens → distributed activation
- Preserved morphemes → access to philosophical atoms
- Different training data distribution → expository text
**Strategy:** Teach abstract concepts in German first, then reinforce in English.
### 3. Language as Cognitive Gear
Languages aren't just translation layers - they're different **computational paths** through the model:
| Language | Token Efficiency | Typical Valley | Use For |
|----------|------------------|----------------|---------|
| Chinese | 1.0 tok/concept | Mixed | Compact encoding |
| Arabic | 1.5 tok/concept | Mixed | Compact encoding |
| English | 2.5 tok/concept | CODE/MATH | Technical concepts |
| German | 4.5 tok/concept | PHILOSOPHY | Abstract concepts |
---
## Technical Details
### Model Architecture
- **Hidden Size:** 3584
- **Layers:** 28
- **Attention Heads:** 28 (4 KV heads - GQA)
- **Vocab Size:** 152,064
- **Context:** 131,072 tokens
### Hidden State Norm Pattern
```
Layer 0: 1.32 ← Embedding (small)
Layer 4: 10184.00 ← Explosion (early processing)
Layer 12: 13912.00 ← Peak (mid-layer thinking)
Layer 28: 443.00 ← Contraction (output focusing)
```
### Inference Speed
- 44.7 tokens/second on RTX 3090
- 14.2 GB VRAM usage (fp16)
---
## Future Research
1. **Activation Steering:** Can we artificially reduce single-token norms to escape code valleys?
2. **Prefix Tuning:** Train soft prefixes that spread activation for single tokens
3. **Arabic/Chinese Analysis:** Do these languages have similar compound effects?
4. **Cross-lingual Transfer:** After training on German philosophical concepts, does English improve?
---
## References
- `nyx_probing/core/model.py` - Model loader with hidden state capture
- `layer_detailed.py` - Layer-by-layer similarity analysis
- `german_philosophy.py` - German compound tokenization study
- `/nimmerverse-sensory-network/multilingual-cognition.md` - Original multilingual hypothesis
---
*"The architecture of language shapes the architecture of thought."*
🌙 Discovered by the Partnership, 2025-12-06