From 8f28dcbc9419e279e1bad1ce262a5f8c7c8d320b Mon Sep 17 00:00:00 2001 From: dafit Date: Sun, 7 Dec 2025 17:05:28 +0100 Subject: [PATCH] docs: add Phase 1 toolchain architecture and progress tracking MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Document the modular toolchain architecture design and track implementation progress for Phase 1 (nyx-substrate foundation and variance collection automation). New Files: - Toolchain-Architecture.md: Complete Phase 1 design document - Modular architecture vision (5 phases) - Repository structure and dependency graph - Phase 1 deliverables (nyx-substrate + nyx-probing) - Success criteria and testing plan - Future phases: ChromaDB, LoRA training, visualization, Godot - TOOLCHAIN-PROGRESS.md: Implementation progress tracker - Phase 1A: nyx-substrate foundation (✅ COMPLETE) - Phase 1B: nyx-probing integration (✅ COMPLETE) - Phase 1C: Baseline variance collection (⏸️ READY) - Metrics: 11/11 tasks (100%), 12 files, ~1250 LOC - Status updates and completion tracking Architecture: nyx-probing ────────┐ nyx-training ───────┼──> nyx-substrate ──> phoebe (PostgreSQL) nyx-visualization ──┤ └─> iris (ChromaDB) management-portal ──┘ Philosophy: Modular tools, clean interfaces, data-first design Status: Phase 1 complete, ready for baseline collection on prometheus 🌙💜 Generated with Claude Code Co-Authored-By: Claude Opus 4.5 --- TOOLCHAIN-PROGRESS.md | 125 ++++++++++ Toolchain-Architecture.md | 464 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 589 insertions(+) create mode 100644 TOOLCHAIN-PROGRESS.md create mode 100644 Toolchain-Architecture.md diff --git a/TOOLCHAIN-PROGRESS.md b/TOOLCHAIN-PROGRESS.md new file mode 100644 index 0000000..3940106 --- /dev/null +++ b/TOOLCHAIN-PROGRESS.md @@ -0,0 +1,125 @@ +# Toolchain Implementation Progress + +**Plan**: See [Toolchain-Architecture.md](Toolchain-Architecture.md) +**Started**: 2025-12-07 +**Current Phase**: Phase 1 - Foundation + Variance Collection + +--- + +## Phase 1A: nyx-substrate Foundation ✅ COMPLETE + +**Goal**: Build nyx-substrate package and database infrastructure + +### ✅ Completed (2025-12-07) + +- [x] Package structure (pyproject.toml, src/ layout) +- [x] PhoebeConnection class with connection pooling +- [x] Message protocol helpers (partnership messages) +- [x] VarianceProbeRun Pydantic schema +- [x] VarianceProbeDAO for database operations +- [x] variance_probe_runs table in phoebe +- [x] Installation and connection testing + +**Files Created**: 9 new files +**Status**: 🟢 nyx-substrate v0.1.0 installed and tested + +--- + +## Phase 1B: nyx-probing Integration ✅ COMPLETE + +**Goal**: Extend nyx-probing to use nyx-substrate for variance collection + +### ✅ Completed (2025-12-07) + +- [x] Add nyx-substrate dependency to nyx-probing/pyproject.toml +- [x] Create VarianceRunner class (nyx_probing/runners/variance_runner.py) +- [x] Add variance CLI commands (nyx_probing/cli/variance.py) +- [x] Register commands in main CLI +- [x] Integration test (imports and CLI verification) + +**Files Created**: 3 new files +**Files Modified**: 2 files +**CLI Commands Added**: 4 (collect, batch, stats, analyze) +**Status**: 🟢 nyx-probing v0.1.0 with variance collection ready + +--- + +## Phase 1C: Baseline Variance Collection ⏸️ READY + +**Goal**: Collect baseline variance data for depth-3 champions + +### ⏳ Ready to Execute (on prometheus) + +- [ ] Run 1000x variance for "Geworfenheit" (thrownness) +- [ ] Run 1000x variance for "Vernunft" (reason) +- [ ] Run 1000x variance for "Erkenntnis" (knowledge) +- [ ] Run 1000x variance for "Pflicht" (duty) +- [ ] Run 1000x variance for "Aufhebung" (sublation) +- [ ] Run 1000x variance for "Wille" (will) + +**Next Actions**: +1. SSH to prometheus.eachpath.local (THE SPINE) +2. Install nyx-substrate and nyx-probing in venv +3. Run batch collection or individual terms +4. Analyze distributions and document baselines + +--- + +## Future Phases (Not Started) + +### Phase 2: ChromaDB Integration (iris) ⏸️ PLANNED +- IrisClient wrapper +- DecisionTrailStore, OrganResponseStore, EmbeddingStore +- Populate embeddings from nyx-probing + +### Phase 3: LoRA Training Pipeline ⏸️ PLANNED +- PEFT integration +- Training data curriculum +- DriftProbe checkpoints +- Identity LoRA training + +### Phase 4: Weight Visualization ⏸️ PLANNED +- 4K pixel space renderer +- Rank decomposition explorer +- Topology cluster visualization + +### Phase 5: Godot Command Center ⏸️ PLANNED +- FastAPI Management Portal backend +- Godot frontend implementation +- Real-time metrics display + +--- + +## Metrics + +**Phase 1 (A+B) Tasks**: 11 total +**Completed**: 11 (100%) ✅ +**In Progress**: 0 +**Remaining**: 0 + +**Files Created**: 12 total +- nyx-substrate: 9 files +- nyx-probing: 3 files + +**Files Modified**: 4 total +- nyx-substrate/README.md +- nyx-probing/pyproject.toml +- nyx-probing/cli/probe.py +- TOOLCHAIN-PROGRESS.md + +**Lines of Code**: ~1250 total +- nyx-substrate: ~800 LOC +- nyx-probing: ~450 LOC + +**CLI Commands**: 4 new commands +- nyx-probe variance collect +- nyx-probe variance batch +- nyx-probe variance stats +- nyx-probe variance analyze + +--- + +**Last Updated**: 2025-12-07 17:00 CET +**Status**: 🎉 Phase 1 (A+B) COMPLETE! Ready for baseline collection on prometheus. + +🌙💜 *The substrate holds. Progress persists. The toolchain grows.* diff --git a/Toolchain-Architecture.md b/Toolchain-Architecture.md new file mode 100644 index 0000000..04263d2 --- /dev/null +++ b/Toolchain-Architecture.md @@ -0,0 +1,464 @@ +# Modular Nimmerverse Toolchain Architecture + +**Planning Date**: 2025-12-07 +**Status**: Design Phase +**Priority**: Variance Collection Pipeline + nyx-substrate Foundation + +--- + +## 🎯 Vision + +Build a modular, composable toolchain for the Nimmerverse research and training pipeline: + +- **nyx-substrate**: Shared foundation (database clients, schemas, validators) +- **nyx-probing**: Research probes (already exists, extend for variance collection) +- **nyx-training**: LoRA training pipeline (future) +- **nyx-visualization**: Weight/topology visualization (future) +- **management-portal**: FastAPI backend for Godot UI (future) +- **Godot Command Center**: Unified metrics visualization (future) + +**Key Principle**: All tools import nyx-substrate. Clean interfaces. Data flows through phoebe + iris. + +--- + +## 📊 Current State Analysis + +### ✅ What Exists + +**nyx-probing** (`/home/dafit/nimmerverse/nyx-probing/`): +- Echo Probe, Surface Probe, Drift Probe, Multilingual Probe +- CLI interface (7 commands) +- NyxModel wrapper (Qwen2.5-7B loading, hidden state capture) +- ProbeResult dataclasses (to_dict() serialization) +- **Gap**: No database persistence, only local JSON files + +**nyx-substrate** (`/home/dafit/nimmerverse/nyx-substrate/`): +- Schema documentation (phoebe + iris) ✅ +- **Gap**: No Python code, just markdown docs + +**Database Infrastructure**: +- phoebe.eachpath.local (PostgreSQL 17.6): partnership/nimmerverse message tables exist +- iris.eachpath.local (ChromaDB): No collections created yet +- **Gap**: No Python client libraries, all manual psql commands + +**Architecture Documentation**: +- Endgame-Vision.md: v5.1 Dialectic (LoRA stack design) +- CLAUDE.md: Partnership protocol (message-based continuity) +- Management-Portal.md: Godot + FastAPI design (not implemented) + +### ❌ What's Missing + +**Database Access**: +- No psycopg3 connection pooling +- No ChromaDB Python integration +- No ORM or query builders +- No variance_probe_runs table (designed but not created) + +**Training Pipeline**: +- No PEFT/LoRA training code +- No DriftProbe checkpoint integration +- No training data curriculum loader + +**Visualization**: +- No weight visualization tools (4K pixel space idea) +- No Godot command center implementation +- No Management Portal FastAPI backend + +--- + +## 🏗️ Modular Architecture Design + +### Repository Structure + +``` +nimmerverse/ +├── nyx-substrate/ # SHARED FOUNDATION +│ ├── pyproject.toml # Installable package +│ ├── src/nyx_substrate/ +│ │ ├── database/ # Phoebe clients +│ │ │ ├── connection.py # Connection pool +│ │ │ ├── messages.py # Message protocol helpers +│ │ │ └── variance.py # Variance probe DAO +│ │ ├── vector/ # Iris clients +│ │ │ ├── client.py # ChromaDB wrapper +│ │ │ ├── decision_trails.py +│ │ │ ├── organ_responses.py +│ │ │ └── embeddings.py +│ │ ├── schemas/ # Pydantic models +│ │ │ ├── variance.py # VarianceProbeRun +│ │ │ ├── decision.py # DecisionTrail +│ │ │ └── traits.py # 8 core traits +│ │ └── constants.py # Shared constants +│ └── migrations/ # Alembic for schema +│ +├── nyx-probing/ # RESEARCH PROBES (extend) +│ ├── nyx_probing/ +│ │ ├── runners/ # NEW: Automated collectors +│ │ │ ├── variance_runner.py # 1000x automation +│ │ │ └── baseline_collector.py +│ │ └── storage/ # EXTEND: Database integration +│ │ └── variance_dao.py # Uses nyx-substrate +│ └── pyproject.toml # Add: depends on nyx-substrate +│ +├── nyx-training/ # FUTURE: LoRA training +│ └── (planned - not in Phase 1) +│ +├── nyx-visualization/ # FUTURE: Weight viz +│ └── (planned - not in Phase 1) +│ +└── management-portal/ # FUTURE: FastAPI + Godot + └── (designed - not in Phase 1) +``` + +### Dependency Graph + +``` +nyx-probing ────────┐ +nyx-training ───────┼──> nyx-substrate ──> phoebe (PostgreSQL) +nyx-visualization ──┤ └─> iris (ChromaDB) +management-portal ──┘ +``` + +**Philosophy**: nyx-substrate is the single source of truth for database access. No tool talks to phoebe/iris directly. + +--- + +## 🚀 Phase 1: Foundation + Variance Collection + +### Goal +Build nyx-substrate package and extend nyx-probing to automate variance baseline collection (1000x runs → phoebe). + +### Deliverables + +#### 1. nyx-substrate Python Package + +**File**: `/home/dafit/nimmerverse/nyx-substrate/pyproject.toml` +```toml +[project] +name = "nyx-substrate" +version = "0.1.0" +requires-python = ">=3.10" +dependencies = [ + "psycopg[binary]>=3.1.0", + "chromadb>=0.4.0", + "pydantic>=2.5.0", +] +``` + +**New Files**: +- `src/nyx_substrate/database/connection.py`: + - `PhoebeConnection` class: Connection pool manager + - Context manager for transactions + - Config from environment variables + +- `src/nyx_substrate/database/messages.py`: + - `write_partnership_message(message, message_type)` → INSERT + - `read_partnership_messages(limit=5)` → SELECT + - `write_nimmerverse_message(...)` (for Young Nyx future) + - `read_nimmerverse_messages(...)` (for discovery protocol) + +- `src/nyx_substrate/database/variance.py`: + - `VarianceProbeDAO` class: + - `create_table()` → CREATE TABLE variance_probe_runs + - `insert_run(session_id, term, run_number, depth, rounds, ...)` → INSERT + - `get_session_stats(session_id)` → Aggregation queries + - `get_term_distribution(term)` → Variance analysis + +- `src/nyx_substrate/schemas/variance.py`: + - `VarianceProbeRun(BaseModel)`: Pydantic model matching phoebe schema + - Validation: term not empty, depth 0-3, rounds > 0 + - `to_dict()` for serialization + +**Database Migration**: +- Create `variance_probe_runs` table in phoebe using schema from `/home/dafit/nimmerverse/nyx-substrate/schema/phoebe/probing/variance_probe_runs.md` + +#### 2. Extend nyx-probing + +**File**: `/home/dafit/nimmerverse/nyx-probing/pyproject.toml` +- Add dependency: `nyx-substrate>=0.1.0` + +**New Files**: +- `nyx_probing/runners/variance_runner.py`: + - `VarianceRunner` class: + - `__init__(model: NyxModel, dao: VarianceProbeDAO)` + - `run_session(term: str, runs: int = 1000) -> UUID`: + - Generate session_id + - Loop 1000x: probe.probe(term) + - Store each result via dao.insert_run() + - Return session_id + - `run_batch(terms: list[str], runs: int = 1000)`: Multiple terms + +- `nyx_probing/cli/variance.py`: + - New Click command group: `nyx-probe variance` + - Subcommands: + - `nyx-probe variance collect --runs 1000`: Single term + - `nyx-probe variance batch --runs 1000`: From glossary + - `nyx-probe variance stats `: View session results + - `nyx-probe variance analyze `: Compare distributions + +**Integration Points**: +```python +# In variance_runner.py +from nyx_substrate.database import PhoebeConnection, VarianceProbeDAO +from nyx_substrate.schemas import VarianceProbeRun + +conn = PhoebeConnection() +dao = VarianceProbeDAO(conn) +runner = VarianceRunner(model=get_model(), dao=dao) +session_id = runner.run_session("Geworfenheit", runs=1000) +print(f"Stored 1000 runs: session {session_id}") +``` + +#### 3. Database Setup + +**Actions**: +1. SSH to phoebe: `ssh phoebe.eachpath.local` +2. Create variance_probe_runs table: + ```sql + CREATE TABLE variance_probe_runs ( + id SERIAL PRIMARY KEY, + session_id UUID NOT NULL, + term TEXT NOT NULL, + run_number INT NOT NULL, + timestamp TIMESTAMPTZ DEFAULT NOW(), + depth INT NOT NULL, + rounds INT NOT NULL, + echo_types TEXT[] NOT NULL, + chain TEXT[] NOT NULL, + model_name TEXT DEFAULT 'Qwen2.5-7B', + temperature FLOAT, + max_rounds INT, + max_new_tokens INT + ); + CREATE INDEX idx_variance_session ON variance_probe_runs(session_id); + CREATE INDEX idx_variance_term ON variance_probe_runs(term); + CREATE INDEX idx_variance_timestamp ON variance_probe_runs(timestamp DESC); + ``` + +3. Test connection from aynee: + ```bash + cd /home/dafit/nimmerverse/nyx-substrate + python3 -c "from nyx_substrate.database import PhoebeConnection; conn = PhoebeConnection(); print('✅ Connected to phoebe')" + ``` + +--- + +## 📁 Critical Files + +### To Create + +**nyx-substrate**: +- `/home/dafit/nimmerverse/nyx-substrate/pyproject.toml` +- `/home/dafit/nimmerverse/nyx-substrate/src/nyx_substrate/__init__.py` +- `/home/dafit/nimmerverse/nyx-substrate/src/nyx_substrate/database/__init__.py` +- `/home/dafit/nimmerverse/nyx-substrate/src/nyx_substrate/database/connection.py` +- `/home/dafit/nimmerverse/nyx-substrate/src/nyx_substrate/database/messages.py` +- `/home/dafit/nimmerverse/nyx-substrate/src/nyx_substrate/database/variance.py` +- `/home/dafit/nimmerverse/nyx-substrate/src/nyx_substrate/schemas/__init__.py` +- `/home/dafit/nimmerverse/nyx-substrate/src/nyx_substrate/schemas/variance.py` +- `/home/dafit/nimmerverse/nyx-substrate/README.md` + +**nyx-probing**: +- `/home/dafit/nimmerverse/nyx-probing/nyx_probing/runners/__init__.py` +- `/home/dafit/nimmerverse/nyx-probing/nyx_probing/runners/variance_runner.py` +- `/home/dafit/nimmerverse/nyx-probing/nyx_probing/cli/variance.py` + +### To Modify + +**nyx-probing**: +- `/home/dafit/nimmerverse/nyx-probing/pyproject.toml` (add nyx-substrate dependency) +- `/home/dafit/nimmerverse/nyx-probing/nyx_probing/cli/__init__.py` (register variance commands) + +--- + +## 🧪 Testing Plan + +### 1. nyx-substrate Unit Tests +```python +# Test connection +def test_phoebe_connection(): + conn = PhoebeConnection() + assert conn.test_connection() == True + +# Test message write +def test_write_message(): + from nyx_substrate.database import write_partnership_message + write_partnership_message("Test session", "architecture_update") + # Verify in phoebe + +# Test variance DAO +def test_variance_insert(): + dao = VarianceProbeDAO(conn) + session_id = uuid.uuid4() + dao.insert_run( + session_id=session_id, + term="test", + run_number=1, + depth=2, + rounds=3, + echo_types=["EXPANDS", "CONFIRMS", "CIRCULAR"], + chain=["test", "expanded", "confirmed"] + ) + stats = dao.get_session_stats(session_id) + assert stats["total_runs"] == 1 +``` + +### 2. Variance Collection Integration Test +```bash +# On prometheus (THE SPINE) +cd /home/dafit/nimmerverse/nyx-probing +source venv/bin/activate + +# Install nyx-substrate in development mode +pip install -e ../nyx-substrate + +# Run small variance test (10 runs) +nyx-probe variance collect "Geworfenheit" --runs 10 + +# Check phoebe +PGGSSENCMODE=disable psql -h phoebe.eachpath.local -U nimmerverse-user -d nimmerverse -c " +SELECT session_id, term, COUNT(*) as runs, AVG(depth) as avg_depth +FROM variance_probe_runs +GROUP BY session_id, term +ORDER BY session_id DESC +LIMIT 5; +" + +# Expected: 1 session, 10 runs, avg_depth ~2.0 +``` + +### 3. Full 1000x Baseline Run +```bash +# Depth-3 champions (from nyx-probing Phase 1) +nyx-probe variance collect "Geworfenheit" --runs 1000 # thrownness +nyx-probe variance collect "Vernunft" --runs 1000 # reason +nyx-probe variance collect "Erkenntnis" --runs 1000 # knowledge +nyx-probe variance collect "Pflicht" --runs 1000 # duty +nyx-probe variance collect "Aufhebung" --runs 1000 # sublation +nyx-probe variance collect "Wille" --runs 1000 # will + +# Analyze variance +nyx-probe variance analyze "Geworfenheit" +# Expected: Distribution histogram, depth variance, chain patterns +``` + +--- + +## 🌊 Data Flow + +### Variance Collection Workflow + +``` +User: nyx-probe variance collect "Geworfenheit" --runs 1000 + ↓ +VarianceRunner.run_session() + ↓ +Loop 1000x: + EchoProbe.probe("Geworfenheit") + ↓ + Returns EchoProbeResult + ↓ + VarianceProbeDAO.insert_run() + ↓ + INSERT INTO phoebe.variance_probe_runs + ↓ +Return session_id + ↓ +Display: "✅ 1000 runs complete. Session: " +``` + +### Future Integration (Phase 2+) + +``` +Training Loop: + ↓ +DriftProbe.probe_lite() [every 100 steps] + ↓ +Store metrics in phoebe.drift_checkpoints (new table) + ↓ +Management Portal API: GET /api/v1/metrics/training + ↓ +Godot Command Center displays live DriftProbe charts +``` + +--- + +## 🎯 Success Criteria + +### Phase 1 Complete When: + +1. ✅ nyx-substrate package installable via pip (`pip install -e .`) +2. ✅ PhoebeConnection works from aynee + prometheus +3. ✅ variance_probe_runs table created in phoebe +4. ✅ `nyx-probe variance collect` command runs successfully +5. ✅ 1000x run completes and stores in phoebe +6. ✅ `nyx-probe variance stats ` displays: + - Total runs + - Depth distribution (0/1/2/3 counts) + - Most common echo_types + - Chain length variance +7. ✅ All 6 depth-3 champions have baseline variance data in phoebe + +--- + +## 🔮 Future Phases (Not in Current Plan) + +### Phase 2: ChromaDB Integration (iris) +- IrisClient wrapper in nyx-substrate +- DecisionTrailStore, OrganResponseStore, EmbeddingStore +- Create iris collections +- Populate embeddings from nyx-probing results + +### Phase 3: LoRA Training Pipeline (nyx-training) +- PEFT integration +- Training data curriculum loader +- DriftProbe checkpoint integration +- Identity LoRA training automation + +### Phase 4: Weight Visualization (nyx-visualization) +- 4K pixel space renderer (LoRA weights as images) +- Rank decomposition explorer +- Topology cluster visualization + +### Phase 5: Godot Command Center +- FastAPI Management Portal backend +- Godot frontend implementation +- Real-time metrics display +- Training dashboard + +--- + +## 📚 References + +**Schema Documentation**: +- `/home/dafit/nimmerverse/nyx-substrate/schema/phoebe/probing/variance_probe_runs.md` +- `/home/dafit/nimmerverse/nyx-substrate/SCHEMA.md` + +**Existing Code**: +- `/home/dafit/nimmerverse/nyx-probing/nyx_probing/probes/echo_probe.py` +- `/home/dafit/nimmerverse/nyx-probing/nyx_probing/core/probe_result.py` +- `/home/dafit/nimmerverse/nyx-probing/nyx_probing/cli/probe.py` + +**Architecture**: +- `/home/dafit/nimmerverse/nimmerverse-sensory-network/Endgame-Vision.md` +- `/home/dafit/nimmerverse/management-portal/Management-Portal.md` + +--- + +## 🌙 Philosophy + +**Modularity**: Each tool is independent but speaks the same data language via nyx-substrate. + +**Simplicity**: No over-engineering. Build what's needed for variance collection first. + +**Data First**: All metrics flow through phoebe/iris. Visualization is separate concern. + +**Future-Ready**: Design allows Godot integration later without refactoring. + +--- + +**Status**: Ready for implementation approval +**Estimated Scope**: 15-20 files, ~1500 lines of Python +**Hardware**: Can develop on aynee, run variance on prometheus (THE SPINE) + +🌙💜 *The substrate holds. Clean interfaces. Composable tools. Data flows through the void.*