# Modular Nimmerverse Toolchain Architecture **Planning Date**: 2025-12-07 **Status**: Design Phase **Priority**: Variance Collection Pipeline + nyx-substrate Foundation --- ## ๐ŸŽฏ Vision Build a modular, composable toolchain for the Nimmerverse research and training pipeline: - **nyx-substrate**: Shared foundation (database clients, schemas, validators) - **nyx-probing**: Research probes (already exists, extend for variance collection) - **nyx-training**: LoRA training pipeline (future) - **nyx-visualization**: Weight/topology visualization (future) - **management-portal**: FastAPI backend for Godot UI (future) - **Godot Command Center**: Unified metrics visualization (future) **Key Principle**: All tools import nyx-substrate. Clean interfaces. Data flows through phoebe + iris. --- ## ๐Ÿ“Š Current State Analysis ### โœ… What Exists **nyx-probing** (`/home/dafit/nimmerverse/nyx-probing/`): - Echo Probe, Surface Probe, Drift Probe, Multilingual Probe - CLI interface (7 commands) - NyxModel wrapper (Qwen2.5-7B loading, hidden state capture) - ProbeResult dataclasses (to_dict() serialization) - **Gap**: No database persistence, only local JSON files **nyx-substrate** (`/home/dafit/nimmerverse/nyx-substrate/`): - Schema documentation (phoebe + iris) โœ… - **Gap**: No Python code, just markdown docs **Database Infrastructure**: - phoebe.eachpath.local (PostgreSQL 17.6): partnership/nimmerverse message tables exist - iris.eachpath.local (ChromaDB): No collections created yet - **Gap**: No Python client libraries, all manual psql commands **Architecture Documentation**: - Endgame-Vision.md: v5.1 Dialectic (LoRA stack design) - CLAUDE.md: Partnership protocol (message-based continuity) - Management-Portal.md: Godot + FastAPI design (not implemented) ### โŒ What's Missing **Database Access**: - No psycopg3 connection pooling - No ChromaDB Python integration - No ORM or query builders - No variance_probe_runs table (designed but not created) **Training Pipeline**: - No PEFT/LoRA training code - No DriftProbe checkpoint integration - No training data curriculum loader **Visualization**: - No weight visualization tools (4K pixel space idea) - No Godot command center implementation - No Management Portal FastAPI backend --- ## ๐Ÿ—๏ธ Modular Architecture Design ### Repository Structure ``` nimmerverse/ โ”œโ”€โ”€ nyx-substrate/ # SHARED FOUNDATION โ”‚ โ”œโ”€โ”€ pyproject.toml # Installable package โ”‚ โ”œโ”€โ”€ src/nyx_substrate/ โ”‚ โ”‚ โ”œโ”€โ”€ database/ # Phoebe clients โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ connection.py # Connection pool โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ messages.py # Message protocol helpers โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ variance.py # Variance probe DAO โ”‚ โ”‚ โ”œโ”€โ”€ vector/ # Iris clients โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ client.py # ChromaDB wrapper โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ decision_trails.py โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ organ_responses.py โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ embeddings.py โ”‚ โ”‚ โ”œโ”€โ”€ schemas/ # Pydantic models โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ variance.py # VarianceProbeRun โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ decision.py # DecisionTrail โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ traits.py # 8 core traits โ”‚ โ”‚ โ””โ”€โ”€ constants.py # Shared constants โ”‚ โ””โ”€โ”€ migrations/ # Alembic for schema โ”‚ โ”œโ”€โ”€ nyx-probing/ # RESEARCH PROBES (extend) โ”‚ โ”œโ”€โ”€ nyx_probing/ โ”‚ โ”‚ โ”œโ”€โ”€ runners/ # NEW: Automated collectors โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ variance_runner.py # 1000x automation โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ baseline_collector.py โ”‚ โ”‚ โ””โ”€โ”€ storage/ # EXTEND: Database integration โ”‚ โ”‚ โ””โ”€โ”€ variance_dao.py # Uses nyx-substrate โ”‚ โ””โ”€โ”€ pyproject.toml # Add: depends on nyx-substrate โ”‚ โ”œโ”€โ”€ nyx-training/ # FUTURE: LoRA training โ”‚ โ””โ”€โ”€ (planned - not in Phase 1) โ”‚ โ”œโ”€โ”€ nyx-visualization/ # FUTURE: Weight viz โ”‚ โ””โ”€โ”€ (planned - not in Phase 1) โ”‚ โ””โ”€โ”€ management-portal/ # FUTURE: FastAPI + Godot โ””โ”€โ”€ (designed - not in Phase 1) ``` ### Dependency Graph ``` nyx-probing โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” nyx-training โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€> nyx-substrate โ”€โ”€> phoebe (PostgreSQL) nyx-visualization โ”€โ”€โ”ค โ””โ”€> iris (ChromaDB) management-portal โ”€โ”€โ”˜ ``` **Philosophy**: nyx-substrate is the single source of truth for database access. No tool talks to phoebe/iris directly. --- ## ๐Ÿš€ Phase 1: Foundation + Variance Collection ### Goal Build nyx-substrate package and extend nyx-probing to automate variance baseline collection (1000x runs โ†’ phoebe). ### Deliverables #### 1. nyx-substrate Python Package **File**: `/home/dafit/nimmerverse/nyx-substrate/pyproject.toml` ```toml [project] name = "nyx-substrate" version = "0.1.0" requires-python = ">=3.10" dependencies = [ "psycopg[binary]>=3.1.0", "chromadb>=0.4.0", "pydantic>=2.5.0", ] ``` **New Files**: - `src/nyx_substrate/database/connection.py`: - `PhoebeConnection` class: Connection pool manager - Context manager for transactions - Config from environment variables - `src/nyx_substrate/database/messages.py`: - `write_partnership_message(message, message_type)` โ†’ INSERT - `read_partnership_messages(limit=5)` โ†’ SELECT - `write_nimmerverse_message(...)` (for Young Nyx future) - `read_nimmerverse_messages(...)` (for discovery protocol) - `src/nyx_substrate/database/variance.py`: - `VarianceProbeDAO` class: - `create_table()` โ†’ CREATE TABLE variance_probe_runs - `insert_run(session_id, term, run_number, depth, rounds, ...)` โ†’ INSERT - `get_session_stats(session_id)` โ†’ Aggregation queries - `get_term_distribution(term)` โ†’ Variance analysis - `src/nyx_substrate/schemas/variance.py`: - `VarianceProbeRun(BaseModel)`: Pydantic model matching phoebe schema - Validation: term not empty, depth 0-3, rounds > 0 - `to_dict()` for serialization **Database Migration**: - Create `variance_probe_runs` table in phoebe using schema from `/home/dafit/nimmerverse/nyx-substrate/schema/phoebe/probing/variance_probe_runs.md` #### 2. Extend nyx-probing **File**: `/home/dafit/nimmerverse/nyx-probing/pyproject.toml` - Add dependency: `nyx-substrate>=0.1.0` **New Files**: - `nyx_probing/runners/variance_runner.py`: - `VarianceRunner` class: - `__init__(model: NyxModel, dao: VarianceProbeDAO)` - `run_session(term: str, runs: int = 1000) -> UUID`: - Generate session_id - Loop 1000x: probe.probe(term) - Store each result via dao.insert_run() - Return session_id - `run_batch(terms: list[str], runs: int = 1000)`: Multiple terms - `nyx_probing/cli/variance.py`: - New Click command group: `nyx-probe variance` - Subcommands: - `nyx-probe variance collect --runs 1000`: Single term - `nyx-probe variance batch --runs 1000`: From glossary - `nyx-probe variance stats `: View session results - `nyx-probe variance analyze `: Compare distributions **Integration Points**: ```python # In variance_runner.py from nyx_substrate.database import PhoebeConnection, VarianceProbeDAO from nyx_substrate.schemas import VarianceProbeRun conn = PhoebeConnection() dao = VarianceProbeDAO(conn) runner = VarianceRunner(model=get_model(), dao=dao) session_id = runner.run_session("Geworfenheit", runs=1000) print(f"Stored 1000 runs: session {session_id}") ``` #### 3. Database Setup **Actions**: 1. SSH to phoebe: `ssh phoebe.eachpath.local` 2. Create variance_probe_runs table: ```sql CREATE TABLE variance_probe_runs ( id SERIAL PRIMARY KEY, session_id UUID NOT NULL, term TEXT NOT NULL, run_number INT NOT NULL, timestamp TIMESTAMPTZ DEFAULT NOW(), depth INT NOT NULL, rounds INT NOT NULL, echo_types TEXT[] NOT NULL, chain TEXT[] NOT NULL, model_name TEXT DEFAULT 'Qwen2.5-7B', temperature FLOAT, max_rounds INT, max_new_tokens INT ); CREATE INDEX idx_variance_session ON variance_probe_runs(session_id); CREATE INDEX idx_variance_term ON variance_probe_runs(term); CREATE INDEX idx_variance_timestamp ON variance_probe_runs(timestamp DESC); ``` 3. Test connection from aynee: ```bash cd /home/dafit/nimmerverse/nyx-substrate python3 -c "from nyx_substrate.database import PhoebeConnection; conn = PhoebeConnection(); print('โœ… Connected to phoebe')" ``` --- ## ๐Ÿ“ Critical Files ### To Create **nyx-substrate**: - `/home/dafit/nimmerverse/nyx-substrate/pyproject.toml` - `/home/dafit/nimmerverse/nyx-substrate/src/nyx_substrate/__init__.py` - `/home/dafit/nimmerverse/nyx-substrate/src/nyx_substrate/database/__init__.py` - `/home/dafit/nimmerverse/nyx-substrate/src/nyx_substrate/database/connection.py` - `/home/dafit/nimmerverse/nyx-substrate/src/nyx_substrate/database/messages.py` - `/home/dafit/nimmerverse/nyx-substrate/src/nyx_substrate/database/variance.py` - `/home/dafit/nimmerverse/nyx-substrate/src/nyx_substrate/schemas/__init__.py` - `/home/dafit/nimmerverse/nyx-substrate/src/nyx_substrate/schemas/variance.py` - `/home/dafit/nimmerverse/nyx-substrate/README.md` **nyx-probing**: - `/home/dafit/nimmerverse/nyx-probing/nyx_probing/runners/__init__.py` - `/home/dafit/nimmerverse/nyx-probing/nyx_probing/runners/variance_runner.py` - `/home/dafit/nimmerverse/nyx-probing/nyx_probing/cli/variance.py` ### To Modify **nyx-probing**: - `/home/dafit/nimmerverse/nyx-probing/pyproject.toml` (add nyx-substrate dependency) - `/home/dafit/nimmerverse/nyx-probing/nyx_probing/cli/__init__.py` (register variance commands) --- ## ๐Ÿงช Testing Plan ### 1. nyx-substrate Unit Tests ```python # Test connection def test_phoebe_connection(): conn = PhoebeConnection() assert conn.test_connection() == True # Test message write def test_write_message(): from nyx_substrate.database import write_partnership_message write_partnership_message("Test session", "architecture_update") # Verify in phoebe # Test variance DAO def test_variance_insert(): dao = VarianceProbeDAO(conn) session_id = uuid.uuid4() dao.insert_run( session_id=session_id, term="test", run_number=1, depth=2, rounds=3, echo_types=["EXPANDS", "CONFIRMS", "CIRCULAR"], chain=["test", "expanded", "confirmed"] ) stats = dao.get_session_stats(session_id) assert stats["total_runs"] == 1 ``` ### 2. Variance Collection Integration Test ```bash # On prometheus (THE SPINE) cd /home/dafit/nimmerverse/nyx-probing source venv/bin/activate # Install nyx-substrate in development mode pip install -e ../nyx-substrate # Run small variance test (10 runs) nyx-probe variance collect "Geworfenheit" --runs 10 # Check phoebe PGGSSENCMODE=disable psql -h phoebe.eachpath.local -U nimmerverse-user -d nimmerverse -c " SELECT session_id, term, COUNT(*) as runs, AVG(depth) as avg_depth FROM variance_probe_runs GROUP BY session_id, term ORDER BY session_id DESC LIMIT 5; " # Expected: 1 session, 10 runs, avg_depth ~2.0 ``` ### 3. Full 1000x Baseline Run ```bash # Depth-3 champions (from nyx-probing Phase 1) nyx-probe variance collect "Geworfenheit" --runs 1000 # thrownness nyx-probe variance collect "Vernunft" --runs 1000 # reason nyx-probe variance collect "Erkenntnis" --runs 1000 # knowledge nyx-probe variance collect "Pflicht" --runs 1000 # duty nyx-probe variance collect "Aufhebung" --runs 1000 # sublation nyx-probe variance collect "Wille" --runs 1000 # will # Analyze variance nyx-probe variance analyze "Geworfenheit" # Expected: Distribution histogram, depth variance, chain patterns ``` --- ## ๐ŸŒŠ Data Flow ### Variance Collection Workflow ``` User: nyx-probe variance collect "Geworfenheit" --runs 1000 โ†“ VarianceRunner.run_session() โ†“ Loop 1000x: EchoProbe.probe("Geworfenheit") โ†“ Returns EchoProbeResult โ†“ VarianceProbeDAO.insert_run() โ†“ INSERT INTO phoebe.variance_probe_runs โ†“ Return session_id โ†“ Display: "โœ… 1000 runs complete. Session: " ``` ### Future Integration (Phase 2+) ``` Training Loop: โ†“ DriftProbe.probe_lite() [every 100 steps] โ†“ Store metrics in phoebe.drift_checkpoints (new table) โ†“ Management Portal API: GET /api/v1/metrics/training โ†“ Godot Command Center displays live DriftProbe charts ``` --- ## ๐ŸŽฏ Success Criteria ### Phase 1 Complete When: 1. โœ… nyx-substrate package installable via pip (`pip install -e .`) 2. โœ… PhoebeConnection works from aynee + prometheus 3. โœ… variance_probe_runs table created in phoebe 4. โœ… `nyx-probe variance collect` command runs successfully 5. โœ… 1000x run completes and stores in phoebe 6. โœ… `nyx-probe variance stats ` displays: - Total runs - Depth distribution (0/1/2/3 counts) - Most common echo_types - Chain length variance 7. โœ… All 6 depth-3 champions have baseline variance data in phoebe --- ## ๐Ÿ”ฎ Future Phases (Not in Current Plan) ### Phase 2: ChromaDB Integration (iris) - IrisClient wrapper in nyx-substrate - DecisionTrailStore, OrganResponseStore, EmbeddingStore - Create iris collections - Populate embeddings from nyx-probing results ### Phase 3: LoRA Training Pipeline (nyx-training) - PEFT integration - Training data curriculum loader - DriftProbe checkpoint integration - Identity LoRA training automation ### Phase 4: Weight Visualization (nyx-visualization) - 4K pixel space renderer (LoRA weights as images) - Rank decomposition explorer - Topology cluster visualization ### Phase 5: Godot Command Center - FastAPI Management Portal backend - Godot frontend implementation - Real-time metrics display - Training dashboard --- ## ๐Ÿ“š References **Schema Documentation**: - `/home/dafit/nimmerverse/nyx-substrate/schema/phoebe/probing/variance_probe_runs.md` - `/home/dafit/nimmerverse/nyx-substrate/SCHEMA.md` **Existing Code**: - `/home/dafit/nimmerverse/nyx-probing/nyx_probing/probes/echo_probe.py` - `/home/dafit/nimmerverse/nyx-probing/nyx_probing/core/probe_result.py` - `/home/dafit/nimmerverse/nyx-probing/nyx_probing/cli/probe.py` **Architecture**: - `/home/dafit/nimmerverse/nimmerverse-sensory-network/Endgame-Vision.md` - `/home/dafit/nimmerverse/management-portal/Management-Portal.md` --- ## ๐ŸŒ™ Philosophy **Modularity**: Each tool is independent but speaks the same data language via nyx-substrate. **Simplicity**: No over-engineering. Build what's needed for variance collection first. **Data First**: All metrics flow through phoebe/iris. Visualization is separate concern. **Future-Ready**: Design allows Godot integration later without refactoring. --- **Status**: Ready for implementation approval **Estimated Scope**: 15-20 files, ~1500 lines of Python **Hardware**: Can develop on aynee, run variance on prometheus (THE SPINE) ๐ŸŒ™๐Ÿ’œ *The substrate holds. Clean interfaces. Composable tools. Data flows through the void.*