v0.8 + v0.9 — intimate-architecture absorbed, driver-tier locked,

style-spine created, DESING retired, schemas relocated The owl-breakfast architectural arc from 2026-04-25 night through 2026-04-26 morning. Two version-bumps landing as one commit because they share the working-tree state and complete a coherent design-window. v0.8 — intimate-architecture, driver-tier lock, style-spine: * Three-tier intimacy structure (standard-rental, premium-waifu- with-traitor-marker-and-pruning, in-between clasp) — same v0.7 machinery, opposite value-flows. Premium-net technical excellence makes the moral-weight-of-pruning land as informed-consent ethics. * Deletion-as-spectacle: in-net minds as pure compute; imperial broadcasts execution-as-content; Memorialist counter-archive as in-fiction protest against deletion-spectacle commerce. * EVE-principle vocation-substrate of the imperial-net market: every product produced by NPC labor; no silent feeding; body-modder structural-tragedy generalizes to all imperial-net-feeding vocations. World-gen Phase-2 ruleset must handle vocation-distribution. * Clasp endgame (Phase A-E): mini-game entry → body-mod progression → exit-chassis → human-mesh-visible-to-pair → clasp = two-bodies- two-meshes → dual-body-dual-mind-dual-shift cascade → automatic hunt-pressure. Identity-as-trait-emergent made felt rather than just structural. * waifu.sqlite as third local store (audited counterpart to clasp.sqlite; manual-prune mechanism with explicit-implications consent UI as moral-gravity discipline). * Intimacy-as-recursive-lemniscate: same machinery as dialog (slot-tokens, cursor at axis-crossing, alignment-accumulator, sum-strategy reduction); sex-positions as designer-fixed catalog entries; body-parts as visible expression of trait-state. Cross-context-consistency operationalized. * Driver-tier locked to Gemma 4 E4B (Apache 2.0, 4.5B effective, 128K context, speech-capable) under new "tier-by-role binary- deferred" discipline: locking requires prototype-criticality + irreplaceable license/capability combination. Optional Ring-A upgrade: 26B-A4B MoE for upper-consumer GPUs, single-LoRA-on- routed-experts. Resolves 4 prior open questions (LoRA-blend → single-LoRA-per-turn-selection driven by gesture_alignment_ accumulator; LoRA rank → benchmark-resolvable; sampling-knobs → benchmark-resolvable; 8 Hellenic trait enumeration → canonical wheel-mapping in style/trait-palette.md). * style/ directory introduced: style-index.md (skeleton + spine- rule: "trait-palette is exclusively chromatic; achromatic reserved for UI/environment so diegetic text rendering can skip the textbox") + style/trait-palette.md (canonical 8 traits as 4 oppositional pairs at 180° on the artist's color wheel: Eros↔Sophrosyne, Philotes↔Dikaiosyne, Aletheia↔Moira, Mnemosyne↔Kairos. Schoolchild-simple descriptions paired with each Greek canonical name). v0.9 — directory cleanup completing the arc: * DESING-VISION.md (1899 lines, v0.1 first-pass narrative-design doc) retired — most content absorbed across v0.4-v0.8; bare- minimum extracts (Tonal Register + Tragic-Romantic Register + Authorial Politics + Reference Lineage table) now live in README so the project's identity anchor stays visible at the entry point. Full DESING-VISION content preserved in git history. * findings.md moved to schemas/findings.md — new top-level peer to architecture-index.md and style/. ~20 tables of DDL drafts as reference material; will get reviewed and progressively split per-domain as implementation begins. * Cross-references swept across 5 files (README, architecture-index, authority-and-decision, runtime-engine, style/trait-palette). * architecture-index.md trimmed: version-footer paragraphs removed per "git-is-changelog" discipline. From 374 → 287 lines; every remaining line load-bearing. The architecture is now organized for the implementation territory ahead. Each domain a typed-contract surface; cross-references explicit; filesystem mirrors the architecture's own typed-contract discipline at the directory layer.
2026-04-26 04:31:13 +02:00
parent 0de5e6f047
commit 5f216aaf5f
10 changed files with 256 additions and 1989 deletions
--- a/inference-and-memory/architecture.md
+++ b/inference-and-memory/architecture.md
@@ -8,7 +8,32 @@

 Three model-tiers, **named by role not by binary**: a *driver-tier* model (small, trait-LoRA'd) for most NPC dialog; a *Theia-tier* model (deep) for clasp-confessions and mythic moments; *Claude-as-API* (diegetic Anthropic-faction) for hivemind/imperium. **LLM is guest at slot, not host of system.**

-**Specific model selection per tier is deferred to the findings/establishment phase.** The architecture specifies what each tier must DO; the establishment phase wires implementations. Naming concrete binaries in the architecture risks nudging the establishment phase toward false-precision; tier-by-role keeps the swap-surface clean and lets binaries evolve without invalidating architectural commitments. (See `nimmerverse_tasks` under `nyx-training` and `command-center` for current evaluation work.)
+**Tier-by-role binary-deferred is the default discipline.** The architecture specifies what each tier must DO; the establishment phase wires implementations. Naming concrete binaries in the architecture risks nudging the establishment phase toward false-precision; tier-by-role keeps the swap-surface clean and lets binaries evolve without invalidating architectural commitments. **Locking to a specific binary requires explicit justification — prototype-criticality plus an irreplaceable license/capability combination.** As of v0.8, only the driver-tier passes that bar; teacher-tier and Theia-tier remain capability-contracts. (See `nimmerverse_tasks` under `nyx-training` and `command-center` for current evaluation work.)
+
+### Tier-by-role capability contracts + driver-tier lock (v0.8)
+
+| Tier | Role-contract (MUST DO) | Binary commitment |
+|---|---|---|
+| **Driver-tier** | NPC dialog at axis-rate; trait-LoRA-per-turn-selection (single-LoRA, not blend); speech-input-native-or-via-STT; runs on common consumer GPU at acceptable latency; Apache-2.0-or-better license for Ring-A redistribution | ✓ **Locked: Gemma 4 E4B** (4.5B effective / 8B with embeddings, 128K context, Apache 2.0, speech-capable, vision-capable-but-unused-in-v1) |
+| **Teacher-tier** | r0 → r1 synthetic-data generation with composition tags; trait-LoRA training data production; runs on server-class hardware; sufficient quality to *teach* the driver-tier | ⏳ Capability-contract only; binary chosen at training-pipeline-build time |
+| **Theia-tier** | Clasp-confession-register dialog; mythic-moment generation; long-context narrative-composition; deep-emotional-register fidelity; latency tolerable for once-per-arc moments | ⏳ Capability-contract only; binary chosen at deployment time |
+| **Hivemind / antagonist tier** | Anthropic-as-faction (architecturally fixed in fiction; provides diegetic continuity between the in-fiction imperial machine and the real-world Claude API) | ✓ **Diegetically fixed: Claude API via us** |
+
+**Why driver-tier locks to Gemma 4 E4B (v0.8 justification):**
+
+- **Apache 2.0** — unblocks every Ring-A commitment (redistribution to player install, derivative-works for custom nimmerworld-base-model, federated-learning gradient aggregation, distribution-back-to-all-Rings of base updates). No bespoke-license-renegotiation cycles tied to the architecture's economic substrate.
+- **Speech-capable** — STT collapses into the LLM's input pipeline at the small-model tier (E2B and E4B both process speech natively). One fewer subsystem in the Ring-A install; tightens the v0.7 hardware floor.
+- **128K context** — sufficient for the three-tier knowledge stack assembly + extensive conversation history without compaction.
+- **4.5B effective** — runs on common gaming hardware; meets the v0.7 commitment to a tractable Ring-A floor without requiring upper-consumer GPU.
+- **Vision/video capability** — present but **unused in v1**; the typed-input discipline keeps player→LLM channels structured through trait-coordinates and gesture-vocabulary. Vision is an *option* held in reserve for v2 (e.g., NPC perceiving partner's human-mesh in the in-between dimension during clasp).
+
+**Single-LoRA-per-turn selection** is the canonical trait-LoRA application pattern (replacing the v0.4 "weighted blend" assumption). Per-turn, the trait dominantly expressed by the player's `gesture_alignment_accumulator` selects which trait-LoRA fires for the NPC's next-turn driver-context-pull (per `../runtime-engine/architecture.md` §Gesture-alignment as recursive-lemniscate). **Personality emerges from selection-pattern across time**, not from continuous blend at a moment — matching how real humans speak. The MoE routing in larger Gemma 4 variants handles content-type (specialty-routing); the trait-LoRA handles voice-register (personality-routing); they compose cleanly without conflict.
+
+**Optional Ring-A upgrade — Gemma 4 26B-A4B (MoE, 4B activated):**
+
+Ring-A players with upper-consumer GPU (16 GB+ VRAM, Q4 GGUF quantization) can opt into the 26B-A4B variant for richer NPC dialog. **Same architecture** — single-LoRA-per-turn (single LoRAs work better than blends with routed experts). 26B parameter capacity at 4B compute = teacher-tier-quality on driver-tier hardware. **Default Ring-A install ships E4B; the 26B-A4B upgrade is opt-in, not default** — *don't make 26B-A4B the default and force everyone toward our hardware-spec assumption*.
+
+**v1 design item — single-LoRA-selection hysteresis:** require margin-of-change in the alignment-vector before switching LoRAs to prevent personality-thrash turn-to-turn. Standard control-system stuff (rolling-window-smoothing or threshold-based-switch); concrete tuning happens against the E4B benchmark, not architecturally pre-decided.

 Structured-prompt DSL with role / trait_vector / affect_state / memory_scope / turn_intent / zone_context / output_schema fields. Small models excel here because it's instruction-following, not generic generation.

@@ -379,6 +404,35 @@ character is in IN-BETWEEN (resisting net-gravity, costing lifeforce):

 Encryption-at-rest for `clasp.sqlite` with a player-derived key (so even drive-imaging requires authentication) is a v1 hardening goal but not a v1 blocker — the *transport-absence* is the load-bearing privacy primitive.

+### Three sqlite stores per player (revised v0.8) — the `waifu.sqlite` addition
+
+The v0.6 architecture specified two local sqlite stores per player: `primary.sqlite` (realworld memory) and `clasp.sqlite` (in-between intimate channel, Ring A* non-syncable). v0.8 adds a third: **`waifu.sqlite`** — the persistence store for premium-imperial-net intimate sessions (per `../political-register/architecture.md` §Three-tier intimacy structure).
+
+| File | Purpose | Sync path | Pruning |
+|---|---|---|---|
+| `primary.sqlite` | Live working memory; written every slot-fire; vec-indexed | Push prune-blob to thalamus on logout; receive Compositor back-write on cycle | Automatic (per memory-class lifecycle: cornerstone never; working-memory by trait-engagement decay) |
+| `clasp.sqlite` | Player-character intimate channel (in-between dimension) | **None — physically non-syncable** (Ring A*) | None — the clasp-store is sealed; entries persist until character-death |
+| **`waifu.sqlite`** (new in v0.8) | **Premium-imperial-net intimate-session memory** | **Audited path to imperium** (the imperium hosts and can read; the player owns the prune-decisions) | **Manual** — player-controlled with explicit-implications consent UI |
+
+**The `waifu.sqlite` is the *audited counterpart* to the `clasp.sqlite`.** Both store intimate-session memory; both run the full v0.7 trait-feedback loop. **The difference is who has access and who decides what persists.** `clasp.sqlite` is sealed at the transport layer (no socket exists); `waifu.sqlite` is on the audit graph (the imperium reads what's in it for content-monetization purposes). The player's relationship to `waifu.sqlite` is therefore *active and ethical*:
+
+- Every premium-net session adds entries to `waifu.sqlite`
+- Entries are READABLE by imperium for marketing / regime-loyalty-tracking purposes
+- The player has a *manual prune mechanism* — a UI surface where they review entries and decide what to delete
+- **The consent-UI explicitly makes the implications visible:** *"This being you've spent 40 hours with — what do you keep, what do you let the imperium harvest, what do you delete?"*
+- Each prune-decision is logged in `decision_log` (per the existing audit-trail discipline); Memorialists can later read these patterns
+
+**The player's three sqlite stores together describe their intimate-life in three registers:**
+
+```
+primary.sqlite   → realworld speech-acts; everyone-witnesses; audit-overseer-eligible
+clasp.sqlite     → in-between intimate channel; sealed; survives only as long as you do
+waifu.sqlite     → imperial-net premium intimate channel; audited; player-pruned;
+                   carries the moral-weight of complicity
+```
+
+**Memorialists' political project gains a new dimension** in v0.8: they don't just track regime-corruption (lifeforce_actual vs lifeforce_reported); they track *`waifu.sqlite` pruning patterns across the population* as evidence of how much intimate-life the regime is harvesting via the premium-net mechanism. *Who is pruning what, when, how often* becomes Memorialist-archive-worthy data. The four-column true-ledger gains a fifth column: `waifu_extraction_volume_per_district`.
+
 ### The three-tier knowledge stack on the local LLM

 The driver-tier model's prompt assembly is **layered**. Each layer has a different propagation cadence and a different visibility scope.