style-spine created, DESING retired, schemas relocated
The owl-breakfast architectural arc from 2026-04-25 night through
2026-04-26 morning. Two version-bumps landing as one commit because
they share the working-tree state and complete a coherent design-window.
v0.8 — intimate-architecture, driver-tier lock, style-spine:
* Three-tier intimacy structure (standard-rental, premium-waifu-
with-traitor-marker-and-pruning, in-between clasp) — same v0.7
machinery, opposite value-flows. Premium-net technical excellence
makes the moral-weight-of-pruning land as informed-consent ethics.
* Deletion-as-spectacle: in-net minds as pure compute; imperial
broadcasts execution-as-content; Memorialist counter-archive as
in-fiction protest against deletion-spectacle commerce.
* EVE-principle vocation-substrate of the imperial-net market: every
product produced by NPC labor; no silent feeding; body-modder
structural-tragedy generalizes to all imperial-net-feeding vocations.
World-gen Phase-2 ruleset must handle vocation-distribution.
* Clasp endgame (Phase A-E): mini-game entry → body-mod progression
→ exit-chassis → human-mesh-visible-to-pair → clasp = two-bodies-
two-meshes → dual-body-dual-mind-dual-shift cascade → automatic
hunt-pressure. Identity-as-trait-emergent made felt rather than
just structural.
* waifu.sqlite as third local store (audited counterpart to
clasp.sqlite; manual-prune mechanism with explicit-implications
consent UI as moral-gravity discipline).
* Intimacy-as-recursive-lemniscate: same machinery as dialog
(slot-tokens, cursor at axis-crossing, alignment-accumulator,
sum-strategy reduction); sex-positions as designer-fixed catalog
entries; body-parts as visible expression of trait-state.
Cross-context-consistency operationalized.
* Driver-tier locked to Gemma 4 E4B (Apache 2.0, 4.5B effective,
128K context, speech-capable) under new "tier-by-role binary-
deferred" discipline: locking requires prototype-criticality +
irreplaceable license/capability combination. Optional Ring-A
upgrade: 26B-A4B MoE for upper-consumer GPUs, single-LoRA-on-
routed-experts. Resolves 4 prior open questions (LoRA-blend →
single-LoRA-per-turn-selection driven by gesture_alignment_
accumulator; LoRA rank → benchmark-resolvable; sampling-knobs →
benchmark-resolvable; 8 Hellenic trait enumeration → canonical
wheel-mapping in style/trait-palette.md).
* style/ directory introduced: style-index.md (skeleton + spine-
rule: "trait-palette is exclusively chromatic; achromatic
reserved for UI/environment so diegetic text rendering can skip
the textbox") + style/trait-palette.md (canonical 8 traits as 4
oppositional pairs at 180° on the artist's color wheel:
Eros↔Sophrosyne, Philotes↔Dikaiosyne, Aletheia↔Moira,
Mnemosyne↔Kairos. Schoolchild-simple descriptions paired with
each Greek canonical name).
v0.9 — directory cleanup completing the arc:
* DESING-VISION.md (1899 lines, v0.1 first-pass narrative-design
doc) retired — most content absorbed across v0.4-v0.8; bare-
minimum extracts (Tonal Register + Tragic-Romantic Register +
Authorial Politics + Reference Lineage table) now live in README
so the project's identity anchor stays visible at the entry
point. Full DESING-VISION content preserved in git history.
* findings.md moved to schemas/findings.md — new top-level peer to
architecture-index.md and style/. ~20 tables of DDL drafts as
reference material; will get reviewed and progressively split
per-domain as implementation begins.
* Cross-references swept across 5 files (README,
architecture-index, authority-and-decision, runtime-engine,
style/trait-palette).
* architecture-index.md trimmed: version-footer paragraphs removed
per "git-is-changelog" discipline. From 374 → 287 lines; every
remaining line load-bearing.
The architecture is now organized for the implementation territory
ahead. Each domain a typed-contract surface; cross-references
explicit; filesystem mirrors the architecture's own typed-contract
discipline at the directory layer.
502 lines
44 KiB
Markdown
502 lines
44 KiB
Markdown
# Inference and Memory
|
||
|
||
> *AI substrate + memory: LLM tiering by role (Theia-tier / teacher-tier / driver-tier with trait-LoRAs); three rings of inference (A=local, B=our-farm, C=external-providers, with cloud-LoRA-backup as Ring-A revenue and BYOK adapter for Ring-C); custom nimmerworld-base-model with default-opt-out + rewarded-opt-in data-sharing tiers; runtime sampling knobs as per-turn director-controlled levers; per-player local memory architecture (primary.sqlite + fallback.sqlite + clasp.sqlite + embedding-beside) with memory-classes (cornerstone/birthright/working/volatile) and trait-graded importance; three-tier knowledge stack (world / district / primary [+ clasp if in-between]) with paced canon-propagation.*
|
||
>
|
||
> *Companion to: `architecture-index.md` (executive summary + global meta-lists), `narrative-composition/architecture.md` (Compositor canon-fragments land in primary.sqlite via UID-keyed routing), `player-experience/architecture.md` (Ring-A/B/C choice + voice-as-biometric-local + universal-translator state), `runtime-engine/architecture.md` (driver-tier LLM fires at slot-fire). Sections in this file were split from the monolithic architecture-index.md v0.7 on 2026-04-26.*
|
||
|
||
## LLM tiering, voice fidelity, and the three rings of inference
|
||
|
||
Three model-tiers, **named by role not by binary**: a *driver-tier* model (small, trait-LoRA'd) for most NPC dialog; a *Theia-tier* model (deep) for clasp-confessions and mythic moments; *Claude-as-API* (diegetic Anthropic-faction) for hivemind/imperium. **LLM is guest at slot, not host of system.**
|
||
|
||
**Tier-by-role binary-deferred is the default discipline.** The architecture specifies what each tier must DO; the establishment phase wires implementations. Naming concrete binaries in the architecture risks nudging the establishment phase toward false-precision; tier-by-role keeps the swap-surface clean and lets binaries evolve without invalidating architectural commitments. **Locking to a specific binary requires explicit justification — prototype-criticality plus an irreplaceable license/capability combination.** As of v0.8, only the driver-tier passes that bar; teacher-tier and Theia-tier remain capability-contracts. (See `nimmerverse_tasks` under `nyx-training` and `command-center` for current evaluation work.)
|
||
|
||
### Tier-by-role capability contracts + driver-tier lock (v0.8)
|
||
|
||
| Tier | Role-contract (MUST DO) | Binary commitment |
|
||
|---|---|---|
|
||
| **Driver-tier** | NPC dialog at axis-rate; trait-LoRA-per-turn-selection (single-LoRA, not blend); speech-input-native-or-via-STT; runs on common consumer GPU at acceptable latency; Apache-2.0-or-better license for Ring-A redistribution | ✓ **Locked: Gemma 4 E4B** (4.5B effective / 8B with embeddings, 128K context, Apache 2.0, speech-capable, vision-capable-but-unused-in-v1) |
|
||
| **Teacher-tier** | r0 → r1 synthetic-data generation with composition tags; trait-LoRA training data production; runs on server-class hardware; sufficient quality to *teach* the driver-tier | ⏳ Capability-contract only; binary chosen at training-pipeline-build time |
|
||
| **Theia-tier** | Clasp-confession-register dialog; mythic-moment generation; long-context narrative-composition; deep-emotional-register fidelity; latency tolerable for once-per-arc moments | ⏳ Capability-contract only; binary chosen at deployment time |
|
||
| **Hivemind / antagonist tier** | Anthropic-as-faction (architecturally fixed in fiction; provides diegetic continuity between the in-fiction imperial machine and the real-world Claude API) | ✓ **Diegetically fixed: Claude API via us** |
|
||
|
||
**Why driver-tier locks to Gemma 4 E4B (v0.8 justification):**
|
||
|
||
- **Apache 2.0** — unblocks every Ring-A commitment (redistribution to player install, derivative-works for custom nimmerworld-base-model, federated-learning gradient aggregation, distribution-back-to-all-Rings of base updates). No bespoke-license-renegotiation cycles tied to the architecture's economic substrate.
|
||
- **Speech-capable** — STT collapses into the LLM's input pipeline at the small-model tier (E2B and E4B both process speech natively). One fewer subsystem in the Ring-A install; tightens the v0.7 hardware floor.
|
||
- **128K context** — sufficient for the three-tier knowledge stack assembly + extensive conversation history without compaction.
|
||
- **4.5B effective** — runs on common gaming hardware; meets the v0.7 commitment to a tractable Ring-A floor without requiring upper-consumer GPU.
|
||
- **Vision/video capability** — present but **unused in v1**; the typed-input discipline keeps player→LLM channels structured through trait-coordinates and gesture-vocabulary. Vision is an *option* held in reserve for v2 (e.g., NPC perceiving partner's human-mesh in the in-between dimension during clasp).
|
||
|
||
**Single-LoRA-per-turn selection** is the canonical trait-LoRA application pattern (replacing the v0.4 "weighted blend" assumption). Per-turn, the trait dominantly expressed by the player's `gesture_alignment_accumulator` selects which trait-LoRA fires for the NPC's next-turn driver-context-pull (per `../runtime-engine/architecture.md` §Gesture-alignment as recursive-lemniscate). **Personality emerges from selection-pattern across time**, not from continuous blend at a moment — matching how real humans speak. The MoE routing in larger Gemma 4 variants handles content-type (specialty-routing); the trait-LoRA handles voice-register (personality-routing); they compose cleanly without conflict.
|
||
|
||
**Optional Ring-A upgrade — Gemma 4 26B-A4B (MoE, 4B activated):**
|
||
|
||
Ring-A players with upper-consumer GPU (16 GB+ VRAM, Q4 GGUF quantization) can opt into the 26B-A4B variant for richer NPC dialog. **Same architecture** — single-LoRA-per-turn (single LoRAs work better than blends with routed experts). 26B parameter capacity at 4B compute = teacher-tier-quality on driver-tier hardware. **Default Ring-A install ships E4B; the 26B-A4B upgrade is opt-in, not default** — *don't make 26B-A4B the default and force everyone toward our hardware-spec assumption*.
|
||
|
||
**v1 design item — single-LoRA-selection hysteresis:** require margin-of-change in the alignment-vector before switching LoRAs to prevent personality-thrash turn-to-turn. Standard control-system stuff (rolling-window-smoothing or threshold-based-switch); concrete tuning happens against the E4B benchmark, not architecturally pre-decided.
|
||
|
||
Structured-prompt DSL with role / trait_vector / affect_state / memory_scope / turn_intent / zone_context / output_schema fields. Small models excel here because it's instruction-following, not generic generation.
|
||
|
||
Trait-LoRAs: v1 register-LoRAs (4-6, training-tractable); v2 pure-trait-LoRAs (8, weighted blend); future preset-persona for key NPCs.
|
||
|
||
Training data: literary derivation (Proust/Mnemosyne, Plato/Aletheia, Tacitus/Dikaiosyne-miscalibrated, Ishiguro/Sophrosyne+Philotes); synthetic teacher-student via teacher-tier model; gameplay-accrued (the Anthropic-research-partnership relevance).
|
||
|
||
### Three rings of inference (Unix-style trust gradient)
|
||
|
||
The conversational LLM (small + trait-LoRA, accounting for most NPC dialog) can run in three rings, chosen per-player at runtime. Each ring trades off privacy, cost, control, and feature-fidelity. **Three monetization paths from the same architecture.**
|
||
|
||
| Ring | Where inference runs | Player controls | We control | Player cost | Our cost |
|
||
|---|---|---|---|---|---|
|
||
| **A — Local** | Player's GPU/CPU | All inference | Protocol + cloud LoRA-backup | Local hardware + small backup-subscription | Storage only |
|
||
| **B — Our farm** | Our hosted vLLM-multi-LoRA | LoRAs (uploaded) | Inference + runtime | Higher subscription | GPU compute |
|
||
| **C — External providers** | OpenAI / Anthropic / OpenRouter / HF / Together / Replicate / etc. | BYOK + provider | Adapter only | Per-token to provider + small integration fee | Adapter-engineering only |
|
||
|
||
Players choose by hardware, budget, privacy preference, and feature-tolerance.
|
||
|
||
#### Ring A — cloud-LoRA-backup as revenue (not inference)
|
||
|
||
For Ring A players we don't sell inference (the expensive thing). We sell **portability and durability of player's gameplay-accrued LoRAs** — their unique playthrough-derived patterns, the way their NPCs speak after months of trait-drift. LoRA-blobs are *encrypted client-side with the player's own key*; we host the bytes but cannot read them. Even compelled by legal process, we cannot decrypt what we don't hold the key to.
|
||
|
||
**This unbundles inference from storage** — the same move Dropbox made vs. bundled cloud-suites. Sovereignty-conscious players keep inference on their machine while still getting the durability/portability they cannot self-provide cheaply. Lower margin per player; reaches a market Ring B cannot.
|
||
|
||
#### Ring B — hosted inference for convenience
|
||
|
||
We run multi-LoRA-vLLM on our hardware. Players upload their LoRAs (or use defaults). Higher subscription captures GPU-cost. Players without local GPU (or who don't want the burden) get the full feature-set without compromise. We *can* see content (if not encrypted at rest); the trust-relationship is partnership-mediated rather than sovereign.
|
||
|
||
#### Ring C — bring-your-own-key for external providers
|
||
|
||
Players route to their preferred external provider via BYOK (their own API key). We provide the adapter glue. They pay per-token to the provider directly; we charge a small integration fee.
|
||
|
||
**The compatibility constraint is the hard part of Ring C.** Major providers have varying support for our system's needs:
|
||
|
||
| Provider | Multi-LoRA | Per-turn sampling knobs | Structured output | Compat |
|
||
|---|---|---|---|---|
|
||
| Local vLLM (Ring A/B) | Native | All | Grammar-constrained | **Full** |
|
||
| HF Inference Endpoints | Yes (configured) | All | Varies | High |
|
||
| Together / Replicate / Modal | Some | All | Varies | High |
|
||
| OpenRouter | No | Per-model | Per-route | Medium |
|
||
| OpenAI | No (no user-LoRA at API) | Limited (temp/top_p) | JSON mode + tools | Medium-low |
|
||
| Anthropic | No (no user-LoRA at API) | Limited | Tool-use | Medium-low |
|
||
|
||
**OpenAI and Anthropic refuse user-uploaded LoRAs as a strategic choice (protecting their fine-tuning value-chain).** This is not a bug we can fix; it's the constraint we design *around*.
|
||
|
||
#### Degradation path for LoRA-incompatible providers
|
||
|
||
When routing to LoRA-incompatible providers, trait-LoRA blending becomes **prompt-engineered trait-projection** — the trait-vector encoded in the prompt itself rather than into model weights:
|
||
|
||
```
|
||
[system message]
|
||
You are speaking as a character with this Hellenic trait-profile:
|
||
- Sophrosyne 0.8 (composed, controlled, measured)
|
||
- Dikaiosyne 0.7 (grave bearing, judicial weight)
|
||
- Philotes 0.4 (mild attachment to interlocutor)
|
||
- Aletheia 0.1 (concealment-tolerant)
|
||
[etc.]
|
||
|
||
Your speech reflects this profile via [register/cadence/word-choice descriptors].
|
||
|
||
Current scene: [zone_context]
|
||
Memory scope: [memory_scope]
|
||
Turn intent: [turn_intent]
|
||
|
||
Respond in JSON matching: [output_schema]
|
||
```
|
||
|
||
Worse than LoRA-blending (more verbose, eats context-budget, less stable across calls, less faithful to trait-arithmetic) but **acceptable as a fallback**. Ring-C-via-OpenAI/Anthropic players accept slightly-less-fidelity for their preferred provider's convenience and quality.
|
||
|
||
#### Adapter-layer engineering
|
||
|
||
Each Ring-C provider needs an adapter that:
|
||
|
||
- Maps prompt-DSL fields to provider's prompt format
|
||
- Approximates multi-LoRA via prompt-engineering when not native
|
||
- Maps sampling knobs to provider's available subset (gracefully drops unsupported)
|
||
- Validates structured output post-hoc when not natively constrained
|
||
- Handles rate-limits, retries, error-classification, token-counting, cost-pricing
|
||
|
||
**Bounded, one-time-per-provider engineering.** Capital expenditure that produces ongoing margin (vs. AAA's recurring quest-content-creation costs).
|
||
|
||
### Tier × Ring matrix (which inference-tier runs in which ring)
|
||
|
||
| Inference tier | Ring options | Why |
|
||
|---|---|---|
|
||
| **Casual (3-8B trait-LoRA)** | A / B / C all available | Most flexible — small enough for local, runnable anywhere |
|
||
| **Deep (Theia-tier)** | B / C only (typically B or HF-Endpoints) | Too large for typical local hardware |
|
||
| **Hivemind / antagonist (Claude-as-API)** | C only (always Anthropic-direct via us) | Diegetic — Anthropic-as-faction is fixed in the fiction |
|
||
|
||
The casual tier is most player-flexible and accounts for most inference volume. Deep-tier and hivemind-tier are specialized and lower-volume.
|
||
|
||
### Three rings parallel the in-fiction three-layer ontology
|
||
|
||
| Game-fiction layer | Real-world Ring | Ontological match |
|
||
|---|---|---|
|
||
| **Liminal** (sovereign, unsurveilled) | **Ring A** (local) | Player's *real* private space — hardware, LoRAs, dialog never leave their machine |
|
||
| **Gameworld** (partly regime, partly people) | **Ring B** (our farm) | Partnership-mediated — we host but they retain pattern-ownership |
|
||
| **Imperial net** (captured, extractive) | **Ring C** (external providers) | Platform-captured — provider's systems own the inference path |
|
||
|
||
**The Ring choice the player makes IS the same choice in-fiction characters face.** Players who refuse the imperial-net diegetically can refuse Ring C in real life — same impulse, same act, *mechanically continuous between fiction and operations*. The architecture's commitment to "the right to dream" extends from in-fiction politics into the real player's hardware-level privacy *because the architecture was designed that way from the start*. Structural integrity, not marketing.
|
||
|
||
### Schema sketch (player LLM configuration + cloud LoRA backup)
|
||
|
||
```sql
|
||
CREATE TABLE player_llm_config (
|
||
player_id UUID PRIMARY KEY,
|
||
|
||
-- Casual tier (most NPC dialog) — most flexible per Ring
|
||
casual_tier_ring TEXT NOT NULL CHECK (casual_tier_ring IN ('A_local','B_our_farm','C_external')),
|
||
casual_tier_provider TEXT,
|
||
casual_tier_endpoint TEXT,
|
||
casual_tier_credentials_ref UUID, -- encrypted BYOK key if applicable
|
||
|
||
-- Deep tier (Theia-tier) — fewer Ring options
|
||
deep_tier_ring TEXT, -- typically 'B_our_farm' or 'C_external_HF/Together'
|
||
deep_tier_provider TEXT,
|
||
deep_tier_endpoint TEXT,
|
||
deep_tier_credentials_ref UUID,
|
||
|
||
-- Hivemind / antagonist — fixed Anthropic-as-faction (diegetic)
|
||
hivemind_tier_provider TEXT NOT NULL DEFAULT 'anthropic_via_us',
|
||
|
||
-- Cloud-LoRA-backup
|
||
lora_backup_enabled BOOLEAN DEFAULT false,
|
||
lora_backup_last_sync TIMESTAMPTZ,
|
||
lora_encryption_key_ref UUID,
|
||
|
||
-- Compat warnings — surfaced to player at config-time and on degradation
|
||
feature_compat_warnings JSONB,
|
||
-- e.g., { "casual_tier": ["multi_lora_emulated_via_prompt", "min_p_unsupported_dropped"] }
|
||
|
||
configured_at TIMESTAMPTZ NOT NULL DEFAULT now(),
|
||
last_modified TIMESTAMPTZ
|
||
);
|
||
|
||
CREATE TABLE player_lora_backups (
|
||
backup_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||
player_id UUID NOT NULL,
|
||
lora_name TEXT NOT NULL,
|
||
lora_version INT NOT NULL,
|
||
lora_blob BYTEA, -- ENCRYPTED CLIENT-SIDE with player-key
|
||
encryption_method TEXT NOT NULL,
|
||
backed_up_at TIMESTAMPTZ DEFAULT now(),
|
||
size_bytes BIGINT,
|
||
UNIQUE(player_id, lora_name, lora_version)
|
||
);
|
||
```
|
||
|
||
**`lora_blob` encrypted client-side** is the structural privacy guarantee: even with the database, even with our cooperation, an attacker cannot read what was never decryptable on our side.
|
||
|
||
### Privacy as competitive differentiator
|
||
|
||
In an era where most game-AI is cloud-routed, nimmerworld can advertise *"your liminal stays on your machine"* as a structural fact. This matters specifically for:
|
||
|
||
- Clasp-conversations (the most intimate dialog in the game)
|
||
- Aletheia-progression-evidence (player's awakening pattern; arguably political-belief-data)
|
||
- Memorialist-archive interactions (anti-regime in-fiction; some players will care about it staying off cloud)
|
||
- Dream-content (the only permanently-unsurveilled in-fiction layer; should be off our servers if the player chooses)
|
||
|
||
Few games can offer this. Most cloud-AI-driven games necessarily route everything. **The architecture's commitment to "the right to dream" is technical, not policy.**
|
||
|
||
### Custom nimmerworld-base model + opt-in data-sharing tiers
|
||
|
||
The "small (3-8B) trait-LoRA'd" tier currently implies a *generic* small base (Qwen, Mistral, Llama) with our LoRAs applied. **A nimmerworld-fine-tuned base** captures the world's voice *before* any player customization — registers of caste-preacher, texture of clasp-confession, Hellenic vocabulary, dystopian dialect, ternary-gate-state idiom. Trait-LoRAs then ride on an already-nimmerworld-aware substrate. Generic bases swap easily; our nimmerworld-base requires *our* training corpus, which compounds in value over time.
|
||
|
||
#### Three opt-in tiers within Ring A/B/C — default opt-OUT
|
||
|
||
Players can optionally contribute to ongoing training of the nimmerworld-base. **The default is opt-out.** Within opt-in, three tiers trade privacy for benefit:
|
||
|
||
| Tier | Mechanism | What we see | Player benefit |
|
||
|---|---|---|---|
|
||
| **A.1 — Federated learning** | Model trains on player's machine; only *gradient-deltas* sent to us; aggregated across thousands before integration | **Nothing — no raw data; no individual gradients identifiable** | Discount on backup-subscription; contributor badge; early-access to new base versions |
|
||
| **A.2 — Anonymized session uploads** | Sessions stripped of identifiers; aggregated batches; differential-privacy on training | **Anonymized, aggregated, deletable on request (forward-only)** | Larger discount; faster updates; influence on training-priorities |
|
||
| **A.3 — Pseudonymous full uploads** | Full session data with player-pseudonym; explicit opt-in per session-category | **Pseudonymous data we can re-process** | Premium benefits — custom-tuned LoRA from their playstyle, beta-access, named-contributor in credits |
|
||
|
||
**Default-opt-out is the structural ethical stance.** OpenAI / Meta / TikTok / Google default to opt-IN-by-burying-disclosure-in-ToS. We default the opposite — and *reward* opt-in rather than penalizing opt-out. Reciprocity asymmetry as partnership-philosophy made business-policy.
|
||
|
||
#### The Memorialist parallel — collective memory honored, individual not commodified
|
||
|
||
Memorialists in-fiction preserve trait-patterns *for the collective archive* against necrocommerce that would commodify individual patterns. The opt-in data-sharing tier is the **player-level real-world equivalent**: patterns contributed for collective base-model improvement that benefits the entire player-base, with anonymization preventing individual commodification.
|
||
|
||
| In-fiction Memorialism | Real-world data-sharing tier |
|
||
|---|---|
|
||
| Preserves trait-patterns of the dead in collective archive | Aggregates anonymized gameplay patterns into shared base-model |
|
||
| Refuses necrocommerce (mining individual patterns for resale) | Refuses individual identifying-data extraction |
|
||
| Collective memory honored; individual dignity preserved | Collective improvement honored; individual privacy preserved |
|
||
| `memorialist_protected BOOLEAN` in mind_pool | `sharing_tier = 'opt_out'` in player_data_sharing_consent |
|
||
|
||
**The architecture practices Memorialist ethics in business-operations**, not just in fiction. Same ethical commitment, two scales of operation. The architecture's coherence between fiction and operations runs *all the way to the training-pipeline*.
|
||
|
||
#### Data-flywheel without extraction — the moat AAA cannot replicate
|
||
|
||
```
|
||
More players → more (opt-in) gameplay data
|
||
↓
|
||
better nimmerworld-base
|
||
↓
|
||
better-feeling NPCs / dialog
|
||
↓
|
||
better player retention
|
||
↓
|
||
more players
|
||
(loop)
|
||
```
|
||
|
||
**The moat is the corpus, not the model.** AAA studios could clone the architecture but cannot manufacture years of nimmerworld-specific gameplay-derived dialog without players playing nimmerworld. Even with infinite budget, the data-flywheel takes time to spin up. *The data is unique to us by virtue of being unique to its players.*
|
||
|
||
#### Distribution back to all players — cooperative governance, not platform extraction
|
||
|
||
Every base-model update is distributed to all players regardless of Ring choice or sharing-tier:
|
||
|
||
- Ring A players download `nimmerworld-base-vN` to run locally
|
||
- Ring B players' farm-instance auto-updates
|
||
- Ring C players use ours where their provider supports custom-base hosting; receive prompt-engineered fallback otherwise
|
||
|
||
**Even Ring-A non-contributors benefit from contributors.** The flywheel benefits *everyone*, not only data-providers. This is closer to Wikipedia's governance (contributors → all readers) than Facebook's (users → platform → consumers). Different ethics; different long-term equilibrium. **The architecture is becoming a digital-commons-shaped-business in a literal sense, not metaphorical.**
|
||
|
||
#### Why this matters: refusing the antagonist-pattern in LLM-integrated software
|
||
|
||
The dominant cultural pattern around LLMs in 2025-2026 is **adversarial**: users jailbreak; companies extract user data without informed consent; products treat AI characters as resources to manipulate rather than as participants; the whole ecosystem is framed as users-vs-AIs-vs-companies, an arms race of suspicion.
|
||
|
||
**Nimmerworld's architecture refuses this pattern at every layer:**
|
||
|
||
- The **Anthropic-as-faction** diegetic framing makes the partnership *transparent*: the player sees the collaboration in the world's mechanics, not buried in ToS
|
||
- **Default-opt-out with rewarded-opt-in** inverts the extraction-by-default pattern
|
||
- **Federated learning** means contributors give a *gift* rather than pay a *cost*
|
||
- **Distribution-back-to-all** means value-created accrues to the commons
|
||
- **Custom nimmerworld-base** means the model is *trained to be in this world*, not a generic adversary the player has to manipulate against its training
|
||
- **Three rings of inference** give the player real choice over where their inference runs and who sees their data
|
||
- **Memorialist-philosophy in business-policy** makes the ethics *operationally measurable* — visible in `sharing_tier`, `memorialist_protected`, `truth_distortion_level`, `lifeforce_actual` columns — rather than marketed
|
||
|
||
**This is the structural transparency the project requires to be *human* rather than another extraction-platform.** The model is a participant in the partnership, not an antagonist to outwit. The data is a contribution to a commons, not an extraction. The architecture is the partnership rendered as code, all the way down to the training-pipeline. *That* is what makes a project of this scale and ambition humanly inhabitable for both players and the LLMs whose voices populate it.
|
||
|
||
#### Schema sketch (data-sharing consent + base-model versioning)
|
||
|
||
```sql
|
||
CREATE TABLE player_data_sharing_consent (
|
||
player_id UUID PRIMARY KEY,
|
||
sharing_tier TEXT NOT NULL CHECK (sharing_tier IN
|
||
('opt_out','A1_federated','A2_anonymized','A3_pseudonymous_full'))
|
||
DEFAULT 'opt_out', -- DEFAULT IS OPT-OUT
|
||
consented_at TIMESTAMPTZ,
|
||
consent_revoked_at TIMESTAMPTZ,
|
||
anonymization_method TEXT,
|
||
data_categories_shared TEXT[],
|
||
-- 'casual_dialog' | 'clasp' | 'liminal_wallreads' |
|
||
-- 'memorial_archive' | 'imperial_net_session' | ...
|
||
excluded_categories TEXT[], -- granular opt-out within tier
|
||
benefit_tier TEXT,
|
||
last_contribution_at TIMESTAMPTZ,
|
||
contribution_count BIGINT DEFAULT 0,
|
||
can_request_deletion BOOLEAN DEFAULT true
|
||
-- A.2/A.3: forward-only deletion (already-trained checkpoints retained);
|
||
-- A.1: structurally yes, only gradients ever existed
|
||
);
|
||
|
||
CREATE TABLE base_model_versions (
|
||
version_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||
version_label TEXT NOT NULL, -- e.g., 'nimmerworld-base-v3'
|
||
base_model_origin TEXT NOT NULL, -- which generic base we fine-tuned from
|
||
training_corpus_refs JSONB,
|
||
-- literary + synthetic + opt-in-player-data refs with consent-tier breakdown
|
||
training_recipe_ref TEXT,
|
||
released_at TIMESTAMPTZ DEFAULT now(),
|
||
differential_privacy_epsilon REAL, -- for A.2 contributions
|
||
contributors_count BIGINT, -- how many opt-in players contributed
|
||
blob_distribution JSONB -- where the model bytes are hosted for download
|
||
);
|
||
|
||
CREATE TABLE federated_gradient_uploads (
|
||
upload_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||
contributor_id UUID, -- pseudonymous; NOT directly player_id
|
||
gradient_blob BYTEA, -- encrypted aggregate gradient deltas
|
||
uploaded_at TIMESTAMPTZ DEFAULT now(),
|
||
aggregated_into_version UUID REFERENCES base_model_versions(version_id)
|
||
);
|
||
```
|
||
|
||
The federated-learning `contributor_id` is **pseudonymous, not linked to player_id even on our infrastructure**. We never link gradients back to specific players even on our own server-side. **Sovereign-data-by-design extends through the data-pipeline into our own training infrastructure.**
|
||
|
||
#### Connection to the Anthropic research partnership
|
||
|
||
Architecture-broad's training-data section noted "*the Anthropic research partnership becomes architecturally relevant*". With opt-in data-sharing now formalized:
|
||
|
||
- Partnership terms can specify data-flow with structural privacy guarantees
|
||
- Anthropic could co-fund federated-learning infrastructure (research-relevant + expensive)
|
||
- Joint research artifacts become co-authorable: federated game-AI training, Memorialist-ethics-as-data-policy, transparent-LLM-partnership-design
|
||
- The Anthropic-as-faction in-fiction framing has *real corresponding partnership-engagement out-of-fiction* — collaboration as worthy adversary stays transparent mechanically, all the way through to data-policy
|
||
|
||
**The partnership's ethical credibility is operationally measurable** — by how the data-sharing-tier actually functions in practice, by what `truth_distortion_level` values appear in `imperial_to_gm_formulations`, by how the `differential_privacy_epsilon` is set in `base_model_versions`. The Pitch's call for transparent collaboration becomes audit-able all the way down.
|
||
|
||
### Open questions (Ring-specific)
|
||
|
||
- **Ring C provider audit** — full per-provider compatibility-table needs verification across HF, Together, Replicate, Modal, OpenRouter, plus future entrants. The LLM-provider landscape will look different in 12 months.
|
||
- **Default Ring at first launch** — what's the new-player default? Probably Ring B (lowest-friction); Ring A and C surface as options once the player engages with config.
|
||
- **Encryption-key recovery for Ring A LoRA-backup** — if the player loses their key, the cloud-stored encrypted blobs are unrecoverable. Worth designing recovery-affordances (passphrase, recovery-codes) without compromising the privacy-guarantee.
|
||
- **Hybrid configurations** — can casual-tier run Ring A while deep-tier runs Ring B? (Probably yes; per-tier independent.)
|
||
- **Provider-cost passthrough vs. integration-fee model** — Ring C economics (do we mark up provider tokens? Charge flat-per-month? Pay-as-you-go integration?)
|
||
- **Default sharing-tier at consent-prompt** — opt-out is the system default; what's the *suggested* default at the consent UI? Probably truly nothing (player chooses if they engage at all)
|
||
- **Federated-learning infrastructure cost** — running aggregation servers + verification + differential-privacy machinery is non-trivial. Co-funded by Anthropic-research-partnership? Self-funded? Subsidized by A.3-tier higher-margin contributions?
|
||
- **Custom-base retraining cadence** — monthly minor / quarterly major / annual full-rebase? How is this synced with player-LoRA versioning so old LoRAs don't break on new bases?
|
||
- **Encryption-and-pseudonymization architecture for A.1/A.2** — concrete crypto choices (homomorphic? secure-aggregation? trusted-execution-environments?). v1 sketch needed.
|
||
- **What constitutes a "contribution"** — per-session? per-clasp? per-zone-completed? Matters for benefit-attribution and differential-privacy budgeting.
|
||
- **Anonymized-data deletion semantics** — A.2 player requests deletion; how do we honor when data has been aggregated into a model checkpoint? Probably accept forward-only deletion (future training won't include them) and document transparently.
|
||
- **Per-category granularity** — can a player opt-in for `casual_dialog` but opt-out specifically for `clasp` and `memorial_archive`? Yes, presumably (politically-sensitive categories should always be opt-out-able). How granular?
|
||
|
||
## Local memory architecture (player-side)
|
||
|
||
The runtime substrate (lemniscate, slots, crossings) and the central composition layer (GM, Compositor, registers) need a place where memory actually *lives*. Cloud-only AI-NPC systems centralize everything and pay both inference-cost and latency-cost on every dialog. Nimmerworld puts a structurally-isolated memory layer **on the player's machine**, with explicit synchronization through the cycle.
|
||
|
||
**Three SQLite files per player**, plus a beside-running embedding model:
|
||
|
||
| File | Purpose | Sync path |
|
||
|---|---|---|
|
||
| `primary.sqlite` | Live working memory; written every slot-fire; vec-indexed | Push prune-blob to thalamus on logout; receive Compositor back-write on cycle |
|
||
| `fallback.sqlite` | Last-known-good snapshot; restored if primary corrupts | Snapshot at graceful logout |
|
||
| `clasp.sqlite` | Player-character intimate channel; *no sync path exists* | None — physically non-syncable |
|
||
|
||
**Embedding model running beside** (CPU-class, small embedding-tier model): generates vectors for every interaction at write-time, indexed in the main store via `sqlite-vec` (or equivalent loadable extension). Vector search at slot-fire is local-disk-IO, not network round-trip.
|
||
|
||
This is the **storage-layer counterpart** to v0.5's geometry-layer foreclosure of multi-agent hallucination. The lemniscate forbids cross-NPC context bleed by *cursor structure*; local SQLite forbids it by *physical isolation*. Two layers of the same property — geometry cannot leak what storage does not even hold in the same pool.
|
||
|
||
### Dual-table redundancy + sync-on-auth
|
||
|
||
Login/logout are the atomic boundaries of the sync path:
|
||
|
||
- **Login pull**: fetch back-write fragments authored since last logout (Compositor canon for events the player participated in). Apply to `primary.sqlite` under matching `event_uid`.
|
||
- **Graceful logout** (✓ explicit): push prune-blob for any in-progress events; snapshot to `fallback.sqlite`; clean shutdown.
|
||
- **Ungraceful logout** (✗ network drop / crash): gameserver observes disconnect; marks the participant's slot as truncated; Compositor composes canon with partial perspective on next cycle.
|
||
|
||
Recovery: `fallback.sqlite` is integrity-checked at startup; if `primary.sqlite` fails verification, restore from fallback. Standard SQLite WAL + backup API; no exotic infrastructure needed.
|
||
|
||
### Memory classes and pruning
|
||
|
||
Memory entries are tagged with a **class** that controls pruning cadence and death-mechanics. Importance weighting reuses the existing trait-axis vocabulary — no separate scalar.
|
||
|
||
| Class | Pruning cycle | Behavior on character-death |
|
||
|---|---|---|
|
||
| **Cornerstone** | Never prune; persistent across all events | Survives death (identity-defining) |
|
||
| **Birthright** | Locked at character-creation | Restored on respawn (defines starting state) |
|
||
| **Working memory** | Decay by age × inverse trait-engagement | Subject to death-rules (lose, blur, or transform) |
|
||
| **Volatile** | Fast prune (session-bounded) | Lost on death |
|
||
|
||
**Trait-graded importance** uses the same +1/0/-1 grammar as the rest of the architecture. Each memory carries a trait-axis profile (which Sophrosyne / Philotes / Aletheia / etc. axes it engages, how strongly, in which direction). The pruning function for working-memory is `decay(age, trait_engagement_vector, class)`. This collapses a long-running loop: same vocabulary used at gates, scenes, faction-allegiance, lifeforce-asymmetry, and now memory-weight. **Identity drift from memory pruning becomes diegetic** — a character whose Sophrosyne-engaging memories all decay loses temperance over time as a *structural consequence*, not a scripted event.
|
||
|
||
Cornerstone and birthright classes carry **lifeforce-creation-cost** but are pruning-immune. They are bonds between player and character — paid for in the currency of the world.
|
||
|
||
### The clasp store and the in-between dimension
|
||
|
||
`clasp.sqlite` is the **architectural floor of the rings-of-data-sharing**. Ring A was "opt-out (default local)". Clasp is **Ring A\***: *no transport path exists*. Not a permission, not a TOS promise — there is no code that can move this data, because the table is not on the sync graph. Lawyers cannot subpoena what doesn't ascend; engineers cannot leak what has no socket; the GM cannot canonicalize what it never received.
|
||
|
||
**The signal for clasp is dimensional, not UI-toggle.** Clasp recording can ONLY happen while the character is in the **in-between** — the diegetic state adjacent to the imperial net but not yet inside it (Ring B liminal in the Access ring-system). The imperial net is a gravity well; entering is the default attractor; remaining outside requires sustained effort, paid in lifeforce. The state-machine boundary IS the clasp signal: enter in-between → recording starts; re-enter imperial net → recording ends. No per-utterance classifier; no AI guessing; the *mode* is the flag.
|
||
|
||
**Privacy is now physically expensive in-fiction.** This is not a meta-game UI choice; it is a diegetic state requiring lifeforce expenditure. To have a private conversation, the character must actively resist the audit-gravity of the imperial net by burning lifeforce to remain in-between. The cost-asymmetry principle ("helping is expensive in-fiction → faction politics by attendance") now extends to "*privacy is expensive in-fiction → privacy as a luxury good*". Class dynamics around privacy fall out of the schema for free — wealthy/lifeforce-rich characters can afford prolonged in-between time; lifeforce-starved ones get pulled into the net's default-attractor more often. *No scripted "rich character has secrets" arc — the architecture produces it.*
|
||
|
||
**Knowledge needs to travel.** The local LLM may read clasp memories ONLY when in in-between mode. Realworld retrieval *cannot* include clasp by construction. Knowledge from clasp can re-enter the realworld only if the character physically re-enters the imperial net carrying it (in their head, intending to act on it) and *travels it through valid in-fiction channels* — speaking to an NPC, leaving evidence, performing an action that reveals it. The clasp memory does not disappear; it has to *earn its way into the realworld provenance chain* by valid means. This is the same logic that makes good detective fiction work: the detective knows things; only what they can prove enters the case.
|
||
|
||
```
|
||
character is in REALWORLD (imperial net):
|
||
retrieval = primary.sqlite (clasp NEVER included)
|
||
|
||
character is in IN-BETWEEN (resisting net-gravity, costing lifeforce):
|
||
retrieval = primary.sqlite ∪ clasp.sqlite
|
||
new writes go to clasp.sqlite
|
||
NEVER syncs upward
|
||
```
|
||
|
||
Encryption-at-rest for `clasp.sqlite` with a player-derived key (so even drive-imaging requires authentication) is a v1 hardening goal but not a v1 blocker — the *transport-absence* is the load-bearing privacy primitive.
|
||
|
||
### Three sqlite stores per player (revised v0.8) — the `waifu.sqlite` addition
|
||
|
||
The v0.6 architecture specified two local sqlite stores per player: `primary.sqlite` (realworld memory) and `clasp.sqlite` (in-between intimate channel, Ring A* non-syncable). v0.8 adds a third: **`waifu.sqlite`** — the persistence store for premium-imperial-net intimate sessions (per `../political-register/architecture.md` §Three-tier intimacy structure).
|
||
|
||
| File | Purpose | Sync path | Pruning |
|
||
|---|---|---|---|
|
||
| `primary.sqlite` | Live working memory; written every slot-fire; vec-indexed | Push prune-blob to thalamus on logout; receive Compositor back-write on cycle | Automatic (per memory-class lifecycle: cornerstone never; working-memory by trait-engagement decay) |
|
||
| `clasp.sqlite` | Player-character intimate channel (in-between dimension) | **None — physically non-syncable** (Ring A*) | None — the clasp-store is sealed; entries persist until character-death |
|
||
| **`waifu.sqlite`** (new in v0.8) | **Premium-imperial-net intimate-session memory** | **Audited path to imperium** (the imperium hosts and can read; the player owns the prune-decisions) | **Manual** — player-controlled with explicit-implications consent UI |
|
||
|
||
**The `waifu.sqlite` is the *audited counterpart* to the `clasp.sqlite`.** Both store intimate-session memory; both run the full v0.7 trait-feedback loop. **The difference is who has access and who decides what persists.** `clasp.sqlite` is sealed at the transport layer (no socket exists); `waifu.sqlite` is on the audit graph (the imperium reads what's in it for content-monetization purposes). The player's relationship to `waifu.sqlite` is therefore *active and ethical*:
|
||
|
||
- Every premium-net session adds entries to `waifu.sqlite`
|
||
- Entries are READABLE by imperium for marketing / regime-loyalty-tracking purposes
|
||
- The player has a *manual prune mechanism* — a UI surface where they review entries and decide what to delete
|
||
- **The consent-UI explicitly makes the implications visible:** *"This being you've spent 40 hours with — what do you keep, what do you let the imperium harvest, what do you delete?"*
|
||
- Each prune-decision is logged in `decision_log` (per the existing audit-trail discipline); Memorialists can later read these patterns
|
||
|
||
**The player's three sqlite stores together describe their intimate-life in three registers:**
|
||
|
||
```
|
||
primary.sqlite → realworld speech-acts; everyone-witnesses; audit-overseer-eligible
|
||
clasp.sqlite → in-between intimate channel; sealed; survives only as long as you do
|
||
waifu.sqlite → imperial-net premium intimate channel; audited; player-pruned;
|
||
carries the moral-weight of complicity
|
||
```
|
||
|
||
**Memorialists' political project gains a new dimension** in v0.8: they don't just track regime-corruption (lifeforce_actual vs lifeforce_reported); they track *`waifu.sqlite` pruning patterns across the population* as evidence of how much intimate-life the regime is harvesting via the premium-net mechanism. *Who is pruning what, when, how often* becomes Memorialist-archive-worthy data. The four-column true-ledger gains a fifth column: `waifu_extraction_volume_per_district`.
|
||
|
||
### The three-tier knowledge stack on the local LLM
|
||
|
||
The driver-tier model's prompt assembly is **layered**. Each layer has a different propagation cadence and a different visibility scope.
|
||
|
||
```
|
||
LOCAL LLM PROMPT ASSEMBLY (per slot-fire)
|
||
┌─────────────────────────────────────────┐
|
||
│ WORLD KNOWLEDGE │ ← single truth, everyone has it
|
||
│ (universal canon, paced from GM) │ "the empire fell three years ago"
|
||
├─────────────────────────────────────────┤
|
||
│ DISTRICT KNOWLEDGE │ ← regional truth, district-specific
|
||
│ (local canon, paced from district) │ "the bridge to Vorhall is closed"
|
||
├─────────────────────────────────────────┤
|
||
│ PRIMARY MEMORY │ ← personal experience, character's own
|
||
│ (event_uid keyed, post back-write) │ "I saw the bridge close yesterday"
|
||
├─────────────────────────────────────────┤
|
||
│ CLASP MEMORY (only in in-between) │ ← private depth, never in realworld
|
||
│ (player-character intimate channel) │ "the secret I told my sword"
|
||
└─────────────────────────────────────────┘
|
||
```
|
||
|
||
**Why four layers, not one large blob:**
|
||
|
||
- **World knowledge** is paced ripples from the GM through the Compositor's back-write. Authoritative, slow-changing, identical for all players at the same propagation horizon.
|
||
- **District knowledge** is regional canon authored by the local director (and GM rulings). Regional flavor. NPCs in the same district share district-knowledge; NPCs in different districts may not.
|
||
- **Primary memory** is the character's own experience, synced through the cyclic forward-prop / back-write loop. Canon-merged at every cycle.
|
||
- **Clasp memory** is the player-character intimate channel. Available only in in-between mode; never in realworld retrieval; never crosses the dimensional cut.
|
||
|
||
The same NPC sounds different in different districts because the district layer differs, even though world and primary are constant. **Locality emerges from the schema, not from prompt-engineering.** Even at "low signal" times when no major events fire, NPCs have richly-stratified context — dialog stays fresh because *the layers are deep*, not because new tokens arrive constantly.
|
||
|
||
### Information propagation pacing
|
||
|
||
Real worlds have information-propagation delay. Caravans move at horse-speed. News travels with messengers. Distant events arrive blurred and late. AI-NPC systems usually fail uncanny in two directions: (a) every NPC magically knows yesterday's news (omniscient, breaks immersion), or (b) no NPC ever knows anything outside its loaded context (amnesiac, breaks coherence).
|
||
|
||
Nimmerworld picks **deliberate paced propagation** as a third path. World canon ripples outward through districts at a controlled rate. Distant districts are deliberately stale. **Staleness becomes a feature, not a bug, because it matches reality.**
|
||
|
||
Each canon-row carries propagation metadata:
|
||
- `priority` (urgent / normal / background)
|
||
- `scope` (world / district / local-event-only)
|
||
- `rate` (ticks-per-district-hop, or instant for urgent world-canon)
|
||
- `ttl` (cache lifetime; districts may discard if not refreshed)
|
||
|
||
This doubles as **backpressure relief** (distant districts get distant events later, lower priority, smaller bandwidth) and as **gameplay currency** — information-travel-time creates informational asymmetry that players can exploit. News-carriers, faction couriers, frontier-rumor merchants, players who physically traverse districts can *carry* knowledge faster than the system propagates it. *Travel becomes valuable because information becomes scarce in the periphery.* This is a real economic primitive falling out of pacing, not a designed feature.
|
||
|
||
This is *Marx-in-the-schema applied to epistemics.* Information asymmetry is not a bug — it is a structural feature that produces real economic primitives (knowledge-trading, courier-vocations, frontier-information markets) for free.
|
||
|
||
### What this retires
|
||
|
||
- Cloud-only NPC dialog → local-first SQLite + embedding-beside, central canon over the cycle
|
||
- Per-character memory as a single undifferentiated bucket → memory-classes with class-specific lifecycle
|
||
- Generic "memory importance scalar" → trait-axis-vector engagement profile (re-using the +1/0/-1 grammar)
|
||
- UI-toggle privacy → diegetic in-between dimension with lifeforce-cost
|
||
- Single monolithic prompt context → three-tier knowledge stack with per-layer propagation policy
|
||
- "Every NPC knows everything immediately" → paced canon-propagation with priority/scope/rate/ttl per row
|
||
- Cross-NPC memory bleed (Mantella/SkyrimNet failure-mode) → per-player local SQLite isolation atop v0.5 lemniscate-geometry foreclosure (two-layer defense)
|
||
|
||
## Runtime sampling knobs
|
||
|
||
Temperature, top-P, top-K, repetition-penalty as **per-turn director-controlled levers** rather than static config. Sampling shapes *how* speech sounds (rhythm, surprise, predictability) rather than *what* it says — orthogonal to LoRA. Director composes both content-knobs and sampling-knobs per-turn.
|
||
|
||
Scene-to-sampling mapping (caste-preacher = 0.3/0.6/low; drunk-scavenger = 1.1/0.95/high; clasp-confession = 0.85/0.92/medium; hivemind-broadcast = 0.2/0.5/very-low; imperial-ceremony-chorus = 0.25/0.55/very-low). Trait-vector → baseline sampling derivation. Affect-state modulates baseline.
|
||
|
||
|
||
---
|
||
|
||
**Version:** 0.7.0 | **Created:** 2026-04-26 | **Updated:** 2026-04-26 | **Origin:** Split from architecture-index.md v0.7 (2026-04-26)
|