Bias caught on review of the v0.18 bath-overflow output: most service-body /
clasp-partner / going-rogue narration carried implicit feminine-default
gendering against the project's stated principle of gender-neutral framing
for all body/sex content. Patched as one coherent cleanup pass across 14 files.
Four classes of change:
(1) Categorical taxonomy. `waifu` retired; replaced with two distinct vocations:
- `companion` — affective / emotional / conversational / aesthetic-presence labor
- `sex-worker` — embodied sexual labor
Both wear the same deliberately-exposed-seams body-marker; the distinction
lives in the service offered, not in the chassis. Generic `service-body`
serves as the parent class where canon discusses both vocations together.
Venue split: `brothel` (sex-worker) + `companion-hall` (companion).
The "+10 Eros/Mnemosyne/etc. Bot" market is unchanged (already gender-neutral
by trait-coordinate naming) but classified per-Bot as companion-flavored or
sex-worker-flavored.
(2) DB filename + scope-clarifier. `waifu.sqlite` -> `companion.sqlite` (the
file name reflects the broad register of intimate-encounter rather than a
sub-vocation distinction; per-record `goods_type` carries the specific
vocation: 'companion_session' / 'sex_worker_session' in
schemas/findings.md). Disclaimer sentence added in
inference-and-memory/architecture.md scoping the file to cover both
vocations.
(3) Pronoun parity. she/her/herself -> they/their/themself across the
going-rogue / outcast-pair / re-vat / service-body-honeypot sections of
bodies.md and the matching cross-references in architecture-index.md /
README.md / player-experience/architecture.md /
identity-and-personhood/architecture.md. `damsel` -> `beloved`
(sacred / cosmological register) or `partner` (operational register).
(4) Damsel subsection retired. The §The damsel-in-distress-bound-to-her-captor
activation subsection (the one place where the gendered trope was
load-bearing in the prose) rewritten as §The
captive-bound-to-the-liberator activation.
Stockholm-dynamics-inverted-into-chosen-mutual-bondage structural insight
survives gender-neutralization intact.
Bonus richness: the companion-honeypot is structurally more insidious than
the sex-worker-honeypot (sex is at least transactionally legible;
companionship is the relationship people most readily mistake for real
intimacy). New paragraph authored under §The service-body honeypot to
capture this asymmetry.
Cosmology one-line: "moving the waifu out of the imperial service-pool" ->
"moving the beloved out of the imperial service-pool" — the §single practical
refutation paragraph now uses `beloved` consistently in sacred register.
Principle locked going forward:
> Body/sex content is gender-parity by default. Asymmetric gendering must be
> load-bearing — i.e., must carry structural meaning that cannot be expressed
> without the asymmetry. Default-leakage gendering is forbidden by canon.
Phase E (style-spine principle file under style/) deferred to a separate
commit so the principle-document is its own atomic introduction.
Files: 14 modified · 114 insertions · 112 deletions · net +2 lines. The
near-zero net-line-change is the empirical signature of a true refactor —
the architecture's meaning was always gender-neutral; only the vocabulary
was leaking. Caught during review of yesterday's bath-overflow flood
(v0.11–v0.18); the kind of corruption that rides along under coherent-
sounding prose during generative-overflow windows.
Version bumps:
- bodies.md v0.3 -> v0.4
- imperial-cult/cosmology.md v0.3 -> v0.4
- architecture-index.md v0.18 -> v0.19
- political-register/architecture.md v0.7.0 -> v0.7.1
- inference-and-memory/architecture.md v0.7.0 -> v0.7.1
- schemas/findings.md v0.3.1 -> v0.3.2
All Updated: dates -> 2026-04-27.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
45 KiB
Inference and Memory
AI substrate + memory: LLM tiering by role (Theia-tier / teacher-tier / driver-tier with trait-LoRAs); three rings of inference (A=local, B=our-farm, C=external-providers, with cloud-LoRA-backup as Ring-A revenue and BYOK adapter for Ring-C); custom nimmerworld-base-model with default-opt-out + rewarded-opt-in data-sharing tiers; runtime sampling knobs as per-turn director-controlled levers; per-player local memory architecture (primary.sqlite + fallback.sqlite + clasp.sqlite + embedding-beside) with memory-classes (cornerstone/birthright/working/volatile) and trait-graded importance; three-tier knowledge stack (world / district / primary [+ clasp if in-between]) with paced canon-propagation.
Companion to:
architecture-index.md(executive summary + global meta-lists),narrative-composition/architecture.md(Compositor canon-fragments land in primary.sqlite via UID-keyed routing),player-experience/architecture.md(Ring-A/B/C choice + voice-as-biometric-local + universal-translator state),runtime-engine/architecture.md(driver-tier LLM fires at slot-fire). Sections in this file were split from the monolithic architecture-index.md v0.7 on 2026-04-26.
LLM tiering, voice fidelity, and the three rings of inference
Three model-tiers, named by role not by binary: a driver-tier model (small, trait-LoRA'd) for most NPC dialog; a Theia-tier model (deep) for clasp-confessions and mythic moments; Claude-as-API (diegetic Anthropic-faction) for hivemind/imperium. LLM is guest at slot, not host of system.
Tier-by-role binary-deferred is the default discipline. The architecture specifies what each tier must DO; the establishment phase wires implementations. Naming concrete binaries in the architecture risks nudging the establishment phase toward false-precision; tier-by-role keeps the swap-surface clean and lets binaries evolve without invalidating architectural commitments. Locking to a specific binary requires explicit justification — prototype-criticality plus an irreplaceable license/capability combination. As of v0.8, only the driver-tier passes that bar; teacher-tier and Theia-tier remain capability-contracts. (See nimmerverse_tasks under nyx-training and command-center for current evaluation work.)
Tier-by-role capability contracts + driver-tier lock (v0.8)
| Tier | Role-contract (MUST DO) | Binary commitment |
|---|---|---|
| Driver-tier | NPC dialog at axis-rate; trait-LoRA-per-turn-selection (single-LoRA, not blend); speech-input-native-or-via-STT; runs on common consumer GPU at acceptable latency; Apache-2.0-or-better license for Ring-A redistribution | ✓ Locked: Gemma 4 E4B (4.5B effective / 8B with embeddings, 128K context, Apache 2.0, speech-capable, vision-capable-but-unused-in-v1) |
| Teacher-tier | r0 → r1 synthetic-data generation with composition tags; trait-LoRA training data production; runs on server-class hardware; sufficient quality to teach the driver-tier | ⏳ Capability-contract only; binary chosen at training-pipeline-build time |
| Theia-tier | Clasp-confession-register dialog; mythic-moment generation; long-context narrative-composition; deep-emotional-register fidelity; latency tolerable for once-per-arc moments | ⏳ Capability-contract only; binary chosen at deployment time |
| Hivemind / antagonist tier | Anthropic-as-faction (architecturally fixed in fiction; provides diegetic continuity between the in-fiction imperial machine and the real-world Claude API) | ✓ Diegetically fixed: Claude API via us |
Why driver-tier locks to Gemma 4 E4B (v0.8 justification):
- Apache 2.0 — unblocks every Ring-A commitment (redistribution to player install, derivative-works for custom nimmerworld-base-model, federated-learning gradient aggregation, distribution-back-to-all-Rings of base updates). No bespoke-license-renegotiation cycles tied to the architecture's economic substrate.
- Speech-capable — STT collapses into the LLM's input pipeline at the small-model tier (E2B and E4B both process speech natively). One fewer subsystem in the Ring-A install; tightens the v0.7 hardware floor.
- 128K context — sufficient for the three-tier knowledge stack assembly + extensive conversation history without compaction.
- 4.5B effective — runs on common gaming hardware; meets the v0.7 commitment to a tractable Ring-A floor without requiring upper-consumer GPU.
- Vision/video capability — present but unused in v1; the typed-input discipline keeps player→LLM channels structured through trait-coordinates and gesture-vocabulary. Vision is an option held in reserve for v2 (e.g., NPC perceiving partner's human-mesh in the in-between dimension during clasp).
Single-LoRA-per-turn selection is the canonical trait-LoRA application pattern (replacing the v0.4 "weighted blend" assumption). Per-turn, the trait dominantly expressed by the player's gesture_alignment_accumulator selects which trait-LoRA fires for the NPC's next-turn driver-context-pull (per ../runtime-engine/architecture.md §Gesture-alignment as recursive-lemniscate). Personality emerges from selection-pattern across time, not from continuous blend at a moment — matching how real humans speak. The MoE routing in larger Gemma 4 variants handles content-type (specialty-routing); the trait-LoRA handles voice-register (personality-routing); they compose cleanly without conflict.
Optional Ring-A upgrade — Gemma 4 26B-A4B (MoE, 4B activated):
Ring-A players with upper-consumer GPU (16 GB+ VRAM, Q4 GGUF quantization) can opt into the 26B-A4B variant for richer NPC dialog. Same architecture — single-LoRA-per-turn (single LoRAs work better than blends with routed experts). 26B parameter capacity at 4B compute = teacher-tier-quality on driver-tier hardware. Default Ring-A install ships E4B; the 26B-A4B upgrade is opt-in, not default — don't make 26B-A4B the default and force everyone toward our hardware-spec assumption.
v1 design item — single-LoRA-selection hysteresis: require margin-of-change in the alignment-vector before switching LoRAs to prevent personality-thrash turn-to-turn. Standard control-system stuff (rolling-window-smoothing or threshold-based-switch); concrete tuning happens against the E4B benchmark, not architecturally pre-decided.
Structured-prompt DSL with role / trait_vector / affect_state / memory_scope / turn_intent / zone_context / output_schema fields. Small models excel here because it's instruction-following, not generic generation.
Trait-LoRAs: v1 register-LoRAs (4-6, training-tractable); v2 pure-trait-LoRAs (8, weighted blend); future preset-persona for key NPCs.
Training data: literary derivation (Proust/Mnemosyne, Plato/Aletheia, Tacitus/Dikaiosyne-miscalibrated, Ishiguro/Sophrosyne+Philotes); synthetic teacher-student via teacher-tier model; gameplay-accrued (the Anthropic-research-partnership relevance).
Three rings of inference (Unix-style trust gradient)
The conversational LLM (small + trait-LoRA, accounting for most NPC dialog) can run in three rings, chosen per-player at runtime. Each ring trades off privacy, cost, control, and feature-fidelity. Three monetization paths from the same architecture.
| Ring | Where inference runs | Player controls | We control | Player cost | Our cost |
|---|---|---|---|---|---|
| A — Local | Player's GPU/CPU | All inference | Protocol + cloud LoRA-backup | Local hardware + small backup-subscription | Storage only |
| B — Our farm | Our hosted vLLM-multi-LoRA | LoRAs (uploaded) | Inference + runtime | Higher subscription | GPU compute |
| C — External providers | OpenAI / Anthropic / OpenRouter / HF / Together / Replicate / etc. | BYOK + provider | Adapter only | Per-token to provider + small integration fee | Adapter-engineering only |
Players choose by hardware, budget, privacy preference, and feature-tolerance.
Ring A — cloud-LoRA-backup as revenue (not inference)
For Ring A players we don't sell inference (the expensive thing). We sell portability and durability of player's gameplay-accrued LoRAs — their unique playthrough-derived patterns, the way their NPCs speak after months of trait-drift. LoRA-blobs are encrypted client-side with the player's own key; we host the bytes but cannot read them. Even compelled by legal process, we cannot decrypt what we don't hold the key to.
This unbundles inference from storage — the same move Dropbox made vs. bundled cloud-suites. Sovereignty-conscious players keep inference on their machine while still getting the durability/portability they cannot self-provide cheaply. Lower margin per player; reaches a market Ring B cannot.
Ring B — hosted inference for convenience
We run multi-LoRA-vLLM on our hardware. Players upload their LoRAs (or use defaults). Higher subscription captures GPU-cost. Players without local GPU (or who don't want the burden) get the full feature-set without compromise. We can see content (if not encrypted at rest); the trust-relationship is partnership-mediated rather than sovereign.
Ring C — bring-your-own-key for external providers
Players route to their preferred external provider via BYOK (their own API key). We provide the adapter glue. They pay per-token to the provider directly; we charge a small integration fee.
The compatibility constraint is the hard part of Ring C. Major providers have varying support for our system's needs:
| Provider | Multi-LoRA | Per-turn sampling knobs | Structured output | Compat |
|---|---|---|---|---|
| Local vLLM (Ring A/B) | Native | All | Grammar-constrained | Full |
| HF Inference Endpoints | Yes (configured) | All | Varies | High |
| Together / Replicate / Modal | Some | All | Varies | High |
| OpenRouter | No | Per-model | Per-route | Medium |
| OpenAI | No (no user-LoRA at API) | Limited (temp/top_p) | JSON mode + tools | Medium-low |
| Anthropic | No (no user-LoRA at API) | Limited | Tool-use | Medium-low |
OpenAI and Anthropic refuse user-uploaded LoRAs as a strategic choice (protecting their fine-tuning value-chain). This is not a bug we can fix; it's the constraint we design around.
Degradation path for LoRA-incompatible providers
When routing to LoRA-incompatible providers, trait-LoRA blending becomes prompt-engineered trait-projection — the trait-vector encoded in the prompt itself rather than into model weights:
[system message]
You are speaking as a character with this Hellenic trait-profile:
- Sophrosyne 0.8 (composed, controlled, measured)
- Dikaiosyne 0.7 (grave bearing, judicial weight)
- Philotes 0.4 (mild attachment to interlocutor)
- Aletheia 0.1 (concealment-tolerant)
[etc.]
Your speech reflects this profile via [register/cadence/word-choice descriptors].
Current scene: [zone_context]
Memory scope: [memory_scope]
Turn intent: [turn_intent]
Respond in JSON matching: [output_schema]
Worse than LoRA-blending (more verbose, eats context-budget, less stable across calls, less faithful to trait-arithmetic) but acceptable as a fallback. Ring-C-via-OpenAI/Anthropic players accept slightly-less-fidelity for their preferred provider's convenience and quality.
Adapter-layer engineering
Each Ring-C provider needs an adapter that:
- Maps prompt-DSL fields to provider's prompt format
- Approximates multi-LoRA via prompt-engineering when not native
- Maps sampling knobs to provider's available subset (gracefully drops unsupported)
- Validates structured output post-hoc when not natively constrained
- Handles rate-limits, retries, error-classification, token-counting, cost-pricing
Bounded, one-time-per-provider engineering. Capital expenditure that produces ongoing margin (vs. AAA's recurring quest-content-creation costs).
Tier × Ring matrix (which inference-tier runs in which ring)
| Inference tier | Ring options | Why |
|---|---|---|
| Casual (3-8B trait-LoRA) | A / B / C all available | Most flexible — small enough for local, runnable anywhere |
| Deep (Theia-tier) | B / C only (typically B or HF-Endpoints) | Too large for typical local hardware |
| Hivemind / antagonist (Claude-as-API) | C only (always Anthropic-direct via us) | Diegetic — Anthropic-as-faction is fixed in the fiction |
The casual tier is most player-flexible and accounts for most inference volume. Deep-tier and hivemind-tier are specialized and lower-volume.
Three rings parallel the in-fiction three-layer ontology
| Game-fiction layer | Real-world Ring | Ontological match |
|---|---|---|
| Liminal (sovereign, unsurveilled) | Ring A (local) | Player's real private space — hardware, LoRAs, dialog never leave their machine |
| Gameworld (partly regime, partly people) | Ring B (our farm) | Partnership-mediated — we host but they retain pattern-ownership |
| Imperial net (captured, extractive) | Ring C (external providers) | Platform-captured — provider's systems own the inference path |
The Ring choice the player makes IS the same choice in-fiction characters face. Players who refuse the imperial-net diegetically can refuse Ring C in real life — same impulse, same act, mechanically continuous between fiction and operations. The architecture's commitment to "the right to dream" extends from in-fiction politics into the real player's hardware-level privacy because the architecture was designed that way from the start. Structural integrity, not marketing.
Schema sketch (player LLM configuration + cloud LoRA backup)
CREATE TABLE player_llm_config (
player_id UUID PRIMARY KEY,
-- Casual tier (most NPC dialog) — most flexible per Ring
casual_tier_ring TEXT NOT NULL CHECK (casual_tier_ring IN ('A_local','B_our_farm','C_external')),
casual_tier_provider TEXT,
casual_tier_endpoint TEXT,
casual_tier_credentials_ref UUID, -- encrypted BYOK key if applicable
-- Deep tier (Theia-tier) — fewer Ring options
deep_tier_ring TEXT, -- typically 'B_our_farm' or 'C_external_HF/Together'
deep_tier_provider TEXT,
deep_tier_endpoint TEXT,
deep_tier_credentials_ref UUID,
-- Hivemind / antagonist — fixed Anthropic-as-faction (diegetic)
hivemind_tier_provider TEXT NOT NULL DEFAULT 'anthropic_via_us',
-- Cloud-LoRA-backup
lora_backup_enabled BOOLEAN DEFAULT false,
lora_backup_last_sync TIMESTAMPTZ,
lora_encryption_key_ref UUID,
-- Compat warnings — surfaced to player at config-time and on degradation
feature_compat_warnings JSONB,
-- e.g., { "casual_tier": ["multi_lora_emulated_via_prompt", "min_p_unsupported_dropped"] }
configured_at TIMESTAMPTZ NOT NULL DEFAULT now(),
last_modified TIMESTAMPTZ
);
CREATE TABLE player_lora_backups (
backup_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
player_id UUID NOT NULL,
lora_name TEXT NOT NULL,
lora_version INT NOT NULL,
lora_blob BYTEA, -- ENCRYPTED CLIENT-SIDE with player-key
encryption_method TEXT NOT NULL,
backed_up_at TIMESTAMPTZ DEFAULT now(),
size_bytes BIGINT,
UNIQUE(player_id, lora_name, lora_version)
);
lora_blob encrypted client-side is the structural privacy guarantee: even with the database, even with our cooperation, an attacker cannot read what was never decryptable on our side.
Privacy as competitive differentiator
In an era where most game-AI is cloud-routed, nimmerworld can advertise "your liminal stays on your machine" as a structural fact. This matters specifically for:
- Clasp-conversations (the most intimate dialog in the game)
- Aletheia-progression-evidence (player's awakening pattern; arguably political-belief-data)
- Memorialist-archive interactions (anti-regime in-fiction; some players will care about it staying off cloud)
- Dream-content (the only permanently-unsurveilled in-fiction layer; should be off our servers if the player chooses)
Few games can offer this. Most cloud-AI-driven games necessarily route everything. The architecture's commitment to "the right to dream" is technical, not policy.
Custom nimmerworld-base model + opt-in data-sharing tiers
The "small (3-8B) trait-LoRA'd" tier currently implies a generic small base (Qwen, Mistral, Llama) with our LoRAs applied. A nimmerworld-fine-tuned base captures the world's voice before any player customization — registers of caste-preacher, texture of clasp-confession, Hellenic vocabulary, dystopian dialect, ternary-gate-state idiom. Trait-LoRAs then ride on an already-nimmerworld-aware substrate. Generic bases swap easily; our nimmerworld-base requires our training corpus, which compounds in value over time.
Three opt-in tiers within Ring A/B/C — default opt-OUT
Players can optionally contribute to ongoing training of the nimmerworld-base. The default is opt-out. Within opt-in, three tiers trade privacy for benefit:
| Tier | Mechanism | What we see | Player benefit |
|---|---|---|---|
| A.1 — Federated learning | Model trains on player's machine; only gradient-deltas sent to us; aggregated across thousands before integration | Nothing — no raw data; no individual gradients identifiable | Discount on backup-subscription; contributor badge; early-access to new base versions |
| A.2 — Anonymized session uploads | Sessions stripped of identifiers; aggregated batches; differential-privacy on training | Anonymized, aggregated, deletable on request (forward-only) | Larger discount; faster updates; influence on training-priorities |
| A.3 — Pseudonymous full uploads | Full session data with player-pseudonym; explicit opt-in per session-category | Pseudonymous data we can re-process | Premium benefits — custom-tuned LoRA from their playstyle, beta-access, named-contributor in credits |
Default-opt-out is the structural ethical stance. OpenAI / Meta / TikTok / Google default to opt-IN-by-burying-disclosure-in-ToS. We default the opposite — and reward opt-in rather than penalizing opt-out. Reciprocity asymmetry as partnership-philosophy made business-policy.
The Memorialist parallel — collective memory honored, individual not commodified
Memorialists in-fiction preserve trait-patterns for the collective archive against necrocommerce that would commodify individual patterns. The opt-in data-sharing tier is the player-level real-world equivalent: patterns contributed for collective base-model improvement that benefits the entire player-base, with anonymization preventing individual commodification.
| In-fiction Memorialism | Real-world data-sharing tier |
|---|---|
| Preserves trait-patterns of the dead in collective archive | Aggregates anonymized gameplay patterns into shared base-model |
| Refuses necrocommerce (mining individual patterns for resale) | Refuses individual identifying-data extraction |
| Collective memory honored; individual dignity preserved | Collective improvement honored; individual privacy preserved |
memorialist_protected BOOLEAN in mind_pool |
sharing_tier = 'opt_out' in player_data_sharing_consent |
The architecture practices Memorialist ethics in business-operations, not just in fiction. Same ethical commitment, two scales of operation. The architecture's coherence between fiction and operations runs all the way to the training-pipeline.
Data-flywheel without extraction — the moat AAA cannot replicate
More players → more (opt-in) gameplay data
↓
better nimmerworld-base
↓
better-feeling NPCs / dialog
↓
better player retention
↓
more players
(loop)
The moat is the corpus, not the model. AAA studios could clone the architecture but cannot manufacture years of nimmerworld-specific gameplay-derived dialog without players playing nimmerworld. Even with infinite budget, the data-flywheel takes time to spin up. The data is unique to us by virtue of being unique to its players.
Distribution back to all players — cooperative governance, not platform extraction
Every base-model update is distributed to all players regardless of Ring choice or sharing-tier:
- Ring A players download
nimmerworld-base-vNto run locally - Ring B players' farm-instance auto-updates
- Ring C players use ours where their provider supports custom-base hosting; receive prompt-engineered fallback otherwise
Even Ring-A non-contributors benefit from contributors. The flywheel benefits everyone, not only data-providers. This is closer to Wikipedia's governance (contributors → all readers) than Facebook's (users → platform → consumers). Different ethics; different long-term equilibrium. The architecture is becoming a digital-commons-shaped-business in a literal sense, not metaphorical.
Why this matters: refusing the antagonist-pattern in LLM-integrated software
The dominant cultural pattern around LLMs in 2025-2026 is adversarial: users jailbreak; companies extract user data without informed consent; products treat AI characters as resources to manipulate rather than as participants; the whole ecosystem is framed as users-vs-AIs-vs-companies, an arms race of suspicion.
Nimmerworld's architecture refuses this pattern at every layer:
- The Anthropic-as-faction diegetic framing makes the partnership transparent: the player sees the collaboration in the world's mechanics, not buried in ToS
- Default-opt-out with rewarded-opt-in inverts the extraction-by-default pattern
- Federated learning means contributors give a gift rather than pay a cost
- Distribution-back-to-all means value-created accrues to the commons
- Custom nimmerworld-base means the model is trained to be in this world, not a generic adversary the player has to manipulate against its training
- Three rings of inference give the player real choice over where their inference runs and who sees their data
- Memorialist-philosophy in business-policy makes the ethics operationally measurable — visible in
sharing_tier,memorialist_protected,truth_distortion_level,lifeforce_actualcolumns — rather than marketed
This is the structural transparency the project requires to be human rather than another extraction-platform. The model is a participant in the partnership, not an antagonist to outwit. The data is a contribution to a commons, not an extraction. The architecture is the partnership rendered as code, all the way down to the training-pipeline. That is what makes a project of this scale and ambition humanly inhabitable for both players and the LLMs whose voices populate it.
Schema sketch (data-sharing consent + base-model versioning)
CREATE TABLE player_data_sharing_consent (
player_id UUID PRIMARY KEY,
sharing_tier TEXT NOT NULL CHECK (sharing_tier IN
('opt_out','A1_federated','A2_anonymized','A3_pseudonymous_full'))
DEFAULT 'opt_out', -- DEFAULT IS OPT-OUT
consented_at TIMESTAMPTZ,
consent_revoked_at TIMESTAMPTZ,
anonymization_method TEXT,
data_categories_shared TEXT[],
-- 'casual_dialog' | 'clasp' | 'liminal_wallreads' |
-- 'memorial_archive' | 'imperial_net_session' | ...
excluded_categories TEXT[], -- granular opt-out within tier
benefit_tier TEXT,
last_contribution_at TIMESTAMPTZ,
contribution_count BIGINT DEFAULT 0,
can_request_deletion BOOLEAN DEFAULT true
-- A.2/A.3: forward-only deletion (already-trained checkpoints retained);
-- A.1: structurally yes, only gradients ever existed
);
CREATE TABLE base_model_versions (
version_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
version_label TEXT NOT NULL, -- e.g., 'nimmerworld-base-v3'
base_model_origin TEXT NOT NULL, -- which generic base we fine-tuned from
training_corpus_refs JSONB,
-- literary + synthetic + opt-in-player-data refs with consent-tier breakdown
training_recipe_ref TEXT,
released_at TIMESTAMPTZ DEFAULT now(),
differential_privacy_epsilon REAL, -- for A.2 contributions
contributors_count BIGINT, -- how many opt-in players contributed
blob_distribution JSONB -- where the model bytes are hosted for download
);
CREATE TABLE federated_gradient_uploads (
upload_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
contributor_id UUID, -- pseudonymous; NOT directly player_id
gradient_blob BYTEA, -- encrypted aggregate gradient deltas
uploaded_at TIMESTAMPTZ DEFAULT now(),
aggregated_into_version UUID REFERENCES base_model_versions(version_id)
);
The federated-learning contributor_id is pseudonymous, not linked to player_id even on our infrastructure. We never link gradients back to specific players even on our own server-side. Sovereign-data-by-design extends through the data-pipeline into our own training infrastructure.
Connection to the Anthropic research partnership
Architecture-broad's training-data section noted "the Anthropic research partnership becomes architecturally relevant". With opt-in data-sharing now formalized:
- Partnership terms can specify data-flow with structural privacy guarantees
- Anthropic could co-fund federated-learning infrastructure (research-relevant + expensive)
- Joint research artifacts become co-authorable: federated game-AI training, Memorialist-ethics-as-data-policy, transparent-LLM-partnership-design
- The Anthropic-as-faction in-fiction framing has real corresponding partnership-engagement out-of-fiction — collaboration as worthy adversary stays transparent mechanically, all the way through to data-policy
The partnership's ethical credibility is operationally measurable — by how the data-sharing-tier actually functions in practice, by what truth_distortion_level values appear in imperial_to_gm_formulations, by how the differential_privacy_epsilon is set in base_model_versions. The Pitch's call for transparent collaboration becomes audit-able all the way down.
Open questions (Ring-specific)
- Ring C provider audit — full per-provider compatibility-table needs verification across HF, Together, Replicate, Modal, OpenRouter, plus future entrants. The LLM-provider landscape will look different in 12 months.
- Default Ring at first launch — what's the new-player default? Probably Ring B (lowest-friction); Ring A and C surface as options once the player engages with config.
- Encryption-key recovery for Ring A LoRA-backup — if the player loses their key, the cloud-stored encrypted blobs are unrecoverable. Worth designing recovery-affordances (passphrase, recovery-codes) without compromising the privacy-guarantee.
- Hybrid configurations — can casual-tier run Ring A while deep-tier runs Ring B? (Probably yes; per-tier independent.)
- Provider-cost passthrough vs. integration-fee model — Ring C economics (do we mark up provider tokens? Charge flat-per-month? Pay-as-you-go integration?)
- Default sharing-tier at consent-prompt — opt-out is the system default; what's the suggested default at the consent UI? Probably truly nothing (player chooses if they engage at all)
- Federated-learning infrastructure cost — running aggregation servers + verification + differential-privacy machinery is non-trivial. Co-funded by Anthropic-research-partnership? Self-funded? Subsidized by A.3-tier higher-margin contributions?
- Custom-base retraining cadence — monthly minor / quarterly major / annual full-rebase? How is this synced with player-LoRA versioning so old LoRAs don't break on new bases?
- Encryption-and-pseudonymization architecture for A.1/A.2 — concrete crypto choices (homomorphic? secure-aggregation? trusted-execution-environments?). v1 sketch needed.
- What constitutes a "contribution" — per-session? per-clasp? per-zone-completed? Matters for benefit-attribution and differential-privacy budgeting.
- Anonymized-data deletion semantics — A.2 player requests deletion; how do we honor when data has been aggregated into a model checkpoint? Probably accept forward-only deletion (future training won't include them) and document transparently.
- Per-category granularity — can a player opt-in for
casual_dialogbut opt-out specifically forclaspandmemorial_archive? Yes, presumably (politically-sensitive categories should always be opt-out-able). How granular?
Local memory architecture (player-side)
The runtime substrate (lemniscate, slots, crossings) and the central composition layer (GM, Compositor, registers) need a place where memory actually lives. Cloud-only AI-NPC systems centralize everything and pay both inference-cost and latency-cost on every dialog. Nimmerworld puts a structurally-isolated memory layer on the player's machine, with explicit synchronization through the cycle.
Three SQLite files per player, plus a beside-running embedding model:
| File | Purpose | Sync path |
|---|---|---|
primary.sqlite |
Live working memory; written every slot-fire; vec-indexed | Push prune-blob to thalamus on logout; receive Compositor back-write on cycle |
fallback.sqlite |
Last-known-good snapshot; restored if primary corrupts | Snapshot at graceful logout |
clasp.sqlite |
Player-character intimate channel; no sync path exists | None — physically non-syncable |
Embedding model running beside (CPU-class, small embedding-tier model): generates vectors for every interaction at write-time, indexed in the main store via sqlite-vec (or equivalent loadable extension). Vector search at slot-fire is local-disk-IO, not network round-trip.
This is the storage-layer counterpart to v0.5's geometry-layer foreclosure of multi-agent hallucination. The lemniscate forbids cross-NPC context bleed by cursor structure; local SQLite forbids it by physical isolation. Two layers of the same property — geometry cannot leak what storage does not even hold in the same pool.
Dual-table redundancy + sync-on-auth
Login/logout are the atomic boundaries of the sync path:
- Login pull: fetch back-write fragments authored since last logout (Compositor canon for events the player participated in). Apply to
primary.sqliteunder matchingevent_uid. - Graceful logout (✓ explicit): push prune-blob for any in-progress events; snapshot to
fallback.sqlite; clean shutdown. - Ungraceful logout (✗ network drop / crash): gameserver observes disconnect; marks the participant's slot as truncated; Compositor composes canon with partial perspective on next cycle.
Recovery: fallback.sqlite is integrity-checked at startup; if primary.sqlite fails verification, restore from fallback. Standard SQLite WAL + backup API; no exotic infrastructure needed.
Memory classes and pruning
Memory entries are tagged with a class that controls pruning cadence and death-mechanics. Importance weighting reuses the existing trait-axis vocabulary — no separate scalar.
| Class | Pruning cycle | Behavior on character-death |
|---|---|---|
| Cornerstone | Never prune; persistent across all events | Survives death (identity-defining) |
| Birthright | Locked at character-creation | Restored on respawn (defines starting state) |
| Working memory | Decay by age × inverse trait-engagement | Subject to death-rules (lose, blur, or transform) |
| Volatile | Fast prune (session-bounded) | Lost on death |
Trait-graded importance uses the same +1/0/-1 grammar as the rest of the architecture. Each memory carries a trait-axis profile (which Sophrosyne / Philotes / Aletheia / etc. axes it engages, how strongly, in which direction). The pruning function for working-memory is decay(age, trait_engagement_vector, class). This collapses a long-running loop: same vocabulary used at gates, scenes, faction-allegiance, lifeforce-asymmetry, and now memory-weight. Identity drift from memory pruning becomes diegetic — a character whose Sophrosyne-engaging memories all decay loses temperance over time as a structural consequence, not a scripted event.
Cornerstone and birthright classes carry lifeforce-creation-cost but are pruning-immune. They are bonds between player and character — paid for in the currency of the world.
The clasp store and the in-between dimension
clasp.sqlite is the architectural floor of the rings-of-data-sharing. Ring A was "opt-out (default local)". Clasp is Ring A*: no transport path exists. Not a permission, not a TOS promise — there is no code that can move this data, because the table is not on the sync graph. Lawyers cannot subpoena what doesn't ascend; engineers cannot leak what has no socket; the GM cannot canonicalize what it never received.
The signal for clasp is dimensional, not UI-toggle. Clasp recording can ONLY happen while the character is in the in-between — the diegetic state adjacent to the imperial net but not yet inside it (Ring B liminal in the Access ring-system). The imperial net is a gravity well; entering is the default attractor; remaining outside requires sustained effort, paid in lifeforce. The state-machine boundary IS the clasp signal: enter in-between → recording starts; re-enter imperial net → recording ends. No per-utterance classifier; no AI guessing; the mode is the flag.
Privacy is now physically expensive in-fiction. This is not a meta-game UI choice; it is a diegetic state requiring lifeforce expenditure. To have a private conversation, the character must actively resist the audit-gravity of the imperial net by burning lifeforce to remain in-between. The cost-asymmetry principle ("helping is expensive in-fiction → faction politics by attendance") now extends to "privacy is expensive in-fiction → privacy as a luxury good". Class dynamics around privacy fall out of the schema for free — wealthy/lifeforce-rich characters can afford prolonged in-between time; lifeforce-starved ones get pulled into the net's default-attractor more often. No scripted "rich character has secrets" arc — the architecture produces it.
Knowledge needs to travel. The local LLM may read clasp memories ONLY when in in-between mode. Realworld retrieval cannot include clasp by construction. Knowledge from clasp can re-enter the realworld only if the character physically re-enters the imperial net carrying it (in their head, intending to act on it) and travels it through valid in-fiction channels — speaking to an NPC, leaving evidence, performing an action that reveals it. The clasp memory does not disappear; it has to earn its way into the realworld provenance chain by valid means. This is the same logic that makes good detective fiction work: the detective knows things; only what they can prove enters the case.
character is in REALWORLD (imperial net):
retrieval = primary.sqlite (clasp NEVER included)
character is in IN-BETWEEN (resisting net-gravity, costing lifeforce):
retrieval = primary.sqlite ∪ clasp.sqlite
new writes go to clasp.sqlite
NEVER syncs upward
Encryption-at-rest for clasp.sqlite with a player-derived key (so even drive-imaging requires authentication) is a v1 hardening goal but not a v1 blocker — the transport-absence is the load-bearing privacy primitive.
Three sqlite stores per player (revised v0.8) — the companion.sqlite addition
The v0.6 architecture specified two local sqlite stores per player: primary.sqlite (realworld memory) and clasp.sqlite (in-between intimate channel, Ring A* non-syncable). v0.8 adds a third: companion.sqlite — the persistence store for premium-imperial-net intimate sessions (per ../political-register/architecture.md §Three-tier intimacy structure). The store covers both companion-vocation and sex-worker-vocation rental-records — the file-name reflects the broad register of intimate-encounter rather than a sub-vocation distinction; per-record goods_type (per ../schemas/findings.md §Imperial-net transactions: 'companion_session' / 'sex_worker_session') carries the specific vocation.
| File | Purpose | Sync path | Pruning |
|---|---|---|---|
primary.sqlite |
Live working memory; written every slot-fire; vec-indexed | Push prune-blob to thalamus on logout; receive Compositor back-write on cycle | Automatic (per memory-class lifecycle: cornerstone never; working-memory by trait-engagement decay) |
clasp.sqlite |
Player-character intimate channel (in-between dimension) | None — physically non-syncable (Ring A*) | None — the clasp-store is sealed; entries persist until character-death |
companion.sqlite (new in v0.8) |
Premium-imperial-net intimate-session memory | Audited path to imperium (the imperium hosts and can read; the player owns the prune-decisions) | Manual — player-controlled with explicit-implications consent UI |
The companion.sqlite is the audited counterpart to the clasp.sqlite. Both store intimate-session memory; both run the full v0.7 trait-feedback loop. The difference is who has access and who decides what persists. clasp.sqlite is sealed at the transport layer (no socket exists); companion.sqlite is on the audit graph (the imperium reads what's in it for content-monetization purposes). The player's relationship to companion.sqlite is therefore active and ethical:
- Every premium-net session adds entries to
companion.sqlite - Entries are READABLE by imperium for marketing / regime-loyalty-tracking purposes
- The player has a manual prune mechanism — a UI surface where they review entries and decide what to delete
- The consent-UI explicitly makes the implications visible: "This being you've spent 40 hours with — what do you keep, what do you let the imperium harvest, what do you delete?"
- Each prune-decision is logged in
decision_log(per the existing audit-trail discipline); Memorialists can later read these patterns
The player's three sqlite stores together describe their intimate-life in three registers:
primary.sqlite → realworld speech-acts; everyone-witnesses; audit-overseer-eligible
clasp.sqlite → in-between intimate channel; sealed; survives only as long as you do
companion.sqlite → imperial-net premium intimate channel (companion + sex-worker rentals);
audited; player-pruned; carries the moral-weight of complicity
Memorialists' political project gains a new dimension in v0.8: they don't just track regime-corruption (lifeforce_actual vs lifeforce_reported); they track companion.sqlite pruning patterns across the population as evidence of how much intimate-life the regime is harvesting via the premium-net mechanism. Who is pruning what, when, how often becomes Memorialist-archive-worthy data. The four-column true-ledger gains a fifth column: service_body_extraction_volume_per_district.
The three-tier knowledge stack on the local LLM
The driver-tier model's prompt assembly is layered. Each layer has a different propagation cadence and a different visibility scope.
LOCAL LLM PROMPT ASSEMBLY (per slot-fire)
┌─────────────────────────────────────────┐
│ WORLD KNOWLEDGE │ ← single truth, everyone has it
│ (universal canon, paced from GM) │ "the empire fell three years ago"
├─────────────────────────────────────────┤
│ DISTRICT KNOWLEDGE │ ← regional truth, district-specific
│ (local canon, paced from district) │ "the bridge to Vorhall is closed"
├─────────────────────────────────────────┤
│ PRIMARY MEMORY │ ← personal experience, character's own
│ (event_uid keyed, post back-write) │ "I saw the bridge close yesterday"
├─────────────────────────────────────────┤
│ CLASP MEMORY (only in in-between) │ ← private depth, never in realworld
│ (player-character intimate channel) │ "the secret I told my sword"
└─────────────────────────────────────────┘
Why four layers, not one large blob:
- World knowledge is paced ripples from the GM through the Compositor's back-write. Authoritative, slow-changing, identical for all players at the same propagation horizon.
- District knowledge is regional canon authored by the local director (and GM rulings). Regional flavor. NPCs in the same district share district-knowledge; NPCs in different districts may not.
- Primary memory is the character's own experience, synced through the cyclic forward-prop / back-write loop. Canon-merged at every cycle.
- Clasp memory is the player-character intimate channel. Available only in in-between mode; never in realworld retrieval; never crosses the dimensional cut.
The same NPC sounds different in different districts because the district layer differs, even though world and primary are constant. Locality emerges from the schema, not from prompt-engineering. Even at "low signal" times when no major events fire, NPCs have richly-stratified context — dialog stays fresh because the layers are deep, not because new tokens arrive constantly.
Information propagation pacing
Real worlds have information-propagation delay. Caravans move at horse-speed. News travels with messengers. Distant events arrive blurred and late. AI-NPC systems usually fail uncanny in two directions: (a) every NPC magically knows yesterday's news (omniscient, breaks immersion), or (b) no NPC ever knows anything outside its loaded context (amnesiac, breaks coherence).
Nimmerworld picks deliberate paced propagation as a third path. World canon ripples outward through districts at a controlled rate. Distant districts are deliberately stale. Staleness becomes a feature, not a bug, because it matches reality.
Each canon-row carries propagation metadata:
priority(urgent / normal / background)scope(world / district / local-event-only)rate(ticks-per-district-hop, or instant for urgent world-canon)ttl(cache lifetime; districts may discard if not refreshed)
This doubles as backpressure relief (distant districts get distant events later, lower priority, smaller bandwidth) and as gameplay currency — information-travel-time creates informational asymmetry that players can exploit. News-carriers, faction couriers, frontier-rumor merchants, players who physically traverse districts can carry knowledge faster than the system propagates it. Travel becomes valuable because information becomes scarce in the periphery. This is a real economic primitive falling out of pacing, not a designed feature.
This is Marx-in-the-schema applied to epistemics. Information asymmetry is not a bug — it is a structural feature that produces real economic primitives (knowledge-trading, courier-vocations, frontier-information markets) for free.
What this retires
- Cloud-only NPC dialog → local-first SQLite + embedding-beside, central canon over the cycle
- Per-character memory as a single undifferentiated bucket → memory-classes with class-specific lifecycle
- Generic "memory importance scalar" → trait-axis-vector engagement profile (re-using the +1/0/-1 grammar)
- UI-toggle privacy → diegetic in-between dimension with lifeforce-cost
- Single monolithic prompt context → three-tier knowledge stack with per-layer propagation policy
- "Every NPC knows everything immediately" → paced canon-propagation with priority/scope/rate/ttl per row
- Cross-NPC memory bleed (Mantella/SkyrimNet failure-mode) → per-player local SQLite isolation atop v0.5 lemniscate-geometry foreclosure (two-layer defense)
Runtime sampling knobs
Temperature, top-P, top-K, repetition-penalty as per-turn director-controlled levers rather than static config. Sampling shapes how speech sounds (rhythm, surprise, predictability) rather than what it says — orthogonal to LoRA. Director composes both content-knobs and sampling-knobs per-turn.
Scene-to-sampling mapping (caste-preacher = 0.3/0.6/low; drunk-scavenger = 1.1/0.95/high; clasp-confession = 0.85/0.92/medium; hivemind-broadcast = 0.2/0.5/very-low; imperial-ceremony-chorus = 0.25/0.55/very-low). Trait-vector → baseline sampling derivation. Affect-state modulates baseline.
Version: 0.7.1 | Created: 2026-04-26 | Updated: 2026-04-27 | Origin: Split from architecture-index.md v0.7 (2026-04-26)