30 KiB
Nimmerworld — Broad Architecture
Ground-up zone-based event architecture. Minds at the center, world as co-remembering substrate. v0.1 initial draft 2026-04-24 morning; v0.2 expanded 2026-04-24 afternoon. dafit + chrysalis.
Thesis
Every game world lacks minds. Nimmerworld puts minds at the center — NPCs with trait-filtered interior life, cells as small data-lives that co-remember with them. The architecture is built ground-up for that commitment.
Core inversion — zones replace bubbles
Legacy engine perception (Unreal AIPerception, WC3 sight-radius, trigger volumes) is spatial, binary, separate from cognition, and assumes a passive world. Nimmerworld inverts every axis:
- perception is trait-filtered, not only spatial
- perception is graded consolidation, not binary detection
- perception IS memory (one subsystem)
- the world emits events; agents subscribe
The replacement primitive is the zone — a bounded, named, slot-indexed, director-managed event-instance. Arthurian round table as mental model: a bounded place of structured speaking-and-witnessing, with named roles and shared what-was-said-by-whom.
Zone anatomy
- Boundary — cell-envelope
- N slots — named positions (seats, fighter-positions, workbench-stations, spectator-ring)
- Director or overseer — manages turn order, memory-pulls, prompt construction, voice selection, event emission
- NATS topic + subscriber list — slot-occupancy drives subscription
- Mixed-fidelity voices — 2–3 LLM slots + scripted/generic for the rest; director decides
- Trigger — gamemaster-spawned or emergent from task-execution
- Lifecycle — duration, dissolution conditions, memory-write on close
- Persistence flag —
ephemeral=true(dreamworld) orpersistent=true(gameworld)
Zone taxonomy (v1 starter set)
| Zone type | Slots | Executor | Persistence |
|---|---|---|---|
| Conversation | 2–4 dialog | director | persistent |
| Street brawl | fighter + spectator | director | persistent |
| Ritual | fixed ceremonial | director | persistent |
| Maintenance | 1–2 workbench | director | persistent |
| Wall-writing | 1 author + witnesses | director | persistent |
| Market exchange | 2–3 + ambient | director | persistent |
| Memorial gathering | 1 mourner + N witnesses | director | persistent |
| Clasp (dreamworld) | 2 | director | ephemeral |
| Patrol / sweep | mobile, N enforcers | overseer | persistent |
| Interrogation | 1 subject + N enforcers | overseer | persistent |
| Raid | district-scope + N enforcers | overseer | persistent |
Factions as universal demand source
A faction is not only "a group of NPCs with shared ideology." It is any source of bounded demand on the system.
| Category | Examples |
|---|---|
| Human factions | hivemind, scavenger guilds, memorialists, aletheia-wakers, clasp-underground, caste preachers, flesh-keepers |
| Natural forces | weather-faction, season-faction, solar-storm-faction, geology-faction |
| Infrastructural conditions | scarcity-faction, decay-faction, fire-faction, supply-chain-faction |
| External agents | anthropic-faction, future research-partner factions |
| Emergent events | player-disturbance-faction (player action without existing template) |
All broadcast demands. All propagate through the gamemaster's arbitration. All get distributed down the cascade. All produce observable effects as zones and NPC-actions. One primitive; no special-case code for weather vs. hivemind vs. Anthropic.
Randomness enters at the faction layer. The designer tunes broadcast probability, intensity, and duration per faction-type. Weather-factions broadcast continuously at low intensity; storm-factions rarely and urgently; solar-storms cosmically. No separate randomness subsystem.
Anthropic-as-faction makes the commercial partnership architecturally transparent (the mechanism is visible in the architecture) while staying diegetic (the player sees an in-fiction caste-preacher's sermon, not an Anthropic logo). Anthropic's broadcasts compete for gamemaster attention on the same queue as every other faction. No privileged routing.
Hierarchy
GAMEMASTER (resource allocation + faction-demand arbitration)
▲
│ ← faction broadcasts (human · natural · infrastructural · external)
│
FACTIONS:
hivemind · scavenger guilds · memorialists · aletheia-wakers ·
clasp-underground · flesh-keepers · caste preachers · ...
+ weather · scarcity · solar-storm · fire · anthropic · ...
│ ← gamemaster dispatches to executors
▼
┌────────────────────┬────────────────────┐
│ OVERSEERS │ DIRECTORS │
│ (hivemind │ (macro-life of │
│ enforcement: │ the city: │
│ patrol, surveil, │ conversation, │
│ raid, propaganda)│ brawl, ritual, │
│ │ maintenance, │
│ │ market, routine) │
└────────────────────┴────────────────────┘
│ │
└───────┬───────┘
▼
ZONES (bounded, slot-indexed, lifecycle-managed)
│
▼
SLOT OCCUPANCY (NPCs + player)
│
▼
NPC / PLAYER MINDS
(trait-vector + memory stack + Dream-process consolidation)
The bidirectional cascade
The cascade is bidirectional. Down goes demand; up comes outcome.
DOWN — demand propagation
factions broadcast → gamemaster arbitrates → district directors
decompose → NPCs execute → zones spawn → events happen
UP — outcome signal
NPC state reports (task-done, task-failed, need-unmet, trait-drift)
→ district aggregate → gamemaster receives district-summary
→ faction-satisfaction scores computed
→ faction trait-vectors drift based on outcome
→ next broadcast cycle shaped by last outcome cycle
The clean signal up the pyramid IS the training surface for the gamemaster's Dream-process (see below). Every epoch closes on given these broadcasts, I allocated this way, here is the aggregate outcome, here is the faction-satisfaction score.
Task cascade and bounded agency
Three levels of bounded agency, each with a tool-calling interface:
Gamemaster's tools (against faction-demands + district-reports)
assign_district_task(district_id, task_spec, deadline)set_faction_priority(faction, weight)spawn_global_event(type, parameters)— droughts, holidays, anomaliesrequest_district_report(district_id)arbitrate_conflicting_demands(demand_list)
District Director's tools (against gamemaster-assigned tasks)
spawn_zone(type, cells, slot_config)assign_npc_task(npc_id, task_spec, deadline)request_resources(district, type, quantity)designate_meeting_point(cells, purpose)trigger_ambient_event(cell, type)report_to_gamemaster(metric_dict)
NPC's tools (against daily task-list + personal needs)
move_to(cell)interact(object, intent)occupy_zone_slot(zone_id, slot_index)consume(item)/rest(duration)/seek_npc(npc_id, reason)write_wall(cell, content)defer_task(task_id, reason)— when traits override assignment
Higher levels do not know lower levels' implementations. Complexity is bounded at each level. Each level may be rule-based (fast), LLM-based (rich), or hybrid — and the choice per level can evolve independently.
NPC task-vs-need-vs-trait arithmetic
Each tick, an NPC asks "what do I do next?" Weighted sum:
| Factor | Weighted by |
|---|---|
| Next task in list | vocational-discipline + Sophrosyne |
| Most salient need | need-urgency × (1 / Sophrosyne) |
| Memory-surfaced opportunity | trait-salience of triggered memory |
| Faction-loyalty pull | Dikaiosyne + faction-membership |
| Beloved's distress | Philotes |
High-Sophrosyne NPCs prioritize tasks. High-Philotes NPCs deviate for loved ones. High-Kairos NPCs seize opportune moments. NPCs are agents with bounded autonomy inside an assigned framework.
Zones emerge from task execution
Zones are consequences of the cascade, not pre-scripted events:
- maintenance-task + cooperating NPC-pair + workshop-cell → maintenance-zone spawns
- preacher-task + caste-preacher + high-foot-traffic cell → sermon-zone spawns
- patrol-task + enforcement-NPCs + designated route → patrol-zone spawns
- memorial-task + unmourned-body cell → memorial-zone spawns
The director's zone-spawn logic is fed by the task-cascade. Designers script demands and rules; zones emerge.
Distributed-scheduler lineage
At its bones: factions = job-submitters; gamemaster = global scheduler; district directors = regional schedulers; NPCs = workers with autonomy; zones = observable work. Decades of engineering literature (Kubernetes pod-scheduling, Mesos, Borg, job-queues) applies to the scheduler side.
Resources
Resources are what the cascade allocates. Every demand the gamemaster receives targets some combination of resources; every task the cascade issues consumes some; every NPC action generates, depletes, or transfers some.
The resource taxonomy
| Category | Examples |
|---|---|
| Labor | NPC hours per vocation × district |
| Material | item-instances in cells — parts, food, tools, scrip, limbs, contraband |
| Spatial | cell-capacity, hoarding-density, zone-anchor availability |
| Temporal | NPC daily-hours, zone durations, tick budget |
| Cognitive | LLM-slot budget, VRAM, director attention, compute-ceiling |
| Diegetic currencies | dreamtime (machine-paid), memory-tokens (across-cycle), scrip (black market) |
| Social | trait-signature trust, relationship-strength, faction-membership density |
| Attention | player attention — the scarcest resource in the play experience |
Resource flow
Resources do not sit still. They move through the simulation:
- Generated — labor hours accrue per tick; parts scavenged from junkyard cells; dreamtime paid out by the machine; memory-tokens earned per completed vocational hour
- Consumed — tasks spend NPC-time; crafting consumes parts; zone-occupancy consumes director compute; intimacy consumes dreamtime
- Accumulated — stashes grow in hidden cells; trait-vectors drift; memory stacks deepen
- Decayed — item wear; NPC fatigue; district ambient-pressure dissipation; memory fade in lower tiers
- Transferred — payments, theft, gifts, clasp-sharing, Memorialist cache-keeping, inheritance-across-cycles
Each category has its own flow characteristics; phoebe tracks them per-cell and per-NPC.
District report format (district → gamemaster)
Each district director periodically reports up the pyramid:
district_report {
district_id: ...
timestamp: ...
labor_available: { vocation → count + hours-remaining }
resources_on_hand: { item_type → count + quality-distribution }
space_utilization: { cells_near_capacity, hoarding_density }
faction_member_counts: { faction → count }
aggregate_trait_vector: [8 floats]
recent_salient_events: [top-N]
pending_demand_backlog: { faction → unfulfilled-demand-count }
player_presence: { present: bool, cell-proximity, engagement-state }
}
The gamemaster consumes these reports each cycle to inform allocation.
Resource contention and arbitration
When multiple faction-demands target the same resources (hivemind wants enforcement-labor; memorialists want mourning-labor; scavengers want courier-labor; Anthropic-faction wants wall-writing-labor), the gamemaster arbitrates by:
- Faction priority weights (tunable per-faction)
- Deadline urgency (imminent outranks distant; decay curves per urgency class)
- Historical satisfaction (chronically under-served factions drift pressure-up)
- Player proximity (near-player biases toward zone-types the player engages with)
- Global budget saturation (at cap, drop lowest-priority; age out stale demands)
This is the gamemaster's core job. Its Dream-process learns to do it better over time.
Player as resource flow (both directions)
The player consumes:
- NPC time (when in a zone-slot with player)
- Director compute (dialog slots, memory-scoping, prompt construction)
- Cell occupancy (physical presence uses space)
- Zone anchors (player-proximity biases zone spawns, consuming zone-slot budget)
The player produces:
- Narrative pressure (player-disturbance-faction broadcasts emerge from novel actions)
- Trait-salient memories in NPCs they interact with
- District-state perturbations (ripples through subsequent cascade cycles)
The player is a resource flow in both directions, not a privileged observer.
Phoebe as resource ledger
All resource state lives in phoebe:
- Per-cell ledger — item-instances, occupancy, ambient metadata
- Per-NPC ledger — labor-hours-remaining, needs-state, task-list, memory-stack, stash-references
- Per-district aggregate — computed from per-cell and per-NPC rows
- Per-faction state — membership counts, trait-vectors, demand-queues, satisfaction-history
- Global ledger — currency supplies, compute budget, tick counter, epoch marker
Resource queries are SQL by default (via pgnats). Gamemaster and directors read; NPCs' world-actions write via the NATS → phoebe pipeline.
Zone spawn cadence
Zone spawn-cadence is the game's pulse rate. Tuning it = tuning the emotional tempo. Too fast: walks home get interrupted; melancholy-intimacy collapses. Too slow: city feels dead.
Three layered mechanisms (all compute-budget capped)
- Demand queue (faction-driven). Factions write prioritized demands; gamemaster processes top-N by priority + compute-cost per tick. Stale demands age out.
- Pressure gradients (emergent ambient). Cells accumulate ambient pressure (idle NPCs wanting connection, market activity accreting, brawl potential at cantinas). Threshold-crossings spawn zones organically.
- Player-proximity densification. Multiplier layered over both. Near the player, queue-processing and pressure-release run faster. Distant districts tick at background rate.
Tuning dials
| Dial | Controls | v1 starting value |
|---|---|---|
gamemaster_tick_hz |
Evaluation frequency | ~1 Hz |
max_zones_per_district |
Concurrent zone cap | 8–12 |
max_llm_slots_citywide |
Global LLM-dialog concurrency | 5–10 |
per_npc_cooldown |
Min gap between NPC zone-participations | 30–60s |
faction_weight[f] |
Per-faction priority multiplier | hivemind 1.0, others 0.3–1.2 |
daily_cycle_curve[zone_type] |
Time-of-day multiplier | morning 1.2, night 0.3 |
player_proximity_multiplier |
Density boost near player | 2–3× |
district_aletheia_dampening |
Aletheia suppression of overseer-zones | 0.0–0.8 |
zone_type_cooldown[cell, type] |
Local cooldown after dissolution | 2–15 min |
Daily pulse (target v1 feel)
- Morning — ambient, market, maintenance. Low density.
- Midday — productivity-check overseer-zones. Tension rises.
- Evening — conversation-zone peak. Richest dialog density. Relationships deepen.
- Night — sparse. Clasp-possibility, shadow, intimate walking. The walks home happen here.
- Burst events — short-term spike in affected district; settles over the following hour.
The player as perturbation
The player is not above the scheduler. The player is a finite attention-unit injected into it. Every player-NPC interaction:
- Pulls the NPC out of its scheduled tasks for the duration
- Consumes director compute (dialog-LLM slot, sampling knobs, memory pulls)
- Degrades the district's quota-fulfillment in the next report
The player is angel and chaos simultaneously.
- Per-NPC scale (angel) — you help a stumbling NPC; they survive their cycle; their Philotes-toward-you consolidates; they may clasp with you, die for you, remember you across cycles.
- Per-district scale (chaos) — your time-consumption raised the aggregate failure-rate; someone else broke this cycle; a raid spawned that otherwise would not have; ambient desperation rose.
Both are true simultaneously. This is not a moral dilemma to resolve; it is the structure of finite agency in a scarcity economy. You cannot help everyone; helping anyone is a choice of whom to not help elsewhere.
Thematic claims become literal economics
- "Time as the scarcest resource" — every minute with a beloved is a minute not in the Black Board queue, a minute the district's quota is missing.
- "You earn the time to love her by stealing it from the machine" — the time-theft is literal in the scheduler; the machine detects the quota-miss.
- "The clasp is wage theft / economic sabotage" — two consciousnesses in one body produce one body's throughput; district report degrades permanently; clasped couples are statistical anomalies in the scheduler, which is exactly how the hivemind will detect them.
The critique is the simulation. No separate narrative system needed.
The player has tasks and needs too
The player is an NPC with vocation, tasks, needs. Time helping others = own tasks fail = own quotas missed = own enforcement-pressure rises = own death and reinstantiation arrive sooner. The player IS the system they are deviating from.
LLM tiering and voice fidelity
Three model-tiers
| Tier | Model | Role |
|---|---|---|
| Casual / most NPCs | Small (3–8B, trait-LoRA'd, knob-steered) | Most dialog slots — ambient conversation, routine speech, casual turns |
| Deep / mythic moments | Theia 70B | Clasp confessions, mentor speech, ritual exchange, NPC-internal deliberation at high stakes |
| Hivemind / antagonist | Claude-as-API (future integration) | District-summary broadcasts, overseer directives, anamnesis dialog |
Three tiers, three call patterns. Cognitive distance between hivemind and citizens is also model-architecture distance.
LLM is guest at slot, not host of system
- Zone director composes the prompt (content knobs + sampling knobs + output schema)
- LLM generates one slot-turn (structured JSON output)
- Director dispatches the output back to the zone
The LLM never "runs" an NPC. It speaks for a slot, one turn at a time, in trait-scoped context.
Structured-prompt DSL (knob-steered)
<|role|> caste-preacher
<|trait_vector|> Sophrosyne 0.8, Dikaiosyne-miscalibrated 0.7, Aletheia 0.1
<|affect_state|> measured concern
<|memory_scope|> [last interaction with this NPC 3 days ago, suspicious;
recent wall-reading; district mood]
<|turn_intent|> gently warn about productivity concerns
<|zone_context|> morning, market-square, 4 NPCs present, 1 drone overhead
<|output_schema|> { dialog_text, gesture_cue, trait_activation,
affect_shift, memory_write_candidates, end_turn_flag }
Small models excel at this surface because it is instruction-following, not generic generation.
Trait-LoRAs
vLLM multi-LoRA serving: one base, N LoRAs loaded simultaneously, hot-swappable per request.
| Option | Count | Composition | Trade-off |
|---|---|---|---|
| A. Pure-trait | 8 (one per Hellenic virtue) | weighted blend over trait-vector | Cleanest ontology; blend quality unknown at inference |
| B. Register-LoRAs | 4–6 | selected by slot-type + trait-in-prompt | Training-tractable; cruder ontology |
| C. Preset-persona | 8–12 | per NPC class | High individual quality; rigid; no drift |
v1 recommendation: start with B (register-LoRAs) for training-tractability, layer toward A (trait-LoRAs proper) in v2 as data accrues.
Training-data strategy
- Literary derivation — Proust (Mnemosyne), Plato (Aletheia), Tacitus (Dikaiosyne-miscalibrated), Ishiguro (Sophrosyne + Philotes). Labeled corpus anchored in existing prose.
- Synthetic teacher-student — Qwen3.5-27B teacher generates trait-labeled samples; small base learns via LoRA. Existing r0 → r1 pipeline with trait-tags as composition axis.
- Gameplay-accrued — logged gameworld dialog with trait-vectors accrues in phoebe; periodic LoRA retraining. This is where the Anthropic research partnership becomes architecturally relevant.
Tooling synergy with nyx-training
Same Unsloth pipeline. Same teacher-student distillation. Same tagged-generation. One tooling investment, two deliverables.
Runtime sampling knobs
Temperature, top-P, top-K, repetition-penalty are usually set once and forgotten. In nimmerworld they are per-turn director-controlled levers — part of the same prompt-composition as content knobs.
The knobs and what they shape
- Temperature — determinism vs. creativity
- Top-P — lexical range
- Top-K — tail cutoff
- Repetition penalty — novelty vs. ritual-repetition
- Min-P — adaptive variety
Sampling shapes how speech sounds (rhythm, surprise, predictability), not what it says. Orthogonal to LoRA. Together they give the director a full voice-palette.
Scene-to-sampling mapping (starter table)
| Scene | temp | top-P | rep-penalty |
|---|---|---|---|
| Caste-preacher sermon | 0.3 | 0.6 | low |
| Drunk scavenger at bar | 1.1 | 0.95 | high |
| Hivemind broadcast | 0.2 | 0.5 | very low |
| Clasp confession peak | 0.85 | 0.92 | medium |
| NPC giving directions | 0.4 | 0.7 | medium |
| Ritual ecstasy (dreamworld) | 1.3 | 0.99 | high |
| Memorialist chant | 0.4 | 0.65 | very low |
| Aletheia-waker whispering heresy | 0.7 | 0.88 | medium |
Trait-vector → sampling derivation
Baseline sampling is derived from NPC's current trait-vector:
- High Sophrosyne → lower temp (measured)
- Low Sophrosyne → higher temp (loose)
- High Kairos → higher top-P (catching unexpected tokens)
- High Mnemosyne → lower repetition penalty (comfortable returning to phrases)
- High Aletheia → moderate-high temp (willing to surface disclosures)
- High Moira → lower top-P (pattern-constrained)
Affect-state modulates on baseline; zone-type priors apply per zone. Sampling becomes a feature of the character-simulation, not a static config.
Reflexive gamemaster Dream-process
Every mind in the system has a Dream-process. NPCs consolidate slot-scoped events into memory. Clasped player-inhabitants consolidate shared experience. The hivemind consolidates district summaries. And the gamemaster itself consolidates its own orchestration decisions into a better policy.
Epoch cycle
WITHIN AN EPOCH:
gamemaster orchestrates → zones spawn → NPCs execute →
zones dissolve → memory-writes consolidate →
outcomes ripple back up to district reports
ALL events publish to NATS, log to phoebe via pgnats
EPOCH BOUNDARY (every N game-hours or days):
- aggregate gate-decisions + outcomes from phoebe
- apply quality/reward signal
- train gamemaster's policy on (context, decision, outcome) triples
- probe-evaluate the updated policy
- shadow-deploy for validation
- swap in as active gamemaster
NEXT EPOCH:
improved gamemaster orchestrates...
What the policy learns
- Zone-spawn decisions given district state + faction pressure + time + proximity
- Faction-demand arbitration weights
- Director/overseer dispatch decisions
- Slot-assignment decisions
- Sampling-knob defaults per trait × affect × zone-type
- LoRA-blend weights per trait-vector
- Pacing modulation
Modular policies per decision-surface (easier to version and debug) beats one monolithic policy.
Discipline-questions (risks to name)
- Reward signal problem. What do we train toward? Player-engagement can be gamed. Trait-drift coherence needs a metric. Defining the reward function is the hardest part of the loop. Needs hand-written guardrails alongside automated signals.
- Catastrophic forgetting. Rehearsal buffers, EWC, or periodic anchored-corpus training required.
- Feedback-loop drift. Reinforcing pathological convergence is possible. Human-in-the-loop audits at regular intervals.
- Reproducibility vs. live-learning. Version-pin per ship-release; continuous-learn in staging; promote to production on audit.
- Compute cost. Training competes with inference. Train during low-play hours; hot-swap policies at epoch boundaries.
- Privacy. Dreamworld-ephemeral / gameworld-persistent schema governs what can feed training. Dreamworld content never touches the policy corpus.
Key moves (consolidated)
- LLM is guest at slot, not host of system.
- Mixed-fidelity voices by default — small-LoRA-steered + scripted/generic + occasional Theia-70B + Claude-API-hivemind.
- Perception IS memory.
- Hivemind is a faction, not a conductor.
- Faction-broadcast is universal primitive — weather, scarcity, cosmos, external research partners all flow through it.
- Zones emerge from task execution — designers script demands + rules, not events.
- The player is a perturbation on the scheduler — angel at NPC scale, chaos at district scale.
- Ephemeral/persistent is one zone-type bit — dreamworld/gameworld privacy follows.
- Sampling-knobs are per-turn director levers — not static API config.
- The gamemaster has its own Dream-process — the architecture is reflexive.
- The cascade is bidirectional — demand down, outcome up.
Compute allocation
- Active zones in a 100-NPC city at any moment: ~5–15 with LLM-dialog slots
- Theia (70B) — deep dialog slots + tier-1 moments (few concurrent)
- Small model (3–8B) with trait-LoRAs — the majority of dialog slots
- Saturn (small classifiers) — scripted-voice selection, trait-salience scoring, packet routing, director subroutines
- Director / overseer logic — deterministic script + small classifiers; no LLM for orchestration
- Claude-as-API (future integration) — hivemind-broadcast tier
Mapping to phoebe task list
- Thalamus (NATS orchestration) = gamemaster + arbitration substrate + gamemaster's Dream-process substrate
- Specialist composition system = overseers + directors + NPC-minds as composable profiles
- NPC schema for phoebe = trait-vector + memory stack + slot-occupancy + task-list + needs state
- NATS namespace registry = zone topics + faction broadcast topics + district report topics
- pgnats on phoebe-dev = phoebe as first-class actor for memory-writes on zone close, decision-logs, faction-satisfaction tracking
- math cells as first harness/MCP test bed = zone-slot-memory primitives
- r0 → r1 generation pipeline = trait-LoRA training-data generation (shared with nyx-training)
- Adopt Unsloth training patterns = LoRA + gamemaster-policy training infrastructure
- Probe-to-phoebe pipeline = LoRA evaluation + gamemaster-policy evaluation
What this retires
- NPC attention / interaction / discovery bubbles as first-class primitives (DESING-VISION §449). → replaced by zone slot-occupancy.
- Geometric perception (cone, radius, LOS) as the perception model. → replaced by subscriber-based event emission with trait-salience filtering.
- LLM-per-NPC or LLM-per-action. → replaced by LLM-per-slot-per-turn, mixed with scripted/generic voices.
- Static sampling parameters as API configuration. → replaced by per-turn director-composed sampling knobs.
- Pre-scripted zones / events. → replaced by zones emerging from task-execution meeting NPC-proximity.
- Single-purpose randomness subsystems. → replaced by factions-as-demand-sources (weather / scarcity / cosmos / external = same primitive).
- Static gamemaster policy. → replaced by the reflexive Dream-process learning loop (v2+).
Open questions
- Reward function for the gamemaster's Dream-process — what combines trait-drift coherence + faction-satisfaction + player-engagement + burnout-rate + aesthetic-register-fit into one training signal?
- Zone spawn-cadence tuning algorithm at v1 (rule-based) → v2 (policy-learned) transition point.
- LoRA-blend vs. single-LoRA-selection semantics at inference.
- LoRA rank selection — budget/quality trade-off.
- Sampling-knob heuristics — where to start; how to learn refinements.
- Zone overlap policy — can one NPC occupy slots in two zones simultaneously?
- Zone-to-zone handoff (walking out of conversation into a brawl).
- Mobile zone boundaries (patrols, escorts, pursuit).
- Slot-capacity elasticity — can a zone grow slots dynamically?
- Anthropic-faction's broadcast cadence + arbitration weight.
- Player-dialog handling — route player text through a player-trait-LoRA for style-coherence, or bypass the LLM entirely?
- Demand-arbitration algorithm inside the gamemaster (rule-based v1 shape).
- Director/overseer spawn ownership per model class.
Version: 0.2 | Created: 2026-04-24 | Updated: 2026-04-24
v0.2 (2026-04-24, afternoon) expands the v0.1 architecture with: factions-as-universal-demand-source (weather / scarcity / storms / fires / Anthropic all as factions), the bidirectional cascade (demand down, outcome up), the task-cascade and bounded agency (three-level tool-calling scheduler, NPC trait-task-need arithmetic, zones emerging from task execution), the resource taxonomy and flow (labor / material / spatial / temporal / cognitive / diegetic-currencies / social / attention; district report format; contention arbitration; phoebe as ledger), zone spawn-cadence mechanisms and tuning dials, the player as perturbation (angel/chaos at two scales, moral economy emerging from scheduler), LLM tiering (small + Theia + Claude-API) with trait-LoRAs and structured-prompt DSL, runtime sampling knobs as per-turn director levers, and the reflexive gamemaster Dream-process (epoch-cycle policy training).
Captured live from dafit–chrysalis dialogue, 2026-04-24. Companion to DESING-VISION.md; supersedes bubble-based perception, scripted-zone spawn, static-sampling-config, static-gamemaster-policy, and per-NPC-LLM where prior sections implied those patterns.