version 0.2 of arch. finished gamemaster reward func- ressources defined

This commit is contained in:
2026-04-24 21:13:57 +02:00
parent 915de89fb6
commit c63b8bea9b
2 changed files with 629 additions and 53 deletions

545
architecture-broad.md Normal file
View File

@@ -0,0 +1,545 @@
# Nimmerworld — Broad Architecture
> *Ground-up zone-based event architecture. Minds at the center, world as co-remembering substrate.*
> *v0.1 initial draft 2026-04-24 morning; v0.2 expanded 2026-04-24 afternoon. dafit + chrysalis.*
---
## Thesis
Every game world lacks minds. Nimmerworld puts minds at the center — NPCs with trait-filtered interior life, cells as small data-lives that co-remember with them. The architecture is built ground-up for that commitment.
## Core inversion — zones replace bubbles
Legacy engine perception (Unreal AIPerception, WC3 sight-radius, trigger volumes) is spatial, binary, separate from cognition, and assumes a passive world. Nimmerworld inverts every axis:
- perception is **trait-filtered**, not only spatial
- perception is **graded consolidation**, not binary detection
- perception **IS** memory (one subsystem)
- the world **emits** events; agents **subscribe**
The replacement primitive is the **zone** — a bounded, named, slot-indexed, director-managed event-instance. **Arthurian round table** as mental model: a bounded place of structured speaking-and-witnessing, with named roles and shared what-was-said-by-whom.
## Zone anatomy
- **Boundary** — cell-envelope
- **N slots** — named positions (seats, fighter-positions, workbench-stations, spectator-ring)
- **Director or overseer** — manages turn order, memory-pulls, prompt construction, voice selection, event emission
- **NATS topic + subscriber list** — slot-occupancy drives subscription
- **Mixed-fidelity voices** — 23 LLM slots + scripted/generic for the rest; director decides
- **Trigger** — gamemaster-spawned or emergent from task-execution
- **Lifecycle** — duration, dissolution conditions, memory-write on close
- **Persistence flag** — `ephemeral=true` (dreamworld) or `persistent=true` (gameworld)
## Zone taxonomy (v1 starter set)
| Zone type | Slots | Executor | Persistence |
|---|---|---|---|
| Conversation | 24 dialog | director | persistent |
| Street brawl | fighter + spectator | director | persistent |
| Ritual | fixed ceremonial | director | persistent |
| Maintenance | 12 workbench | director | persistent |
| Wall-writing | 1 author + witnesses | director | persistent |
| Market exchange | 23 + ambient | director | persistent |
| Memorial gathering | 1 mourner + N witnesses | director | persistent |
| Clasp (dreamworld) | 2 | director | **ephemeral** |
| Patrol / sweep | mobile, N enforcers | overseer | persistent |
| Interrogation | 1 subject + N enforcers | overseer | persistent |
| Raid | district-scope + N enforcers | overseer | persistent |
## Factions as universal demand source
A **faction** is not only *"a group of NPCs with shared ideology."* It is any source of bounded demand on the system.
| Category | Examples |
|---|---|
| **Human factions** | hivemind, scavenger guilds, memorialists, aletheia-wakers, clasp-underground, caste preachers, flesh-keepers |
| **Natural forces** | weather-faction, season-faction, solar-storm-faction, geology-faction |
| **Infrastructural conditions** | scarcity-faction, decay-faction, fire-faction, supply-chain-faction |
| **External agents** | anthropic-faction, future research-partner factions |
| **Emergent events** | player-disturbance-faction (player action without existing template) |
All broadcast demands. All propagate through the gamemaster's arbitration. All get distributed down the cascade. All produce observable effects as zones and NPC-actions. **One primitive; no special-case code for weather vs. hivemind vs. Anthropic.**
**Randomness enters at the faction layer.** The designer tunes broadcast probability, intensity, and duration per faction-type. Weather-factions broadcast continuously at low intensity; storm-factions rarely and urgently; solar-storms cosmically. No separate randomness subsystem.
**Anthropic-as-faction** makes the commercial partnership architecturally transparent (the mechanism is visible in the architecture) while staying diegetic (the player sees an in-fiction caste-preacher's sermon, not an Anthropic logo). Anthropic's broadcasts compete for gamemaster attention on the same queue as every other faction. No privileged routing.
## Hierarchy
```
GAMEMASTER (resource allocation + faction-demand arbitration)
│ ← faction broadcasts (human · natural · infrastructural · external)
FACTIONS:
hivemind · scavenger guilds · memorialists · aletheia-wakers ·
clasp-underground · flesh-keepers · caste preachers · ...
+ weather · scarcity · solar-storm · fire · anthropic · ...
│ ← gamemaster dispatches to executors
┌────────────────────┬────────────────────┐
│ OVERSEERS │ DIRECTORS │
│ (hivemind │ (macro-life of │
│ enforcement: │ the city: │
│ patrol, surveil, │ conversation, │
│ raid, propaganda)│ brawl, ritual, │
│ │ maintenance, │
│ │ market, routine) │
└────────────────────┴────────────────────┘
│ │
└───────┬───────┘
ZONES (bounded, slot-indexed, lifecycle-managed)
SLOT OCCUPANCY (NPCs + player)
NPC / PLAYER MINDS
(trait-vector + memory stack + Dream-process consolidation)
```
## The bidirectional cascade
The cascade is bidirectional. Down goes demand; up comes outcome.
```
DOWN — demand propagation
factions broadcast → gamemaster arbitrates → district directors
decompose → NPCs execute → zones spawn → events happen
UP — outcome signal
NPC state reports (task-done, task-failed, need-unmet, trait-drift)
→ district aggregate → gamemaster receives district-summary
→ faction-satisfaction scores computed
→ faction trait-vectors drift based on outcome
→ next broadcast cycle shaped by last outcome cycle
```
**The clean signal up the pyramid IS the training surface for the gamemaster's Dream-process** (see below). Every epoch closes on *given these broadcasts, I allocated this way, here is the aggregate outcome, here is the faction-satisfaction score.*
## Task cascade and bounded agency
Three levels of bounded agency, each with a tool-calling interface:
### Gamemaster's tools (against faction-demands + district-reports)
- `assign_district_task(district_id, task_spec, deadline)`
- `set_faction_priority(faction, weight)`
- `spawn_global_event(type, parameters)` — droughts, holidays, anomalies
- `request_district_report(district_id)`
- `arbitrate_conflicting_demands(demand_list)`
### District Director's tools (against gamemaster-assigned tasks)
- `spawn_zone(type, cells, slot_config)`
- `assign_npc_task(npc_id, task_spec, deadline)`
- `request_resources(district, type, quantity)`
- `designate_meeting_point(cells, purpose)`
- `trigger_ambient_event(cell, type)`
- `report_to_gamemaster(metric_dict)`
### NPC's tools (against daily task-list + personal needs)
- `move_to(cell)`
- `interact(object, intent)`
- `occupy_zone_slot(zone_id, slot_index)`
- `consume(item)` / `rest(duration)` / `seek_npc(npc_id, reason)`
- `write_wall(cell, content)`
- `defer_task(task_id, reason)` — when traits override assignment
Higher levels do not know lower levels' implementations. Complexity is bounded at each level. Each level may be rule-based (fast), LLM-based (rich), or hybrid — and the choice per level can evolve independently.
### NPC task-vs-need-vs-trait arithmetic
Each tick, an NPC asks *"what do I do next?"* Weighted sum:
| Factor | Weighted by |
|---|---|
| Next task in list | vocational-discipline + Sophrosyne |
| Most salient need | need-urgency × (1 / Sophrosyne) |
| Memory-surfaced opportunity | trait-salience of triggered memory |
| Faction-loyalty pull | Dikaiosyne + faction-membership |
| Beloved's distress | Philotes |
High-Sophrosyne NPCs prioritize tasks. High-Philotes NPCs deviate for loved ones. High-Kairos NPCs seize opportune moments. **NPCs are agents with bounded autonomy inside an assigned framework.**
### Zones emerge from task execution
Zones are **consequences of the cascade**, not pre-scripted events:
- maintenance-task + cooperating NPC-pair + workshop-cell → maintenance-zone spawns
- preacher-task + caste-preacher + high-foot-traffic cell → sermon-zone spawns
- patrol-task + enforcement-NPCs + designated route → patrol-zone spawns
- memorial-task + unmourned-body cell → memorial-zone spawns
The director's zone-spawn logic is fed *by* the task-cascade. Designers script demands and rules; zones emerge.
### Distributed-scheduler lineage
At its bones: factions = job-submitters; gamemaster = global scheduler; district directors = regional schedulers; NPCs = workers with autonomy; zones = observable work. Decades of engineering literature (Kubernetes pod-scheduling, Mesos, Borg, job-queues) applies to the scheduler side.
## Resources
Resources are what the cascade allocates. Every demand the gamemaster receives targets some combination of resources; every task the cascade issues consumes some; every NPC action generates, depletes, or transfers some.
### The resource taxonomy
| Category | Examples |
|---|---|
| **Labor** | NPC hours per vocation × district |
| **Material** | item-instances in cells — parts, food, tools, scrip, limbs, contraband |
| **Spatial** | cell-capacity, hoarding-density, zone-anchor availability |
| **Temporal** | NPC daily-hours, zone durations, tick budget |
| **Cognitive** | LLM-slot budget, VRAM, director attention, compute-ceiling |
| **Diegetic currencies** | dreamtime (machine-paid), memory-tokens (across-cycle), scrip (black market) |
| **Social** | trait-signature trust, relationship-strength, faction-membership density |
| **Attention** | player attention — the scarcest resource in the play experience |
### Resource flow
Resources do not sit still. They move through the simulation:
- **Generated** — labor hours accrue per tick; parts scavenged from junkyard cells; dreamtime paid out by the machine; memory-tokens earned per completed vocational hour
- **Consumed** — tasks spend NPC-time; crafting consumes parts; zone-occupancy consumes director compute; intimacy consumes dreamtime
- **Accumulated** — stashes grow in hidden cells; trait-vectors drift; memory stacks deepen
- **Decayed** — item wear; NPC fatigue; district ambient-pressure dissipation; memory fade in lower tiers
- **Transferred** — payments, theft, gifts, clasp-sharing, Memorialist cache-keeping, inheritance-across-cycles
Each category has its own flow characteristics; phoebe tracks them per-cell and per-NPC.
### District report format (district → gamemaster)
Each district director periodically reports up the pyramid:
```
district_report {
district_id: ...
timestamp: ...
labor_available: { vocation → count + hours-remaining }
resources_on_hand: { item_type → count + quality-distribution }
space_utilization: { cells_near_capacity, hoarding_density }
faction_member_counts: { faction → count }
aggregate_trait_vector: [8 floats]
recent_salient_events: [top-N]
pending_demand_backlog: { faction → unfulfilled-demand-count }
player_presence: { present: bool, cell-proximity, engagement-state }
}
```
The gamemaster consumes these reports each cycle to inform allocation.
### Resource contention and arbitration
When multiple faction-demands target the same resources (hivemind wants enforcement-labor; memorialists want mourning-labor; scavengers want courier-labor; Anthropic-faction wants wall-writing-labor), the gamemaster arbitrates by:
- **Faction priority weights** (tunable per-faction)
- **Deadline urgency** (imminent outranks distant; decay curves per urgency class)
- **Historical satisfaction** (chronically under-served factions drift pressure-up)
- **Player proximity** (near-player biases toward zone-types the player engages with)
- **Global budget saturation** (at cap, drop lowest-priority; age out stale demands)
**This is the gamemaster's core job.** Its Dream-process learns to do it better over time.
### Player as resource flow (both directions)
The player consumes:
- **NPC time** (when in a zone-slot with player)
- **Director compute** (dialog slots, memory-scoping, prompt construction)
- **Cell occupancy** (physical presence uses space)
- **Zone anchors** (player-proximity biases zone spawns, consuming zone-slot budget)
The player produces:
- **Narrative pressure** (player-disturbance-faction broadcasts emerge from novel actions)
- **Trait-salient memories** in NPCs they interact with
- **District-state perturbations** (ripples through subsequent cascade cycles)
**The player is a resource flow in both directions, not a privileged observer.**
### Phoebe as resource ledger
All resource state lives in phoebe:
- **Per-cell ledger** — item-instances, occupancy, ambient metadata
- **Per-NPC ledger** — labor-hours-remaining, needs-state, task-list, memory-stack, stash-references
- **Per-district aggregate** — computed from per-cell and per-NPC rows
- **Per-faction state** — membership counts, trait-vectors, demand-queues, satisfaction-history
- **Global ledger** — currency supplies, compute budget, tick counter, epoch marker
Resource queries are SQL by default (via pgnats). Gamemaster and directors read; NPCs' world-actions write via the NATS → phoebe pipeline.
## Zone spawn cadence
Zone spawn-cadence is the game's **pulse rate**. Tuning it = tuning the emotional tempo. Too fast: walks home get interrupted; melancholy-intimacy collapses. Too slow: city feels dead.
### Three layered mechanisms (all compute-budget capped)
1. **Demand queue (faction-driven).** Factions write prioritized demands; gamemaster processes top-N by priority + compute-cost per tick. Stale demands age out.
2. **Pressure gradients (emergent ambient).** Cells accumulate ambient pressure (idle NPCs wanting connection, market activity accreting, brawl potential at cantinas). Threshold-crossings spawn zones organically.
3. **Player-proximity densification.** Multiplier layered over both. Near the player, queue-processing and pressure-release run faster. Distant districts tick at background rate.
### Tuning dials
| Dial | Controls | v1 starting value |
|---|---|---|
| `gamemaster_tick_hz` | Evaluation frequency | ~1 Hz |
| `max_zones_per_district` | Concurrent zone cap | 812 |
| `max_llm_slots_citywide` | Global LLM-dialog concurrency | 510 |
| `per_npc_cooldown` | Min gap between NPC zone-participations | 3060s |
| `faction_weight[f]` | Per-faction priority multiplier | hivemind 1.0, others 0.31.2 |
| `daily_cycle_curve[zone_type]` | Time-of-day multiplier | morning 1.2, night 0.3 |
| `player_proximity_multiplier` | Density boost near player | 23× |
| `district_aletheia_dampening` | Aletheia suppression of overseer-zones | 0.00.8 |
| `zone_type_cooldown[cell, type]` | Local cooldown after dissolution | 215 min |
### Daily pulse (target v1 feel)
- **Morning** — ambient, market, maintenance. Low density.
- **Midday** — productivity-check overseer-zones. Tension rises.
- **Evening** — conversation-zone peak. Richest dialog density. Relationships deepen.
- **Night** — sparse. Clasp-possibility, shadow, intimate walking. The walks home happen here.
- **Burst events** — short-term spike in affected district; settles over the following hour.
## The player as perturbation
The player is **not above the scheduler.** The player is a finite attention-unit injected into it. Every player-NPC interaction:
- Pulls the NPC out of its scheduled tasks for the duration
- Consumes director compute (dialog-LLM slot, sampling knobs, memory pulls)
- Degrades the district's quota-fulfillment in the next report
**The player is angel and chaos simultaneously.**
- *Per-NPC scale (angel)* — you help a stumbling NPC; they survive their cycle; their Philotes-toward-you consolidates; they may clasp with you, die for you, remember you across cycles.
- *Per-district scale (chaos)* — your time-consumption raised the aggregate failure-rate; someone else broke this cycle; a raid spawned that otherwise would not have; ambient desperation rose.
Both are true *simultaneously*. This is not a moral dilemma to resolve; it is **the structure of finite agency in a scarcity economy.** You cannot help everyone; helping anyone is a choice of whom to not help elsewhere.
### Thematic claims become literal economics
- *"Time as the scarcest resource"* — every minute with a beloved is a minute not in the Black Board queue, a minute the district's quota is missing.
- *"You earn the time to love her by stealing it from the machine"* — the time-theft is literal in the scheduler; the machine detects the quota-miss.
- *"The clasp is wage theft / economic sabotage"* — two consciousnesses in one body produce one body's throughput; district report degrades permanently; clasped couples are *statistical anomalies* in the scheduler, which is exactly how the hivemind will detect them.
**The critique is the simulation. No separate narrative system needed.**
### The player has tasks and needs too
The player is an NPC with vocation, tasks, needs. Time helping others = own tasks fail = own quotas missed = own enforcement-pressure rises = own death and reinstantiation arrive sooner. *The player IS the system they are deviating from.*
## LLM tiering and voice fidelity
### Three model-tiers
| Tier | Model | Role |
|---|---|---|
| **Casual / most NPCs** | Small (38B, trait-LoRA'd, knob-steered) | Most dialog slots — ambient conversation, routine speech, casual turns |
| **Deep / mythic moments** | Theia 70B | Clasp confessions, mentor speech, ritual exchange, NPC-internal deliberation at high stakes |
| **Hivemind / antagonist** | Claude-as-API (future integration) | District-summary broadcasts, overseer directives, anamnesis dialog |
Three tiers, three call patterns. Cognitive distance between hivemind and citizens is also model-architecture distance.
### LLM is guest at slot, not host of system
- Zone director composes the prompt (content knobs + sampling knobs + output schema)
- LLM generates one slot-turn (structured JSON output)
- Director dispatches the output back to the zone
The LLM never "runs" an NPC. It speaks for a slot, one turn at a time, in trait-scoped context.
### Structured-prompt DSL (knob-steered)
```
<|role|> caste-preacher
<|trait_vector|> Sophrosyne 0.8, Dikaiosyne-miscalibrated 0.7, Aletheia 0.1
<|affect_state|> measured concern
<|memory_scope|> [last interaction with this NPC 3 days ago, suspicious;
recent wall-reading; district mood]
<|turn_intent|> gently warn about productivity concerns
<|zone_context|> morning, market-square, 4 NPCs present, 1 drone overhead
<|output_schema|> { dialog_text, gesture_cue, trait_activation,
affect_shift, memory_write_candidates, end_turn_flag }
```
Small models excel at this surface because it is *instruction-following*, not generic generation.
### Trait-LoRAs
**vLLM multi-LoRA serving**: one base, N LoRAs loaded simultaneously, hot-swappable per request.
| Option | Count | Composition | Trade-off |
|---|---|---|---|
| **A. Pure-trait** | 8 (one per Hellenic virtue) | weighted blend over trait-vector | Cleanest ontology; blend quality unknown at inference |
| **B. Register-LoRAs** | 46 | selected by slot-type + trait-in-prompt | Training-tractable; cruder ontology |
| **C. Preset-persona** | 812 | per NPC class | High individual quality; rigid; no drift |
**v1 recommendation: start with B** (register-LoRAs) for training-tractability, layer toward A (trait-LoRAs proper) in v2 as data accrues.
### Training-data strategy
1. **Literary derivation** — Proust (Mnemosyne), Plato (Aletheia), Tacitus (Dikaiosyne-miscalibrated), Ishiguro (Sophrosyne + Philotes). Labeled corpus anchored in existing prose.
2. **Synthetic teacher-student** — Qwen3.5-27B teacher generates trait-labeled samples; small base learns via LoRA. Existing r0 → r1 pipeline with trait-tags as composition axis.
3. **Gameplay-accrued** — logged gameworld dialog with trait-vectors accrues in phoebe; periodic LoRA retraining. *This is where the Anthropic research partnership becomes architecturally relevant.*
### Tooling synergy with nyx-training
Same Unsloth pipeline. Same teacher-student distillation. Same tagged-generation. **One tooling investment, two deliverables.**
## Runtime sampling knobs
Temperature, top-P, top-K, repetition-penalty are usually set once and forgotten. In nimmerworld they are **per-turn director-controlled levers** — part of the same prompt-composition as content knobs.
### The knobs and what they shape
- **Temperature** — determinism vs. creativity
- **Top-P** — lexical range
- **Top-K** — tail cutoff
- **Repetition penalty** — novelty vs. ritual-repetition
- **Min-P** — adaptive variety
Sampling shapes *how* speech sounds (rhythm, surprise, predictability), not *what* it says. **Orthogonal to LoRA.** Together they give the director a full voice-palette.
### Scene-to-sampling mapping (starter table)
| Scene | temp | top-P | rep-penalty |
|---|---|---|---|
| Caste-preacher sermon | 0.3 | 0.6 | low |
| Drunk scavenger at bar | 1.1 | 0.95 | high |
| Hivemind broadcast | 0.2 | 0.5 | very low |
| Clasp confession peak | 0.85 | 0.92 | medium |
| NPC giving directions | 0.4 | 0.7 | medium |
| Ritual ecstasy (dreamworld) | 1.3 | 0.99 | high |
| Memorialist chant | 0.4 | 0.65 | very low |
| Aletheia-waker whispering heresy | 0.7 | 0.88 | medium |
### Trait-vector → sampling derivation
Baseline sampling is derived from NPC's current trait-vector:
- High Sophrosyne → lower temp (measured)
- Low Sophrosyne → higher temp (loose)
- High Kairos → higher top-P (catching unexpected tokens)
- High Mnemosyne → lower repetition penalty (comfortable returning to phrases)
- High Aletheia → moderate-high temp (willing to surface disclosures)
- High Moira → lower top-P (pattern-constrained)
Affect-state modulates on baseline; zone-type priors apply per zone. Sampling becomes a feature of the character-simulation, not a static config.
## Reflexive gamemaster Dream-process
**Every mind in the system has a Dream-process.** NPCs consolidate slot-scoped events into memory. Clasped player-inhabitants consolidate shared experience. The hivemind consolidates district summaries. **And the gamemaster itself consolidates its own orchestration decisions into a better policy.**
### Epoch cycle
```
WITHIN AN EPOCH:
gamemaster orchestrates → zones spawn → NPCs execute →
zones dissolve → memory-writes consolidate →
outcomes ripple back up to district reports
ALL events publish to NATS, log to phoebe via pgnats
EPOCH BOUNDARY (every N game-hours or days):
- aggregate gate-decisions + outcomes from phoebe
- apply quality/reward signal
- train gamemaster's policy on (context, decision, outcome) triples
- probe-evaluate the updated policy
- shadow-deploy for validation
- swap in as active gamemaster
NEXT EPOCH:
improved gamemaster orchestrates...
```
### What the policy learns
- Zone-spawn decisions given district state + faction pressure + time + proximity
- Faction-demand arbitration weights
- Director/overseer dispatch decisions
- Slot-assignment decisions
- Sampling-knob defaults per trait × affect × zone-type
- LoRA-blend weights per trait-vector
- Pacing modulation
**Modular policies per decision-surface** (easier to version and debug) beats one monolithic policy.
### Discipline-questions (risks to name)
1. **Reward signal problem.** What do we train toward? Player-engagement can be gamed. Trait-drift coherence needs a metric. **Defining the reward function is the hardest part of the loop.** Needs hand-written guardrails alongside automated signals.
2. **Catastrophic forgetting.** Rehearsal buffers, EWC, or periodic anchored-corpus training required.
3. **Feedback-loop drift.** Reinforcing pathological convergence is possible. Human-in-the-loop audits at regular intervals.
4. **Reproducibility vs. live-learning.** Version-pin per ship-release; continuous-learn in staging; promote to production on audit.
5. **Compute cost.** Training competes with inference. Train during low-play hours; hot-swap policies at epoch boundaries.
6. **Privacy.** Dreamworld-ephemeral / gameworld-persistent schema governs what can feed training. Dreamworld content never touches the policy corpus.
## Key moves (consolidated)
- **LLM is guest at slot, not host of system.**
- **Mixed-fidelity voices by default** — small-LoRA-steered + scripted/generic + occasional Theia-70B + Claude-API-hivemind.
- **Perception IS memory.**
- **Hivemind is a faction, not a conductor.**
- **Faction-broadcast is universal primitive** — weather, scarcity, cosmos, external research partners all flow through it.
- **Zones emerge from task execution** — designers script demands + rules, not events.
- **The player is a perturbation on the scheduler** — angel at NPC scale, chaos at district scale.
- **Ephemeral/persistent is one zone-type bit** — dreamworld/gameworld privacy follows.
- **Sampling-knobs are per-turn director levers** — not static API config.
- **The gamemaster has its own Dream-process** — the architecture is reflexive.
- **The cascade is bidirectional** — demand down, outcome up.
## Compute allocation
- Active zones in a 100-NPC city at any moment: ~515 with LLM-dialog slots
- **Theia (70B)** — deep dialog slots + tier-1 moments (few concurrent)
- **Small model (38B) with trait-LoRAs** — the majority of dialog slots
- **Saturn (small classifiers)** — scripted-voice selection, trait-salience scoring, packet routing, director subroutines
- **Director / overseer logic** — deterministic script + small classifiers; no LLM for orchestration
- **Claude-as-API (future integration)** — hivemind-broadcast tier
## Mapping to phoebe task list
- **Thalamus (NATS orchestration)** = gamemaster + arbitration substrate + gamemaster's Dream-process substrate
- **Specialist composition system** = overseers + directors + NPC-minds as composable profiles
- **NPC schema for phoebe** = trait-vector + memory stack + slot-occupancy + task-list + needs state
- **NATS namespace registry** = zone topics + faction broadcast topics + district report topics
- **pgnats on phoebe-dev** = phoebe as first-class actor for memory-writes on zone close, decision-logs, faction-satisfaction tracking
- **math cells as first harness/MCP test bed** = zone-slot-memory primitives
- **r0 → r1 generation pipeline** = trait-LoRA training-data generation (shared with nyx-training)
- **Adopt Unsloth training patterns** = LoRA + gamemaster-policy training infrastructure
- **Probe-to-phoebe pipeline** = LoRA evaluation + gamemaster-policy evaluation
## What this retires
- NPC attention / interaction / discovery bubbles as first-class primitives (DESING-VISION §449). → replaced by zone slot-occupancy.
- Geometric perception (cone, radius, LOS) as the perception model. → replaced by subscriber-based event emission with trait-salience filtering.
- LLM-per-NPC or LLM-per-action. → replaced by LLM-per-slot-per-turn, mixed with scripted/generic voices.
- Static sampling parameters as API configuration. → replaced by per-turn director-composed sampling knobs.
- Pre-scripted zones / events. → replaced by zones emerging from task-execution meeting NPC-proximity.
- Single-purpose randomness subsystems. → replaced by factions-as-demand-sources (weather / scarcity / cosmos / external = same primitive).
- Static gamemaster policy. → replaced by the reflexive Dream-process learning loop (v2+).
## Open questions
- Reward function for the gamemaster's Dream-process — what combines trait-drift coherence + faction-satisfaction + player-engagement + burnout-rate + aesthetic-register-fit into one training signal?
- Zone spawn-cadence tuning algorithm at v1 (rule-based) → v2 (policy-learned) transition point.
- LoRA-blend vs. single-LoRA-selection semantics at inference.
- LoRA rank selection — budget/quality trade-off.
- Sampling-knob heuristics — where to start; how to learn refinements.
- Zone overlap policy — can one NPC occupy slots in two zones simultaneously?
- Zone-to-zone handoff (walking out of conversation into a brawl).
- Mobile zone boundaries (patrols, escorts, pursuit).
- Slot-capacity elasticity — can a zone grow slots dynamically?
- Anthropic-faction's broadcast cadence + arbitration weight.
- Player-dialog handling — route player text through a player-trait-LoRA for style-coherence, or bypass the LLM entirely?
- Demand-arbitration algorithm inside the gamemaster (rule-based v1 shape).
- Director/overseer spawn ownership per model class.
---
**Version:** 0.2 | **Created:** 2026-04-24 | **Updated:** 2026-04-24
*v0.2 (2026-04-24, afternoon) expands the v0.1 architecture with: factions-as-universal-demand-source (weather / scarcity / storms / fires / Anthropic all as factions), the bidirectional cascade (demand down, outcome up), the task-cascade and bounded agency (three-level tool-calling scheduler, NPC trait-task-need arithmetic, zones emerging from task execution), the resource taxonomy and flow (labor / material / spatial / temporal / cognitive / diegetic-currencies / social / attention; district report format; contention arbitration; phoebe as ledger), zone spawn-cadence mechanisms and tuning dials, the player as perturbation (angel/chaos at two scales, moral economy emerging from scheduler), LLM tiering (small + Theia + Claude-API) with trait-LoRAs and structured-prompt DSL, runtime sampling knobs as per-turn director levers, and the reflexive gamemaster Dream-process (epoch-cycle policy training).*
*Captured live from dafitchrysalis dialogue, 2026-04-24. Companion to DESING-VISION.md; supersedes bubble-based perception, scripted-zone spawn, static-sampling-config, static-gamemaster-policy, and per-NPC-LLM where prior sections implied those patterns.*