Files
2026-04-26 02:11:10 +02:00

159 lines
12 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Scale and Transport
> *Running the architecture at MMO size: compute-allocation budgets across model-tiers, horizontal-scale primitives (UID-keyed routing, stateless Compositors, ephemeral Director-routines, sharded GMs, pruning at every layer), pgnats-native transport with the JetStream republish + replay refinement, district-distribution fallback.*
>
> *Companion to: `architecture-index.md` (executive summary + global meta-lists), `narrative-composition/architecture.md` (Compositor is the load-bearing horizontal-scale actor), `inference-and-memory/architecture.md` (local-first memory architecture sits on this transport substrate). Sections in this file were split from the monolithic architecture-index.md v0.7 on 2026-04-26.*
## Compute allocation
- Active zones in 100-NPC city: ~515 with LLM-dialog slots
- **Theia-tier (deep model)** — deep slots + tier-1 moments (few concurrent)
- **Driver-tier (small model with trait-LoRAs)** — majority of dialog slots
- **Saturn (small classifiers)** — voice-selection, trait-salience, audit-overseer classification, ternary-gate dynamics
- **Director / overseer logic** — deterministic + small classifiers; no LLM for orchestration
- **Claude-as-API (future)** — hivemind/imperium broadcast tier
- **Outer rails** — graph-pathfinding, cheap, LOD-trivial
- **Pipe / off-shift NPCs** — sparse simulation, event-driven scale-up
- **Interior navmesh** — only currently-occupied interiors active
- **Liminal / imperial-net rendering** — shader-preset swap, no geometry duplication
## Horizontal scale architecture
The architecture must scale to MMO size — many concurrent players across many districts, many concurrent events, many local LLMs firing at axis-rate. Vertically-scaled monolithic AI-NPC systems break under this load. **Nimmerworld is built horizontally-scalable from the ground up**, with the primitives that make horizontal scale work: UID-keyed routing, stateless workers, ephemeral actors per scope, sharded service mesh, pruning-on-completion at every layer.
### UID-keyed routing as the load-bearing primitive
Hierarchical UIDs (`gm_event_uid > district_uid > scene_sub_uid > slot_id`) carry enough information for *any* worker to pick up *any* unit of work without shared in-memory state. UID-as-routing-key is what lets every layer scale independently.
This is the same primitive that Cassandra/DynamoDB/etc. built around (partition keys); we re-derive it for narrative composition because the underlying constraint is the same — many concurrent actors operating on private state with cross-actor coordination via typed events at known boundaries.
### Compositors on demand
Compositor instances are **stateless workers**. Any compositor can pick up any `event_uid` from the transient-waiting-flag — state lives in phoebe + per-player SQLites; workers are pure functions of `(event_uid → composition)`. Spin up more on queue-depth; spin down on drain. This is the autoscaling-worker pattern (Celery / Lambda / Sidekiq) applied to narrative.
### Directors as ephemeral routines per UID
Each event / event-chain spawns a **director-routine** scoped to that UID. The routine lives only as long as the event lives in the active-register; on completion it prunes. This is the actor model (Erlang / Akka / Orleans) — supervised lightweight processes, thousands concurrent, failure-recovery via supervisor respawn from register-state.
### Sharded GMs
A single GM is a scalability bottleneck. Multi-GM shards across:
- **Geography** — GM-per-region/continent/world-zone
- **Theme** — political-narratives / economic-narratives / personal-narratives
- **Tier** — major-arcs vs everyday-events
- **Faction** — each major faction has its own GM-shard
GM-shards share the catalogue, the trait-axis vocabulary, the verifier-flag system. They communicate via NATS pub/sub at GM-tier. *Equilibrium-seeking becomes a distributed consensus across GM-shards* — the same epistemological shift that distributed databases went through (CAP theorem, eventual consistency, partition tolerance).
This is the most architecturally aggressive move and is *not* a free lunch. Multi-GM consensus on equilibrium is real engineering: Paxos/Raft for strong consistency; CRDT-style for eventual; faction-sharding so shards own different equilibria. Well-trodden ground; cost not zero.
### Pruning as cleanup
Each layer prunes on completion: directors prune at event-end, register entries prune on Compositor pickup, transient-waiting-flag drains at cycle, even player memory prunes by class. **Garbage collection is a first-class structural concern**, not an afterthought. Most "smart NPC" prototypes accumulate state forever; at MMO scale that becomes unrunnable in months.
### Transport — pgnats native, with district-distribution fallback
The Compositor's forward-prop and back-write traffic — and the canon distribution to participants — is the highest-volume path in the system. Two transport options exist.
**Option A — pgnats native serialization (preferred):**
```
COMPOSITOR (writes SQL into phoebe)
│ pgnats: SQL INSERT/UPDATE → NATS subject publish, automatic, transactional
│ subject hierarchy mirrors UID hierarchy:
│ nimmerverse.events.<gm_event_uid>.district.<district_uid>.scene.<scene_uid>
NATS JetStream (durable, replay-capable, at-least-once)
│ subscribers route by subject pattern (wildcards supported)
PLAYER CLIENT (NATS subscriber)
│ receives row-shaped messages, INSERT into local SQLite under matching event_uid
```
**Why preferred:** transactional-outbox pattern *native to the database* — no separate publisher service, no schema-duplication between wire format and storage format, single source of truth. Subject-as-routing using UID hierarchy means players who participated in `gm_event` 7af2 → `district` 9c1 → `scene` 3e8 subscribe to `nimmerverse.events.7af2.>` and receive *only* relevant events. *No application-level routing logic.*
**Option B — district-as-distribution-coordinator (fallback):**
```
COMPOSITOR writes to phoebe (canon authored)
DISTRICT GAMESERVER pulls / receives back-write package
│ runs distribution checks: who participated, who's online, who needs queue-on-login
│ retries; peer-shares with neighboring districts if needed
PARTICIPANTS receive from their district (the authoritative local hub)
```
**Why fallback:** if pgnats can't carry the load (functional bug, scale ceiling, durability gap), district-as-distribution is the natural retreat — district is authoritative within its scope, simpler than full P2P, more flexible than central-push.
**The asymmetry of the bet:**
| | pgnats works | pgnats fails |
|---|---|---|
| Code volume | Few hundred lines of SQL + subject patterns | ~1020K lines of broker/outbox/subscriber logic |
| Services to operate | phoebe + NATS (already in stack) | + outbox-reader + custom-publisher + custom-applier |
| Schema management | Single source (Postgres DDL) | Postgres + Protobuf/msgpack + version-skew handling |
| Latency to client | Sub-millisecond NATS hop + apply | + serialize step + queue drain + apply |
| Time to ship | Weeks | Months |
**This is the most leveraged engineering decision in the architecture currently open.** The pgnats evaluation task in `nimmerverse_tasks` (under `nimmerverse-core`) is therefore load-bearing; its outcome decides whether transport is 200 lines of SQL or 20,000 lines of Go.
### NATS republish + replay — the pull-from-checkpoint refinement
JetStream's `republish` feature combined with `replay` semantics is a **promising refinement on top of Option A** that turns back-write delivery from push-to-listeners into pull-from-checkpoint. Worth nailing down concretely in the pgnats evaluation; if it carries our delivery patterns under load, it is a significant performance and complexity multiplier.
**How it works:**
- Compositor publishes canon ONCE to the event-keyed subject (e.g., `nimmerverse.events.<gm_event_uid>.scene.<sub_uid>`)
- A `republish` rule on the JetStream stream fans the message out to derived subjects automatically (per-participant, per-faction, per-district — any derived stream-shape we configure)
- Each derived subject has durable JetStream consumers per recipient; recipients replay from their own checkpoint (last-delivered-sequence) when they reconnect
**Why it's faster and simpler:**
- **Compositor never tracks recipients.** Publishes once; NATS republishes to N derived subjects per the configured rule. No application-level fan-out logic; no "for each participant, if online deliver else queue" code-path.
- **Player connectivity is decoupled from delivery.** Offline players don't slow down the publish path; their queue accumulates in JetStream. On reconnect, they replay from checkpoint.
- **Late joiners catch up for free.** New consumer subscribes with replay-from-sequence; gets full history relevant to their subject. No special reconcile code-path.
- **Backpressure is built in.** Slow consumer's queue grows in JetStream — doesn't affect publishers or other consumers. Flow-control, ack-policy, max-bytes/max-msgs limits all native to JetStream.
- **Re-keying is a config change.** Add a new republish rule for a new derived stream-shape (e.g., per-faction canon-feed); no application code changes for the new consumer-tier.
- **Replay = audit-trail for free.** Replay from sequence 0 reconstructs entire canon history. Disaster recovery, debugging, time-travel queries are all free side-effects.
**Architectural effect:**
```
Compositor → publishes canon to event_uid subject (ONCE)
▼ (JetStream republish-rule fan-out — config, not code)
Per-player subjects: nimmerverse.player.<id>.canon
Per-faction subjects: nimmerverse.faction.<name>.canon
Per-district subjects: nimmerverse.district.<id>.canon
Each consumer replays from its own checkpoint when ready
```
Compositor's mental model collapses to *"I publish to the canonical event-stream and forget"*; recipient-mental-model collapses to *"I replay my own consumer-subject from my last sequence"*. **The delivery problem disappears into NATS.**
**Specific evaluation criteria** (to add to the pgnats evaluation task):
- Republish rule expressiveness — can we route by UID hierarchy (`events.<gm_uid>.>``player.<participant_id>.canon`)?
- Replay performance — what's the cost of a consumer replaying N hours of missed canon on reconnect?
- Durability under broker failure — does the republish-derived subject survive primary-broker loss?
- Schema-evolution behavior — can we add new derived subjects without disrupting existing consumers?
- Cost at scale — disk, memory, file-handles for many durable consumers (one-per-active-player)?
If `republish + replay` carries the load, the back-write transport is *substantially* simpler than the original Option A description and a much harder competitor to beat with Option B.
### What this retires
- Vertically-scaled monolithic AI-NPC backend → horizontally-scaled stateless-workers + ephemeral-actors + sharded-services
- Centralized in-memory event-state → UID-keyed registers + transient-waiting-flag buffer
- Single global GM as bottleneck → sharded GMs with cross-shard equilibrium-consensus
- Manual outbox-reader services → pgnats native serialization (preferred); district-distribution fallback if pgnats can't carry
- "AI-NPC scale" framed as inference budget alone → AI-NPC scale framed as transport + state + composition + sharding, with inference as one of many concurrently-scaling axes
---
**Version:** 0.7.0 | **Created:** 2026-04-26 | **Updated:** 2026-04-26 | **Origin:** Split from architecture-index.md v0.7 (2026-04-26)