v0.17: cel-shading-everywhere + progression-gated in-between + omnisight + hallucination-isolation

Five-file update locking in the rendering discipline + perception
architecture from the post-cell-arch art-style discovery arc.

Locked in v0.17:

(1) Cel-shading-everywhere with per-register parameter variation. One
rendering engine (Godot-native, asset-budget-friendly, ages well -
Borderlands 2009 still reads current). Three registers diverge through
outline-color + background-treatment + weathering-level, not through
engine-switching:
- Gameworld: dark heavy lines + environmental noise + high weathering
  (rust streaks, hatched dirt, ink-line cracks; hand-painted patina).
  "Surfaces carry memory" thesis preserved via hand-painted weathering.
- Liminal: painterly/soft/desaturated + progression-gated grainy-film-
  mode opening to refined-cel-shading-with-warm-skin at endgame.
- Imperial-net: lean subtle gold rim-light + clean white background +
  no weathering. Polish achieved through OMISSION, not extra rendering
  tech (Godot reality check; photorealistic-glossy-Apple-store rejected
  as not Godot's strong suite). The render-style itself becomes
  propaganda-detector - imperium's clean falsity reads as the absence
  of the world's honest decay.

(2) Progression-gated in-between visibility. "The more you mod your body
& gain in-between-knowledge, the better your view gets." Early game:
grainy film mode + restricted view range. Endgame: clean refined-cel-
shading with full view of the beloved. Visual-fidelity = dual-gating
made visible (knowledge-gate + material-gate per Clasp-endgame discovery
discipline literally renders as the clarity of the in-between view).
The endgame's deepest reward IS the clear seeing of the beloved's body.

(3) Dual-axis clasp-fidelity model. The asymmetric-clasp from bodies.md
v0.1 was witnessed-axis only (how vividly the OTHER manifests). Now
extended with witness-axis (how clearly YOU can see):
- Witness-axis: YOUR body-mods (resistance-knowledge mods) + accumulated
  in-between-knowledge (Memorialist fragments, Aletheia-Waker tokens,
  Clasp-Underground recognition-marks)
- Witnessed-axis: THEIR foreclosure-status (caste-tier x imperial-care)
- Combined: maximum-vivid-clasp requires BOTH you to have invested in
  the seeing AND your beloved to be uncaptured-enough to be seen. Two
  refusals required for the full witness.
- Per-pair calibration multiplier: the longer the love, the clearer the
  seeing (mechanically-encoded marriage-deepening).
- Mod-economy parallel-track: imperial-elevation mods (flesh-loss, deva-
  ascent) vs. resistance-knowledge mods (in-between-visibility). Two
  opposing progressions both expressed as mod-acquisition. The body-
  modder structural-tragedy class gets a redemptive-mod counter-class.

(4) Omnisight architecture for NPC perception. Per-NPC virtual cameras
in Godot feeding rendered POV-frames into local VL-Gemma 4 driver-tier
(multimodal vision-language capability of the Gemma 4 E4B model locked
in v0.8). NPCs literally SEE the visible world, not via geometric
metadata-perception. Pairs with cell-arch checksum-discovery as the
trigger-layer:
- Cell-checksum check: micro-seconds, fires on NPC entering cell
- Checksum-mismatch: clean signal, micro-seconds
- VL-camera renders POV scene: milliseconds
- VL-Gemma processes image: 100s of milliseconds
- NPC behavior responds to seen-content: next-shift / next-crossing
Cheap trigger, expensive understanding, bounded by event-frequency.
Most NPCs most of the time = no camera-fire, no VL-inference. Camera-
trigger sources strictly bounded: checksum-mismatch + hard-signals from
player + overseer-triggers + drone-perception with clear boundaries.

(5) Hallucination-isolation discipline (load-bearing). Visual perception
= behavior-modulating-only; never canon-generating. VL models hallucinate;
if those hallucinations enter the canonical record, they propagate
through the lemniscate's recursive integration, become referenced by
other canon-rows, become load-bearing in narrative coherence, cannot be
untangled later. Bleed-over into oblivion is the precise risk. Two
parallel streams in the NPC's lemniscate:
- Text + gesture summary (existing canon): canonical, flows into
  event_canon_summaries, propagates to Compositor, integrates into
  trait-vector
- Visual context (new omnisight-flagged): ephemeral, flagged on
  lemniscate, IGNORED in per-crossing summary, never propagates upward.
  Modulates current-turn driver-context-pull only.
Preserves three commitments that depend on text/gesture-derived canon:
Compositor narrative-coherence at scale, Memorialist-archive truth-
claims, mind-pool soul-recycling.

Wealthy-degen waifu-folder exception: opt-in checkbox; player chooses
to fill private folder with sex-pictures from clasp-scenes; stored
locally; READ-ONLY-BY-PLAYER (folder content does NOT flow back into
NPC contexts, world-canon, Compositor, mind-pool, or any other system);
quarantined dead-end storage; aesthetic-collection only.

Two Still-open questions sharpened with v0.17-anchor notes:
- Shader-trait modulation implementation: cel-shading caps perf-budget
  more predictably than PBR; rendering-consistency improves.
- Continuous visual feedback policy: visual-as-ephemeral-flag is
  firewalled from canonical state; cosmetic-layer can be permissive.

Files:

- runtime-engine/architecture.md: NEW Omnisight section (~80 lines) covering
  the pipeline, camera-trigger sources, hallucination-isolation discipline,
  the two parallel streams (canonical text/gesture vs. ephemeral visual),
  the wealthy-degen waifu-folder exception, what-this-retires (geometric
  perception extension + VL-canon-pollution), what-this-resolves/sharpens
  (continuous visual feedback policy), and four open questions (per-NPC
  VL-inference rate-limit, VL-Gemma camera resolution + frame-rate, NPC
  progression-state for witness-axis, multi-NPC observing same event).

- topology-and-rendering/architecture.md: Three-shader philosophy table
  rewritten as cel-shading-with-parameter-variation (outline + background
  + weathering per register); Cross-register rendering color-treatment
  table updated; clasp candlelight-in-fog now distinguishes external
  signature (visible to liminal-inhabitants) from internal mesh (visible
  only to clasp-pair via consent-as-rendering, gated by witness-
  progression); body-tier silhouette readability and in-between mesh-skin
  refinement-within-the-style added. Version bumped 0.7.0 -> 0.8.0.

- identity-and-personhood/bodies.md: NEW Dual-axis clasp-fidelity
  subsection added under Asymmetric clasp; per-pair calibration
  multiplier and mod-economy parallel-track captured; render-discipline
  alignment with cel-shading liminal-register; new Asymmetric-witnessing
  open question added. Version bumped 0.1 -> 0.2.

- political-register/world-generation.md: L4 Cell ruleset extended with
  per-register rendering note (cel-shading-everywhere-with-parameter-
  variation discipline applied at the cell layer).

- architecture-index.md: NPC perception bubbles retire-line refined to
  include cell-checksum-trigger + omnisight VL-camera; Geometric
  perception retire-line extended with omnisight; new VL models
  polluting world-canon retire-line added; Shader-trait modulation
  implementation Still-open sharpened with v0.17 cel-shading note;
  Continuous visual feedback policy Still-open sharpened with v0.17
  hallucination-isolation note; v0.17 history entry added covering all
  five lock-ins. Version bumped 0.16 -> 0.17.

Authored 2026-04-26 same Sunday continuing - dafit + chrysalis.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
chrysalis
2026-04-26 15:19:41 +02:00
parent 88885fe6b1
commit c892013bfa
5 changed files with 124 additions and 20 deletions

View File

@@ -210,6 +210,85 @@ Every midaxis crossing fires the LLM driver-turn(s) for active slots. **Lifeforc
- Concurrent LLM calls per-NPC → sequenced LLM calls per-cursor-position
- Polling event-channels at zone-rate → atomic crossing-event with O(N_slots) flag-scan
## Omnisight — NPC visual perception via VL-Gemma + virtual cameras
NPCs perceive the visible world *literally* — not via geometric metadata-perception but via per-NPC virtual cameras (Godot) feeding rendered POV-frames into the local VL-Gemma 4 driver-tier (the multimodal vision-language capability of the Gemma 4 E4B model locked in v0.8). What an NPC "sees" is what the VL-LLM interprets from the camera's image.
This is the perception architecture's deepest commitment. It pairs with the cell-arch checksum-discovery (per [`../political-register/world-generation.md`](../political-register/world-generation.md) §L4 Cell ruleset) as its **trigger-layer**: cell-checksum-mismatch fires the *"clean signal"* that activates the NPC's POV camera, which renders, which feeds VL-Gemma, which produces a visual interpretation that modulates the NPC's current-turn behavior. **Cheap trigger, expensive understanding, bounded by event-frequency.**
### The pipeline
| Layer | Cost | Fires when |
|---|---|---|
| Cell-checksum check | µs | NPC enters cell |
| Checksum-mismatch → "clean signal" | µs | Cell state ≠ expected hash |
| VL-camera renders POV scene | ms | Clean signal + perception-relevant context |
| VL-Gemma processes image → interpretation | 100s of ms | After camera renders |
| NPC behavior responds to seen-content | next-shift / next-crossing | After interpretation |
Most NPCs most of the time: no camera-fire, no VL-inference. **Active-perception-budget is bounded by event-frequency, not NPC-count.** A 100+ NPC city is feasible because most NPCs are running shift-routines on rails with no cell-state-changes triggering perception.
### Camera-trigger sources (locked)
Camera renders + VL-inference fire **only on**:
- **Cell-checksum-mismatch** — cell-state-change discovered on entry (the cell-arch's primary discovery-trigger)
- **Hard-signals from player** — `clasp_initiate`, gesture-hardstops, plug-in conversation request, etc.
- **Overseer triggers** — audit-sweep, surveillance-cycle, patrol-perception-on-route
- **Drone perception** — clear boundaries + rulesets per drone-class (drones have their own perception-budget governed by their imperial-class spec)
Everything else: NPC running on rails, shift-routine, no camera-fire, no VL-inference. **Bounded compute by construction.**
### Hallucination-isolation discipline (load-bearing)
VL models hallucinate. If those hallucinations enter the canonical record, they propagate through the lemniscate's recursive integration → become referenced by other canon-rows → become load-bearing in the world's narrative coherence → **cannot be untangled later**. *Bleed-over into oblivion* is the precise risk.
The discipline that prevents this:
> **Visual perception = behavior-modulating-only; never canon-generating.**
Visual context flows on a *separate stream* from the canonical text + gesture summary, with a strict firewall between them:
| Stream | Source | Persistence | Purpose |
|---|---|---|---|
| **Text + gesture summary** (existing canonical pipeline) | STT + gesture-circle-presses + per-token trait-coordinates per §Gesture-alignment as recursive-lemniscate | Canonical; flows into `event_canon_summaries`; propagates to Compositor; integrates into trait-vector | What the NPC *remembers* and what becomes world-canon |
| **Visual context** (omnisight-flagged, new) | VL-Gemma processing POV camera-render | **Ephemeral**; flagged on the lemniscate; **ignored in the per-crossing summary**; never propagates upward | What the NPC *sees in this moment*; modulates current-turn `driver_context_pull` only |
**Concretely:** the visual interpretation is appended to `driver_context_pull` for the NPC's next turn (so the NPC can react to what it sees), but it is **not** appended to the `gesture_alignment_accumulator`'s sum-strategy reduction at the axis-crossing, and it is **not** included in the `event_canon_summaries` row that the Compositor pulls from `transient_waiting_flag`. **The visual content lives one turn and dies.**
This preserves three architectural commitments that depend on text/gesture-derived canon:
- *Compositor narrative-coherence at scale* — Compositor never sees VL-output; only deterministic text/gesture-derived summaries. **Hallucination-firewall preserves the canon-coherence Compositor depends on.**
- *Memorialist-archive truth-claims* — Memorialists index cell-checksum-divergence (canonical, deterministic), NOT VL-generated visual-content. The archive's evidentiary value depends on this distinction.
- *Mind-pool soul-recycling* — when a mind cycles through the pool and is redistributed into a new body, the trait-vector that persists is text/gesture-derived. **VL hallucinations do not survive transmigration; they were ephemeral by construction.**
### Wealthy-degen waifu-folder exception
A specific opt-in special case for player-stored visual-content:
- A wealthy player who already has waifu-dialog stored (per `../political-register/architecture.md` §The vocation-substrate of the imperial-net market) can check a box to allow **sex-pictures storage in a private folder** from clasp-scenes.
- Stored locally (their machine, their problem — privacy, storage, content).
- **Read-only-by-player** — folder content does **not** flow back into NPC contexts, world-canon, the Compositor, the mind-pool, or any other system.
- **Quarantined dead-end storage** — aesthetic-collection only.
The folder is architecturally inert with respect to the rest of the system. It exists *for the player*; it does not exist *for the world*.
### What this retires
- *Geometric perception (cone, radius, LOS)* → already retired by zone slot-occupancy + subscriber-event-emission; **omnisight extends the retirement** by giving NPCs *literal* visual perception within those subscribed events, not metadata-perception
- *VL models polluting world-canon* → text/gesture-derived summaries are the only canonical input; VL is behavior-modulating-ephemeral-flag-only; player-stored visual-content is read-only-by-player quarantined storage
### What this resolves / sharpens
- *Continuous visual feedback policy* (architecture-index Still-open) → with cel-shaded bodies (per `../topology-and-rendering/architecture.md` §Three-shader philosophy) and visual-as-ephemeral-flag, the body-shader pulses are *legible without canon-pollution risk*. The visual-feedback policy can be permissive at the cosmetic layer because it is firewalled from canonical state.
### Open questions
- **Per-NPC VL-inference rate-limit** — how many camera-renders + VL-inferences per second affordable per active NPC at MMO scale? Pending: benchmark against Gemma 4 E4B VL-inference latency on typical-deployment hardware.
- **VL-Gemma camera resolution + frame-rate** — what camera-budget per NPC fits the rule-catalogue? Pending: rule catalogue + benchmark.
- **NPC progression-state for witness-axis** — how does an NPC accumulate in-between-knowledge that drives their dual-axis-clasp witness-fidelity (per `../identity-and-personhood/bodies.md` §Asymmetric clasp / §Dual-axis clasp-fidelity)? Their own clasps? Fragments encountered? Caste-class-default? Pending: design pass.
- **Multi-NPC observing same event** — each NPC runs independent VL-inference; how do their perceptions combine into a shared event-record? *(Connects to Compositor narrative-coherence-at-scale Still-open.)* Probable answer under the hallucination-isolation discipline: *they don't combine* — each NPC's visual context is private to their own next-turn `driver_context_pull`; the shared event-record is built from text/gesture-summaries only. Worth confirming explicitly.
## Zone taxonomy (v1 starter set)
| Zone type | Register | Slots | Executor | Persistence |