# SkyrimNet Architecture — High-Level Model ## What SkyrimNet is A multi-agent LLM orchestrator that hijacks vanilla Skyrim NPC behavior — replacing static dialogue topics and idle routines with context-aware, LLM-driven scenes. NPCs talk to each other and the player through generated dialogue; their world-affecting actions are picked from a registry of "actions" contributed by SkyrimNet itself and any cooperating mod. `[verified]` from `SkyrimNet.log:14123-14266` (action library initialization), `Source/Scripts/SkyrimNetApi.psc` (public API), `prompts/gamemaster_action_selector.prompt` (GM orchestrator prompt). ## The two-plugin architecture SkyrimNet ships alongside a sibling SKSE plugin called **IntelEngine**. They're independent SKSE plugins that share the SQLite-backed persistence layer. | Plugin | Role | Storage | |---|---|---| | **SkyrimNet** | LLM orchestration, dialogue generation, agent pipelines, TTS/STT, action dispatch | `overwrite/SKSE/Plugins/SkyrimNet/data/SkyrimNet-{epoch}-{nnnnnn}.db` | | **IntelEngine** | Persistent narrative/intelligence layer (third-party "story DM"-style agent) | `overwrite/SKSE/Plugins/IntelEngine/data/IntelEngine-{epoch}-{nnnnnn}.db` | `[verified]` from disk layout. Per-game-session DB sharding (epoch suffix = save game timestamp). ## The four code layers ``` ┌─────────────────────────────────────────────────┐ │ Closed-source C++ DLL │ │ SKSE/Plugins/SkyrimNet.dll │ │ - LLM orchestration, agent dispatch │ │ - Action parser (ParseEmbeddedAction) │ │ - Decorator implementation │ │ - SQLite persistence + vector embeddings │ └─────────────────────────────────────────────────┘ ▲ ▼ ┌─────────────────────────────────────────────────┐ │ Open-source Papyrus glue │ │ mods/SkyrimNet/Source/Scripts/*.psc │ │ - SkyrimNetApi.psc (public API surface) │ │ - SkyrimNetInternal.psc (DLL callbacks) │ │ - skynet_MainController.psc (quest entry) │ │ - skynet_Library.psc (shipped action impls) │ │ - skynet_VoiceInput*.psc (STT integration) │ └─────────────────────────────────────────────────┘ ▲ ▼ ┌─────────────────────────────────────────────────┐ │ Open-source .esp content (Spriggit JSON) │ │ mods/SkyrimNet/plugins/SkyrimNet/ │ │ - 8 custom AI Packages (NPC/Player Dialogue, │ │ Follow, TalkToPlayer) │ │ - Custom Magic Effects (voice input spells) │ │ - Factions (Whitelist/Blacklist/Following) │ │ - Keywords (DialogueTarget/FollowTarget) │ │ - Quests (skynet_MainController, skynet_Mcm) │ └─────────────────────────────────────────────────┘ ▲ ▼ ┌─────────────────────────────────────────────────┐ │ Configuration & content (text files) │ │ mods/SkyrimNet/SKSE/Plugins/SkyrimNet/ │ │ - prompts/ (Inja templates, three-layer) │ │ - sql/migrations/ (17 schema migrations) │ │ overwrite/SKSE/Plugins/SkyrimNet/ │ │ - config/ (38 YAML files + defaults_manifest)│ │ - data/ (SQLite per-session DBs) │ │ - prompts/ (runtime UI overrides) │ │ Plus contributing mods' config/actions/*.yaml │ └─────────────────────────────────────────────────┘ ``` `[verified]` All layers exist. The closed-source DLL is the only piece we cannot read directly — we infer behavior from logs, headers, Papyrus callbacks, and traces. ## The four agent families Each agent maps to a "variant" in `OpenRouter.yaml`, which maps to a model/endpoint. See `agent-pipelines.md` for the full table. 1. **Gamemaster (GM)** — scene-level orchestrator. Decides "should anything happen now, and if so what?" Polls every ~30s in continuous mode + fires on player input. Emits one `ACTION:` line. 2. **Dialogue** — generates the actual NPC speech. Triggered by GM actions like `StartConversation` / `ContinueConversation` or by player dialogue input. Can optionally append an `ACTION:` line for inline action firing. 3. **Meta** — classifiers and helpers (mood eval, memory query generation, dialogue speaker selection). Capped at ~100 tokens per call. 4. **Vision (OmniSight)** — describes the current scene from a screenshot. Uses a local Qwen3-VL model. Fires on `player_text_input` and `player_direct_input_voice` events. Plus a fifth implicit agent type: 5. **Native Action Selector** — *post-dialogue* classifier that asks "what in-game action does this NPC's spoken line imply?" Two-stage: category → leaf. Distinct from the GM's scene-level action selection. ## End-to-end orchestration trace For a player text-input event (verified against `all_traces_1776478948530.json`): ``` event_received ├─ papyrus_decorator_cache_warmup │ ├─ get_player │ ├─ get_nearby_actors │ └─ papyrus_decorators_async ← warm caches before LLM render ├─ scene_capture │ └─ omnisight_immediate_scene_capture │ └─ omnisight_capture_image ← screenshot for vision model ├─ chat_ui_open ← UI block for input ├─ warmup_player_dialogue │ └─ many decorator:* spans (decnpc, render_subcomponent, …) └─ dialogue_manager_handle_player_speech ├─ target_selection_llm ← meta-model: who responds? └─ generate_response ├─ initiate_eligibility_checks (Papyrus IsEligible callbacks) ├─ build_action_context │ ├─ wait_eligibility_results (≤ 2500ms) │ ├─ filter_eligible_actions │ └─ build_action_schemas (JSON schema list for LLM) ├─ build_payload │ └─ render_template (Inja render of dialogue_response.prompt) ├─ llm_request (variant=AgentDefault → eva) ├─ tts_generation │ └─ tts_segment_0…N ├─ mood_evaluation (variant=meta → omega, parallel) └─ memory_search_query_generation (variant=meta → omega, parallel) ``` For a continuous-mode GM tick (also `[verified]` from trace): ``` gamemaster_evaluation_llm └─ gamemaster_async_llm └─ llm_request (variant=gamemaster_evaluation → claude-sonnet-4-5, max_tokens=256) ↓ [parser extracts ACTION: line] ↓ if action == StartConversation or ContinueConversation: player_dialogue_manager_process_event └─ dialogue_manager_handle_perceived_event └─ generate_response (full pipeline above) ``` ## Where the bottlenecks are `[hypothesis]` based on the trace structure and log volumes: - **GM `max_tokens: 256`** is a hard ceiling. With three contributor mods registering ~105 actions total, the GM has to reason over a large `eligible_actions` list and emit one ACTION line — the two-stage drilldown and category wrapper exist precisely to compress this cognitive load. - **`wait_eligibility_results` blocks for up to 2500ms.** Slow Papyrus eligibility callbacks shrink the available action set. This is a Skyrim-VM-side performance dependency that no LLM tuning can fix. - **OmniSight vision** runs locally on a Qwen3-VL model. Image capture + inference adds latency before any text generation can begin. ## Adjacent technologies in the substrate - **whisper.cpp** for local STT (`SKSE/Plugins/SkyrimNet/libs/whisper.dll` + `ggml*.dll` for CPU/CUDA/Vulkan/OpenCL backends). - **all-MiniLM-L6-v2** sentence-transformer for semantic embedding of NPC memories (`SKSE/Plugins/SkyrimNet/models/all-MiniLM-L6-v2-tokenizer.json`). - **ONNX runtime** (`onnxruntime_skyrimnet.dll`) — likely VAD or auxiliary model inference. - **espeak-ng** voice data (`SKSE/Plugins/SkyrimNet/models/espeak-ng-data/`) — TTS phoneme tables for Piper/PocketTTS. - **Spriggit** to git-track the .esp content as JSON. ## Cross-references - For per-agent firing details see [`agent-pipelines.md`](agent-pipelines.md). - For the prompt template system see [`prompt-templates.md`](prompt-templates.md). - For action registration and the `ACTION:` parser see [`action-system.md`](action-system.md). - For YAML config behavior see [`config-knobs.md`](config-knobs.md). - For known bugs and what was tried see [`bugs-and-fixes.md`](bugs-and-fixes.md).