feat: Add Oghma RAG Proxy for SkyrimNet lore injection

RAG proxy that intercepts SkyrimNet LLM requests and enriches them
with relevant Tamrielic lore from CHIM's Oghma Infinium database.

Features:
- FastAPI proxy compatible with OpenAI API
- ChromaDB semantic search for lore retrieval
- NPC profile extraction from SkyrimNet prompts
- Google Sheets ingestion for CHIM's Oghma data
- Kubernetes deployment manifests
- Debug endpoint for RAG operation monitoring

Collections ingested to iris-dev ChromaDB:
- oghma_lore: 1951 entries (scholar knowledge)
- oghma_basic: 1949 entries (commoner knowledge)
- oghma_visual: 1151 entries (Omnisight perception)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
dafit
2026-03-30 23:22:46 +02:00
parent 62dcee5fbf
commit 3926ab676f
20 changed files with 2367 additions and 0 deletions

5
oghma-proxy/.gitignore vendored Normal file
View File

@@ -0,0 +1,5 @@
__pycache__/
*.pyc
config.local.yaml
.env

35
oghma-proxy/Dockerfile Normal file
View File

@@ -0,0 +1,35 @@
# Oghma RAG Proxy - Container Image
FROM python:3.11-slim
LABEL maintainer="dafit@eachpath.local"
LABEL description="RAG Proxy for SkyrimNet - Injects Tamrielic lore into NPC conversations"
# Set working directory
WORKDIR /app
# Install system dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
curl \
&& rm -rf /var/lib/apt/lists/*
# Copy project files
COPY pyproject.toml .
COPY src/ ./src/
COPY config.yaml .
# Install Python dependencies
RUN pip install --no-cache-dir -e .
# Create non-root user
RUN useradd -m -u 1000 oghma
USER oghma
# Expose port
EXPOSE 8100
# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD curl -f http://localhost:8100/health || exit 1
# Run the proxy
CMD ["python", "-m", "uvicorn", "oghma_proxy.main:app", "--host", "0.0.0.0", "--port", "8100"]

54
oghma-proxy/README.md Normal file
View File

@@ -0,0 +1,54 @@
# Oghma RAG Proxy
RAG (Retrieval-Augmented Generation) proxy for SkyrimNet that injects Tamrielic lore into NPC conversations based on their knowledge profile.
## Overview
This proxy sits between SkyrimNet and your LLM inference endpoint (OpenRouter/vLLM), enriching prompts with relevant lore from CHIM's Oghma Infinium database.
**Key Features:**
- Zero changes to SkyrimNet — just change the endpoint URL
- NPC-aware filtering — guards don't know mage secrets
- Two-tier knowledge — scholars get deep lore, commoners get basics
- ChromaDB-powered semantic search
## Quick Start
```bash
# Install
pip install -e .
# Ingest Oghma lore into ChromaDB
python -m oghma_proxy.ingest --host iris-dev.eachpath.local --port 35000
# Run proxy
python -m oghma_proxy.main
```
## Configuration
Copy `config.yaml` to `config.local.yaml` and customize:
```yaml
upstream:
url: https://openrouter.ai/api/v1
api_key: ${OPENROUTER_API_KEY}
chromadb:
host: iris-dev.eachpath.local
port: 35000
```
## Kubernetes Deployment
```bash
kubectl apply -k k8s/
```
## Architecture
See [TECHNICAL-SPEC.md](TECHNICAL-SPEC.md) for full design documentation.
---
Part of the [nimmerverse](https://github.com/dafit/nimmerverse) project.

View File

@@ -0,0 +1,497 @@
# Oghma RAG Proxy — Technical Specification
**Project:** SkyrimNet Lore Enhancement via RAG Proxy
**Status:** Design Phase
**Author:** Chrysalis + dafit
**Created:** 2026-03-30
---
## 1. Problem Statement
SkyrimNet currently relies on:
1. LLM's baked-in Skyrim knowledge (incomplete, potentially wrong)
2. Dynamic memories (what the NPC witnessed)
**Missing:** Authoritative lore retrieval filtered by what each NPC *should* know.
**Result:**
- Knowledge bleedover (guard knows Telvanni secrets)
- Lore inaccuracies (mixing up timelines, factions)
- No grounding in canon
---
## 2. Solution: Oghma RAG Proxy
A transparent proxy that sits between SkyrimNet and the LLM inference endpoint.
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ OGHMA RAG PROXY │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────┐ ┌──────────────────┐ ┌──────────────┐ │
│ │ SkyrimNet │ │ Oghma Proxy │ │ OpenRouter │ │
│ │ SKSE Plugin │────────▶│ (FastAPI) │────────▶│ / vLLM │ │
│ │ │ │ │ │ │ │
│ │ Port: N/A │ │ Port: 8100 │ │ Upstream │ │
│ └──────────────┘ └────────┬─────────┘ └──────────────┘ │
│ │ │
│ │ Query │
│ ▼ │
│ ┌──────────────────┐ │
│ │ iris-dev │ │
│ │ ChromaDB │ │
│ │ Port: 35000 │ │
│ │ │ │
│ │ Collections: │ │
│ │ - oghma_lore │ │
│ │ - oghma_basic │ │
│ └──────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
```
---
## 3. Core Principles
1. **Zero SkyrimNet Changes** — Only change the endpoint URL in config
2. **Transparent Passthrough** — Unknown requests forward unchanged
3. **NPC-Aware Filtering** — Lore filtered by extracted NPC profile
4. **Two-Tier Content** — Scholars get deep lore, commoners get basics
5. **Nimmerverse Native** — Runs on existing infrastructure (iris-dev)
---
## 4. Architecture Components
### 4.1 Oghma Proxy Service
**Location:** VM on nimmerverse (could run on phoebe-dev or dedicated)
**Tech Stack:** Python 3.11+, FastAPI, httpx, chromadb-client
**Port:** 8100 (configurable)
**Responsibilities:**
- Intercept OpenRouter-compatible API requests
- Parse prompts to extract NPC context
- Query ChromaDB for relevant lore
- Inject lore into system prompt
- Forward to upstream LLM
- Stream response back to SkyrimNet
### 4.2 Oghma ChromaDB Collection
**Location:** iris-dev.eachpath.local:35000
**Collections:**
| Collection | Content | Use Case |
|------------|---------|----------|
| `oghma_lore` | Full `topic_desc` entries | Scholars, mages, priests |
| `oghma_basic` | Simplified `topic_desc_basic` | Commoners, guards, peasants |
**Metadata Schema:**
```json
{
"topic": "Akatosh",
"category": "Figures/Gods",
"knowledge_classes": ["priest", "scholar", "dragon", "snowelf"],
"knowledge_classes_basic": ["nord", "imperial", "breton"],
"tags": ["divine", "time", "dragon-god"],
"source_sheet": "Figures/Gods"
}
```
### 4.3 NPC Profile Extractor
Parses SkyrimNet prompts to extract:
- NPC name
- Race
- Profession/class (from bio or context)
- Factions
- Location (hold)
- Special traits
**Extraction Patterns:**
```python
# From character bio section
r"## (?P<name>\w+) Bio\n- Gender: (?P<gender>\w+)\n- Race: (?P<race>\w+)"
# From system context
r"You are (?P<name>[^,]+), a (?P<race>\w+) (?P<profession>\w+)"
# From faction mentions
r"member of (?:the )?(?P<faction>[\w\s]+)"
```
---
## 5. Data Flow
### 5.1 Request Interception
```
1. SkyrimNet sends POST /chat/completions to proxy
2. Proxy extracts NPC profile from messages
3. Proxy extracts conversation context/topic
4. Proxy queries ChromaDB:
- Collection: oghma_lore or oghma_basic (based on NPC education)
- Filter: knowledge_classes intersects NPC's classes
- Query: conversation context (semantic search)
- Limit: 3-5 most relevant entries
5. Proxy injects lore block into system message
6. Proxy forwards enriched request to upstream
7. Proxy streams response back to SkyrimNet
```
### 5.2 Lore Injection Format
Injected after character bio, before conversation:
```
## Relevant Lore Knowledge
Based on your background as a Nord priest in Whiterun, you would know:
- **Talos**: [Condensed lore about Talos worship, appropriate to character]
- **Whiterun**: [Local knowledge about the hold]
- **Companions**: [If character has connection]
Remember: This is knowledge your character possesses. Reference it naturally in conversation, don't recite it.
```
---
## 6. NPC Knowledge Classification
### 6.1 Profile → Knowledge Classes Mapping
```python
RACE_CLASSES = {
"Nord": ["nord"],
"Dunmer": ["darkelf", "dunmer"],
"Altmer": ["highelf", "altmer"],
"Bosmer": ["woodelf", "bosmer"],
"Argonian": ["argonian"],
"Khajiit": ["khajiit"],
"Breton": ["breton"],
"Redguard": ["redguard"],
"Orsimer": ["orc", "orsimer"],
"Imperial": ["imperial"],
}
PROFESSION_CLASSES = {
"priest": ["priest"],
"mage": ["mage", "scholar"],
"scholar": ["scholar"],
"blacksmith": ["blacksmith"],
"guard": ["guard", "warrior"],
"thief": ["thief"],
"merchant": ["merchant"],
"innkeeper": ["innkeeper"],
"hunter": ["hunter"],
"farmer": ["peasant"],
"noble": ["noble"],
"bard": ["bard"],
}
LOCATION_CLASSES = {
"Whiterun": ["whiterun"],
"Windhelm": ["eastmarch"],
"Solitude": ["haafingar"],
"Riften": ["rift"],
"Markarth": ["reach"],
"Morthal": ["hjaalmarch"],
"Dawnstar": ["pale"],
"Winterhold": ["winterhold"],
"Falkreath": ["falkreath"],
"Solstheim": ["solstheim"],
}
FACTION_CLASSES = {
"Companions": ["companions"],
"College of Winterhold": ["college", "mage"],
"Thieves Guild": ["thieves"],
"Dark Brotherhood": ["darkbrotherhood"],
"Stormcloaks": ["stormcloak"],
"Imperial Legion": ["imperial"],
"Thalmor": ["thalmor"],
"Dawnguard": ["dawnguard"],
"Volkihar": ["vampire", "volkihar"],
}
```
### 6.2 Education Level Detection
```python
def get_education_level(npc_profile: NPCProfile) -> str:
"""Determine if NPC gets full lore or basic summaries."""
educated_professions = {"mage", "scholar", "priest", "noble", "bard"}
educated_factions = {"College of Winterhold", "Thalmor"}
if npc_profile.profession in educated_professions:
return "scholar"
if any(f in educated_factions for f in npc_profile.factions):
return "scholar"
return "commoner"
```
---
## 7. API Specification
### 7.1 Proxy Endpoints
**POST /v1/chat/completions** (OpenRouter compatible)
- Intercepts, enriches, forwards
- Supports streaming
**POST /v1/completions** (Legacy)
- Same enrichment logic
**GET /health**
- Returns proxy + ChromaDB status
**GET /stats**
- Lore injection statistics
- Cache hit rates
- Average latency added
### 7.2 Configuration
```yaml
# oghma-proxy.yaml
proxy:
host: 0.0.0.0
port: 8100
upstream:
# OpenRouter
url: https://openrouter.ai/api/v1
api_key: ${OPENROUTER_API_KEY}
# Or local vLLM
# url: http://localhost:8000/v1
chromadb:
host: iris-dev.eachpath.local
port: 35000
collection_lore: oghma_lore
collection_basic: oghma_basic
retrieval:
max_results: 5
min_score: 0.6
injection:
enabled: true
position: after_bio # after_bio | before_conversation | system_suffix
logging:
level: INFO
log_injections: true
log_to_phoebe: true # Log to nimmerverse decision table
```
---
## 8. Deployment Architecture
### 8.1 Host Options
**Option A: Dedicated VM (Recommended)**
```
VM ID: 109
Hostname: oghma.eachpath.local
IP: 10.0.40.109
Resources: 2 vCPU, 4GB RAM
OS: Rocky Linux 10
```
**Option B: Co-locate on phoebe-dev**
```
Hostname: phoebe-dev.eachpath.local
Port: 8100 (alongside PostgreSQL 35432)
Pros: No new VM, shared resources
Cons: Resource contention
```
### 8.2 Service Configuration
```ini
# /etc/systemd/system/oghma-proxy.service
[Unit]
Description=Oghma RAG Proxy for SkyrimNet
After=network.target
[Service]
Type=simple
User=chrysalis
WorkingDirectory=/opt/oghma-proxy
ExecStart=/opt/oghma-proxy/venv/bin/python -m uvicorn main:app --host 0.0.0.0 --port 8100
Restart=always
RestartSec=5
Environment=OPENROUTER_API_KEY=your-key-here
[Install]
WantedBy=multi-user.target
```
### 8.3 SkyrimNet Configuration Change
In `config/OpenRouter.yaml`:
```yaml
# Before
openrouter:
base_url: https://openrouter.ai/api/v1
# After
openrouter:
base_url: http://oghma.eachpath.local:8100/v1
# Or if running locally:
# base_url: http://localhost:8100/v1
```
---
## 9. Data Pipeline: Oghma Ingestion
### 9.1 Google Sheets → ChromaDB Pipeline
```python
# ingest_oghma.py
import pandas as pd
import chromadb
from chromadb.config import Settings
SHEET_ID = "1dcfctU-iOqprwy2BOc7___4Awteczgdlv8886KalPsQ"
SHEETS = [
("Figures/Gods", 0),
("Artifacts", 1),
("Locations - Whiterun", 2),
# ... etc
]
def ingest_sheet(sheet_name: str, gid: int, chroma_client):
url = f"https://docs.google.com/spreadsheets/d/{SHEET_ID}/export?format=csv&gid={gid}"
df = pd.read_csv(url)
collection_lore = chroma_client.get_or_create_collection("oghma_lore")
collection_basic = chroma_client.get_or_create_collection("oghma_basic")
for _, row in df.iterrows():
topic = row['topic']
# Full lore for educated NPCs
if pd.notna(row.get('topic_desc')):
collection_lore.add(
documents=[row['topic_desc']],
ids=[f"{sheet_name}:{topic}"],
metadatas=[{
"topic": topic,
"category": sheet_name,
"knowledge_classes": row.get('knowledge_class', ''),
"tags": row.get('tags', ''),
}]
)
# Basic lore for commoners
if pd.notna(row.get('topic_desc_basic')):
collection_basic.add(
documents=[row['topic_desc_basic']],
ids=[f"{sheet_name}:{topic}:basic"],
metadatas=[{
"topic": topic,
"category": sheet_name,
"knowledge_classes": row.get('knowledge_class_basic', ''),
"tags": row.get('tags', ''),
}]
)
```
### 9.2 Embedding Model
Use same embedding model as SkyrimNet memories for consistency:
- **Model:** `all-MiniLM-L6-v2` (384 dimensions)
- **Or:** Match whatever SkyrimNet uses in Memory.yaml
---
## 10. Implementation Phases
### Phase 1: Foundation (Week 1)
- [ ] Set up oghma-proxy repository
- [ ] Implement basic FastAPI proxy (passthrough mode)
- [ ] Test with SkyrimNet (verify transparent forwarding)
- [ ] Deploy on phoebe-dev for initial testing
### Phase 2: Oghma Ingestion (Week 1-2)
- [ ] Write Google Sheets ingestion script
- [ ] Ingest all Oghma sheets into iris-dev ChromaDB
- [ ] Verify embeddings and metadata
- [ ] Test semantic queries manually
### Phase 3: NPC Profile Extraction (Week 2)
- [ ] Implement prompt parser for NPC context
- [ ] Build knowledge class mapper
- [ ] Test with various NPC types
- [ ] Handle edge cases (unnamed NPCs, generic guards)
### Phase 4: RAG Integration (Week 2-3)
- [ ] Implement ChromaDB query logic
- [ ] Build lore injection formatter
- [ ] Add education-level routing (scholar vs commoner)
- [ ] Test end-to-end with SkyrimNet
### Phase 5: Polish & Deploy (Week 3)
- [ ] Add streaming support
- [ ] Implement caching (avoid re-querying same context)
- [ ] Add metrics/logging to phoebe
- [ ] Create dedicated VM or finalize deployment
- [ ] Write operational runbook
### Phase 6: Iteration (Ongoing)
- [ ] Tune retrieval parameters based on gameplay
- [ ] Add more knowledge sources beyond Oghma
- [ ] Consider contributing upstream to SkyrimNet
---
## 11. Success Metrics
| Metric | Target | How to Measure |
|--------|--------|----------------|
| Lore accuracy | NPCs reference correct lore | Manual spot-checks |
| Knowledge scoping | Guards don't know mage secrets | Test with various NPC types |
| Latency overhead | < 200ms added | Proxy metrics |
| Stability | 99.9% uptime | Service monitoring |
| User experience | Immersion improved | Subjective gameplay testing |
---
## 12. Risks & Mitigations
| Risk | Impact | Mitigation |
|------|--------|------------|
| Latency too high | Breaks dialogue flow | Cache aggressively, async prefetch |
| Wrong lore injected | Immersion broken | Strict knowledge filtering, fallback to no injection |
| ChromaDB down | No lore enrichment | Graceful degradation (passthrough mode) |
| Prompt parsing fails | NPC profile unknown | Default to basic/generic lore |
| Oghma data incomplete | Missing topics | Supplement with UESP scraping |
---
## 13. Future Enhancements
1. **Player profile tracking** — Remember what player has learned, NPCs can reference shared knowledge
2. **Gossip propagation** — Lore spreads through NPC network with degradation
3. **Dynamic lore updates** — Events in-game add to lore corpus
4. **Multi-source RAG** — Combine Oghma + UESP + custom worldbuilding
5. **Upstream contribution** — Propose RAG API to SkyrimNet author
---
**Version:** 1.0 | **Created:** 2026-03-30 | **Updated:** 2026-03-30

76
oghma-proxy/config.yaml Normal file
View File

@@ -0,0 +1,76 @@
# Oghma RAG Proxy Configuration
# Copy to config.local.yaml and customize for your environment
proxy:
host: 0.0.0.0
port: 8100
workers: 1
upstream:
# OpenRouter (cloud)
url: https://openrouter.ai/api/v1
api_key: ${OPENROUTER_API_KEY}
# Local vLLM alternative:
# url: http://localhost:8000/v1
# api_key: ""
timeout: 120 # seconds
stream_timeout: 300 # for streaming responses
chromadb:
host: iris-dev.eachpath.local
port: 35000
collection_lore: oghma_lore
collection_basic: oghma_basic
retrieval:
max_results: 5
min_score: 0.55
embedding_model: all-MiniLM-L6-v2 # Match SkyrimNet memory embeddings
injection:
enabled: true
position: after_bio # after_bio | before_conversation | system_suffix
# Injection template
template: |
## Relevant Lore Knowledge
Based on your background, you would know:
{% for entry in lore_entries %}
- **{{ entry.topic }}**: {{ entry.content }}
{% endfor %}
Remember: Reference this knowledge naturally in conversation when relevant.
npc_extraction:
# Regex patterns for extracting NPC info from prompts
patterns:
bio_header: '## (?P<name>[\w\s]+) Bio\n- Gender: (?P<gender>\w+)\n- Race: (?P<race>\w+)'
role_context: 'You are (?P<name>[^,]+), (?:a |an )?(?P<race>\w+)'
faction_member: 'member of (?:the )?(?P<faction>[\w\s]+)'
location_in: '(?:in|at|near) (?P<location>Whiterun|Windhelm|Solitude|Riften|Markarth|Morthal|Dawnstar|Winterhold|Falkreath)'
logging:
level: INFO
format: json # json | console
log_injections: true
# Log decisions to phoebe for analysis
phoebe:
enabled: true
host: phoebe-dev.eachpath.local
port: 35432
database: nimmerverse
table: oghma_proxy_decisions
cache:
enabled: true
ttl_seconds: 300 # Cache lore lookups for 5 minutes
max_size: 1000 # Max cached queries
metrics:
enabled: true
endpoint: /metrics # Prometheus-compatible

View File

@@ -0,0 +1,50 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: oghma-proxy-config
namespace: nimmersky
labels:
app.kubernetes.io/name: oghma-proxy
app.kubernetes.io/part-of: nimmerverse
data:
config.yaml: |
proxy:
host: 0.0.0.0
port: 8100
workers: 1
upstream:
# Configure via secret or environment variables
url: ${UPSTREAM_URL}
timeout: 120
stream_timeout: 300
chromadb:
# ChromaDB service - adjust based on your deployment
# Option 1: External (iris-dev VM)
host: iris-dev.eachpath.local
port: 35000
# Option 2: In-cluster ChromaDB
# host: chromadb.nimmersky.svc.cluster.local
# port: 8000
collection_lore: oghma_lore
collection_basic: oghma_basic
retrieval:
max_results: 5
min_score: 0.55
embedding_model: all-MiniLM-L6-v2
injection:
enabled: true
position: after_bio
logging:
level: INFO
format: json
log_injections: true
cache:
enabled: true
ttl_seconds: 300
max_size: 1000

View File

@@ -0,0 +1,108 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: oghma-proxy
namespace: nimmersky
labels:
app.kubernetes.io/name: oghma-proxy
app.kubernetes.io/part-of: nimmerverse
app.kubernetes.io/component: inference-proxy
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: oghma-proxy
template:
metadata:
labels:
app.kubernetes.io/name: oghma-proxy
app.kubernetes.io/part-of: nimmerverse
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "8100"
prometheus.io/path: "/metrics"
spec:
serviceAccountName: oghma-proxy
securityContext:
runAsNonRoot: true
runAsUser: 1000
fsGroup: 1000
containers:
- name: oghma-proxy
image: registry.eachpath.local/nimmerverse/oghma-proxy:latest
imagePullPolicy: Always
ports:
- name: http
containerPort: 8100
protocol: TCP
env:
- name: OPENROUTER_API_KEY
valueFrom:
secretKeyRef:
name: oghma-proxy-secrets
key: OPENROUTER_API_KEY
- name: UPSTREAM_URL
valueFrom:
secretKeyRef:
name: oghma-proxy-secrets
key: UPSTREAM_URL
volumeMounts:
- name: config
mountPath: /app/config.yaml
subPath: config.yaml
readOnly: true
resources:
requests:
memory: "256Mi"
cpu: "100m"
limits:
memory: "512Mi"
cpu: "500m"
livenessProbe:
httpGet:
path: /health
port: http
initialDelaySeconds: 10
periodSeconds: 30
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
httpGet:
path: /health
port: http
initialDelaySeconds: 5
periodSeconds: 10
timeoutSeconds: 3
failureThreshold: 3
volumes:
- name: config
configMap:
name: oghma-proxy-config
# Prefer scheduling near inference workloads
affinity:
podAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchLabels:
app.kubernetes.io/component: inference
topologyKey: kubernetes.io/hostname
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: oghma-proxy
namespace: nimmersky
labels:
app.kubernetes.io/name: oghma-proxy

View File

@@ -0,0 +1,22 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: nimmersky
resources:
- namespace.yaml
- configmap.yaml
- secret.yaml
- deployment.yaml
- service.yaml
commonLabels:
app.kubernetes.io/managed-by: kustomize
app.kubernetes.io/part-of: nimmerverse
images:
- name: registry.eachpath.local/nimmerverse/oghma-proxy
newTag: latest
# For production, use overlays:
# kustomize build k8s/overlays/production | kubectl apply -f -

View File

@@ -0,0 +1,7 @@
apiVersion: v1
kind: Namespace
metadata:
name: nimmersky
labels:
app.kubernetes.io/part-of: nimmerverse
app.kubernetes.io/component: gaming

View File

@@ -0,0 +1,33 @@
apiVersion: v1
kind: Secret
metadata:
name: oghma-proxy-secrets
namespace: nimmersky
labels:
app.kubernetes.io/name: oghma-proxy
app.kubernetes.io/part-of: nimmerverse
type: Opaque
stringData:
# Replace with actual values or use external secret management
OPENROUTER_API_KEY: "your-openrouter-api-key-here"
UPSTREAM_URL: "https://openrouter.ai/api/v1"
---
# Alternative: Use ExternalSecret with Vault/Vaultwarden
# apiVersion: external-secrets.io/v1beta1
# kind: ExternalSecret
# metadata:
# name: oghma-proxy-secrets
# namespace: nimmersky
# spec:
# refreshInterval: 1h
# secretStoreRef:
# name: vault-backend
# kind: ClusterSecretStore
# target:
# name: oghma-proxy-secrets
# data:
# - secretKey: OPENROUTER_API_KEY
# remoteRef:
# key: nimmerverse/skyrimnet
# property: openrouter_api_key

View File

@@ -0,0 +1,39 @@
apiVersion: v1
kind: Service
metadata:
name: oghma-proxy
namespace: nimmersky
labels:
app.kubernetes.io/name: oghma-proxy
app.kubernetes.io/part-of: nimmerverse
spec:
type: ClusterIP
selector:
app.kubernetes.io/name: oghma-proxy
ports:
- name: http
port: 8100
targetPort: http
protocol: TCP
---
# Optional: Expose externally via LoadBalancer or NodePort
# for SkyrimNet running on gaming PC outside the cluster
apiVersion: v1
kind: Service
metadata:
name: oghma-proxy-external
namespace: nimmersky
labels:
app.kubernetes.io/name: oghma-proxy
app.kubernetes.io/part-of: nimmerverse
spec:
type: NodePort
selector:
app.kubernetes.io/name: oghma-proxy
ports:
- name: http
port: 8100
targetPort: http
nodePort: 30100 # Access via <node-ip>:30100
protocol: TCP

View File

@@ -0,0 +1,54 @@
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"
[project]
name = "oghma-proxy"
version = "0.1.0"
description = "RAG Proxy for SkyrimNet - Injects Tamrielic lore into NPC conversations"
readme = "README.md"
requires-python = ">=3.11"
license = "MIT"
authors = [
{ name = "dafit", email = "dafit@eachpath.local" },
]
dependencies = [
"fastapi>=0.109.0",
"uvicorn[standard]>=0.27.0",
"httpx>=0.26.0",
"chromadb>=0.4.22",
"pydantic>=2.5.0",
"pydantic-settings>=2.1.0",
"pyyaml>=6.0.1",
"structlog>=24.1.0",
"psycopg[binary]>=3.1.0", # For phoebe logging
]
[project.optional-dependencies]
dev = [
"pytest>=7.4.0",
"pytest-asyncio>=0.23.0",
"httpx>=0.26.0",
"ruff>=0.1.0",
]
ingest = [
"pandas>=2.1.0",
"gspread>=5.12.0",
"sentence-transformers>=2.2.0",
]
[project.scripts]
oghma-proxy = "oghma_proxy.main:main"
oghma-ingest = "oghma_proxy.ingest:main"
[tool.ruff]
line-length = 100
target-version = "py311"
[tool.ruff.lint]
select = ["E", "F", "I", "N", "W", "UP"]
[tool.pytest.ini_options]
asyncio_mode = "auto"
testpaths = ["tests"]

View File

@@ -0,0 +1,3 @@
"""Oghma RAG Proxy - Lore enrichment for SkyrimNet."""
__version__ = "0.1.0"

View File

@@ -0,0 +1,147 @@
"""NPC Profile Extractor - Parses SkyrimNet prompts to extract NPC context."""
from __future__ import annotations
import re
import structlog
from .models import NPCProfile
logger = structlog.get_logger()
class NPCExtractor:
"""Extracts NPC profile information from SkyrimNet prompts."""
# Regex patterns for extraction
PATTERNS = {
# Character bio header
"bio_header": re.compile(
r"## (?P<name>[\w\s'-]+) Bio\s*\n"
r"- Gender: (?P<gender>\w+)\s*\n"
r"- Race: (?P<race>[\w\s]+)",
re.MULTILINE,
),
# Alternative role description
"role_intro": re.compile(
r"You are (?P<name>[^,\n]+),?\s*(?:a |an )?(?P<descriptor>[^.\n]+)",
re.IGNORECASE,
),
# Faction membership
"faction": re.compile(
r"(?:member of|belongs to|joined|part of) (?:the )?(?P<faction>[\w\s]+?)(?:\.|,|\n|$)",
re.IGNORECASE,
),
# Location mentions
"location": re.compile(
r"(?:in|at|near|from) (?P<location>Whiterun|Windhelm|Solitude|Riften|"
r"Markarth|Morthal|Dawnstar|Winterhold|Falkreath|Riverwood|Rorikstead|"
r"Ivarstead|Solstheim|Raven Rock)",
re.IGNORECASE,
),
# Profession/occupation
"occupation": re.compile(
r"(?:works as|profession:|occupation:|is a|as a) (?P<profession>[\w\s]+?)(?:\.|,|\n|$)",
re.IGNORECASE,
),
}
# Known professions for fuzzy matching
KNOWN_PROFESSIONS = {
"priest", "priestess", "mage", "wizard", "scholar", "blacksmith",
"guard", "soldier", "warrior", "thief", "merchant", "innkeeper",
"hunter", "farmer", "peasant", "noble", "jarl", "bard", "alchemist",
"healer", "assassin", "spy", "courier", "carriage driver", "fisherman",
"miller", "brewer", "smith", "armorer", "fletcher", "jeweler",
}
def extract(self, messages: list[dict]) -> NPCProfile:
"""Extract NPC profile from chat messages."""
# Combine all message content for analysis
full_text = "\n".join(
msg.get("content", "") for msg in messages if msg.get("content")
)
profile = NPCProfile()
# Try bio header first (most reliable)
if match := self.PATTERNS["bio_header"].search(full_text):
profile.name = match.group("name").strip()
profile.gender = match.group("gender").strip()
profile.race = match.group("race").strip()
logger.debug("Extracted from bio header", name=profile.name, race=profile.race)
# Fallback to role intro
elif match := self.PATTERNS["role_intro"].search(full_text):
profile.name = match.group("name").strip()
descriptor = match.group("descriptor")
# Try to parse race from descriptor
profile.race = self._extract_race_from_descriptor(descriptor)
logger.debug("Extracted from role intro", name=profile.name)
# Extract location
if match := self.PATTERNS["location"].search(full_text):
profile.location = match.group("location").strip()
# Extract factions
for match in self.PATTERNS["faction"].finditer(full_text):
faction = match.group("faction").strip()
if faction and faction not in profile.factions:
profile.factions.append(faction)
# Extract profession
if match := self.PATTERNS["occupation"].search(full_text):
profession = match.group("profession").strip().lower()
# Validate against known professions
for known in self.KNOWN_PROFESSIONS:
if known in profession:
profile.profession = known
break
# Compute knowledge classes
profile.compute_knowledge_classes()
logger.info(
"Extracted NPC profile",
name=profile.name,
race=profile.race,
profession=profile.profession,
factions=profile.factions,
location=profile.location,
knowledge_classes=profile.knowledge_classes,
education_level=profile.education_level.value,
)
return profile
def _extract_race_from_descriptor(self, descriptor: str) -> str:
"""Try to extract race from a descriptor string."""
races = [
"Nord", "Dunmer", "Dark Elf", "Altmer", "High Elf",
"Bosmer", "Wood Elf", "Argonian", "Khajiit", "Breton",
"Redguard", "Orsimer", "Orc", "Imperial",
]
descriptor_lower = descriptor.lower()
for race in races:
if race.lower() in descriptor_lower:
# Normalize to single-word form
return race.replace(" ", "")
return "Unknown"
def extract_conversation_context(self, messages: list[dict]) -> str:
"""Extract the current conversation topic for RAG query."""
# Get the last few user/assistant exchanges
recent_content = []
for msg in reversed(messages[-6:]):
content = msg.get("content", "")
if content and msg.get("role") in ("user", "assistant"):
# Skip very long content (likely system prompts)
if len(content) < 500:
recent_content.append(content)
if not recent_content:
return ""
# Combine recent conversation as the query context
return " ".join(reversed(recent_content[-3:]))

View File

@@ -0,0 +1,444 @@
"""Oghma Data Ingestion - Loads CHIM's Oghma lore into ChromaDB."""
from __future__ import annotations
import argparse
import io
import re
import sys
import time
from typing import Iterator
import chromadb
import httpx
import structlog
from chromadb.config import Settings
logger = structlog.get_logger()
# Google Sheet ID for CHIM's Oghma Infinium
OGHMA_SHEET_ID = "1dcfctU-iOqprwy2BOc7___4Awteczgdlv8886KalPsQ"
def discover_sheet_gids(sheet_id: str) -> dict[str, str]:
"""
Discover actual gids for all sheets by parsing the HTML page.
Returns:
Dict mapping sheet name to gid
"""
url = f"https://docs.google.com/spreadsheets/d/{sheet_id}/htmlview"
with httpx.Client(follow_redirects=True, timeout=30.0) as client:
response = client.get(url)
response.raise_for_status()
html = response.text
sheet_gids = {}
# Pattern 1: items.push({name: "Sheet Name", ...gid: "12345"...})
# This is the format Google Sheets uses in the htmlview page
for match in re.finditer(
r'items\.push\(\{name:\s*"([^"]+)"[^}]*gid:\s*"(\d+)"',
html
):
name = match.group(1)
gid = match.group(2)
sheet_gids[name] = gid
# Pattern 2: Also check for gid in URL patterns as backup
# ...gid=12345", gid: "12345"...
if not sheet_gids:
for match in re.finditer(r'gid=(\d+)"[^}]*gid:\s*"(\d+)"', html):
gid = match.group(2)
# Try to find associated name nearby
# This is a fallback pattern
# Pattern 3: Look for the sheet tabs JSON structure
for match in re.finditer(
r'\{name:\s*"([^"]+)"[^}]*gid:\s*"(\d+)"[^}]*\}',
html
):
name = match.group(1)
gid = match.group(2)
if name not in sheet_gids:
sheet_gids[name] = gid
logger.info("Discovered sheets", count=len(sheet_gids), sheets=list(sheet_gids.keys())[:10])
return sheet_gids
def fetch_sheet_csv(sheet_id: str, gid: str, sheet_name: str = "") -> str:
"""Fetch a Google Sheet as CSV."""
url = f"https://docs.google.com/spreadsheets/d/{sheet_id}/export?format=csv&gid={gid}"
with httpx.Client(follow_redirects=True, timeout=60.0) as client:
response = client.get(url)
if response.status_code == 400:
logger.warning("Sheet fetch failed with 400", sheet=sheet_name, gid=gid)
raise httpx.HTTPStatusError(
f"Failed to fetch sheet {sheet_name}",
request=response.request,
response=response,
)
response.raise_for_status()
return response.text
def parse_csv(csv_text: str) -> Iterator[dict]:
"""Parse CSV text into rows."""
import csv
reader = csv.DictReader(io.StringIO(csv_text))
for row in reader:
yield row
def categorize_sheet(sheet_name: str) -> str | None:
"""
Determine category for a sheet and whether to process it.
Returns category name or None if sheet should be skipped.
"""
# Normalize escaped slashes from JSON
normalized_name = sheet_name.replace("\\/", "/").replace("\\", "")
# Sheets to process and their categories
sheet_categories = {
"Figures/Gods": "figures_gods",
"Artifacts": "artifacts",
"Armor and Weapons": "armor_weapons",
"Items": "items",
"Spells": "spells",
"Creatures": "creatures",
"Groups/Lore/Books": "groups_lore",
"Dynamic Oghma": "dynamic",
"Visual Descriptions": "visual",
}
# Check direct match
if normalized_name in sheet_categories:
return sheet_categories[normalized_name]
# Check location sheets - handles both "Locations - Whiterun" and "Locations (Whiterun)"
if normalized_name.startswith("Locations"):
# Try pattern: "Locations (Whiterun)"
match = re.match(r"Locations\s*[\(\-]\s*([^\)]+)\)?", normalized_name)
if match:
hold = match.group(1).strip().lower().replace(" ", "_")
return f"locations_{hold}"
# Skip meta/reference sheets
skip_sheets = ["Project Oghma", "Knowledge Classes Reference", "Vanilla NPCS", "Template"]
if any(skip in normalized_name for skip in skip_sheets):
return None
return None
def ingest_oghma(
chromadb_host: str = "iris-dev.eachpath.local",
chromadb_port: int = 35000,
dry_run: bool = False,
) -> dict:
"""
Ingest all Oghma sheets into ChromaDB.
Returns:
Statistics about ingestion
"""
stats = {
"sheets_processed": 0,
"sheets_skipped": 0,
"lore_entries": 0,
"basic_entries": 0,
"visual_entries": 0,
"errors": [],
}
# Discover actual sheet gids
logger.info("Discovering sheet gids...")
try:
sheet_gids = discover_sheet_gids(OGHMA_SHEET_ID)
except Exception as e:
logger.error("Failed to discover sheets", error=str(e))
# Fallback to known gids (manually discovered)
sheet_gids = {
"Figures/Gods": "0",
# Add more as we discover them
}
if not sheet_gids:
logger.error("No sheets discovered!")
return stats
if not dry_run:
client = chromadb.HttpClient(
host=chromadb_host,
port=chromadb_port,
settings=Settings(anonymized_telemetry=False),
)
# Get or create collections
collection_lore = client.get_or_create_collection(
name="oghma_lore",
metadata={"description": "Full Tamrielic lore for educated NPCs"},
)
collection_basic = client.get_or_create_collection(
name="oghma_basic",
metadata={"description": "Basic Tamrielic lore for commoners"},
)
collection_visual = client.get_or_create_collection(
name="oghma_visual",
metadata={"description": "Visual descriptions for Omnisight perception"},
)
logger.info("Connected to ChromaDB", host=chromadb_host, port=chromadb_port)
else:
logger.info("DRY RUN - not connecting to ChromaDB")
collection_lore = None
collection_basic = None
collection_visual = None
for sheet_name, gid in sheet_gids.items():
category = categorize_sheet(sheet_name)
if category is None:
logger.debug("Skipping sheet", sheet=sheet_name)
stats["sheets_skipped"] += 1
continue
logger.info("Processing sheet", sheet=sheet_name, gid=gid, category=category)
# Rate limit to avoid Google blocking
time.sleep(1.0)
try:
csv_text = fetch_sheet_csv(OGHMA_SHEET_ID, gid, sheet_name)
rows = list(parse_csv(csv_text))
if not rows:
logger.warning("Empty sheet", sheet=sheet_name)
continue
# Check if this sheet has the expected columns
sample_row = rows[0]
is_visual_sheet = category == "visual"
# Visual Descriptions has different schema: baseid, name, description
if is_visual_sheet:
if "name" not in sample_row or "description" not in sample_row:
logger.warning("Visual sheet missing expected columns", sheet=sheet_name, columns=list(sample_row.keys())[:5])
continue
elif "topic" not in sample_row:
logger.warning("Sheet missing 'topic' column", sheet=sheet_name, columns=list(sample_row.keys())[:5])
continue
lore_docs = []
lore_ids = []
lore_metadatas = []
basic_docs = []
basic_ids = []
basic_metadatas = []
# Visual descriptions - universal perception (Omnisight)
visual_docs = []
visual_ids = []
visual_metadatas = []
# Track seen IDs to handle duplicates
seen_ids = set()
duplicates_skipped = 0
for row in rows:
# Handle visual descriptions separately
if is_visual_sheet:
name = row.get("name", "").strip()
description = row.get("description", "").strip()
baseid = row.get("baseid", "").strip()
# Clean up Excel scientific notation for zero (0.00E+00)
if baseid and ("E+" in baseid or "E-" in baseid):
try:
if float(baseid) == 0:
baseid = ""
except ValueError:
pass
if name and description:
# Use baseid:name for uniqueness, fall back to name only
doc_id = f"visual:{baseid}:{name}" if baseid else f"visual:{name}"
if doc_id in seen_ids:
duplicates_skipped += 1
continue
seen_ids.add(doc_id)
visual_docs.append(description)
visual_ids.append(doc_id)
visual_metadatas.append({
"name": name,
"baseid": baseid,
"category": "visual",
"sheet": sheet_name,
})
continue
topic = row.get("topic", "").strip()
if not topic:
continue
# Full lore entry
topic_desc = row.get("topic_desc", "").strip()
if topic_desc:
lore_id = f"{category}:{topic}"
if lore_id in seen_ids:
duplicates_skipped += 1
else:
seen_ids.add(lore_id)
knowledge_classes = row.get("knowledge_class", "").strip()
lore_docs.append(topic_desc)
lore_ids.append(lore_id)
lore_metadatas.append({
"topic": topic,
"category": category,
"sheet": sheet_name,
"knowledge_classes": knowledge_classes,
"tags": row.get("tags", "").strip(),
})
# Basic lore entry
topic_desc_basic = row.get("topic_desc_basic", "").strip()
if topic_desc_basic:
basic_id = f"{category}:{topic}:basic"
if basic_id in seen_ids:
duplicates_skipped += 1
else:
seen_ids.add(basic_id)
knowledge_classes_basic = row.get("knowledge_class_basic", "").strip()
basic_docs.append(topic_desc_basic)
basic_ids.append(basic_id)
basic_metadatas.append({
"topic": topic,
"category": category,
"sheet": sheet_name,
"knowledge_classes": knowledge_classes_basic,
"tags": row.get("tags", "").strip(),
})
# Batch insert to ChromaDB
if not dry_run:
if lore_docs:
collection_lore.upsert(
documents=lore_docs,
ids=lore_ids,
metadatas=lore_metadatas,
)
if basic_docs:
collection_basic.upsert(
documents=basic_docs,
ids=basic_ids,
metadatas=basic_metadatas,
)
if visual_docs:
collection_visual.upsert(
documents=visual_docs,
ids=visual_ids,
metadatas=visual_metadatas,
)
stats["sheets_processed"] += 1
stats["lore_entries"] += len(lore_docs)
stats["basic_entries"] += len(basic_docs)
stats["visual_entries"] += len(visual_docs)
if duplicates_skipped > 0:
logger.debug("Duplicates skipped", sheet=sheet_name, count=duplicates_skipped)
logger.info(
"Sheet processed",
sheet=sheet_name,
rows=len(rows),
lore_entries=len(lore_docs),
basic_entries=len(basic_docs),
visual_entries=len(visual_docs),
)
except httpx.HTTPStatusError as e:
logger.error("HTTP error fetching sheet", sheet=sheet_name, status=e.response.status_code)
stats["errors"].append({"sheet": sheet_name, "error": f"HTTP {e.response.status_code}"})
except Exception as e:
logger.error("Failed to process sheet", sheet=sheet_name, error=str(e))
stats["errors"].append({"sheet": sheet_name, "error": str(e)})
logger.info(
"Ingestion complete",
sheets_processed=stats["sheets_processed"],
sheets_skipped=stats["sheets_skipped"],
lore_entries=stats["lore_entries"],
basic_entries=stats["basic_entries"],
visual_entries=stats["visual_entries"],
errors=len(stats["errors"]),
)
return stats
def main():
"""CLI entry point."""
parser = argparse.ArgumentParser(
description="Ingest CHIM's Oghma Infinium lore into ChromaDB"
)
parser.add_argument(
"--host",
default="iris-dev.eachpath.local",
help="ChromaDB host",
)
parser.add_argument(
"--port",
type=int,
default=35000,
help="ChromaDB port",
)
parser.add_argument(
"--dry-run",
action="store_true",
help="Fetch and parse sheets without writing to ChromaDB",
)
args = parser.parse_args()
# Configure logging
structlog.configure(
processors=[
structlog.stdlib.add_log_level,
structlog.processors.TimeStamper(fmt="iso"),
structlog.dev.ConsoleRenderer(),
],
)
try:
stats = ingest_oghma(
chromadb_host=args.host,
chromadb_port=args.port,
dry_run=args.dry_run,
)
if stats["errors"]:
logger.warning("Ingestion completed with errors", errors=stats["errors"])
# Don't exit 1 if we processed some sheets successfully
if stats["sheets_processed"] == 0:
sys.exit(1)
except Exception as e:
logger.error("Ingestion failed", error=str(e))
sys.exit(1)
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,153 @@
"""Lore Injector - Injects retrieved lore into SkyrimNet prompts."""
from __future__ import annotations
import structlog
from .models import InjectionResult, LoreEntry, NPCProfile
logger = structlog.get_logger()
class LoreInjector:
"""Injects Oghma lore into SkyrimNet chat messages."""
DEFAULT_TEMPLATE = """
## Relevant Lore Knowledge
Based on your background as a {race} {profession} in {location}, you would know:
{lore_items}
Note: Reference this knowledge naturally when relevant to the conversation. Do not recite it.
"""
def __init__(self, template: str | None = None, position: str = "after_bio"):
"""
Initialize injector.
Args:
template: Jinja-style template for injection block
position: Where to inject - 'after_bio', 'before_conversation', 'system_suffix'
"""
self.template = template or self.DEFAULT_TEMPLATE
self.position = position
def inject(
self,
messages: list[dict],
npc_profile: NPCProfile,
lore_entries: list[LoreEntry],
query_time_ms: float,
) -> tuple[list[dict], InjectionResult]:
"""
Inject lore into chat messages.
Args:
messages: Original chat messages
npc_profile: Extracted NPC profile
lore_entries: Retrieved lore entries
query_time_ms: Time taken for retrieval
Returns:
Tuple of (modified messages, injection result)
"""
if not lore_entries:
return messages, InjectionResult(
npc_profile=npc_profile,
lore_entries=[],
injection_text="",
query_time_ms=query_time_ms,
)
# Build injection text
injection_text = self._build_injection_text(npc_profile, lore_entries)
# Clone messages to avoid modifying original
modified_messages = [dict(msg) for msg in messages]
# Find injection point
injected = False
for i, msg in enumerate(modified_messages):
if msg.get("role") == "system":
content = msg.get("content", "")
if self.position == "after_bio":
# Inject after character bio section
bio_markers = ["## Background", "## Personality", "## Speech Style"]
for marker in bio_markers:
if marker in content:
# Insert before this section
idx = content.index(marker)
modified_messages[i]["content"] = (
content[:idx] + injection_text + "\n\n" + content[idx:]
)
injected = True
break
elif self.position == "system_suffix":
# Append to end of system message
modified_messages[i]["content"] = content + "\n\n" + injection_text
injected = True
if injected:
break
# Fallback: prepend to first user message if no system message found
if not injected and self.position == "before_conversation":
for i, msg in enumerate(modified_messages):
if msg.get("role") == "user":
content = msg.get("content", "")
modified_messages[i]["content"] = (
f"[Context for the NPC you're speaking with]\n{injection_text}\n\n"
f"[Player speaks]\n{content}"
)
injected = True
break
if injected:
logger.info(
"Injected lore",
npc_name=npc_profile.name,
entries_count=len(lore_entries),
position=self.position,
)
else:
logger.warning("Could not find injection point", position=self.position)
result = InjectionResult(
npc_profile=npc_profile,
lore_entries=lore_entries,
injection_text=injection_text if injected else "",
query_time_ms=query_time_ms,
)
return modified_messages, result
def _build_injection_text(
self,
npc_profile: NPCProfile,
lore_entries: list[LoreEntry],
) -> str:
"""Build the injection text block."""
# Build lore items list
lore_items = []
for entry in lore_entries:
# Truncate very long entries
content = entry.content
if len(content) > 300:
content = content[:297] + "..."
lore_items.append(f"- **{entry.topic}**: {content}")
lore_items_text = "\n".join(lore_items)
# Fill template
injection_text = self.template.format(
race=npc_profile.race or "person",
profession=npc_profile.profession or "citizen",
location=npc_profile.location or "Skyrim",
lore_items=lore_items_text,
name=npc_profile.name,
)
return injection_text.strip()

View File

@@ -0,0 +1,291 @@
"""Oghma RAG Proxy - Main FastAPI Application."""
from __future__ import annotations
import os
import time
from contextlib import asynccontextmanager
from typing import Any
import httpx
import structlog
import yaml
from fastapi import FastAPI, HTTPException, Request
from fastapi.responses import StreamingResponse
from .extractor import NPCExtractor
from .injector import LoreInjector
from .models import ChatCompletionRequest
from .retriever import OghmaRetriever
# Configure structured logging
structlog.configure(
processors=[
structlog.stdlib.filter_by_level,
structlog.stdlib.add_logger_name,
structlog.stdlib.add_log_level,
structlog.processors.TimeStamper(fmt="iso"),
structlog.processors.JSONRenderer(),
],
wrapper_class=structlog.stdlib.BoundLogger,
context_class=dict,
logger_factory=structlog.stdlib.LoggerFactory(),
)
logger = structlog.get_logger()
def load_config(config_path: str = "config.yaml") -> dict:
"""Load configuration from YAML file."""
# Try local config first, then default
for path in ["config.local.yaml", config_path]:
if os.path.exists(path):
with open(path) as f:
config = yaml.safe_load(f)
logger.info("Loaded config", path=path)
return config
return {}
# Global instances
config = load_config()
extractor = NPCExtractor()
retriever: OghmaRetriever | None = None
injector: LoreInjector | None = None
http_client: httpx.AsyncClient | None = None
@asynccontextmanager
async def lifespan(app: FastAPI):
"""Application lifespan - setup and teardown."""
global retriever, injector, http_client
# Initialize components
chroma_config = config.get("chromadb", {})
retriever = OghmaRetriever(
host=chroma_config.get("host", "iris-dev.eachpath.local"),
port=chroma_config.get("port", 35000),
collection_lore=chroma_config.get("collection_lore", "oghma_lore"),
collection_basic=chroma_config.get("collection_basic", "oghma_basic"),
max_results=config.get("retrieval", {}).get("max_results", 5),
min_score=config.get("retrieval", {}).get("min_score", 0.55),
)
injection_config = config.get("injection", {})
injector = LoreInjector(
template=injection_config.get("template"),
position=injection_config.get("position", "after_bio"),
)
upstream_config = config.get("upstream", {})
http_client = httpx.AsyncClient(
timeout=httpx.Timeout(
connect=10.0,
read=upstream_config.get("timeout", 120.0),
write=30.0,
pool=10.0,
),
limits=httpx.Limits(max_connections=100, max_keepalive_connections=20),
)
logger.info(
"Oghma RAG Proxy started",
upstream_url=upstream_config.get("url", ""),
chromadb_host=chroma_config.get("host"),
)
yield
# Cleanup
if http_client:
await http_client.aclose()
logger.info("Oghma RAG Proxy stopped")
app = FastAPI(
title="Oghma RAG Proxy",
description="RAG Proxy for SkyrimNet - Injects Tamrielic lore into NPC conversations",
version="0.1.0",
lifespan=lifespan,
)
@app.get("/health")
async def health_check():
"""Health check endpoint."""
chromadb_healthy = retriever.health_check() if retriever else False
return {
"status": "healthy" if chromadb_healthy else "degraded",
"components": {
"proxy": "healthy",
"chromadb": "healthy" if chromadb_healthy else "unhealthy",
},
}
# Debug: track recent RAG operations
_recent_rag_ops = []
@app.get("/stats")
async def get_stats():
"""Get proxy statistics."""
return {
"version": "0.1.0",
"injection_enabled": config.get("injection", {}).get("enabled", True),
"upstream_url": config.get("upstream", {}).get("url", ""),
}
@app.get("/debug/rag")
async def debug_rag():
"""Debug endpoint to see recent RAG operations."""
return {"recent_operations": _recent_rag_ops[-20:]}
@app.post("/v1/chat/completions")
async def chat_completions(request: Request):
"""
Proxy chat completions with RAG enrichment.
This endpoint intercepts OpenRouter-compatible requests,
enriches them with relevant Tamrielic lore, and forwards
to the upstream LLM.
"""
start_time = time.perf_counter()
# Parse request body
body = await request.json()
messages = body.get("messages", [])
stream = body.get("stream", False)
# Extract NPC profile from messages
npc_profile = extractor.extract(messages)
# Get conversation context for RAG query
context = extractor.extract_conversation_context(messages)
# Retrieve relevant lore
lore_entries = []
query_time_ms = 0.0
if (
retriever
and injector
and config.get("injection", {}).get("enabled", True)
and context
):
lore_entries, query_time_ms = retriever.retrieve(context, npc_profile)
# Inject lore into messages
if lore_entries:
messages, injection_result = injector.inject(
messages,
npc_profile,
lore_entries,
query_time_ms,
)
body["messages"] = messages
# Track for debug endpoint
_recent_rag_ops.append({
"npc": npc_profile.name,
"race": npc_profile.race,
"query": context[:100] if context else "",
"lore_found": len(lore_entries),
"topics": [e.topic for e in lore_entries[:3]],
"time_ms": round(query_time_ms, 2),
})
if len(_recent_rag_ops) > 50:
_recent_rag_ops.pop(0)
# Forward to upstream
upstream_config = config.get("upstream", {})
upstream_url = upstream_config.get("url", "https://openrouter.ai/api/v1")
api_key = upstream_config.get("api_key", os.environ.get("OPENROUTER_API_KEY", ""))
headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json",
}
# Copy relevant headers from original request
for header in ["HTTP-Referer", "X-Title"]:
if value := request.headers.get(header):
headers[header] = value
try:
if stream:
# Streaming response
return StreamingResponse(
stream_upstream(f"{upstream_url}/chat/completions", headers, body),
media_type="text/event-stream",
)
else:
# Regular response
response = await http_client.post(
f"{upstream_url}/chat/completions",
json=body,
headers=headers,
)
response.raise_for_status()
total_time = (time.perf_counter() - start_time) * 1000
logger.info(
"Request completed",
npc_name=npc_profile.name,
lore_entries=len(lore_entries),
rag_time_ms=round(query_time_ms, 2),
total_time_ms=round(total_time, 2),
)
return response.json()
except httpx.HTTPError as e:
logger.error("Upstream request failed", error=str(e))
raise HTTPException(status_code=502, detail=f"Upstream error: {e}")
async def stream_upstream(url: str, headers: dict, body: dict):
"""Stream response from upstream."""
async with http_client.stream("POST", url, json=body, headers=headers) as response:
async for chunk in response.aiter_bytes():
yield chunk
@app.post("/v1/completions")
async def completions(request: Request):
"""Legacy completions endpoint - passthrough."""
body = await request.json()
upstream_config = config.get("upstream", {})
upstream_url = upstream_config.get("url", "https://openrouter.ai/api/v1")
api_key = upstream_config.get("api_key", os.environ.get("OPENROUTER_API_KEY", ""))
headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json",
}
response = await http_client.post(
f"{upstream_url}/completions",
json=body,
headers=headers,
)
return response.json()
def main():
"""Run the proxy server."""
import uvicorn
proxy_config = config.get("proxy", {})
uvicorn.run(
"oghma_proxy.main:app",
host=proxy_config.get("host", "0.0.0.0"),
port=proxy_config.get("port", 8100),
workers=proxy_config.get("workers", 1),
log_level="info",
)
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,169 @@
"""Data models for Oghma RAG Proxy."""
from __future__ import annotations
from enum import Enum
from typing import Any
from pydantic import BaseModel, Field
class EducationLevel(str, Enum):
"""NPC education level determines lore depth."""
SCHOLAR = "scholar" # Full lore access
COMMONER = "commoner" # Basic summaries only
class NPCProfile(BaseModel):
"""Extracted NPC profile from SkyrimNet prompts."""
name: str = "Unknown"
race: str = "Unknown"
gender: str = "Unknown"
profession: str | None = None
factions: list[str] = Field(default_factory=list)
location: str | None = None
traits: list[str] = Field(default_factory=list)
# Computed
knowledge_classes: list[str] = Field(default_factory=list)
education_level: EducationLevel = EducationLevel.COMMONER
def compute_knowledge_classes(self) -> None:
"""Compute knowledge classes from profile attributes."""
classes = set()
# Race-based knowledge
race_map = {
"nord": ["nord"],
"dunmer": ["darkelf", "dunmer"],
"altmer": ["highelf", "altmer"],
"bosmer": ["woodelf", "bosmer"],
"argonian": ["argonian"],
"khajiit": ["khajiit"],
"breton": ["breton"],
"redguard": ["redguard"],
"orsimer": ["orc", "orsimer"],
"orc": ["orc", "orsimer"],
"imperial": ["imperial"],
}
race_lower = self.race.lower()
if race_lower in race_map:
classes.update(race_map[race_lower])
# Profession-based knowledge
profession_map = {
"priest": ["priest"],
"mage": ["mage", "scholar"],
"wizard": ["mage", "scholar"],
"scholar": ["scholar"],
"blacksmith": ["blacksmith"],
"guard": ["guard", "warrior"],
"soldier": ["warrior", "guard"],
"warrior": ["warrior"],
"thief": ["thief"],
"merchant": ["merchant"],
"innkeeper": ["innkeeper"],
"hunter": ["hunter"],
"farmer": ["peasant"],
"peasant": ["peasant"],
"noble": ["noble"],
"jarl": ["noble"],
"bard": ["bard"],
"alchemist": ["alchemist"],
}
if self.profession:
prof_lower = self.profession.lower()
if prof_lower in profession_map:
classes.update(profession_map[prof_lower])
# Location-based knowledge
location_map = {
"whiterun": ["whiterun"],
"windhelm": ["eastmarch"],
"solitude": ["haafingar"],
"riften": ["rift"],
"markarth": ["reach"],
"morthal": ["hjaalmarch"],
"dawnstar": ["pale"],
"winterhold": ["winterhold"],
"falkreath": ["falkreath"],
"solstheim": ["solstheim"],
}
if self.location:
loc_lower = self.location.lower()
if loc_lower in location_map:
classes.update(location_map[loc_lower])
# Faction-based knowledge
faction_map = {
"companions": ["companions"],
"college of winterhold": ["college", "mage"],
"college": ["college", "mage"],
"thieves guild": ["thieves"],
"dark brotherhood": ["darkbrotherhood"],
"stormcloaks": ["stormcloak"],
"stormcloak": ["stormcloak"],
"imperial legion": ["imperial"],
"legion": ["imperial"],
"thalmor": ["thalmor"],
"dawnguard": ["dawnguard"],
"volkihar": ["vampire", "volkihar"],
}
for faction in self.factions:
faction_lower = faction.lower()
if faction_lower in faction_map:
classes.update(faction_map[faction_lower])
self.knowledge_classes = list(classes)
# Determine education level
educated_professions = {"mage", "wizard", "scholar", "priest", "noble", "bard"}
educated_factions = {"college of winterhold", "thalmor", "college"}
if self.profession and self.profession.lower() in educated_professions:
self.education_level = EducationLevel.SCHOLAR
elif any(f.lower() in educated_factions for f in self.factions):
self.education_level = EducationLevel.SCHOLAR
else:
self.education_level = EducationLevel.COMMONER
class LoreEntry(BaseModel):
"""A retrieved lore entry from Oghma."""
topic: str
content: str
category: str
score: float
knowledge_classes: list[str] = Field(default_factory=list)
class ChatMessage(BaseModel):
"""OpenRouter-compatible chat message."""
role: str
content: str
name: str | None = None
class ChatCompletionRequest(BaseModel):
"""OpenRouter-compatible chat completion request."""
model: str
messages: list[ChatMessage]
temperature: float | None = None
max_tokens: int | None = None
stream: bool = False
# Allow additional fields to pass through
model_config = {"extra": "allow"}
class InjectionResult(BaseModel):
"""Result of lore injection."""
npc_profile: NPCProfile
lore_entries: list[LoreEntry]
injection_text: str
query_time_ms: float

View File

@@ -0,0 +1,173 @@
"""Oghma Lore Retriever - Queries ChromaDB for relevant Tamrielic lore."""
from __future__ import annotations
import time
from functools import lru_cache
from typing import TYPE_CHECKING
import chromadb
import structlog
from chromadb.config import Settings
from .models import EducationLevel, LoreEntry, NPCProfile
if TYPE_CHECKING:
from chromadb import Collection
logger = structlog.get_logger()
class OghmaRetriever:
"""Retrieves relevant lore from Oghma ChromaDB collections."""
def __init__(
self,
host: str = "iris-dev.eachpath.local",
port: int = 35000,
collection_lore: str = "oghma_lore",
collection_basic: str = "oghma_basic",
max_results: int = 5,
min_score: float = 0.55,
):
self.host = host
self.port = port
self.collection_lore_name = collection_lore
self.collection_basic_name = collection_basic
self.max_results = max_results
self.min_score = min_score
self._client: chromadb.HttpClient | None = None
self._collection_lore: Collection | None = None
self._collection_basic: Collection | None = None
def _get_client(self) -> chromadb.HttpClient:
"""Get or create ChromaDB client."""
if self._client is None:
self._client = chromadb.HttpClient(
host=self.host,
port=self.port,
settings=Settings(anonymized_telemetry=False),
)
logger.info("Connected to ChromaDB", host=self.host, port=self.port)
return self._client
def _get_collection(self, education_level: EducationLevel) -> Collection:
"""Get the appropriate collection based on education level."""
client = self._get_client()
if education_level == EducationLevel.SCHOLAR:
if self._collection_lore is None:
self._collection_lore = client.get_collection(self.collection_lore_name)
return self._collection_lore
else:
if self._collection_basic is None:
self._collection_basic = client.get_collection(self.collection_basic_name)
return self._collection_basic
def retrieve(
self,
query: str,
npc_profile: NPCProfile,
) -> tuple[list[LoreEntry], float]:
"""
Retrieve relevant lore entries for an NPC.
Args:
query: Conversation context to search for
npc_profile: NPC profile for knowledge filtering
Returns:
Tuple of (lore entries, query time in ms)
"""
if not query.strip():
return [], 0.0
start_time = time.perf_counter()
try:
collection = self._get_collection(npc_profile.education_level)
# Build metadata filter for knowledge classes
# NOTE: Currently disabled because CHIM's Oghma data doesn't have
# knowledge_class populated consistently. Enable when data is enriched.
where_filter = None
# TODO: Re-enable when knowledge_class data is available
# if npc_profile.knowledge_classes:
# if len(npc_profile.knowledge_classes) == 1:
# where_filter = {"knowledge_classes": {"$contains": npc_profile.knowledge_classes[0]}}
# else:
# where_filter = {
# "$or": [
# {"knowledge_classes": {"$contains": kc}}
# for kc in npc_profile.knowledge_classes
# ]
# }
# Query ChromaDB
results = collection.query(
query_texts=[query],
n_results=self.max_results,
where=where_filter,
include=["documents", "metadatas", "distances"],
)
# Parse results
entries = []
if results and results["documents"] and results["documents"][0]:
for i, doc in enumerate(results["documents"][0]):
metadata = results["metadatas"][0][i] if results["metadatas"] else {}
distance = results["distances"][0][i] if results["distances"] else 1.0
# Convert distance to similarity score (ChromaDB uses L2 distance)
# Lower distance = higher similarity
score = 1.0 / (1.0 + distance)
if score >= self.min_score:
entries.append(
LoreEntry(
topic=metadata.get("topic", "Unknown"),
content=doc,
category=metadata.get("category", "Unknown"),
score=score,
knowledge_classes=metadata.get("knowledge_classes", "").split(","),
)
)
query_time = (time.perf_counter() - start_time) * 1000
logger.info(
"Retrieved lore entries",
query_preview=query[:100],
npc_name=npc_profile.name,
education=npc_profile.education_level.value,
entries_found=len(entries),
query_time_ms=round(query_time, 2),
)
return entries, query_time
except Exception as e:
logger.error("Failed to retrieve lore", error=str(e))
query_time = (time.perf_counter() - start_time) * 1000
return [], query_time
def health_check(self) -> bool:
"""Check if ChromaDB is reachable."""
try:
client = self._get_client()
client.heartbeat()
return True
except Exception as e:
logger.error("ChromaDB health check failed", error=str(e))
return False
# Cached retriever instance
@lru_cache(maxsize=1)
def get_retriever(
host: str = "iris-dev.eachpath.local",
port: int = 35000,
) -> OghmaRetriever:
"""Get cached retriever instance."""
return OghmaRetriever(host=host, port=port)

7
oghma-proxy/usage.txt Normal file
View File

@@ -0,0 +1,7 @@
cd /home/dafit/nimmerverse/nimmersky/oghma-proxy
python -m oghma_proxy.main
Endpoints:
- http://localhost:8100/health - Health check
- http://localhost:8100/debug/rag - See recent RAG operations
- http://localhost:8100/v1/chat/completions - The proxy endpoint (point SkyrimNet here)