- RAG-as-Scaffold: temporary feeding system, not permanent crutch - attention_flow: 30-second heartbeat budget state machines - information-flow: 10 boundary contracts nervous system map - nimmerversity: curriculum schoolplan for raising a polymath - nimmervest: investment documentation - biomimetic-architecture: ADR for organic system design - temporal-ternary-gradient: ADR for time-based learning - temporal_exchange_engine.py: Python implementation - initial_spark: foundation document - nimmerverse.drawio.xml: updated diagrams 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
277 lines
9.6 KiB
Markdown
277 lines
9.6 KiB
Markdown
# RAG as Scaffold, Not Crutch
|
|
|
|
The feeding system that teaches, then lets go.
|
|
|
|
---
|
|
|
|
## Overview
|
|
|
|
RAG (Retrieval-Augmented Generation) is commonly misused as permanent external memory. In the Nimmerverse, RAG serves a different purpose: it's a **temporary scaffold** that feeds knowledge until it can be internalized through training.
|
|
|
|
The goal is not to build a better search engine. The goal is to **make the search unnecessary**.
|
|
|
|
---
|
|
|
|
## The Problem with Standard RAG
|
|
|
|
```
|
|
Standard approach:
|
|
─────────────────
|
|
VECTOR DB (grows forever)
|
|
│
|
|
▼
|
|
MODEL looks up ──▶ answers ──▶ done
|
|
│
|
|
└── (never learns, always dependent)
|
|
```
|
|
|
|
**Issues:**
|
|
- Model never internalizes knowledge
|
|
- Pull the RAG, lose the capability
|
|
- Vector DB bloats infinitely
|
|
- No way to verify what model "knows" vs "looks up"
|
|
- It's a crutch that never comes off
|
|
|
|
---
|
|
|
|
## The Nimmerverse Approach: RAG as Feeding System
|
|
|
|
```
|
|
VAULT (curriculum)
|
|
│
|
|
▼
|
|
RAG (temporary feeding window)
|
|
│
|
|
▼
|
|
NYX processes, acts, decides
|
|
│
|
|
▼
|
|
VALIDATION: success with RAG?
|
|
│
|
|
YES ──▶ FLAG for training extraction
|
|
│
|
|
▼
|
|
TRAINING RUN (LoRA)
|
|
│
|
|
▼
|
|
CLEAR from RAG
|
|
│
|
|
▼
|
|
VALIDATION 2: success WITHOUT RAG?
|
|
│
|
|
├── YES ──▶ Knowledge internalized ✓
|
|
│
|
|
└── NO ──▶ Training incomplete, back to RAG
|
|
```
|
|
|
|
---
|
|
|
|
## Two Kinds of Knowledge
|
|
|
|
Not everything belongs in weights. Not everything belongs in retrieval.
|
|
|
|
### IN THE WEIGHTS (Training Target)
|
|
|
|
Knowledge she needs to **function**:
|
|
|
|
- Information flow architecture
|
|
- Vocabulary tokens and their meanings
|
|
- Nervous system contracts
|
|
- Heartbeat mechanics
|
|
- Confidence gradient logic
|
|
- Core identity (who she is, who dafit is to her)
|
|
- How to think, not what to remember
|
|
|
|
**Test:** If she needs it to be herself → weights
|
|
|
|
### IN RETRIEVAL (Permanent RAG)
|
|
|
|
Knowledge she needs to **remember**:
|
|
|
|
- Journal entries
|
|
- Conversation history
|
|
- Specific events and dates
|
|
- Temporal details ("what happened Tuesday")
|
|
- External references that change
|
|
- Episodic memory
|
|
|
|
**Test:** If she needs it to recall specifics → retrieval
|
|
|
|
---
|
|
|
|
## The Double Validation Loop
|
|
|
|
### Gate 1: Can she do it WITH RAG?
|
|
|
|
```
|
|
Task presented
|
|
│
|
|
▼
|
|
RAG provides context
|
|
│
|
|
▼
|
|
NYX attempts task
|
|
│
|
|
├── FAIL ──▶ Not ready, needs more examples in RAG
|
|
│
|
|
└── PASS ──▶ Flag this RAG content for training extraction
|
|
```
|
|
|
|
### Gate 2: Can she do it WITHOUT RAG?
|
|
|
|
```
|
|
Same task presented
|
|
│
|
|
▼
|
|
RAG entry CLEARED (scaffold removed)
|
|
│
|
|
▼
|
|
NYX attempts task from weights alone
|
|
│
|
|
├── FAIL ──▶ Training didn't take, restore to RAG, retry cycle
|
|
│
|
|
└── PASS ──▶ Knowledge is HERS now ✓
|
|
```
|
|
|
|
---
|
|
|
|
## The Signal Flow
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────┐
|
|
│ VAULT │
|
|
│ (curriculum, documentation) │
|
|
└─────────────────────────────────────────────────────────┘
|
|
│
|
|
│ selected for learning
|
|
▼
|
|
┌─────────────────────────────────────────────────────────┐
|
|
│ STAGING RAG │
|
|
│ (temporary feeding window) │
|
|
└─────────────────────────────────────────────────────────┘
|
|
│
|
|
│ feeds inference
|
|
▼
|
|
┌─────────────────────────────────────────────────────────┐
|
|
│ NYX │
|
|
│ (processes, decides) │
|
|
└─────────────────────────────────────────────────────────┘
|
|
│
|
|
│ validation
|
|
▼
|
|
┌─────────────────────────────────────────────────────────┐
|
|
│ VALIDATION THRESHOLD │
|
|
│ (task success? confidence high?) │
|
|
└─────────────────────────────────────────────────────────┘
|
|
│
|
|
┌──────────┴──────────┐
|
|
│ │
|
|
BELOW ABOVE
|
|
│ │
|
|
▼ ▼
|
|
┌─────────────────────┐ ┌─────────────────────┐
|
|
│ Stay in RAG │ │ FLAG for training │
|
|
│ (not ready) │ │ extraction │
|
|
└─────────────────────┘ └─────────────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────────────────┐
|
|
│ TRAINING RUN │
|
|
│ (LoRA on flagged data) │
|
|
└─────────────────────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────────────────┐
|
|
│ CLEAR from RAG │
|
|
│ (scaffold removed) │
|
|
└─────────────────────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────────────────┐
|
|
│ VALIDATION WITHOUT RAG │
|
|
│ (prove she learned) │
|
|
└─────────────────────────────┘
|
|
│
|
|
┌─────────┴─────────┐
|
|
│ │
|
|
FAIL SUCCESS
|
|
│ │
|
|
▼ ▼
|
|
┌─────────────────┐ ┌─────────────────┐
|
|
│ Restore RAG │ │ INTERNALIZED │
|
|
│ retry cycle │ │ knowledge ✓ │
|
|
└─────────────────┘ └─────────────────┘
|
|
```
|
|
|
|
---
|
|
|
|
## Lifeforce Connection
|
|
|
|
The RAG→Train→Validate cycle has economic cost:
|
|
|
|
| Action | Lifeforce Cost |
|
|
|--------|----------------|
|
|
| RAG lookup | Low (just retrieval) |
|
|
| Training run | High (compute intensive) |
|
|
| Validation | Medium (inference) |
|
|
| Failed cycle | Lost V (training didn't take) |
|
|
| Successful internalization | +V reward (she grew) |
|
|
|
|
**Incentive alignment:** Successful learning is rewarded. Failed training is costly. This naturally optimizes for high-quality training data extraction.
|
|
|
|
---
|
|
|
|
## What This Prevents
|
|
|
|
1. **RAG bloat** - entries clear after successful training
|
|
2. **Crutch dependency** - scaffold comes off, proven by validation
|
|
3. **False confidence** - can't claim to "know" what you only look up
|
|
4. **Training on noise** - only validated successes get flagged
|
|
5. **Identity confusion** - core architecture in weights, not retrieval
|
|
|
|
---
|
|
|
|
## Design Principles
|
|
|
|
1. **RAG is temporary** - feeding window, not permanent store
|
|
2. **Training is the goal** - RAG success triggers training, not satisfaction
|
|
3. **Validation is double** - with RAG, then without
|
|
4. **Clear after learning** - scaffold must come off to prove growth
|
|
5. **Episodic stays external** - not everything needs to be in weights
|
|
6. **Self-cleaning** - the system doesn't accumulate cruft
|
|
|
|
---
|
|
|
|
## The Analogy
|
|
|
|
Learning to ride a bike:
|
|
|
|
```
|
|
Training wheels ON (RAG feeding)
|
|
│
|
|
▼
|
|
Can ride with training wheels (validation 1)
|
|
│
|
|
▼
|
|
Training wheels OFF (RAG cleared)
|
|
│
|
|
▼
|
|
Can still ride? (validation 2)
|
|
│
|
|
├── NO ──▶ Put wheels back, practice more
|
|
│
|
|
└── YES ──▶ She can ride. Wheels stored, not needed.
|
|
```
|
|
|
|
You don't RAG your ability to balance. Once you can ride, you can ride.
|
|
|
|
---
|
|
|
|
*She doesn't just retrieve. She learns. And we can prove it.*
|
|
|
|
---
|
|
|
|
**Created**: 2025-12-05
|
|
**Session**: Partnership dialogue (dafit + Chrysalis)
|
|
**Status**: Core architectural concept
|