Three-layer model: - papers/ - Domain exploration (full context) - RULEBOOK.md - Tight rules for agents (~200 tokens) - evals/ - Machine-readable rules (future) Source: python-numbers-everyone-should-know benchmarks 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2.5 KiB
2.5 KiB
Python Performance ADRs - Plan
Created: 2026-01-03 Source: python-numbers-everyone-should-know benchmarks
The Three-Layer Model
Layer 1: Domain Papers (Exploration)
├── Full context, reasoning, benchmark methodology
├── ~2000-5000 tokens each
└── Loaded only when deep-diving into a domain
↓ distill
Layer 2: Rulebook (Tight)
├── Actionable rules, no fluff
├── ~200-400 tokens TOTAL
└── Loaded into every coding subagent
↓ formalize
Layer 3: Evals (Machine-Readable)
├── AST patterns, thresholds, rule IDs
├── For automated review agent
└── JSON/YAML, not prose
Directory Structure
python-performance-adrs/
├── PLAN.md # This file
├── README.md # Usage for agents
├── RULEBOOK.md # The tight rulebook (~200 tokens)
├── papers/ # Domain exploration papers
│ ├── async-overhead.md
│ ├── collection-membership.md
│ ├── json-serialization.md
│ ├── exception-flow.md
│ ├── string-formatting.md
│ ├── memory-slots.md
│ ├── import-optimization.md
│ └── database-patterns.md
├── data/
│ └── benchmarks.yaml # Raw numbers for eval generation
└── evals/
└── rules.yaml # Machine-readable eval rules (future)
Domains to Explore
| Domain | Key Insight | Priority |
|---|---|---|
| Async Overhead | 1400x slower than sync for simple calls | Critical |
| Collection Membership | Set O(1) vs List O(n), 200x at 1000 items | Critical |
| JSON Serialization | orjson 8x faster than stdlib | High |
| Exception Flow | 6x overhead when raised | High |
| String Formatting | f-string > % > .format() | Medium |
| Memory/Slots | slots saves 50% memory | Medium |
| Import Optimization | Lazy imports for CLI startup | Medium |
| Database Patterns | SQLite reads fast, writes slow | Medium |
Workflow
- Explore: Spin up agents to analyze each domain from source benchmarks
- Write Papers: Each agent produces a domain paper with full context
- Distill: Extract tight rules into RULEBOOK.md
- Formalize: Convert to machine-readable evals (later)
Source
Benchmark data from:
/home/dafit/nimmerverse/references/python-numbers-everyone-should-know/the-report.md- Main findingsresults.json- Raw benchmark datacode/- Benchmark implementations
The substrate holds. The rules compress.