feat: Add 8 domain papers and RULEBOOK.md

Domain papers distilled from python-numbers-everyone-should-know:
- async-overhead: 1,400x sync vs async overhead
- collection-membership: 200x set vs list at 1000 items
- json-serialization: 8x orjson vs stdlib
- exception-flow: 6.5x exception overhead (try/except free)
- string-formatting: f-strings > % > .format()
- memory-slots: 69% memory reduction with __slots__
- import-optimization: 100ms+ for heavy packages
- database-patterns: 98% commit overhead in SQLite

RULEBOOK.md: ~200 token distillation for coding subagents

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
dafit
2026-01-03 14:31:40 +01:00
parent 4def3b46c2
commit 7efd1368d1
9 changed files with 909 additions and 0 deletions

View File

@@ -0,0 +1,93 @@
# JSON Serialization Performance in Python
**Domain Paper: Python Performance ADRs**
**Date:** 2026-01-03
**Source:** Python Numbers Everyone Should Know benchmarks
---
## Executive Summary
Alternative JSON libraries like `orjson` and `msgspec` deliver **8-12x faster serialization** and **2-7x faster deserialization** compared to stdlib `json`. The performance gap is consistent across payload sizes.
---
## Key Findings
### Serialization Performance (dumps)
| Library | Simple Object | Complex Object | Speedup vs stdlib |
|---------|--------------|----------------|-------------------|
| `json.dumps()` | 708 ns | 2.65 us | 1x (baseline) |
| `orjson.dumps()` | 61 ns | 310 ns | **11.6x / 8.5x** |
| `msgspec.encode()` | 92 ns | 445 ns | 7.7x / 6.0x |
| `ujson.dumps()` | 264 ns | 1.64 us | 2.7x / 1.6x |
### Deserialization Performance (loads)
| Library | Simple Object | Complex Object | Speedup vs stdlib |
|---------|--------------|----------------|-------------------|
| `json.loads()` | 714 ns | 2.22 us | 1x (baseline) |
| `orjson.loads()` | 106 ns | 839 ns | **6.7x / 2.6x** |
| `msgspec.decode()` | 101 ns | 850 ns | 7.1x / 2.6x |
---
## When to Use Each Library
### Use stdlib `json` when:
- Zero dependencies required
- Need custom JSONEncoder subclass
- Compatibility is paramount
### Use `orjson` when:
- Maximum performance needed
- You can accept bytes output
- You need datetime/UUID support
### Use `msgspec` when:
- You need typed decoding
- You want MessagePack too
- Memory efficiency matters
---
## Practical Rules for Coding Agents
### Rule 1: Default to orjson for new projects
```python
# Instead of:
import json
data = json.dumps(obj)
# Prefer:
import orjson
data = orjson.dumps(obj) # Returns bytes
```
### Rule 2: Use stdlib json only when explicitly needed
Acceptable reasons:
- Must avoid external dependencies
- Need custom JSONEncoder subclass
- Working in constrained environment
### Rule 3: Profile before optimizing JSON
At 2-3 microseconds per operation, JSON serialization is rarely the bottleneck unless you're doing thousands of operations per second.
---
## Summary Table
| Scenario | Recommendation | Expected Speedup |
|----------|----------------|------------------|
| General use | orjson | 8x serialization, 2.5x deserialization |
| Typed data | msgspec | 6x + type safety |
| Drop-in replacement | ujson | 1.5-2x |
| Zero dependencies | json (stdlib) | Baseline |
---
*Benchmark source: python-numbers-everyone-should-know*