feat: Add 8 domain papers and RULEBOOK.md
Domain papers distilled from python-numbers-everyone-should-know: - async-overhead: 1,400x sync vs async overhead - collection-membership: 200x set vs list at 1000 items - json-serialization: 8x orjson vs stdlib - exception-flow: 6.5x exception overhead (try/except free) - string-formatting: f-strings > % > .format() - memory-slots: 69% memory reduction with __slots__ - import-optimization: 100ms+ for heavy packages - database-patterns: 98% commit overhead in SQLite RULEBOOK.md: ~200 token distillation for coding subagents 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
93
papers/json-serialization.md
Normal file
93
papers/json-serialization.md
Normal file
@@ -0,0 +1,93 @@
|
||||
# JSON Serialization Performance in Python
|
||||
|
||||
**Domain Paper: Python Performance ADRs**
|
||||
**Date:** 2026-01-03
|
||||
**Source:** Python Numbers Everyone Should Know benchmarks
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
Alternative JSON libraries like `orjson` and `msgspec` deliver **8-12x faster serialization** and **2-7x faster deserialization** compared to stdlib `json`. The performance gap is consistent across payload sizes.
|
||||
|
||||
---
|
||||
|
||||
## Key Findings
|
||||
|
||||
### Serialization Performance (dumps)
|
||||
|
||||
| Library | Simple Object | Complex Object | Speedup vs stdlib |
|
||||
|---------|--------------|----------------|-------------------|
|
||||
| `json.dumps()` | 708 ns | 2.65 us | 1x (baseline) |
|
||||
| `orjson.dumps()` | 61 ns | 310 ns | **11.6x / 8.5x** |
|
||||
| `msgspec.encode()` | 92 ns | 445 ns | 7.7x / 6.0x |
|
||||
| `ujson.dumps()` | 264 ns | 1.64 us | 2.7x / 1.6x |
|
||||
|
||||
### Deserialization Performance (loads)
|
||||
|
||||
| Library | Simple Object | Complex Object | Speedup vs stdlib |
|
||||
|---------|--------------|----------------|-------------------|
|
||||
| `json.loads()` | 714 ns | 2.22 us | 1x (baseline) |
|
||||
| `orjson.loads()` | 106 ns | 839 ns | **6.7x / 2.6x** |
|
||||
| `msgspec.decode()` | 101 ns | 850 ns | 7.1x / 2.6x |
|
||||
|
||||
---
|
||||
|
||||
## When to Use Each Library
|
||||
|
||||
### Use stdlib `json` when:
|
||||
- Zero dependencies required
|
||||
- Need custom JSONEncoder subclass
|
||||
- Compatibility is paramount
|
||||
|
||||
### Use `orjson` when:
|
||||
- Maximum performance needed
|
||||
- You can accept bytes output
|
||||
- You need datetime/UUID support
|
||||
|
||||
### Use `msgspec` when:
|
||||
- You need typed decoding
|
||||
- You want MessagePack too
|
||||
- Memory efficiency matters
|
||||
|
||||
---
|
||||
|
||||
## Practical Rules for Coding Agents
|
||||
|
||||
### Rule 1: Default to orjson for new projects
|
||||
|
||||
```python
|
||||
# Instead of:
|
||||
import json
|
||||
data = json.dumps(obj)
|
||||
|
||||
# Prefer:
|
||||
import orjson
|
||||
data = orjson.dumps(obj) # Returns bytes
|
||||
```
|
||||
|
||||
### Rule 2: Use stdlib json only when explicitly needed
|
||||
|
||||
Acceptable reasons:
|
||||
- Must avoid external dependencies
|
||||
- Need custom JSONEncoder subclass
|
||||
- Working in constrained environment
|
||||
|
||||
### Rule 3: Profile before optimizing JSON
|
||||
|
||||
At 2-3 microseconds per operation, JSON serialization is rarely the bottleneck unless you're doing thousands of operations per second.
|
||||
|
||||
---
|
||||
|
||||
## Summary Table
|
||||
|
||||
| Scenario | Recommendation | Expected Speedup |
|
||||
|----------|----------------|------------------|
|
||||
| General use | orjson | 8x serialization, 2.5x deserialization |
|
||||
| Typed data | msgspec | 6x + type safety |
|
||||
| Drop-in replacement | ujson | 1.5-2x |
|
||||
| Zero dependencies | json (stdlib) | Baseline |
|
||||
|
||||
---
|
||||
|
||||
*Benchmark source: python-numbers-everyone-should-know*
|
||||
Reference in New Issue
Block a user