Domain papers distilled from python-numbers-everyone-should-know: - async-overhead: 1,400x sync vs async overhead - collection-membership: 200x set vs list at 1000 items - json-serialization: 8x orjson vs stdlib - exception-flow: 6.5x exception overhead (try/except free) - string-formatting: f-strings > % > .format() - memory-slots: 69% memory reduction with __slots__ - import-optimization: 100ms+ for heavy packages - database-patterns: 98% commit overhead in SQLite RULEBOOK.md: ~200 token distillation for coding subagents 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
3.2 KiB
Database Patterns: SQLite, DiskCache, and MongoDB
Domain: Persistence and data access patterns in Python Source: python-numbers-everyone-should-know benchmarks Date: 2026-01-03
Executive Summary
Reads are cheap, writes are expensive. SQLite commits dominate write latency (192 microseconds with commit vs 3 microseconds without). For read-heavy workloads, SQLite achieves 280K ops/sec by primary key. For write-heavy workloads, consider diskcache (8x faster writes) or batch operations.
Key Findings
The Numbers
| Operation | SQLite | DiskCache | MongoDB |
|---|---|---|---|
| Write one object | 192 us (5.2k/s) | 24 us (42k/s) | 119 us (8.4k/s) |
| Read by key/id | 3.6 us (280k/s) | 4.3 us (236k/s) | 121 us (8.2k/s) |
Finding 1: The Commit Tax
SQLite writes with commit: 192 microseconds SQLite writes without commit: 3 microseconds
The commit operation accounts for 98.4% of write latency.
Finding 2: DiskCache Wins for Simple Key-Value
| Operation | SQLite Raw | DiskCache |
|---|---|---|
| Write | 192 us | 24 us |
| Read | 3.6 us | 4.3 us |
DiskCache achieves 8x faster writes with comparable read performance.
Finding 3: Batching Provides 9x Throughput
SQLite executemany() 10 rows: 215 microseconds total (21.5 us/row)
10 individual inserts: 1,920 microseconds (192 us/row)
When to Use Each Storage Option
SQLite
- Read-heavy workloads (100:1 read/write ratio)
- Need to query inside JSON with
json_extract() - ACID guarantees matter
DiskCache
- Key-value storage with automatic serialization
- Cache patterns (TTL, LRU eviction)
- Agent state persistence
MongoDB
- Distributed, multi-node deployments
- Complex aggregation pipelines
- Full-text search requirements
Practical Rules for Coding Agents
Rule 1: Default to DiskCache for Agent State
from diskcache import Cache
cache = Cache('/tmp/agent-cache')
cache.set('conversation:123', messages) # 24 microseconds
Rule 2: Batch SQLite Writes
# GOOD: Batch with executemany
conn.executemany('INSERT INTO items (data) VALUES (?)', items_list)
conn.commit() # One commit for all items
Rule 3: Use Transactions for Multi-Step Operations
conn.execute('BEGIN')
conn.execute('INSERT INTO users ...')
conn.execute('INSERT INTO audit_log ...')
conn.commit() # One fsync
Rule 4: Lazy Import for CLI Tools
def save_to_db(data):
import sqlite3 # 1.63ms only when needed
conn = sqlite3.connect('app.db')
Summary
| Metric | SQLite | DiskCache | MongoDB |
|---|---|---|---|
| Write latency | 192 us | 24 us | 119 us |
| Read latency | 3.6 us | 4.3 us | 121 us |
| Writes/sec | 5.2k | 42k | 8.4k |
| Reads/sec | 280k | 236k | 8.2k |
The Bottom Line
- Reads are cheap everywhere - Optimize for write patterns
- SQLite commits dominate latency - Batch or use transactions
- DiskCache for key-value - 8x faster writes, automatic serialization
- MongoDB for distribution - Not for local performance
The tortoise way: Measure, understand the cost, choose deliberately.
Benchmark source: python-numbers-everyone-should-know