Files
python-performance-adrs/papers/database-patterns.md
dafit 7efd1368d1 feat: Add 8 domain papers and RULEBOOK.md
Domain papers distilled from python-numbers-everyone-should-know:
- async-overhead: 1,400x sync vs async overhead
- collection-membership: 200x set vs list at 1000 items
- json-serialization: 8x orjson vs stdlib
- exception-flow: 6.5x exception overhead (try/except free)
- string-formatting: f-strings > % > .format()
- memory-slots: 69% memory reduction with __slots__
- import-optimization: 100ms+ for heavy packages
- database-patterns: 98% commit overhead in SQLite

RULEBOOK.md: ~200 token distillation for coding subagents

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-03 14:31:40 +01:00

3.2 KiB

Database Patterns: SQLite, DiskCache, and MongoDB

Domain: Persistence and data access patterns in Python Source: python-numbers-everyone-should-know benchmarks Date: 2026-01-03


Executive Summary

Reads are cheap, writes are expensive. SQLite commits dominate write latency (192 microseconds with commit vs 3 microseconds without). For read-heavy workloads, SQLite achieves 280K ops/sec by primary key. For write-heavy workloads, consider diskcache (8x faster writes) or batch operations.


Key Findings

The Numbers

Operation SQLite DiskCache MongoDB
Write one object 192 us (5.2k/s) 24 us (42k/s) 119 us (8.4k/s)
Read by key/id 3.6 us (280k/s) 4.3 us (236k/s) 121 us (8.2k/s)

Finding 1: The Commit Tax

SQLite writes with commit: 192 microseconds SQLite writes without commit: 3 microseconds

The commit operation accounts for 98.4% of write latency.

Finding 2: DiskCache Wins for Simple Key-Value

Operation SQLite Raw DiskCache
Write 192 us 24 us
Read 3.6 us 4.3 us

DiskCache achieves 8x faster writes with comparable read performance.

Finding 3: Batching Provides 9x Throughput

SQLite executemany() 10 rows: 215 microseconds total (21.5 us/row) 10 individual inserts: 1,920 microseconds (192 us/row)


When to Use Each Storage Option

SQLite

  • Read-heavy workloads (100:1 read/write ratio)
  • Need to query inside JSON with json_extract()
  • ACID guarantees matter

DiskCache

  • Key-value storage with automatic serialization
  • Cache patterns (TTL, LRU eviction)
  • Agent state persistence

MongoDB

  • Distributed, multi-node deployments
  • Complex aggregation pipelines
  • Full-text search requirements

Practical Rules for Coding Agents

Rule 1: Default to DiskCache for Agent State

from diskcache import Cache
cache = Cache('/tmp/agent-cache')
cache.set('conversation:123', messages)  # 24 microseconds

Rule 2: Batch SQLite Writes

# GOOD: Batch with executemany
conn.executemany('INSERT INTO items (data) VALUES (?)', items_list)
conn.commit()  # One commit for all items

Rule 3: Use Transactions for Multi-Step Operations

conn.execute('BEGIN')
conn.execute('INSERT INTO users ...')
conn.execute('INSERT INTO audit_log ...')
conn.commit()  # One fsync

Rule 4: Lazy Import for CLI Tools

def save_to_db(data):
    import sqlite3  # 1.63ms only when needed
    conn = sqlite3.connect('app.db')

Summary

Metric SQLite DiskCache MongoDB
Write latency 192 us 24 us 119 us
Read latency 3.6 us 4.3 us 121 us
Writes/sec 5.2k 42k 8.4k
Reads/sec 280k 236k 8.2k

The Bottom Line

  1. Reads are cheap everywhere - Optimize for write patterns
  2. SQLite commits dominate latency - Batch or use transactions
  3. DiskCache for key-value - 8x faster writes, automatic serialization
  4. MongoDB for distribution - Not for local performance

The tortoise way: Measure, understand the cost, choose deliberately.


Benchmark source: python-numbers-everyone-should-know