feat: Add 8 domain papers and RULEBOOK.md
Domain papers distilled from python-numbers-everyone-should-know: - async-overhead: 1,400x sync vs async overhead - collection-membership: 200x set vs list at 1000 items - json-serialization: 8x orjson vs stdlib - exception-flow: 6.5x exception overhead (try/except free) - string-formatting: f-strings > % > .format() - memory-slots: 69% memory reduction with __slots__ - import-optimization: 100ms+ for heavy packages - database-patterns: 98% commit overhead in SQLite RULEBOOK.md: ~200 token distillation for coding subagents 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
104
papers/import-optimization.md
Normal file
104
papers/import-optimization.md
Normal file
@@ -0,0 +1,104 @@
|
||||
# Import Optimization
|
||||
|
||||
**Domain Paper: Python Performance ADRs**
|
||||
**Date:** 2026-01-03
|
||||
**Source:** python-numbers-everyone-should-know benchmarks
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
Import costs range from **sub-microsecond (cached) to 100+ milliseconds** (large frameworks). For CLI tools and short-lived scripts, import time can dominate total execution.
|
||||
|
||||
---
|
||||
|
||||
## Benchmark Data (First Import, Fresh Process)
|
||||
|
||||
### Built-in Modules
|
||||
|
||||
| Module | First Import |
|
||||
|--------|-------------|
|
||||
| `sys` | 0.2 us |
|
||||
| `os` | 0.2 us |
|
||||
| `math` | 24 us |
|
||||
|
||||
### Standard Library
|
||||
|
||||
| Module | First Import |
|
||||
|--------|-------------|
|
||||
| `datetime` | 72 us |
|
||||
| `typing` | 2.0 ms |
|
||||
| `json` | 2.9 ms |
|
||||
| `dataclasses` | 6.0 ms |
|
||||
| `logging` | 10.5 ms |
|
||||
| `asyncio` | 17.7 ms |
|
||||
|
||||
### External Packages
|
||||
|
||||
| Package | First Import |
|
||||
|---------|-------------|
|
||||
| `pydantic` | 15.8 ms |
|
||||
| `flask` | 47.3 ms |
|
||||
| `fastapi` | 104.4 ms |
|
||||
|
||||
**Key insight:** FastAPI takes 100ms just to import. For a CLI tool that runs in 50ms, this is unacceptable overhead.
|
||||
|
||||
---
|
||||
|
||||
## Lazy Import Patterns
|
||||
|
||||
### Pattern 1: Function-Level Import
|
||||
```python
|
||||
def process_data(data):
|
||||
import pandas as pd # Only when needed
|
||||
return pd.DataFrame(data)
|
||||
```
|
||||
|
||||
### Pattern 2: TYPE_CHECKING Guard
|
||||
```python
|
||||
from typing import TYPE_CHECKING
|
||||
|
||||
if TYPE_CHECKING:
|
||||
import pandas as pd
|
||||
|
||||
def process(data: "pd.DataFrame"):
|
||||
import pandas as pd
|
||||
return pd.DataFrame(data)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Practical Rules for Coding Agents
|
||||
|
||||
### MUST
|
||||
|
||||
1. **Use `TYPE_CHECKING` for type-only imports** when the type is from a heavy package
|
||||
2. **Use function-level imports for rarely-used code paths**
|
||||
3. **Never import heavy packages at module level in CLI tools**
|
||||
|
||||
### SHOULD
|
||||
|
||||
4. **Use `from __future__ import annotations`** for cleaner TYPE_CHECKING
|
||||
5. **Profile import time for new dependencies:**
|
||||
```bash
|
||||
python -c "import time; s=time.perf_counter(); import PACKAGE; print(f'{(time.perf_counter()-s)*1000:.1f}ms')"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Summary Table
|
||||
|
||||
| Scenario | Pattern | Example |
|
||||
|----------|---------|---------|
|
||||
| Type hints for heavy types | `TYPE_CHECKING` | pandas, numpy types |
|
||||
| Rarely-used function | Function-level import | Error handling paths |
|
||||
| CLI fast path | Defer until needed | `--version`, `--help` |
|
||||
| Serverless cold start | Minimize top-level | Lambda/Cloud Functions |
|
||||
|
||||
---
|
||||
|
||||
*Import costs are hidden taxes. Pay them lazily.*
|
||||
|
||||
---
|
||||
|
||||
*Benchmark source: python-numbers-everyone-should-know*
|
||||
Reference in New Issue
Block a user