Files
python-performance-adrs/papers/async-overhead.md
dafit 7efd1368d1 feat: Add 8 domain papers and RULEBOOK.md
Domain papers distilled from python-numbers-everyone-should-know:
- async-overhead: 1,400x sync vs async overhead
- collection-membership: 200x set vs list at 1000 items
- json-serialization: 8x orjson vs stdlib
- exception-flow: 6.5x exception overhead (try/except free)
- string-formatting: f-strings > % > .format()
- memory-slots: 69% memory reduction with __slots__
- import-optimization: 100ms+ for heavy packages
- database-patterns: 98% commit overhead in SQLite

RULEBOOK.md: ~200 token distillation for coding subagents

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-03 14:31:40 +01:00

127 lines
3.7 KiB
Markdown

# Async Overhead in Python: When the Cure is Worse Than the Disease
**Domain Paper: Python Performance ADRs**
**Date:** 2026-01-03
**Source:** Python Numbers Everyone Should Know benchmarks (Python 3.14.2, Apple Silicon)
---
## Executive Summary
Async Python introduces a **1,400x overhead** for simple operations compared to synchronous equivalents. This overhead is fixed regardless of what work the function does. The critical insight: async only makes sense when you're waiting on I/O that takes orders of magnitude longer than this overhead.
**The Core Numbers:**
- Sync function call: **20.3 ns**
- Async equivalent via `run_until_complete`: **28.2 us** (28,200 ns)
- **Ratio: 1,387x slower** (approximately 1,400x)
---
## What Was Benchmarked
### Methodology
The benchmarks measured pure async machinery overhead using CPython 3.14.2 on Apple Silicon. Each operation was run thousands of times with warmup periods, reporting median values.
### Test Functions
```python
# The async function being tested
async def return_value_coro():
return 42
# The sync equivalent
def sync_function():
return 42
```
---
## Key Findings
### Coroutine Creation (Cheap)
| Operation | Time |
|-----------|------|
| Create coroutine object | 47.0 ns |
**Key insight:** Creating a coroutine object is cheap (47 ns). The cost comes when you actually run it.
### Running Coroutines (Expensive)
| Operation | Time |
|-----------|------|
| `run_until_complete(empty)` | 27.6 us |
| `run_until_complete(return value)` | 26.6 us |
| Run nested await | 28.9 us |
**Key insight:** Every `run_until_complete` costs ~27 us regardless of coroutine complexity.
### The Critical Comparison
| Operation | Time | Ratio |
|-----------|------|-------|
| Sync function call | 20.3 ns | 1x |
| Async equivalent | 28.2 us | **1,387x** |
---
## When Async IS Appropriate
### Good Use Cases
1. **Web servers handling concurrent connections** - FastAPI/Starlette: 115-125k req/sec
2. **Concurrent network I/O** - Fetching data from multiple APIs simultaneously
3. **High-latency operations with parallelism** - `asyncio.gather()` for multiple slow API calls
### Bad Use Cases
1. **Wrapping synchronous database drivers** - Use native async drivers or stay sync
2. **CPU-bound computation** - Async doesn't parallelize CPU work (GIL)
3. **Simple scripts with sequential operations** - CLI tools, data processing pipelines
---
## Practical Rules for Coding Agents
### Rule 1: Default to Sync
Write synchronous code unless you have a specific, measurable need for async.
### Rule 2: The 1ms Threshold
Only consider async when individual I/O operations take **>1 millisecond**.
### Rule 3: Batch Over Broadcast
If you need async, gather operations together:
```python
# Good: 27 us overhead ONCE
results = await asyncio.gather(*[fetch(url) for url in urls])
# Bad: 27 us overhead PER call
for url in urls:
result = await fetch(url)
```
### Rule 4: Stay in the Loop
Avoid `run_until_complete` inside an already-running loop.
### Rule 5: Match Your I/O Library
Use async libraries for async code, sync libraries for sync code.
---
## Summary Table
| Scenario | Recommendation | Reasoning |
|----------|----------------|-----------|
| Simple function returning data | Sync | Async adds 1,400x overhead |
| In-memory operations | Sync | No I/O to wait on |
| Single database query | Sync | Query time < async amortization |
| Multiple independent API calls | Async + gather | Parallelism benefit outweighs overhead |
| Web server (many connections) | Async framework | Concurrent handling essential |
| CLI tool | Sync | Sequential operations, no benefit |
---
*Benchmark source: python-numbers-everyone-should-know (2026-01-01, Python 3.14.2, Apple Silicon)*