Domain papers distilled from python-numbers-everyone-should-know: - async-overhead: 1,400x sync vs async overhead - collection-membership: 200x set vs list at 1000 items - json-serialization: 8x orjson vs stdlib - exception-flow: 6.5x exception overhead (try/except free) - string-formatting: f-strings > % > .format() - memory-slots: 69% memory reduction with __slots__ - import-optimization: 100ms+ for heavy packages - database-patterns: 98% commit overhead in SQLite RULEBOOK.md: ~200 token distillation for coding subagents 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
127 lines
3.7 KiB
Markdown
127 lines
3.7 KiB
Markdown
# Async Overhead in Python: When the Cure is Worse Than the Disease
|
|
|
|
**Domain Paper: Python Performance ADRs**
|
|
**Date:** 2026-01-03
|
|
**Source:** Python Numbers Everyone Should Know benchmarks (Python 3.14.2, Apple Silicon)
|
|
|
|
---
|
|
|
|
## Executive Summary
|
|
|
|
Async Python introduces a **1,400x overhead** for simple operations compared to synchronous equivalents. This overhead is fixed regardless of what work the function does. The critical insight: async only makes sense when you're waiting on I/O that takes orders of magnitude longer than this overhead.
|
|
|
|
**The Core Numbers:**
|
|
- Sync function call: **20.3 ns**
|
|
- Async equivalent via `run_until_complete`: **28.2 us** (28,200 ns)
|
|
- **Ratio: 1,387x slower** (approximately 1,400x)
|
|
|
|
---
|
|
|
|
## What Was Benchmarked
|
|
|
|
### Methodology
|
|
|
|
The benchmarks measured pure async machinery overhead using CPython 3.14.2 on Apple Silicon. Each operation was run thousands of times with warmup periods, reporting median values.
|
|
|
|
### Test Functions
|
|
|
|
```python
|
|
# The async function being tested
|
|
async def return_value_coro():
|
|
return 42
|
|
|
|
# The sync equivalent
|
|
def sync_function():
|
|
return 42
|
|
```
|
|
|
|
---
|
|
|
|
## Key Findings
|
|
|
|
### Coroutine Creation (Cheap)
|
|
|
|
| Operation | Time |
|
|
|-----------|------|
|
|
| Create coroutine object | 47.0 ns |
|
|
|
|
**Key insight:** Creating a coroutine object is cheap (47 ns). The cost comes when you actually run it.
|
|
|
|
### Running Coroutines (Expensive)
|
|
|
|
| Operation | Time |
|
|
|-----------|------|
|
|
| `run_until_complete(empty)` | 27.6 us |
|
|
| `run_until_complete(return value)` | 26.6 us |
|
|
| Run nested await | 28.9 us |
|
|
|
|
**Key insight:** Every `run_until_complete` costs ~27 us regardless of coroutine complexity.
|
|
|
|
### The Critical Comparison
|
|
|
|
| Operation | Time | Ratio |
|
|
|-----------|------|-------|
|
|
| Sync function call | 20.3 ns | 1x |
|
|
| Async equivalent | 28.2 us | **1,387x** |
|
|
|
|
---
|
|
|
|
## When Async IS Appropriate
|
|
|
|
### Good Use Cases
|
|
|
|
1. **Web servers handling concurrent connections** - FastAPI/Starlette: 115-125k req/sec
|
|
2. **Concurrent network I/O** - Fetching data from multiple APIs simultaneously
|
|
3. **High-latency operations with parallelism** - `asyncio.gather()` for multiple slow API calls
|
|
|
|
### Bad Use Cases
|
|
|
|
1. **Wrapping synchronous database drivers** - Use native async drivers or stay sync
|
|
2. **CPU-bound computation** - Async doesn't parallelize CPU work (GIL)
|
|
3. **Simple scripts with sequential operations** - CLI tools, data processing pipelines
|
|
|
|
---
|
|
|
|
## Practical Rules for Coding Agents
|
|
|
|
### Rule 1: Default to Sync
|
|
Write synchronous code unless you have a specific, measurable need for async.
|
|
|
|
### Rule 2: The 1ms Threshold
|
|
Only consider async when individual I/O operations take **>1 millisecond**.
|
|
|
|
### Rule 3: Batch Over Broadcast
|
|
If you need async, gather operations together:
|
|
|
|
```python
|
|
# Good: 27 us overhead ONCE
|
|
results = await asyncio.gather(*[fetch(url) for url in urls])
|
|
|
|
# Bad: 27 us overhead PER call
|
|
for url in urls:
|
|
result = await fetch(url)
|
|
```
|
|
|
|
### Rule 4: Stay in the Loop
|
|
Avoid `run_until_complete` inside an already-running loop.
|
|
|
|
### Rule 5: Match Your I/O Library
|
|
Use async libraries for async code, sync libraries for sync code.
|
|
|
|
---
|
|
|
|
## Summary Table
|
|
|
|
| Scenario | Recommendation | Reasoning |
|
|
|----------|----------------|-----------|
|
|
| Simple function returning data | Sync | Async adds 1,400x overhead |
|
|
| In-memory operations | Sync | No I/O to wait on |
|
|
| Single database query | Sync | Query time < async amortization |
|
|
| Multiple independent API calls | Async + gather | Parallelism benefit outweighs overhead |
|
|
| Web server (many connections) | Async framework | Concurrent handling essential |
|
|
| CLI tool | Sync | Sequential operations, no benefit |
|
|
|
|
---
|
|
|
|
*Benchmark source: python-numbers-everyone-should-know (2026-01-01, Python 3.14.2, Apple Silicon)*
|