feat: Add 8 domain papers and RULEBOOK.md
Domain papers distilled from python-numbers-everyone-should-know: - async-overhead: 1,400x sync vs async overhead - collection-membership: 200x set vs list at 1000 items - json-serialization: 8x orjson vs stdlib - exception-flow: 6.5x exception overhead (try/except free) - string-formatting: f-strings > % > .format() - memory-slots: 69% memory reduction with __slots__ - import-optimization: 100ms+ for heavy packages - database-patterns: 98% commit overhead in SQLite RULEBOOK.md: ~200 token distillation for coding subagents 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
111
papers/string-formatting.md
Normal file
111
papers/string-formatting.md
Normal file
@@ -0,0 +1,111 @@
|
||||
# String Formatting: Domain Exploration
|
||||
|
||||
**Date:** 2026-01-03
|
||||
**Source:** python-numbers-everyone-should-know benchmarks
|
||||
**Python Version:** 3.14.2 (CPython, ARM64 macOS)
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
String formatting performance: **simple concatenation is fastest for trivial joins**, while **f-strings offer the best balance of readability and performance** for interpolation use cases.
|
||||
|
||||
---
|
||||
|
||||
## Raw Benchmark Results
|
||||
|
||||
| Operation | Time (ns) | Throughput |
|
||||
|-----------|-----------|------------|
|
||||
| `concat_small` | 39.1 ns | 25.6M ops/sec |
|
||||
| `f_string` | 64.9 ns | 15.4M ops/sec |
|
||||
| `percent_formatting` | 89.8 ns | 11.1M ops/sec |
|
||||
| `format_method` | 103 ns | 9.7M ops/sec |
|
||||
|
||||
### Relative Performance
|
||||
|
||||
| Method | vs f-string |
|
||||
|--------|-------------|
|
||||
| `concat_small` | 1.66x faster |
|
||||
| `f_string` | 1.00x (reference) |
|
||||
| `percent_formatting` | 0.72x slower |
|
||||
| `format_method` | 0.63x slower |
|
||||
|
||||
---
|
||||
|
||||
## Why F-Strings Are Fast
|
||||
|
||||
F-strings are parsed at **compile time**, not runtime:
|
||||
|
||||
1. **No method lookup**: F-strings don't call `.format()` at runtime
|
||||
2. **No tuple creation**: `%` formatting requires `(name,)` tuple
|
||||
3. **Specialized bytecode**: `FORMAT_VALUE` and `BUILD_STRING` are optimized
|
||||
|
||||
---
|
||||
|
||||
## When to Use Each Method
|
||||
|
||||
### Concatenation Wins
|
||||
For 2-3 literal strings with no formatting:
|
||||
```python
|
||||
path = base_dir + '/' + filename # Simpler, faster
|
||||
```
|
||||
|
||||
### % Formatting for Logging
|
||||
```python
|
||||
# Deferred evaluation - string built only if debug enabled
|
||||
logger.debug('Processing %s items', count)
|
||||
|
||||
# f-string - string ALWAYS built, then discarded
|
||||
logger.debug(f'Processing {count} items') # Wasteful
|
||||
```
|
||||
|
||||
### .format() for Dynamic Templates
|
||||
```python
|
||||
template = get_template_from_config() # Returns 'User: {name}'
|
||||
result = template.format(name=user.name)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Practical Rules for Coding Agents
|
||||
|
||||
### Rule 1: Default to F-Strings
|
||||
```python
|
||||
# Preferred
|
||||
message = f'User {user.name} logged in at {timestamp}'
|
||||
```
|
||||
|
||||
### Rule 2: Use Concatenation for Trivial Joins
|
||||
```python
|
||||
url = base_url + endpoint # Fine - simpler and faster
|
||||
```
|
||||
|
||||
### Rule 3: Use join() for Multiple Parts
|
||||
```python
|
||||
# Correct - O(n) time
|
||||
result = ''.join([part1, part2, part3, part4])
|
||||
|
||||
# Inefficient - O(n^2) time
|
||||
result = part1 + part2 + part3 + part4
|
||||
```
|
||||
|
||||
### Rule 4: Keep % for Logging
|
||||
```python
|
||||
logger.info('Processed %d records in %.2fs', count, elapsed)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
| Scenario | Best Choice | Reason |
|
||||
|----------|-------------|--------|
|
||||
| Variable interpolation | f-string | 1.6x faster than `.format()` |
|
||||
| Simple 2-part join | Concatenation | 1.7x faster than f-string |
|
||||
| Building from many parts | `''.join()` | O(n) vs O(n^2) |
|
||||
| Logging statements | `%` style | Deferred evaluation |
|
||||
| Dynamic templates | `.format()` | Template flexibility |
|
||||
|
||||
---
|
||||
|
||||
*Benchmark source: python-numbers-everyone-should-know*
|
||||
Reference in New Issue
Block a user