Code Review Best Practices
Core Philosophy
Review at PR level, not file-by-file. Focus on egregious structural issues, not minutia. Look for cross-file patterns that indicate architectural problems.
Quick Start
Option 1: LLM-Based Whole-PR Review (Fast, High-Level)
Use the llm CLI tool for fast whole-PR reviews with large context models:
bash
# Extract diff and review with Claude Sonnet ./scripts/extract-diff.sh && ./scripts/review-with-llm.sh # Use different models for different perspectives ./scripts/review-with-llm.sh -m opus # Deeper analysis ./scripts/review-with-llm.sh -m gpt-4-turbo # Different insights # Include project guidelines ./scripts/review-with-llm.sh -g docs/guidelines/
Prerequisites: Install Simon Willison's llm tool:
bash
pip install llm && llm keys set anthropic
See scripts/README.md for full setup instructions.
Option 2: Manual Diff Extraction
bash
# ALWAYS filter out lock files first git diff main...HEAD -- . ':!uv.lock' ':!poetry.lock' > /tmp/pr-diff.txt wc -lc /tmp/pr-diff.txt # Most diffs fit after this # If still too big, filter more git diff main...HEAD -- . ':!uv.lock' ':!docs/*' > /tmp/code-diff.txt
What to Look For - Egregious Issues Only
Checklist
- • Code duplication - Same logic in 2+ places?
- • Repeated patterns - Same code structure 3+ times?
- • God functions - Functions over 100 lines?
- • Weak types - Any, object, raw dict/list, missing annotations?
- • Type-deficient patterns - hasattr, getattr instead of proper types?
- • Meaningless tests - Tests that just verify assignment?
- • Dead code - Unused functions, commented code?
What NOT to Flag
- •Variable naming (unless truly confusing)
- •Line length
- •Comment style
- •Whitespace
- •Import order
These are auto-fixable or minor. Focus on structural problems.
Egregious Issues
1. Massive Code Duplication
python
# WRONG - Same validation in two files
# file1.py
def process_user_data(user):
if not user.get("email"):
raise ValueError("Email required")
if not user.get("name"):
raise ValueError("Name required")
# file2.py - DUPLICATE!
def validate_user(user):
if not user.get("email"):
raise ValueError("Email required")
if not user.get("name"):
raise ValueError("Name required")
# CORRECT - Extract once, reuse
def validate_user_fields(user: User) -> User:
if not user.email:
raise ValueError("Email required")
if not user.name:
raise ValueError("Name required")
return user
2. God Functions (100+ lines)
python
# WRONG - 200 line function doing everything
def process_order(order_data):
# 50 lines of validation
# 30 lines of transformation
# 40 lines of API calls
# 30 lines of database updates
# 50 lines of error handling
...
# CORRECT - Extract responsibilities
def process_order(order_data: OrderData) -> ProcessedOrder:
validated = _validate_order(order_data)
transformed = _transform_order(validated)
api_result = _call_payment_api(transformed)
return _save_order(transformed, api_result)
3. Weak Types
python
# WRONG - Any, raw dict/list
def process_data(data: Any) -> dict:
return data.get("value")
def get_users() -> list: # List of what?
return [{"name": "Bob"}]
# CORRECT - Specific types
def process_data(data: UserData) -> UserResponse:
return UserResponse(value=data.value)
def get_users() -> list[User]:
return [User(name="Bob")]
4. hasattr/getattr (Type Deficiency Markers)
python
# WRONG - Using hasattr because type is wrong
def process(obj: Any):
if hasattr(obj, "name"):
return obj.name
return None
# CORRECT - Use Protocol
class Named(Protocol):
name: str
def process(obj: Named) -> str:
return obj.name
5. Related Classes Without Shared Interface
python
# WRONG - Related classes with drifting interfaces
class CacheDirectory:
def list_entries(self, filter=None): ...
def count(self, filter=None): ...
class StatusDirectory:
def get_all_entries(self): ... # Different name!
def total_count(self): ... # Different name!
# CORRECT - ABC enforces consistent interface
class FileDirectoryBase(ABC, Generic[T]):
@abstractmethod
def list_entries(self) -> list[T]: ...
@abstractmethod
def count(self) -> int: ...
6. Meaningless Tests
python
# WRONG - Tests assignment
def test_user_created():
user = User(name="Bob", email="bob@example.com")
assert user is not None # Meaningless
assert user.name == "Bob" # Just testing assignment
# CORRECT - Tests invariants
def test_user_email_must_be_valid():
"""INVARIANT: User email must contain @ and domain."""
with pytest.raises(ValidationError):
User(name="Bob", email="invalid-email")
After Code Review
CRITICAL: After completing a code review, ALWAYS:
- •Present findings - Output the full review report
- •List suggested actions - Number each potential fix
- •Ask for approval - Before creating TODOs or executing
- •Wait for explicit approval - User may reject or modify
Example flow:
code
[Code review findings output] ## Suggested Actions 1. Add format-duration validator to ClipV4PromptSchema 2. Add tests for format-duration validation 3. Move CLIPS_PER_HOUR to config class Which items should I proceed with?
Scripts
The plugin includes scripts for LLM-powered code review:
- •
scripts/extract-diff.sh- Extract diffs in PR, commit, or path mode - •
scripts/review-with-llm.sh- Run code review with llm CLI tool - •
scripts/README.md- Setup instructions and usage examples
Reference Files
- •references/what-to-flag.md - Detailed patterns to flag
Related Skills
- •For pattern discovery, see
/pythonista-patterning - •For testing patterns, see
/pythonista-testing - •For type safety, see
/pythonista-typing