Code Review Best Practices

Core Philosophy

Review at PR level, not file-by-file. Focus on egregious structural issues, not minutia. Look for cross-file patterns that indicate architectural problems.

Quick Start

Option 1: LLM-Based Whole-PR Review (Fast, High-Level)

Use the llm CLI tool for fast whole-PR reviews with large context models:

bash

# Extract diff and review with Claude Sonnet
./scripts/extract-diff.sh && ./scripts/review-with-llm.sh

# Use different models for different perspectives
./scripts/review-with-llm.sh -m opus       # Deeper analysis
./scripts/review-with-llm.sh -m gpt-4-turbo  # Different insights

# Include project guidelines
./scripts/review-with-llm.sh -g docs/guidelines/

Prerequisites: Install Simon Willison's llm tool:

bash

pip install llm && llm keys set anthropic

See scripts/README.md for full setup instructions.

Option 2: Manual Diff Extraction

bash

# ALWAYS filter out lock files first
git diff main...HEAD -- . ':!uv.lock' ':!poetry.lock' > /tmp/pr-diff.txt
wc -lc /tmp/pr-diff.txt  # Most diffs fit after this

# If still too big, filter more
git diff main...HEAD -- . ':!uv.lock' ':!docs/*' > /tmp/code-diff.txt

What to Look For - Egregious Issues Only

Checklist

• Code duplication - Same logic in 2+ places?
• Repeated patterns - Same code structure 3+ times?
• God functions - Functions over 100 lines?
• Weak types - Any, object, raw dict/list, missing annotations?
• Type-deficient patterns - hasattr, getattr instead of proper types?
• Meaningless tests - Tests that just verify assignment?
• Dead code - Unused functions, commented code?

What NOT to Flag

•Variable naming (unless truly confusing)
•Line length
•Comment style
•Whitespace
•Import order

These are auto-fixable or minor. Focus on structural problems.

Egregious Issues

1. Massive Code Duplication

python

# WRONG - Same validation in two files
# file1.py
def process_user_data(user):
    if not user.get("email"):
        raise ValueError("Email required")
    if not user.get("name"):
        raise ValueError("Name required")

# file2.py - DUPLICATE!
def validate_user(user):
    if not user.get("email"):
        raise ValueError("Email required")
    if not user.get("name"):
        raise ValueError("Name required")

# CORRECT - Extract once, reuse
def validate_user_fields(user: User) -> User:
    if not user.email:
        raise ValueError("Email required")
    if not user.name:
        raise ValueError("Name required")
    return user

2. God Functions (100+ lines)

python

# WRONG - 200 line function doing everything
def process_order(order_data):
    # 50 lines of validation
    # 30 lines of transformation
    # 40 lines of API calls
    # 30 lines of database updates
    # 50 lines of error handling
    ...

# CORRECT - Extract responsibilities
def process_order(order_data: OrderData) -> ProcessedOrder:
    validated = _validate_order(order_data)
    transformed = _transform_order(validated)
    api_result = _call_payment_api(transformed)
    return _save_order(transformed, api_result)

3. Weak Types

python

# WRONG - Any, raw dict/list
def process_data(data: Any) -> dict:
    return data.get("value")

def get_users() -> list:  # List of what?
    return [{"name": "Bob"}]

# CORRECT - Specific types
def process_data(data: UserData) -> UserResponse:
    return UserResponse(value=data.value)

def get_users() -> list[User]:
    return [User(name="Bob")]

4. hasattr/getattr (Type Deficiency Markers)

python

# WRONG - Using hasattr because type is wrong
def process(obj: Any):
    if hasattr(obj, "name"):
        return obj.name
    return None

# CORRECT - Use Protocol
class Named(Protocol):
    name: str

def process(obj: Named) -> str:
    return obj.name

5. Related Classes Without Shared Interface

python

# WRONG - Related classes with drifting interfaces
class CacheDirectory:
    def list_entries(self, filter=None): ...
    def count(self, filter=None): ...

class StatusDirectory:
    def get_all_entries(self): ...  # Different name!
    def total_count(self): ...  # Different name!

# CORRECT - ABC enforces consistent interface
class FileDirectoryBase(ABC, Generic[T]):
    @abstractmethod
    def list_entries(self) -> list[T]: ...
    @abstractmethod
    def count(self) -> int: ...

6. Meaningless Tests

python

# WRONG - Tests assignment
def test_user_created():
    user = User(name="Bob", email="bob@example.com")
    assert user is not None  # Meaningless
    assert user.name == "Bob"  # Just testing assignment

# CORRECT - Tests invariants
def test_user_email_must_be_valid():
    """INVARIANT: User email must contain @ and domain."""
    with pytest.raises(ValidationError):
        User(name="Bob", email="invalid-email")

After Code Review

CRITICAL: After completing a code review, ALWAYS:

•Present findings - Output the full review report
•List suggested actions - Number each potential fix
•Ask for approval - Before creating TODOs or executing
•Wait for explicit approval - User may reject or modify

Example flow:

code

[Code review findings output]

## Suggested Actions
1. Add format-duration validator to ClipV4PromptSchema
2. Add tests for format-duration validation
3. Move CLIPS_PER_HOUR to config class

Which items should I proceed with?

Scripts

The plugin includes scripts for LLM-powered code review:

•scripts/extract-diff.sh - Extract diffs in PR, commit, or path mode
•scripts/review-with-llm.sh - Run code review with llm CLI tool
•scripts/README.md - Setup instructions and usage examples

Reference Files

•references/what-to-flag.md - Detailed patterns to flag

Related Skills

•For pattern discovery, see /pythonista-patterning
•For testing patterns, see /pythonista-testing
•For type safety, see /pythonista-typing