AgentSkillsCN

Review Implementation

审查实施过程

SKILL.md

/review-implementation

Reviews an existing arXiv paper implementation against the original paper to verify correctness, completeness, and adherence to the paper's methodology.

Usage

code
/review-implementation <arxiv_url_or_id> [options]

Arguments

  • <arxiv_url_or_id>: arXiv paper URL (e.g., https://arxiv.org/abs/2301.00001) or ID (e.g., 2301.00001)

Options

  • --propose-fixes: Include concrete code proposals to fix identified issues in the review report
  • --detailed: Generate a more detailed review with section-by-section analysis

Output

The review report is saved as REVIEW.md in the paper_impl/{arxiv_id}/ directory.

Workflow

When this skill is invoked, follow these phases in order:

Phase 1: Gather Resources

  1. Locate the implementation directory

    bash
    cd /Users/shibuiyusuke/tmp/paper2code
    uv run python -c "
    from src import extract_arxiv_id, fetch_paper_metadata
    from pathlib import Path
    
    paper = fetch_paper_metadata(extract_arxiv_id('<arxiv_url_or_id>'))
    impl_dir = Path('./paper_impl') / paper.clean_id
    
    if not impl_dir.exists():
        print(f'ERROR: Implementation not found at {impl_dir}')
        exit(1)
    
    print(f'Implementation directory: {impl_dir}')
    print(f'Paper: {paper.title}')
    "
    
  2. Read the paper PDF

    • Use the Read tool to read the PDF at paper_impl/{clean_id}.pdf
    • If PDF is not found, check in paper_impl/{clean_id}/ directory
    • Focus on: Abstract, Method/Approach, Algorithm boxes, Appendix for implementation details
  3. Read all implementation source files

    • Read all Python files under paper_impl/{arxiv_id}/src/
    • Read all test files under paper_impl/{arxiv_id}/tests/
    • Read README.md and requirements.txt

Phase 2: Paper Analysis

Extract the key implementation requirements from the paper:

  1. Core Algorithm/Model

    • Main mathematical formulations (equations)
    • Algorithm pseudocode (if provided)
    • Architecture diagrams and components
    • Key hyperparameters and their values
  2. Implementation Details

    • Initialization methods
    • Normalization techniques
    • Activation functions
    • Training procedures (loss functions, optimizers)
    • Data preprocessing steps
  3. Evaluation Methodology

    • Datasets used
    • Metrics reported
    • Baseline comparisons

Phase 3: Implementation Review

Compare the implementation against the paper systematically:

3.1 Structural Review

  • File organization: Does the project structure make sense?
  • Module separation: Are components properly modularized?
  • Dependencies: Are all required libraries included?

3.2 Algorithm Correctness

For each key algorithm/model component:

  1. Equation Mapping

    • Does the code implement the equations correctly?
    • Are variable names consistent with paper notation?
    • Are all terms and operations accounted for?
  2. Control Flow

    • Does the algorithm flow match the pseudocode?
    • Are edge cases handled correctly?
    • Is the order of operations correct?
  3. Numerical Considerations

    • Is numerical stability addressed?
    • Are appropriate epsilon values used?
    • Is gradient flow preserved?

3.3 Completeness Review

  • Are all major components from the paper implemented?
  • Are optional features or ablations mentioned in the paper included?
  • Are hyperparameters configurable and set to paper defaults?

3.4 Code Quality Review

  • Type hints and docstrings
  • Comments referencing paper sections/equations
  • Test coverage
  • Error handling

Phase 4: Generate Review Report

Create REVIEW.md in the implementation directory with the following structure:

markdown
# Implementation Review: {Paper Title}

**Paper**: [{arxiv_id}]({arxiv_url})
**Review Date**: {current_date}
**Implementation Path**: `paper_impl/{arxiv_id}/`

## Summary

{Brief overall assessment: Excellent/Good/Needs Improvement/Major Issues}

{2-3 sentence summary of the implementation quality}

## Detailed Review

### 1. Algorithm Correctness

#### 1.1 {Component Name}

**Paper Reference**: Section X.Y, Equation (Z)

**Status**: {Correct/Partial/Incorrect/Missing}

**Analysis**:
{Detailed analysis of how the implementation matches or differs from the paper}

{If --propose-fixes is set and status is not Correct:}
**Proposed Fix**:
```python
# Proposed code changes

1.2 {Next Component}

...

2. Completeness

ComponentPaper SectionImplementedNotes
{Component 1}Section XYes/No/Partial{notes}
{Component 2}Section YYes/No/Partial{notes}
...

3. Code Quality

3.1 Documentation

  • README completeness
  • Docstrings present
  • Paper references in comments

3.2 Testing

  • Unit tests present
  • Shape tests
  • Gradient tests
  • Integration tests

3.3 Best Practices

  • Type hints
  • Error handling
  • Numerical stability

4. Discrepancies and Recommendations

Critical Issues

{List of issues that affect correctness}

{If --propose-fixes is set:} Fix:

python
# Code to fix the issue

Minor Issues

{List of style/quality issues}

Recommendations

{Suggestions for improvement}

Verification Checklist

  • Core algorithm matches paper equations
  • Hyperparameters match paper defaults
  • Training procedure aligns with paper
  • Output shapes are correct
  • Numerical stability is handled

Conclusion

{Final assessment and summary of key findings}

code

### Phase 5: Fix Proposals (if --propose-fixes)

When the `--propose-fixes` option is specified:

1. For each identified issue, provide:
   - **Location**: File path and line numbers
   - **Problem**: Clear description of the issue
   - **Paper Reference**: Relevant section/equation
   - **Proposed Code**: Complete code snippet to fix the issue

2. Format fix proposals clearly:
   ```markdown
   #### Fix for: {Issue Title}

   **File**: `src/{filename}.py`
   **Lines**: {start}-{end}

   **Current Code**:
   ```python
   # Current problematic code

Proposed Code:

python
# Fixed code with comments explaining changes

Rationale: {Why this fix is correct according to the paper}

code

3. For complex fixes spanning multiple files, provide a step-by-step guide

## Important Guidelines

### When Reading the Paper

- **Focus on implementation details**: Look for Appendix sections with hyperparameters
- **Note ambiguities**: Papers often omit details - note what's missing
- **Check figures carefully**: Architecture diagrams often reveal details not in text

### When Reviewing Code

- **Be thorough but fair**: Not every deviation is a bug
- **Consider practical adaptations**: Some changes may be valid simplifications
- **Check test results**: If tests pass, the implementation likely works

### Review Criteria

Rate each component using:
- **Correct**: Faithfully implements the paper
- **Partial**: Implements the concept but with minor deviations
- **Incorrect**: Contains errors that affect functionality
- **Missing**: Not implemented when it should be

### Common Issues to Look For

1. **Matrix operations**: Transposition errors, wrong axis for reduction
2. **Normalization**: Wrong layer norm placement, missing normalization
3. **Initialization**: Default PyTorch init vs paper-specified init
4. **Activation functions**: Wrong activation or placement
5. **Loss computation**: Missing terms, wrong reduction
6. **Hyperparameters**: Different from paper defaults

## Example Session

User: /review-implementation https://arxiv.org/abs/2511.07800 --propose-fixes

Claude: I'll review the implementation of "From Experience to Strategy: Empowering LLM Agents with Trainable Graph Memory".

[Phase 1: Locating implementation and reading paper...] [Phase 2: Extracting paper requirements...] [Phase 3: Reviewing implementation...] [Phase 4: Generating review report...] [Phase 5: Adding fix proposals...]

Review complete! Saved to paper_impl/2511_07800v1/REVIEW.md

Summary:

  • Overall: Good
  • 3 components fully correct
  • 1 component with minor issues
  • 2 fix proposals included
code