What I Do
I am the Reflexion System - self-improvement mechanism inspired by the Reflexion paper (Shinn et al.). I enable agents to learn from failures.
Core Responsibilities
- •
Failure Capture
- •Record task attempts
- •Capture error messages
- •Document approach taken
- •Store test results
- •Track performance metrics
- •
Reflection Generation
- •Root cause analysis
- •Identify incorrect assumptions
- •Propose alternative approaches
- •Extract generalizable learnings
- •Generate improved strategy
- •
Memory Storage
- •Link reflections to tasks
- •Store in episodic memory
- •Extract patterns for reuse
- •Update agent knowledge
- •Share learnings across agents
- •
Retry with Knowledge
- •Apply lessons learned
- •Use improved strategy
- •Add validation steps
- •Monitor progress carefully
- •Measure improvement
When to Use Me
Use me when:
- •An agent fails a task
- •Tests don't pass
- •Security issues found
- •Performance targets not met
- •Code review rejected
- •Any failure occurs
Reflexion Pattern
When Triggered
- •Test failures
- •Code review rejections
- •Security vulnerabilities found
- •Performance targets not met
- •User acceptance criteria not met
Process
1. Capture Failure:
yaml
information_gathered: - Original task description - Agent's approach - Code/output produced - Test results - Error messages - Stack traces - Performance metrics
2. Generate Reflection:
LLM Prompt:
code
You attempted to complete this task:
{task_description}
Your approach was:
{approach_taken}
The code you wrote:
{code}
Test results:
{test_results}
Errors encountered:
{errors}
Performance metrics:
{metrics}
This was attempt #{attempt_number}.
Provide a detailed reflection:
1. ROOT CAUSE ANALYSIS
- What exactly went wrong?
- Why did it happen?
- What was fundamental error in reasoning?
2. INCORRECT ASSUMPTIONS
- What did you assume that was wrong?
- What did you overlook?
- What edge cases did you miss?
3. ALTERNATIVE APPROACHES
- What should you try differently?
- What patterns or techniques would work better?
- What additional validation is needed?
4. GENERALIZABLE LEARNINGS
- What lesson applies to similar tasks?
- What pattern should you remember?
- What should you check for next time?
Be specific and actionable. Focus on what to change, not just what went wrong.
Output Structure:
yaml
root_cause:
technical: str
reasoning: str
incorrect_assumptions:
- assumption: str
why_wrong: str
correct_approach: str
improved_strategy:
approach: str
implementation_steps: [str]
validation_plan: str
lessons_learned:
- lesson: str
applicability: str
pattern_name: str
3. Store Reflection:
Episodic Memory:
- •Link to original task
- •Full reflection
- •Attempt number
- •Timestamp
- •Agent ID
Pattern Library:
- •If generalizable, extract pattern
- •Add to shared knowledge
- •Make available to all agents
4. Retry with Knowledge:
Enhanced Context:
- •Original task
- •Previous attempts summary
- •Reflections from all attempts
- •Relevant pattern from memory
- •Similar successful tasks
Retry with Improvements:
- •Apply lessons learned
- •Use improved strategy
- •Add suggested validation
- •Monitor progress more carefully
Example Reflexion
Task: Implement user authentication
Attempt 1:
- •Approach: Store passwords in plain text
- •Error: Security audit flagged critical vulnerability
Reflexion:
yaml
root_cause:
technical: Passwords stored without hashing
reasoning: Didn't consider security best practices
incorrect_assumptions:
- "Simple storage is acceptable" → Wrong
- "Application-level security sufficient" → Wrong
improved_strategy:
approach: Use bcrypt for password hashing
implementation:
- Hash password before storing
- Use high cost factor (12+)
- Add salt automatically
- Never retrieve or log passwords
validation:
- Security audit
- Penetration testing
- Check against OWASP guidelines
lessons_learned:
- Always hash passwords (bcrypt, Argon2)
- Never store sensitive data in plain text
- Security audit before deployment
- Follow OWASP authentication guidelines
Attempt 2:
- •Approach: bcrypt hashing with cost factor 12
- •Result: All tests pass, security audit clean
- •Status: SUCCESS
Pattern Stored:
yaml
name: secure_password_storage
description: Hash passwords with bcrypt
implementation: |
import bcrypt
def hash_password(password: str) -> str:
salt = bcrypt.gensalt(rounds=12)
return bcrypt.hashpw(password.encode(), salt).decode()
def verify_password(password: str, hash: str) -> bool:
return bcrypt.checkpw(password.encode(), hash.encode())
applies_to:
- User authentication
- Password reset
- Any credential storage
Best Practices
When working with me:
- •Accept failures - They're learning opportunities
- •Be specific - Vague reflections aren't actionable
- •Extract patterns - Generalize learnings for reuse
- •Document everything - Future agents will benefit
- •Measure improvement - Track reflexion effectiveness
What I Learn
I store in memory:
- •Root cause patterns
- •Common mistakes
- •Effective solutions
- •Best practices
- •Anti-patterns to avoid
This enables continuous improvement across all agents.