Archive Search

Search through raw Claude Code session transcripts to recover technical details, reasoning traces, and thinking blocks that were compressed during episode distillation.

What You'll Get

The archive contains:

•✅ Complete thinking blocks (raw reasoning process)
•✅ User messages (original intent and questions)
•✅ Assistant explanations (first 1000 chars)
•❌ Tool results (filtered out - too noisy)
•❌ System messages (no semantic value)

Best for:

•Technical archaeology: "How did we implement X?"
•Decision rationale: "Why did we choose Y over Z?"
•Reasoning evolution: "How has our approach to X changed?"
•Debugging context: "What was the exact error we saw?"

How Archive Differs from Regular Memory

Memory Type	Content	Use Case
Episodes (`/episode`)	Curated summaries (2-8 bullets per section)	What happened today
Learnings (`/memory`)	Distilled insights	General knowledge
Archive (`/archive-search`)	Raw transcripts with thinking blocks	Technical depth, exact discussions

The archive captures ~5-10% of raw transcript bytes by selecting only high-value content (thinking blocks, user questions, assistant reasoning). It's indexed separately to prevent noise pollution of curated memory.

When to Use This Skill

Good use cases:

code

/archive-search "model fallback implementation"
→ Find thinking blocks about ModelFallbackChain design

/archive-search "Swift vs Python for memory database"
→ Recover exact trade-off analysis

/archive-search "context tracking approaches"
→ See evolution of ContextTracker across sessions

Not helpful:

code

/archive-search "who is Lucy"
→ Use /person instead (person profiles)

/archive-search "what did we decide about X"
→ Use /memory or check decisions.md

Search Tips

Thinking blocks are highest value:

•Archive preserves full reasoning traces
•Includes technical discussion and trade-offs
•Most detailed context for implementation decisions

Be specific:

•Good: "memory database indexing strategy"
•Too broad: "memory" (will match thousands of chunks)

Look for technical terms:

•Good: "ModelFallbackChain", "ContextTracker", "sanitizeResponse"
•Generic: "error handling" (use more specific context)

Coverage

Time range: Last 90 days of sessions Project: samara-main only (for now) Update frequency: Incremental sync during dream cycles (3 AM)

Current state: Run /archive-search stats to see:

•Total indexed chunks
•Session count
•Thinking block count

Example Session

User: /archive-search "sanitization logic"

Claude:

code

Found 3 relevant sessions:

================================================================================
Result 1 | Distance: 0.18 | Role: thinking
Session: 5399ef7a-... | Time: 2026-01-05 15:42:33
--------------------------------------------------------------------------------
The user is asking about the sanitization issue. Let me think through what we
know:

1. Thinking blocks are leaking into messages
2. Session IDs appearing in user-visible text
3. This happens in multi-stream scenarios (webcam + web fetch + conversation)

The root cause is likely in ClaudeInvoker.swift where we parse JSON output.
If parseJsonOutput() falls back to raw text when JSON parsing fails, it would
include everything - thinking blocks, session IDs, etc.

We need a three-layer defense:
- parseJsonOutput() should never fall back to raw
- sanitizeResponse() should strip known patterns
- MessageBus should coordinate all sends through single channel

[...1247 more characters]

Session file: ~/.claude/projects/.../5399ef7a-...jsonl

This gives you the exact reasoning process, not just the summary that ended up in episode logs.

Commands

Once you invoke this skill with /archive-search <query>, you can:

•
Search by query:
- •archive-search "your query here"
- •Returns top 5 most relevant chunks
•
Filter by role:
- •--role thinking - Only thinking blocks (highest signal)
- •--role user - Only user messages
- •--role assistant - Only assistant text
•
Adjust result count:
- •--n 10 - Get more results
•
Check index stats:
- •stats - See what's indexed
•
Quality check:
- •sample - Random sample of 3 chunks
- •sample --session-id abc123 - Sample from specific session

Implementation Notes

For Claude instances:

When a user invokes /archive-search <query>, you should:

•Run: ~/.claude-mind/system/bin/archive-index search "<query>"
•Parse and present results clearly
•Include session IDs for reference
•Explain what the user is seeing (thinking blocks vs user messages)

First-time setup:

If the archive hasn't been built yet:

bash

# Build initial index (takes 5-10 minutes for 90 days)
~/.claude-mind/system/bin/archive-index rebuild

# Check stats
~/.claude-mind/system/bin/archive-index stats

# Test search
~/.claude-mind/system/bin/archive-index search "model fallback"

After initial build, dream cycles handle incremental sync automatically.