AgentSkillsCN

delegation

针对子代理委派的统一供应商选择方案。根据任务类型,快速对比并决策:是选择 Kimi K2.5、GLM 还是 MiniMax?

SKILL.md
--- frontmatter
name: delegation
description: "Unified provider selection for subagent delegation. Quick decision matrix for choosing between Kimi K2.5, GLM, and MiniMax based on task type."

Unified Delegation Skill

⛔ CRITICAL: No Claude Subagents (MANUAL ENFORCEMENT)

NEVER spawn Claude models (Haiku, Sonnet, Opus) as subagents.

The deny rules in .claude/settings.local.json DO NOT enforce this.Manual verification required before EVERY Task() call.

See CLAUDE.md for the pre-Task() checklist and fallback protocol.


Provider Selection Matrix

Task TypeBest ProviderWhyFallback 1Fallback 2Last Resort
Complex reasoningKimi K2.5Most intelligent, 256K contextGLM-4.7MiniMaxClaude direct
Image/vision (batch)Kimi K2.5Built-in vision capabilityGLM-4.6vClaude direct
Creative/brainstormingGLM-4.7Strong creative problem-solvingKimiMiniMaxClaude direct
Web researchMiniMaxFast, reliable, cheapGLMKimiClaude direct
Simple file explorationMiniMaxQuick turnaroundGLMKimiClaude direct
Batch operationsGLMGood parallelismMiniMaxKimiClaude direct
Code reviewMiniMaxFast blind-spot checkKimiGLMClaude direct

Automatic Fallback Protocol

When a Task() call fails, follow this chain (NEVER fallback to Haiku/Sonnet/Opus):

Primary FailedTry NextThen TryLast Resort
Kimi K2.5GLM-4.7MiniMaxClaude does work directly
GLM-4.7MiniMaxKimi K2.5Claude does work directly
MiniMaxGLM-4.7Kimi K2.5Claude does work directly

Example:

python
# ❌ WRONG - Kimi failed, fell back to Haiku
Task(model="kimi", ...) → Error
Task(model="haiku", ...) → VIOLATION

# ✅ CORRECT - Kimi failed, fell back to GLM
Task(model="kimi", ...) → Error
Task(model="minimax", ...) → Success

If all cheap models fail:

  1. Do the work yourself (Claude)
  2. Document why all subagents failed
  3. Flag for manual investigation with user
  4. NEVER use Haiku/Sonnet/Opus as fallback

Quick Decision Flow

code
┌─ Is it reasoning/decisions? ──────────────────┐
│  YES → Claude does it directly                │
│  NO  → Delegate to subagent ↓                 │
├───────────────────────────────────────────────┤
│                                               │
│  ┌─ Does it need vision? ───────────────────┐ │
│  │  YES → Kimi K2.5 (or GLM-4.6v fallback)  │ │
│  │  NO  ↓                                   │ │
│  └──────────────────────────────────────────┘ │
│                                               │
│  ┌─ Is it complex/creative? ────────────────┐ │
│  │  Complex → Kimi K2.5                     │ │
│  │  Creative → GLM-4.7                      │ │
│  │  Simple → MiniMax                        │ │
│  └──────────────────────────────────────────┘ │
└───────────────────────────────────────────────┘

Provider Profiles

Kimi K2.5 (Most Capable)

Context: 256K tokens | Vision: Yes | Thinking mode: Yes

Best for:

  • Complex multi-step reasoning
  • Batch image analysis (10+ images)
  • Tasks requiring deep understanding
  • Fallback for failed GLM tasks

Launcher: .\scripts\start-kimi.ps1

API Config:

code
Base URL: https://api.moonshot.cn/anthropic/
Models: kimi-k2.5-thinking, kimi-k2-turbo-preview

GLM-4.7 (Creative)

Context: 128K tokens | Vision: GLM-4.6v variant | Thinking mode: Yes

Best for:

  • Creative brainstorming
  • Mathematical reasoning (95.7% AIME 2025)
  • Parallel batch tasks
  • Tool use orchestration

MCP: .cursor/mcp.json (GLM-4.6v configured)

MiniMax M2.1 (Fast & Cheap)

Context: 128K tokens | Vision: VLM API | Speed: Fastest

Best for:

  • Quick web searches
  • Simple file exploration
  • Structured data extraction
  • Code review for blind spots

Launcher: .\scripts\start-claude-minimax.ps1

MCP: .cursor/mcp.json (MiniMax configured)


Delegation Patterns

Pattern 1: Research → Claude Decides

code
1. Claude receives task requiring research
2. Claude spawns MiniMax: "Find all uses of X in codebase"
3. MiniMax returns findings
4. Claude reasons and implements

Pattern 2: Batch Vision Analysis

code
1. Claude needs to analyze 20 sprites
2. Claude spawns Kimi K2.5: "Analyze quality of each sprite"
3. Kimi returns analysis for all 20
4. Claude makes decisions based on report

Pattern 3: Creative Exploration

code
1. Claude needs alternative approaches
2. Claude spawns GLM-4.7: "Brainstorm 5 solutions for X"
3. GLM returns creative options
4. Claude selects and refines best approach

Pattern 4: Code Review

code
1. Claude writes code
2. Claude spawns MiniMax: "Check for bugs, edge cases, security issues"
3. MiniMax returns concerns
4. Claude addresses or dismisses with reasoning

Parallel Delegation

Launch multiple subagents in a single message:

code
Task(prompt="Research X", subagent_type="general-purpose")  ←─┐
Task(prompt="Research Y", subagent_type="general-purpose")  ←─┼─ Parallel
Task(prompt="Research Z", subagent_type="general-purpose")  ←─┘

Rules:

  • Independent tasks → parallel
  • Dependent tasks → sequential
  • Never chain Claude subagents

Background Execution (Token Suspension)

Problem: Claude tokens burn while waiting for subagent results. Solution: Use run_in_background=true + end turn early.

Pattern: Fire-and-Retrieve

code
1. Claude receives task requiring research
2. Task(prompt="...", run_in_background=true) → returns output_file
3. Claude ends turn: "Research agent dispatched. Say 'continue' for results."
4. User says "continue"
5. TaskOutput(task_id="...", block=true) → retrieves results
6. Claude synthesizes and responds

When to Use Background Execution

ScenarioBackground?Why
Research >30 sec✅ YesSaves expensive Claude wait time
Batch image analysis✅ YesLong-running, user can wait
Quick file lookup❌ NoFaster to wait inline
Claude needs result to continue❌ NoWould block anyway

Token Savings Calculation

code
Blocking:     Claude waits 60s = 60s of Opus tokens burned
Background:   Claude ends turn = 0s of Opus tokens burned
              (subagent tokens are 50x cheaper)

Example Usage

code
# Fire (spawn and end turn immediately)
Task(
  prompt="Analyze all 20 sprites in assets/sprites/",
  subagent_type="general-purpose",
  run_in_background=true
)
→ Returns: {task_id: "abc123", output_file: "/path/to/output"}

# ... Claude ends turn, tells user to say "continue" ...

# Retrieve (on next turn)
TaskOutput(task_id="abc123", block=true)
→ Returns: Full subagent analysis

Token Economics

ProviderRelative CostWhen to Use
Claude Opus50xFinal decisions, complex reasoning
Claude Sonnet10xMedium reasoning (avoid as subagent)
Kimi K2.51xComplex tasks, vision
GLM-4.71xCreative, batch
MiniMax1xFast, simple

Key insight: 1 hour Claude exploration = 50 hours subagent exploration (cost).


Common Mistakes

MistakeImpactFix
Claude spawning HaikuExpensiveUse MiniMax instead
Sequential when parallel possibleSlowSingle message, multiple Tasks
Kimi for simple lookupOverkillUse MiniMax
MiniMax for complex reasoningPoor qualityUse Kimi K2.5
Claude reading 10+ filesContext bloatDelegate exploration

Integration with Other Skills

  • /skill kimi-k2.5 - Detailed Kimi setup and patterns
  • /skill minimax-mcp - MiniMax MCP integration details
  • /skill token-efficient-delegation - Full token economics
  • /skill subagent-best-practices - General subagent patterns

[Opus 4.5 - 2026-01-29]