AgentSkillsCN

silence-alert

在 Alertmanager 中创建、列出或删除 Prometheus 告警静默。适用于处理已知问题、执行维护操作,或抑制误报。当用户说“静默告警”、“关闭告警”时,可使用此功能。

SKILL.md
--- frontmatter
name: silence-alert
description: Create, list, or delete Prometheus alert silences in Alertmanager. Use when working on known issue, maintenance, or suppressing false positives. Use when user says "silence alert", "mute alert".

Silence Alert

Create, manage, and expire alert silences in Alertmanager. Investigate before silencing.

Inputs

InputTypeDefaultPurpose
alert_namestringrequiredAlert to silence (e.g., HighErrorRate, PodCrashLooping)
durationstring2h1h, 2h, 4h, 24h
reasonstring"Investigating issue"Audit trail
environmentstringproductionproduction or stage
namespacestring-Optional scope
actionstringcreatecreate, list, or delete
silence_idstring-Required for delete

Persona

Load incident persona (alertmanager tools).

Workflow

1. Bootstrap

  • persona_load("incident")
  • knowledge_query(project="automation-analytics-backend", persona="devops", section="gotchas")
  • check_known_issues("alertmanager_create_silence")
  • memory_read("learned/patterns") → alert_silences history

2. Check Current State (create only)

  • alertmanager_alerts(environment) — verify alert is firing
  • Parse: is_firing, match_count

3. List Silences

  • alertmanager_silences(environment)
  • Parse: already_silenced for our alert, our_silence, all_silences

4. Create Silence

  • Condition: action=create AND not already_silenced
  • alertmanager_create_silence(alert_name, duration, comment=reason, environment)
  • Parse: success, silence_id

5. Delete Silence

  • Condition: action=delete AND silence_id
  • alertmanager_delete_silence(silence_id, environment)

6. Notify Team

  • If created: skill_run("notify_team", message="Alert X silenced for Y")
  • persona_load("incident") — restore

7. Memory

  • memory_session_log("Created/Deleted alert silence", ...)
  • Append to learned/patterns → alert_silences
  • Track recurring_silences for frequently silenced alerts
  • Update state/environments → active_silences

8. Failure Recovery

  • "connection refused" / "no route to host" → vpn_connect()
  • "unauthorized" → check Alertmanager credentials
  • learn_tool_fix("alertmanager_create_silence", ...)

Output

  • create: Silence created, duration, silence_id, extend/remove commands
  • list: All active silences
  • delete: Delete result
  • Alert status: firing or not

Chains To

  • create_jira_issue — track underlying issue
  • notify_team — inform about silence