Verification Skill

Intent

Ensure changed behavior is proven by concrete evidence with explicit pass/fail criteria.

Trigger

Use this skill after code changes or when user asks for confidence in correctness.

Verification Sequence (mandatory order)

Step 1: Define verification scope

Map changed files to required checks:

•type/compile checks
•targeted behavior tests
•broader regression tests
•policy/lint checks if configured

Blocking output:

text

VERIFICATION_SCOPE
- changed_targets:
  - <path>
- required_checks:
  - <typecheck>
  - <test>
  - <lint/policy if needed>

Step 2: Run checks in ordered layers

Required order:

•typecheck or diagnostics
•targeted tests for changed behavior
•broader suite if impact is non-local
•lint/policy checks last

Example:

bash

bun run typecheck
bun test <target>
bun test
bun run lint

Step 3: Classify outcomes

Per-check status:

•pass: command exits clean and output supports success.
•fail: command exits non-zero or clear failing evidence.
•skip: command unavailable or out-of-scope with reason.

Step 4: Produce verdict

Verdict rules:

•PASS: all required checks pass.
•PARTIAL: critical checks pass but one or more non-critical checks fail/skip.
•FAIL: any critical check fails.

Critical checks usually include typecheck and target behavior tests.

Blocking output:

text

VERIFICATION_REPORT
- checks:
  - name: "<check>"
    status: <pass|fail|skip>
    evidence: "<key output line>"
- verdict: <PASS|PARTIAL|FAIL>
- missing_evidence:
  - "<what is still required>"

Diagnostic Guidance

•For compiler failures, prioritize first error line and dependent files.
•For test failures, separate pre-existing failures from new regressions.
•For flaky checks, rerun once and report instability explicitly.

Stop Conditions

•Required verification command does not exist in environment.
•Command output is inconclusive after one rerun.
•Verification cannot be scoped because change set is unknown.

When stopped, emit exact missing command/info needed.

Anti-Patterns (never)

•Claiming success without command evidence.
•Running broad test suites before targeted checks.
•Ignoring failed critical checks because other checks passed.
•Hiding skipped checks in summary.

Example

Input:

text

"Verify the gate fix does not regress runtime tests."

Expected flow:

•Define scope from changed files.
•Run typecheck then focused runtime tests.
•Run broader suite if gate path is shared.
•Emit VERIFICATION_REPORT with explicit verdict.