Run Campaign
Run a GCP eval campaign end-to-end: clean up, rebuild stale images, enqueue, and start workers.
Arguments
- •
$1(required): Campaign YAML path relative to eval/, e.g.campaigns/operations/tikv-all-chaos-cloud.yaml - •
$2(optional): Number of workers to start (default: auto, based on quota)
Steps
Execute these steps in order. Use TaskCreate to track progress.
1. Clean Up
- •Kill all local
eval workerprocesses:ps aux | grep "eval worker" | grep -v grep | awk '{print $2}' | xargs kill -9 - •List and delete all GCP compute instances:
gcloud compute instances list, thengcloud compute instances delete <names> --zone=<zone> --quiet - •Release stale work queue items:
source $PROJECT_ROOT/.env && uv run eval worker release-stale --remote --timeout 1
2. Check & Rebuild Docker Images
The rebuild decision matrix (from CLOUD.md):
| Image | Dockerfile | Triggers |
|---|---|---|
| operator | subjects/tikv/Dockerfile.operator | Changes to packages/operator-core/, packages/operator-protocols/, subjects/*/observer/ |
| worker | eval/Dockerfile | Changes to eval/src/, eval/Dockerfile, subjects/*/service/ |
| tikv-chaos | subjects/tikv/Dockerfile.tikv-chaos | Changes to TiKV base version or chaos tools |
| ycsb | subjects/tikv/Dockerfile.ycsb | Changes to YCSB workloads or go-ycsb version |
For each image, compare the local Docker image creation timestamp against git commits that touch the trigger paths. Use:
docker images --format "{{.Repository}}:{{.Tag}}\t{{.CreatedSince}}" | grep <image-name>
git log --oneline --since="<image-date>" -- <trigger-paths>
If there are commits newer than the image, rebuild and push:
# Operator docker build --platform linux/amd64 -t operator-eval -f subjects/tikv/Dockerfile.operator . docker tag operator-eval us-central1-docker.pkg.dev/operator-486214/eval/operator:latest docker push us-central1-docker.pkg.dev/operator-486214/eval/operator:latest # Worker docker build --platform linux/amd64 -t eval-worker -f eval/Dockerfile . docker tag eval-worker us-central1-docker.pkg.dev/operator-486214/eval/worker:latest docker push us-central1-docker.pkg.dev/operator-486214/eval/worker:latest # tikv-chaos (rarely needed) docker build --platform linux/amd64 -t tikv-chaos:v8.5.5 -f subjects/tikv/Dockerfile.tikv-chaos subjects/tikv/ docker tag tikv-chaos:v8.5.5 us-central1-docker.pkg.dev/operator-486214/eval/tikv-chaos:v8.5.5 docker push us-central1-docker.pkg.dev/operator-486214/eval/tikv-chaos:v8.5.5 # ycsb (rarely needed) docker build --platform linux/amd64 -t ycsb -f subjects/tikv/Dockerfile.ycsb subjects/tikv/ docker tag ycsb us-central1-docker.pkg.dev/operator-486214/eval/ycsb:latest docker push us-central1-docker.pkg.dev/operator-486214/eval/ycsb:latest
All builds MUST use --platform linux/amd64 (GCP VMs are amd64, dev machines may be ARM).
All builds use project root as context (except tikv-chaos and ycsb which use subjects/tikv/).
Build stale images in parallel when possible. Push sequentially after builds complete.
3. Enqueue Campaign
source $PROJECT_ROOT/.env uv run eval run campaign $1 --cloud=gcp
Note the campaign ID from output.
4. Determine Worker Count & Start Workers
If the user specified a worker count, use that. Otherwise, auto-calculate from GCP quota:
# Get E2 vCPU quota and usage for us-central1
gcloud compute regions describe us-central1 \
--format="json(quotas)" | python3 -c "
import json, sys
data = json.load(sys.stdin)
for q in data['quotas']:
if q['metric'] == 'E2_CPUS':
limit = q['limit']
usage = q['usage']
vcpus_per_vm = 4 # e2-standard-4
available = int((limit - usage) / vcpus_per_vm)
# Leave 1 VM worth of buffer
max_workers = max(1, available - 1)
print(f'limit={int(limit)} used={int(usage)} available_vms={available} recommended_workers={max_workers}')
break
"
Each trial VM is e2-standard-4 (4 vCPUs). The E2_CPUS quota (typically 24) is usually the binding constraint. After cleanup (step 1), usage should be 0, giving 24/4 - 1 = 5 workers with buffer.
Report the quota situation to the user before starting workers:
- •Show E2_CPUS limit, current usage, and how many workers will be started
- •If the requested count would exceed quota, warn and cap at the safe maximum
Start each worker as a separate background Bash command with run_in_background: true:
source $PROJECT_ROOT/.env
for i in $(seq 1 ${NUM_WORKERS}); do
uv run eval worker start --cloud=gcp --id=worker-$i \
--operator-image=us-central1-docker.pkg.dev/operator-486214/eval/operator:latest
done
After starting, wait ~15 seconds and verify each worker claimed a work item by tailing their output files.
5. Report
Wait for the campaign to complete with live progress:
source $PROJECT_ROOT/.env && uv run eval wait <campaign_id> --remote
Run this as a background Bash command so you can continue working while it runs. It will show live progress and exit with a summary when all trials finish.
If the user needs to check status manually:
- •
eval show <campaign_id> --remote - •
eval worker status --remote - •
eval viewer --remote(web UI)
Environment
- •
$PROJECT_ROOTis the git repo root (parent of eval/) - •
.envat project root containsEVAL_DATABASE_URLandANTHROPIC_API_KEY - •Working directory is
eval/