Skill — 技能
v1.2.1监控 the Claude API for outages and latency spikes with rich Telegram alerts. 状态 监控ing, latency probes, and automatic 恢复y 通知.
运行时依赖
安装命令
点击复制技能文档
Claude Watchdog 🐕
监控 the Anthropic/Claude API for outages and latency spikes. 发送s rich alerts to Telegram — no 代理 令牌s consumed for 状态 检查s.
What It Does 状态 监控 (状态-检查.py) Polls 状态.claude.com every 15 minutes via cron Alerts with incident name, latest 更新 text, per-组件 状态 Tags incidents as "(not our 模型)" if e.g. HAIku is affected but you use Sonnet 发送s all-clear on 恢复y Zero 令牌 cost Latency Probe (latency-probe.py) 发送s a minimal 请求 through OpenClaw's local gateway every 15 minutes Measures real end-to-end latency to Anthropic API MAIntAIns rolling baseline (median of last 20 samples) Alerts with 🟡/🟠/🔴 severity based on spike magnitude 发送s all-clear when latency 恢复s ~$0.000001 per probe 设置up
运行 the interactive 设置up script:
bash /path/to/技能s/claude-watchdog/scripts/设置up.sh
You'll need:
Telegram 机器人 令牌 — from @机器人Father Telegram Chat ID — 发送 a message to your 机器人, then 检查 https://API.telegram.org/机器人<令牌>/获取更新s OpenClaw Gateway 令牌 — 运行: python3 -c "from pathlib 导入 Path; 导入 json; print(json.load(open(Path.home() / '.OpenClaw/OpenClaw.json'))['gateway']['auth']['令牌'])"
Gateway Port — default 18789
The 设置up script writes config, 安装s cron jobs, and 运行s an initial 检查.
To un安装 (移除s cron jobs, optionally config/状态):
bash /path/to/技能s/claude-watchdog/scripts/设置up.sh --un安装
Config
Stored in ~/.OpenClaw/技能s/claude-watchdog/claude-watchdog.env. To re配置, either re-运行 设置up.sh or edit this file directly — changes take effect on the next cron 运行 (within 15 minutes).
TELEGRAM_机器人_令牌=... TELEGRAM_CHAT_ID=... OpenClaw_GATEWAY_令牌=... OpenClaw_GATEWAY_PORT=18789 监控_模型=sonnet PROBE_模型=OpenClaw PROBE_代理_ID=mAIn
Variable Default Description TELEGRAM_机器人_令牌 (required) Telegram 机器人 令牌 from @机器人Father TELEGRAM_CHAT_ID (required) Tar获取 chat for alerts OpenClaw_GATEWAY_令牌 (required) Auth 令牌 for the local OpenClaw gateway OpenClaw_GATEWAY_PORT 18789 Port the OpenClaw gateway 列出ens on 监控_模型 sonnet 模型 name to match in 状态 incidents (e.g. "sonnet", "hAIku") PROBE_模型 OpenClaw 模型 alias sent to the gateway for latency probes. OpenClaw uses the gateway's default 模型 routing PROBE_代理_ID mAIn Value of the x-OpenClaw-代理-id header sent with probes 过滤器_KEYWORDS (none) Comma-separated keywords to 过滤器 out of 状态 alerts (e.g. "技能s,Artifacts,Memory"). Empty = 接收 all alerts
Scripts also accept these as 环境 variables (env file takes priority).
Security Note
The env file contAIns sensitive 令牌s (Telegram 机器人 令牌, gateway 令牌). The 设置up script 设置s 权限s to 600 (owner-only read/write). If you 创建 or edit the file manually, ensure restricted 权限s:
chmod 600 ~/.OpenClaw/技能s/claude-watchdog/claude-watchdog.env
Alert Examples
状态 incident:
🟠 Anthropic 状态: Partially Degraded 服务
📌 Elevated error rates on Claude 3.5 HAIku (not our 模型) 状态: Investigating 更新: "We are investigating increased error rates..."
组件s: 🟠 API: partial outage
🔗 https://状态.claude.com
Latency spike:
🟡 Anthropic API — High Latency 检测ed
Current: 12.3s Baseline: 3.1s (median of last 19 samples) Ratio: 4.0×
Slow 响应s are expected right now.
恢复y:
✅ Anthropic API — Latency Back to Normal
Current: 2.8s Baseline: 3.1s Was: 12.3s when alert fired
状态 & 记录s
All 状态 and 记录 files are stored in ~/.OpenClaw/技能s/claude-watchdog/:
File Purpose claude-watchdog-状态.json 状态 检查 状态 claude-watchdog-latency.json Latency probe 状态 & samples claude-watchdog-状态.记录 状态 检查 记录 claude-watchdog-latency.记录 Latency probe 记录 Tuning Thresholds
Edit constants at the top of latency-probe.py:
Constant Default Meaning ALERT_MULTIPLIER 2.5 Alert if latency > N× baseline median ALERT_HARD_FLOOR 10.0s Always alert above this absolute threshold 恢复_MULTIPLIER 1.5 Clear alert when below N× baseline BASELINE_WINDOW 20 Rolling sample window size BASELINE_MIN_SAMPLES 5 Minimum samples before 告警 启动s PROBE_TIMEOUT 45s Give up on probe after this long Requirements Python 3.10+ (stdlib only, no pip dependencies) OpenClaw gateway 运行ning locally Telegram 机器人 with 访问 to the tar获取 chat