📦 AB Test Eval — AB测试评估

v2.1.2

一键对任意 OpenClaw 技能、脚本、钩子或定时任务运行 A/B 测试,自动对比性能、稳定性与输出差异,快速定位最优方案,持续优化上线效果。

1· 150·0 当前·0 累计
by @cyrushuang1995-cmyk (Siyuan Huang)
下载技能包
最后更新
2026/4/11
0
安全扫描
VirusTotal
可疑
查看报告
OpenClaw
安全
high confidence
NULL
评估建议
This skill appears to do what it claims: it creates local eval workspaces, reads a target skill's SKILL.md to generate tests, snapshots versions with cp, and spawns subagents to run arms. Before installing/using it, consider: 1) it will read other skills' SKILL.md files — do those files contain anything you consider sensitive? 2) it can spawn many parallel subagents; use the built-in dry-run and smoke-test options first, and limit concurrency to avoid unexpected resource usage or costs; 3) confi...
详细分析 ▾
用途与能力
Name and description match what the SKILL.md instructs: creating an eval workspace, reading a target skill's SKILL.md, generating eval cases, snapshotting via cp, and orchestrating parallel subagents. Required binaries (mkdir, cp) are appropriate and proportional for workspace creation and making snapshots.
指令范围
Instructions largely stay inside the stated purpose, but they do require reading other skills' SKILL.md files and spawning parallel subagents to run arms. Reading SKILL.md for target components is necessary for generating realistic evals, and the doc explicitly requires user approval before running. These behaviors are expected for an evaluator but do mean the skill will access other skills' contents and may run many subagents (resource use).
安装机制
No install spec; instruction-only skills are lowest risk because nothing is downloaded or written at install time.
凭证需求
No environment variables, credentials, or config paths are requested. The declared requirements (mkdir, cp) are minimal and appropriate.
持久化与权限
always:false and normal model invocation. The skill writes evaluation workspaces and history files in a sibling workspace (expected for a tester) but does not request persistent platform-wide privileges or modify other skills' configurations.
安全有层次,运行前请审查代码。

运行时依赖

无特殊依赖

版本

latestv2.1.22026/3/31

NULL

可疑

安装命令

点击复制
官方npx clawhub@latest install ab-test-eval
镜像加速npx clawhub@latest install ab-test-eval --registry https://cn.longxiaskill.com
数据来源ClawHub ↗ · 中文优化:龙虾技能库