MetriLLM — MetriLLM工具

Name: MetriLLM — MetriLLM工具
Author: TheBlueHouse75

TheBlueHouse75

MetriLLM — MetriLLM工具

v0.2.11

[AI辅助] Find the best local LLM for your machine. Tests speed, quality and RAM fit, then tells you if a model is worth running on your hardware.

0· 352·0 当前·0 累计·💬 1

by @thebluehouse75 (TheBlueHouse75)·MIT-0

AI模型访问测试工具网络工具开发工具

下载技能包

License

MIT-0

最后更新

2026/4/12

安全扫描

VirusTotal

可疑

查看报告

OpenClaw

安全

high confidence

The skill's requirements and instructions are coherent with its stated purpose (benchmarking local LLMs); nothing requested is disproportionate, though the user should review the npm package and the optional leaderboard sharing before installing.

评估建议

This skill is coherent for benchmarking local LLMs, but take two precautions before installing: (1) review the npm package / GitHub repo (https://github.com/MetriLLM/metrillm) or audit the package contents before running npm install -g, since global npm installs execute third-party code on your machine; (2) only use --share if you consent to publishing model names, scores and hardware details (the README says no personal data is sent, but verify what the package actually uploads). Also ensure yo...

详细分析 ▾

✓ 用途与能力

The name/description match the instructions: it tells you how to install the metrillm CLI, requires Node 20+ and a local LLM server (Ollama or LM Studio), and runs benchmarking commands. No unrelated credentials, binaries, or config paths are requested.

ℹ 指令范围

Instructions stay within the benchmarking scope (run metrillm bench, view local ~/.metrillm/results/). One caution: the optional --share command uploads results (model name, scores, hardware specs) to metrillm.dev; the SKILL.md states no personal data is sent, but that claim cannot be verified from instructions alone. The skill does not instruct access to unrelated files or env vars.

ℹ 安装机制

Installation is via npm (npm install -g metrillm) which is a standard delivery for a Node CLI. npm installs are moderate-risk because they execute third-party code on your system; this is proportionate to the stated purpose but you should inspect the package or source repository before global installation.

✓ 凭证需求

No environment variables, credentials, or config paths are required. The only data potentially exported is from the explicit --share action (model, scores, hardware specs), which is reasonable for a community leaderboard.

✓ 持久化与权限

The skill is not always-enabled and is user-invocable. It does not request persistent elevated privileges or modify other skills. Autonomous invocation is permitted by default (normal), but nothing in the skill attempts to gain extra persistence.

安全有层次，运行前请审查代码。

License

MIT-0

可自由使用、修改和再分发，无需署名。

查看条款 ↗

运行时依赖

无特殊依赖

版本

latestv0.2.112026/3/3

Fix license: Apache-2.0, not MIT

● 可疑

安装命令点击复制

官方npx clawhub@latest install metrillm

镜像加速npx clawhub@latest install metrillm --registry https://cn.clawhub-mirror.com

技能文档

Test any local model and get a clear verdict: is it worth running on your machine?

Prerequisites

节点.js 20+ — check 带有 节点 -v
Ollama 或 LM Studio installed 和 running

- Ollama: ollama.com, 然后 ollama serve - LM Studio: lmstudio.ai, 加载模型和开始 server

MetriLLM CLI — install globally:

npm install -g metrillm

Usage

列表可用 models

ollama list

Run 满 benchmark

metrillm bench --model $ARGUMENTS --json

This measures:

Performance: tokens/第二个, 时间到第一个令牌, memory usage
Quality: reasoning, math, coding, instruction following, structured 输出, multilingual
Fitness verdict: EXCELLENT / GOOD / MARGINAL / 不 RECOMMENDED

Performance-仅 benchmark (faster)

metrillm bench --model $ARGUMENTS --perf-only --json

Skips quality evaluation — measures speed and memory only.

视图上一个 results

ls ~/.metrillm/results/

Read any JSON file to see full benchmark details.

分享到公开 leaderboard

metrillm bench --model $ARGUMENTS --share

Uploads your result to the MetriLLM community leaderboard — an open, community-driven ranking of local LLM performance across real hardware. Compare your results with others and help the community find the best models for every setup. Shared data includes: model name, scores, hardware specs (CPU, RAM, GPU). No personal data is sent.

Interpreting Results

Verdict	Score	Meaning
EXCELLENT	>= 80	Fast and accurate — great fit
GOOD	>= 60	Solid — suitable for most tasks
MARGINAL	>= 40	Usable but with tradeoffs
NOT RECOMMENDED	< 40	Too slow or inaccurate

Key metrics to highlight:

tokensPerSecond > 30 = good 对于 interactive 使用
ttft < 500ms = responsive
memoryUsedGB vs 可用 RAM = 将 fit?

Tips

使用 --perf-仅 对于 quick tests
关闭 GPU-intensive apps 之前 benchmarking
Benchmark 持续时间 varies depending 在...上模型 speed 和响应 length

打开 Source

MetriLLM is free and open source (Apache 2.0). Contributions, issues, and feedback are welcome: github.com/MetriLLM/metrillm

Test any local model and get a clear verdict: is it worth running on your machine?

Prerequisites

Node.js 20+ — check with node -v
Ollama or LM Studio installed and running

- Ollama: ollama.com, then ollama serve - LM Studio: lmstudio.ai, load a model and start the server

MetriLLM CLI — install globally:

npm install -g metrillm

Usage

List available models

ollama list

Run a full benchmark

metrillm bench --model $ARGUMENTS --json

This measures:

Performance: tokens/second, time to first token, memory usage
Quality: reasoning, math, coding, instruction following, structured output, multilingual
Fitness verdict: EXCELLENT / GOOD / MARGINAL / NOT RECOMMENDED

Performance-only benchmark (faster)

metrillm bench --model $ARGUMENTS --perf-only --json

Skips quality evaluation — measures speed and memory only.

View previous results

ls ~/.metrillm/results/

Read any JSON file to see full benchmark details.

Share to the public leaderboard

metrillm bench --model $ARGUMENTS --share

Uploads your result to the MetriLLM community leaderboard — an open, community-driven ranking of local LLM performance across real hardware. Compare your results with others and help the community find the best models for every setup. Shared data includes: model name, scores, hardware specs (CPU, RAM, GPU). No personal data is sent.

Interpreting Results

Verdict	Score	Meaning
EXCELLENT	>= 80	Fast and accurate — great fit
GOOD	>= 60	Solid — suitable for most tasks
MARGINAL	>= 40	Usable but with tradeoffs
NOT RECOMMENDED	< 40	Too slow or inaccurate

Key metrics to highlight:

tokensPerSecond > 30 = good for interactive use
ttft < 500ms = responsive
memoryUsedGB vs available RAM = will it fit?

Tips

Use --perf-only for quick tests
Close GPU-intensive apps before benchmarking
Benchmark duration varies depending on model speed and response length

Open Source

MetriLLM is free and open source (Apache 2.0). Contributions, issues, and feedback are welcome: github.com/MetriLLM/metrillm

数据来源：ClawHub ↗ · 中文优化：龙虾技能库

OpenClaw 技能定制 / 插件定制 / 私有工作流定制

免费技能或插件可能存在安全风险，如需更匹配、更安全的方案，建议联系付费定制

了解定制服务

License

运行时依赖

版本

安装命令 点击复制

技能文档

Prerequisites

Usage

列表 可用 models

Run 满 benchmark

Performance-仅 benchmark (faster)

视图 上一个 results

分享 到 公开 leaderboard

Interpreting Results

Tips

打开 Source

Prerequisites

Usage

List available models

Run a full benchmark

Performance-only benchmark (faster)

View previous results

Share to the public leaderboard

Interpreting Results

Tips

Open Source

安装命令点击复制

列表可用 models

视图上一个 results

分享到公开 leaderboard