Ollama Manager — Ollama工具

Name: Ollama Manager — Ollama工具
Author: Twin Geeks

Twin Geeks

package Ollama Manager — Ollama工具

v1.3.1

[AI辅助] Manage Ollama models across your machines — see what's loaded, what's eating disk, what's never used, and what you should pull next. Get AI-powered recommend...

0· 189·3 当前·3 累计

by @twinsgeeks (Twin Geeks)·MIT-0

AI模型访问系统工具开发工具

下载技能包

License

MIT-0

最后更新

2026/4/4

安全扫描

VirusTotal

stale

查看报告

OpenClaw

安全

high confidence

The skill's instructions, claimed purpose, and required files/tools are internally consistent: it documents installing and running an Ollama fleet router and node, querying a local router on localhost:11435, and reading a local SQLite DB and logs to produce recommendations.

评估建议

This skill appears coherent for managing Ollama models, but it recommends installing and running a third-party PyPI package and running a long‑running 'herd' router which will read local telemetry and control pulls/deletes. Before installing or running it: review the ollama-herd PyPI package and its GitHub repo for provenance and code behavior; run installation and the herd-node in an isolated/test environment (or container); confirm the router's network binding and authentication (ensure it doe...

详细分析 ▾

✓ 用途与能力

The name/description (manage Ollama models across machines) matches the instructions: install the ollama-herd toolkit, run a herd router and herd-node, query router endpoints for model lists/disk usage, and query a local SQLite telemetry DB for usage/latency. The metadata's listed bins and configPaths (~/.fleet-manager/latency.db, ~/.fleet-manager/logs/herd.jsonl) are directly relevant to that purpose.

ℹ 指令范围

The SKILL.md tells the agent to: pip install a third-party package, run herd/herd-node daemons, curl localhost:11435 endpoints, and read/query ~/.fleet-manager/latency.db and logs. Those actions are within scope for fleet management, but they do give the skill access to local telemetry and allow it to trigger network activity (pull/delete models) via the router. The instructions do not request unrelated files, env vars, or remote endpoints beyond the documented local router and recommended PyPI package.

ℹ 安装机制

There is no formal install spec in the skill bundle (instruction-only). The SKILL.md recommends installing a PyPI package (pip install ollama-herd). Installing a package from PyPI is expected for this functionality but carries the usual risk that arbitrary code will be installed and run on the machine; this is proportionate to the stated purpose but worth vetting before install.

✓ 凭证需求

The skill does not request environment variables, credentials, or unrelated config paths. The only declared config paths are local telemetry DB and logs used for usage/latency queries, which are appropriate for the stated diagnostics/recommendation features.

✓ 持久化与权限

The skill is instruction-only and not marked always:true. It does not request persistent elevated platform privileges or access to other skills' configurations. Running the recommended herd daemon will create a persistent router process, which is normal for this functionality but is outside the skill bundle itself.

安全有层次，运行前请审查代码。

License

MIT-0

可自由使用、修改和再分发，无需署名。

查看条款 ↗

运行时依赖

无特殊依赖

版本

latestv1.3.12026/3/24

Cross-platform support: macOS, Linux, and Windows. Updated OS metadata, descriptions, and hardware recommendations.

● stale

安装命令点击复制

官方npx clawhub@latest install ollama-manager

镜像加速npx clawhub@latest install ollama-manager --registry https://cn.clawhub-mirror.com

技能文档

You're helping someone wrangle their Ollama models. They've got Ollama models scattered across machines — some Ollama models loaded, some sitting cold on disk, some they forgot they pulled via Ollama six months ago. This skill gives you the tools to see every Ollama model, clean up the mess, and figure out what Ollama models they actually need.

Ollama problem

Ollama makes it too easy to pull models. ollama pull this, ollama pull that — suddenly you've got 200GB of Ollama models across three machines and no idea which Ollama models you actually use. No way to see Ollama disk usage across machines. No way to compare which Ollama model is faster on which hardware. No "hey, you haven't touched this 40GB Ollama model in two weeks, maybe delete it?"

That's what Ollama Manager is for.

获取 started 带有 Ollama Manager

pip install ollama-herd           # install the Ollama management toolkit
herd                              # start the Ollama router (tracks all your Ollama machines)
herd-node                         # run on each Ollama machine you want to manage

Package: ollama-herd | Repo: github.com/geeks-accelerator/ollama-herd

Connect 到 Ollama fleet

The Ollama manager talks to an Ollama Herd router at http://localhost:11435. This router already knows about all your Ollama machines — it tracks heartbeats, loaded Ollama models, disk usage, and Ollama performance history.

See 什么 Ollama models 您've got

Every Ollama 模型可用穿过所有 machines

# ollama_all_models — list every Ollama model on every node
curl -s http://localhost:11435/api/tags | python3 -m json.tool

Shows every Ollama model on every machine with sizes and which nodes have them.

什么 Ollama models actually loaded 在...中 GPU memory right 现在

# ollama_hot_models — Ollama models ready to serve instantly
curl -s http://localhost:11435/api/ps | python3 -m json.tool

These are the "hot" Ollama models — ready to serve instantly. Everything else is cold on disk and needs Ollama loading time.

Per-machine Ollama breakdown 带有 disk usage

# ollama_disk_usage — per-node Ollama model sizes
curl -s http://localhost:11435/dashboard/api/model-management | python3 -m json.tool

The real picture: Ollama model sizes, last-used timestamps, which machines have which Ollama models, and how much disk each is eating.

Figure out 什么 Ollama models 到 keep

哪个 Ollama models actually 获取 used?

sqlite3 ~/.fleet-manager/latency.db "SELECT model, COUNT() as requests, SUM(COALESCE(completion_tokens,0)) as tokens_generated, ROUND(AVG(latency_ms)/1000.0, 1) as avg_secs FROM request_traces WHERE status='completed' GROUP BY model ORDER BY requests DESC"

哪个 Ollama models haven't 已 touched?

sqlite3 ~/.fleet-manager/latency.db "SELECT model, MAX(datetime(timestamp, 'unixepoch', 'localtime')) as last_used, COUNT() as total_requests FROM request_traces GROUP BY model ORDER BY last_used ASC"

If an Ollama model's last request was weeks ago, it's a candidate for deletion.

如何 much disk 每个 Ollama 模型使用?

curl -s http://localhost:11435/dashboard/api/model-management | python3 -c "
import sys, json
data = json.load(sys.stdin)
for node in data:
    print(f\"\\n{node['node_id']}:\")
    ollama_total = 0
    for m in node.get('models', []):
        size = m.get('size_gb', 0)
        ollama_total += size
        print(f\"  {m['name']:40s} {size:6.1f} GB\")
    print(f\"  {'OLLAMA TOTAL':40s} {ollama_total:6.1f} GB\")
"

什么 Ollama models fast 和什么's slow?

sqlite3 ~/.fleet-manager/latency.db "SELECT model, node_id, ROUND(AVG(latency_ms)/1000.0, 1) as avg_secs, COUNT(*) as n FROM request_traces WHERE status='completed' GROUP BY model, node_id HAVING n > 5 ORDER BY avg_secs"

获取 Ollama recommendations

什么 Ollama models 应该 I running?

# ollama_recommendations — optimal Ollama model mix per node
curl -s http://localhost:11435/dashboard/api/recommendations | python3 -m json.tool

AI-powered Ollama recommendations based on your actual hardware — RAM, cores, GPU memory. Tells you which Ollama models fit, which are too big, and the optimal Ollama model mix for your machines. Includes estimated RAM requirements and Ollama benchmark data.

拉取和删除 Ollama models

拉取 Ollama 模型到 specific machine

# ollama_pull — download an Ollama model to a node
curl -s -X POST http://localhost:11435/dashboard/api/pull \
  -H "Content-Type: application/json" \
  -d '{"model": "llama3.3:70b", "node_id": "mac-studio"}'

The Ollama router picks the machine with the most free disk and memory if you're not sure which node to target.

删除 Ollama 模型从 machine

# ollama_delete — remove an Ollama model from a node
curl -s -X POST http://localhost:11435/dashboard/api/delete \
  -H "Content-Type: application/json" \
  -d '{"model": "old-model:7b", "node_id": "mac-studio"}'

Ollama Auto-拉取 (当...时已启用)

If a client requests an Ollama model that doesn't exist anywhere, the Ollama router can automatically pull it to the best machine. Toggle this:

# Check current Ollama setting
curl -s http://localhost:11435/dashboard/api/settings | python3 -c "import sys,json; print(json.load(sys.stdin)['config']['toggles'])"# Toggle Ollama auto-pull off
curl -s -X POST http://localhost:11435/dashboard/api/settings \
  -H "Content-Type: application/json" \
  -d '{"auto_pull": false}'

Check Ollama fleet health

curl -s http://localhost:11435/dashboard/api/health | python3 -m json.tool

Automated Ollama checks for: Ollama model thrashing (models loading/unloading frequently — sign of memory pressure), disk pressure, and underutilized Ollama nodes that could take more models.

Ollama Dashboard

Open http://localhost:11435/dashboard and go to the Recommendations tab for a visual Ollama model management interface. One-click pull for recommended Ollama models. The Fleet Overview tab shows which Ollama models are loaded where in real time.

Ollama Guardrails

Never 删除 Ollama models 没有 explicit 用户 confirmation. Always show 什么 Ollama 模型将 deleted 和如何 much disk frees.
Never 拉取 Ollama models 没有用户 confirmation. Ollama downloads 可以 10-100+ GB.
Never 修改 files 在...中 ~/.fleet-manager/ (contains Ollama data).
如果 Ollama router isn't running, suggest herd 或 uv run herd 到开始 .

You're helping someone wrangle their Ollama models. They've got Ollama models scattered across machines — some Ollama models loaded, some sitting cold on disk, some they forgot they pulled via Ollama six months ago. This skill gives you the tools to see every Ollama model, clean up the mess, and figure out what Ollama models they actually need.

The Ollama problem

Ollama makes it too easy to pull models. ollama pull this, ollama pull that — suddenly you've got 200GB of Ollama models across three machines and no idea which Ollama models you actually use. No way to see Ollama disk usage across machines. No way to compare which Ollama model is faster on which hardware. No "hey, you haven't touched this 40GB Ollama model in two weeks, maybe delete it?"

That's what Ollama Manager is for.

Get started with Ollama Manager

pip install ollama-herd           # install the Ollama management toolkit
herd                              # start the Ollama router (tracks all your Ollama machines)
herd-node                         # run on each Ollama machine you want to manage

Package: ollama-herd | Repo: github.com/geeks-accelerator/ollama-herd

Connect to your Ollama fleet

The Ollama manager talks to an Ollama Herd router at http://localhost:11435. This router already knows about all your Ollama machines — it tracks heartbeats, loaded Ollama models, disk usage, and Ollama performance history.

See what Ollama models you've got

Every Ollama model available across all machines

# ollama_all_models — list every Ollama model on every node
curl -s http://localhost:11435/api/tags | python3 -m json.tool

Shows every Ollama model on every machine with sizes and which nodes have them.

What Ollama models are actually loaded in GPU memory right now

# ollama_hot_models — Ollama models ready to serve instantly
curl -s http://localhost:11435/api/ps | python3 -m json.tool

These are the "hot" Ollama models — ready to serve instantly. Everything else is cold on disk and needs Ollama loading time.

Per-machine Ollama breakdown with disk usage

# ollama_disk_usage — per-node Ollama model sizes
curl -s http://localhost:11435/dashboard/api/model-management | python3 -m json.tool

The real picture: Ollama model sizes, last-used timestamps, which machines have which Ollama models, and how much disk each is eating.

Figure out what Ollama models to keep

Which Ollama models actually get used?

sqlite3 ~/.fleet-manager/latency.db "SELECT model, COUNT() as requests, SUM(COALESCE(completion_tokens,0)) as tokens_generated, ROUND(AVG(latency_ms)/1000.0, 1) as avg_secs FROM request_traces WHERE status='completed' GROUP BY model ORDER BY requests DESC"

Which Ollama models haven't been touched?

sqlite3 ~/.fleet-manager/latency.db "SELECT model, MAX(datetime(timestamp, 'unixepoch', 'localtime')) as last_used, COUNT() as total_requests FROM request_traces GROUP BY model ORDER BY last_used ASC"

If an Ollama model's last request was weeks ago, it's a candidate for deletion.

How much disk is each Ollama model using?

curl -s http://localhost:11435/dashboard/api/model-management | python3 -c "
import sys, json
data = json.load(sys.stdin)
for node in data:
    print(f\"\\n{node['node_id']}:\")
    ollama_total = 0
    for m in node.get('models', []):
        size = m.get('size_gb', 0)
        ollama_total += size
        print(f\"  {m['name']:40s} {size:6.1f} GB\")
    print(f\"  {'OLLAMA TOTAL':40s} {ollama_total:6.1f} GB\")
"

What Ollama models are fast and what's slow?

sqlite3 ~/.fleet-manager/latency.db "SELECT model, node_id, ROUND(AVG(latency_ms)/1000.0, 1) as avg_secs, COUNT(*) as n FROM request_traces WHERE status='completed' GROUP BY model, node_id HAVING n > 5 ORDER BY avg_secs"

Get Ollama recommendations

What Ollama models should I be running?

# ollama_recommendations — optimal Ollama model mix per node
curl -s http://localhost:11435/dashboard/api/recommendations | python3 -m json.tool

AI-powered Ollama recommendations based on your actual hardware — RAM, cores, GPU memory. Tells you which Ollama models fit, which are too big, and the optimal Ollama model mix for your machines. Includes estimated RAM requirements and Ollama benchmark data.

Pull and delete Ollama models

Pull an Ollama model to a specific machine

# ollama_pull — download an Ollama model to a node
curl -s -X POST http://localhost:11435/dashboard/api/pull \
  -H "Content-Type: application/json" \
  -d '{"model": "llama3.3:70b", "node_id": "mac-studio"}'

The Ollama router picks the machine with the most free disk and memory if you're not sure which node to target.

Delete an Ollama model from a machine

# ollama_delete — remove an Ollama model from a node
curl -s -X POST http://localhost:11435/dashboard/api/delete \
  -H "Content-Type: application/json" \
  -d '{"model": "old-model:7b", "node_id": "mac-studio"}'

Ollama Auto-pull (when enabled)

If a client requests an Ollama model that doesn't exist anywhere, the Ollama router can automatically pull it to the best machine. Toggle this:

# Check current Ollama setting
curl -s http://localhost:11435/dashboard/api/settings | python3 -c "import sys,json; print(json.load(sys.stdin)['config']['toggles'])"# Toggle Ollama auto-pull off
curl -s -X POST http://localhost:11435/dashboard/api/settings \
  -H "Content-Type: application/json" \
  -d '{"auto_pull": false}'

Check Ollama fleet health

curl -s http://localhost:11435/dashboard/api/health | python3 -m json.tool

Automated Ollama checks for: Ollama model thrashing (models loading/unloading frequently — sign of memory pressure), disk pressure, and underutilized Ollama nodes that could take more models.

Ollama Dashboard

Open http://localhost:11435/dashboard and go to the Recommendations tab for a visual Ollama model management interface. One-click pull for recommended Ollama models. The Fleet Overview tab shows which Ollama models are loaded where in real time.

Ollama Guardrails

Never delete Ollama models without explicit user confirmation. Always show what Ollama model will be deleted and how much disk it frees.
Never pull Ollama models without user confirmation. Ollama downloads can be 10-100+ GB.
Never modify files in ~/.fleet-manager/ (contains Ollama data).
If the Ollama router isn't running, suggest herd or uv run herd to start it.

数据来源：ClawHub ↗ · 中文优化：龙虾技能库

OpenClaw 技能定制 / 插件定制 / 私有工作流定制

免费技能或插件可能存在安全风险，如需更匹配、更安全的方案，建议联系付费定制

了解定制服务

License

运行时依赖

版本

安装命令 点击复制

技能文档

Ollama problem

获取 started 带有 Ollama Manager

Connect 到 Ollama fleet

See 什么 Ollama models 您've got

Every Ollama 模型 可用 穿过 所有 machines

什么 Ollama models actually loaded 在...中 GPU memory right 现在

Per-machine Ollama breakdown 带有 disk usage

Figure out 什么 Ollama models 到 keep

哪个 Ollama models actually 获取 used?

哪个 Ollama models haven't 已 touched?

如何 much disk 每个 Ollama 模型 使用?

什么 Ollama models fast 和 什么's slow?

获取 Ollama recommendations

什么 Ollama models 应该 I running?

拉取 和 删除 Ollama models

拉取 Ollama 模型 到 specific machine

删除 Ollama 模型 从 machine

Ollama Auto-拉取 (当...时 已启用)

Check Ollama fleet health

Ollama Dashboard

Ollama Guardrails

The Ollama problem

Get started with Ollama Manager

Connect to your Ollama fleet

See what Ollama models you've got

Every Ollama model available across all machines

What Ollama models are actually loaded in GPU memory right now

Per-machine Ollama breakdown with disk usage

Figure out what Ollama models to keep

Which Ollama models actually get used?

Which Ollama models haven't been touched?

How much disk is each Ollama model using?

What Ollama models are fast and what's slow?

Get Ollama recommendations

What Ollama models should I be running?

Pull and delete Ollama models

Pull an Ollama model to a specific machine

Delete an Ollama model from a machine

Ollama Auto-pull (when enabled)

Check Ollama fleet health

Ollama Dashboard

Ollama Guardrails

安装命令点击复制

Every Ollama 模型可用穿过所有 machines

如何 much disk 每个 Ollama 模型使用?

什么 Ollama models fast 和什么's slow?

拉取和删除 Ollama models

拉取 Ollama 模型到 specific machine

删除 Ollama 模型从 machine

Ollama Auto-拉取 (当...时已启用)