安全扫描
OpenClaw
安全
high confidenceThe skill's requirements, instructions, and install steps match its stated purpose (running the mineru-open-api CLI with a MINERU_TOKEN to clean HTML).
评估建议
This skill appears coherent: it runs the mineru-open-api CLI and needs a MINERU_TOKEN from mineru.net. Before installing, verify the npm package and GitHub repo are legitimate (check publisher, recent commits, and npm download counts). Treat MINERU_TOKEN like any API credential: only provide a token with the minimal needed scopes, avoid using it with highly sensitive local HTML unless you accept sending content to the MinerU service, and rotate/delete the token if you stop using the skill.详细分析 ▾
✓ 用途与能力
Name/description (HTML cleanup via MinerU) align with required binary (mineru-open-api) and required env var (MINERU_TOKEN). The primary credential and declared binaries are exactly what the CLI needs to function.
✓ 指令范围
SKILL.md only instructs the agent to run mineru-open-api commands against remote URLs or local HTML files, use the auth flow, and write output to stdout or files. It does not ask the agent to read unrelated system files, other credentials, or post data to unexpected endpoints beyond MinerU's API.
✓ 安装机制
Installation options are standard package installs (npm package and Go install from a GitHub repo). These are expected for a CLI; no arbitrary download URLs, extract steps, or personal servers are used.
✓ 凭证需求
Only MINERU_TOKEN is required and declared as the primary credential, which is proportionate for a hosted extraction/processing service. No unrelated secrets or config paths are requested.
✓ 持久化与权限
Skill is not forced-always; it is user-invocable and does not request elevated persistent presence or modifications to other skills or system-wide configs.
安全有层次,运行前请审查代码。
运行时依赖
无特殊依赖
版本
latestv0.4.02026/3/27
SEO: expand description for better ClawHub vector search discovery
● 无害
安装命令 点击复制
官方npx clawhub@latest install html-to-html
镜像加速npx clawhub@latest install html-to-html --registry https://cn.clawhub-mirror.com
技能文档
Fetch a remote web page or local HTML file and convert it to clean structured HTML using MinerU. Strips noise and preserves semantic content.
Install
npm install -g mineru-open-api
# or via Go (macOS/Linux):
go install github.com/opendatalab/MinerU-Ecosystem/cli/mineru-open-api@latest
Quick Start
# Crawl a web page and output clean HTML (requires token)
mineru-open-api crawl https://example.com/article -f html -o ./out/# Re-extract a local HTML file to clean HTML (requires token)
mineru-open-api extract page.html -f html -o ./out/
# Batch crawl multiple URLs to HTML (requires token)
mineru-open-api crawl url1 url2 -f html -o ./pages/
Authentication
Token required:
mineru-open-api auth # Interactive token setup
export MINERU_TOKEN="your-token" # Or via environment variable
Create token at: https://mineru.net/apiManage/token
Capabilities
- Input: remote web page URL or local .html file
- Output: clean structured HTML (
-f html) - For remote URLs: use
crawl -f html - For local HTML files: use
extract -f html - Requires token — not available in
flash-extract
Notes
- HTML output (
-f html) requires token; not available inflash-extract crawlsupports output formats: md, html, jsonextractsupports output formats: md, html, latex, docx, json- Output goes to stdout by default; use
-oto save to a file or directory - All progress/status messages go to stderr; document content goes to stdout
- MinerU is open-source by OpenDataLab (Shanghai AI Lab): https://github.com/opendatalab/MinerU
数据来源:ClawHub ↗ · 中文优化:龙虾技能库
OpenClaw 技能定制 / 插件定制 / 私有工作流定制
免费技能或插件可能存在安全风险,如需更匹配、更安全的方案,建议联系付费定制