详细分析 ▾
运行时依赖
版本
Initial release of Website Scraper Pro. - Scrape a single web page into clean markdown or deterministic JSON using Crawl4AI. - Supports JS-aware scraping for client-side rendered pages. - Deterministic, query-focused narrowing of content without internal AI processing. - Outputs either markdown or structured JSON including title, links, and metadata. - Usage is limited to single-page extraction; no site-wide crawling or web search.
安装命令 点击复制
技能文档
When to use
- The user wants the content of a single web page from a specific URL.
- The user wants clean markdown extracted from an article, docs page, blog post, or landing page.
- The user wants a JS-aware scrape for a page that depends on client-side rendering.
- The user wants deterministic query-focused narrowing of one page without using an AI model inside the skill.
- The user wants structured JSON output with markdown, title, links, and metadata.
When NOT to use
- The user wants a broad web search across multiple sources.
- The user wants a site-wide crawl, recursive crawl, or multi-page extraction workflow.
- The user wants AI summarization, synthesis, or answer generation inside the scraper itself.
- The user wants authenticated browser automation or interactive form submission.
Commands
Scrape a page to markdown
uv run /root/.openclaw/workspace/skills/website-scraper-pro/src/main.py
Scrape a JS-heavy page
uv run /root/.openclaw/workspace/skills/website-scraper-pro/src/main.py --js
Scrape a page and narrow by query
uv run /root/.openclaw/workspace/skills/website-scraper-pro/src/main.py --query ""
Return deterministic JSON
uv run /root/.openclaw/workspace/skills/website-scraper-pro/src/main.py --format json
Examples
# Default markdown scrape
uv run /root/.openclaw/workspace/skills/website-scraper-pro/src/main.py https://example.com# JS-aware scrape
uv run /root/.openclaw/workspace/skills/website-scraper-pro/src/main.py https://example.com --js
# Query-focused retrieval
uv run /root/.openclaw/workspace/skills/website-scraper-pro/src/main.py https://example.com --query "documentation examples"
# JSON output
uv run /root/.openclaw/workspace/skills/website-scraper-pro/src/main.py https://example.com --format json
Output
- Default output is clean markdown for a single page.
--querykeeps the output deterministic and non-LLM.--format jsonreturns deterministic JSON with fields such astitle,url,markdown,links, andmetadatawhen available.
Notes
- This v1 does not use AI models internally. It is a deterministic retrieval tool only.
- The skill is single-page only. It does not do deep crawling, site maps, schema extraction, or RAG.
uv runreads the inline# /// scriptdependency block inmain.pyand installscrawl4aiin an isolated environment.- If browser setup is missing, run one-time setup commands such as:
uv run --with crawl4ai crawl4ai-setup
- uv run --with crawl4ai python -m playwright install chromium
- Do NOT use web search for this workflow when a direct URL is available.
- Call
uv run src/main.pydirectly as shown above.
免费技能或插件可能存在安全风险,如需更匹配、更安全的方案,建议联系付费定制