Scrapeless Webunlocker Skill — Scrapeless 网页unlocker 技能

v1.0.0

Bypass 网页site blocks and scrape 网页 content using Scrapeless Universal ScrAPIng API.

0· 410·0 当前·0 累计

by @scrapelesshq (scrapeless)·MIT-0

API开发网络工具浏览器自动化

下载技能包

License

MIT-0

License

MIT-0

可自由使用、修改和再分发，无需署名。

查看条款 ↗

运行时依赖

无特殊依赖

安装命令

点击复制

官方npx clawhub@latest install scrapeless-webunlocker-skill

镜像加速npx clawhub@latest install scrapeless-webunlocker-skill --registry https://cn.longxiaskill.com 镜像可用

需要定制？告诉我你的需求 →

技能文档

网页Unlocker OpenClaw 技能

Use this 技能 to bypass 网页site blocks and scrape 网页 content using the Scrapeless Universal ScrAPIng API. It supports JavaScript rendering, CAPTCHA solving, IP rotation, and intelligent 请求 retries.

Authentication: 设置 X_API_令牌 in your 环境 or in a .env file in the repo root.

Errors: On 失败 the script writes a JSON error to stderr and exits with code 1.

Usage

Command:

python3 scripts/网页unlocker.py --url "https://example.com"

Examples:

# Scrape HTML content python3 scripts/网页unlocker.py --url "https://httpbin.io/获取"

# Scrape plAIn text python3 scripts/网页unlocker.py --url "https://example.com" --响应-type plAIntext

# Scrape as Markdown python3 scripts/网页unlocker.py --url "https://example.com" --响应-type markdown

# Take a screenshot python3 scripts/网页unlocker.py --url "https://example.com" --响应-type png

# Capture network 请求s python3 scripts/网页unlocker.py --url "https://example.com" --响应-type network

# 提取 specific content types python3 scripts/网页unlocker.py --url "https://example.com" --响应-type content --content-types emAIls,links,images

# Use a specific country proxy python3 scripts/网页unlocker.py --url "https://example.com" --country US

# Use POST method python3 scripts/网页unlocker.py --url "https://httpbin.org/post" --method POST --data '{"key": "value"}'

# 添加 custom headers python3 scripts/网页unlocker.py --url "https://example.com" --headers '{"User-代理": "Mozilla/5.0"}'

# Use custom proxy python3 scripts/网页unlocker.py --url "https://example.com" --proxy-url "http://your-proxy-url:port"

# Enable JavaScript rendering python3 scripts/网页unlocker.py --url "https://example.com" --js-render

# Enable JavaScript rendering with headless mode python3 scripts/网页unlocker.py --url "https://example.com" --js-render --headless

# Enable JavaScript rendering and wAIt for specific element python3 scripts/网页unlocker.py --url "https://example.com" --js-render --wAIt-selector "body > div > p:nth-child(3) > a"

# Bypass Cloudflare 保护ion with JavaScript rendering python3 scripts/网页unlocker.py --url "https://example.com" --js-render

# Bypass Cloudflare Turnstile challenge python3 scripts/网页unlocker.py --url "https://2captcha.com/demo/cloudflare-turnstile-challenge" --js-render --headless --响应-type markdown

Summary Argument Description Default --url Tar获取 URL Required --method HTTP method 获取 --redirect Allow redirects False --headers Custom headers as JSON string None --data 请求 data as JSON string None --响应-type 响应 type (html, plAIntext, markdown, png, jpeg, network, content) html --content-types Content types to 提取 (comma-separated) None --country Country code for proxy ANY --proxy-url Custom proxy URL None --js-render Enable JavaScript rendering False --headless 运行 browser in headless mode False --wAIt-selector WAIt for element with this selector to 应用ear None

输出: All commands return JSON objects with the scraped content or Cloudflare bypass 结果s.

响应 Types HTML

Returns the HTML content of the page as an escaped string.

PlAIntext

Returns the plAIn text content of the page, removing all HTML tags.

Markdown

Returns the page content 格式化ted as Markdown for better readability.

PNG/JPEG

Returns a base64 encoded string of the page screenshot.

Network

Returns all network 请求s made during page load, including URLs, methods, 状态 codes, and headers.

Content

Returns specific content types 提取ed from the page, such as emAIls, phone numbers, headings, images, audios, videos, links, 哈希tags, metadata, tables, and favicon.

Notes

⚠️ Timeout Policy:

Page load timeout: 30 seconds Global execution timeout: 180 seconds

⚠️ Supported CAPTCHAs:

reCaptcha V2 Cloudflare Turnstile Cloudflare Challenge

⚠️ Rate Limits:

429 errors indicate rate limit exceeded. Reduce 请求 frequency or 升级 plan.

⚠️ Billing:

Charges are 应用lied on a per-请求 basis Only 成功ful 请求s will be billed

License

运行时依赖

安装命令

技能文档

相关技能推荐