Scrapling Web Extractor — 网页抓取与 Markdown 转换器

Name: Scrapling Web Extractor — 网页抓取与 Markdown 转换器
Author: yumiu8103-hue

yumiu8103-hue

🕸️ Scrapling Web Extractor — 网页抓取与 Markdown 转换器

v1.0.0

使用 Scrapling抓取一个或多个公共网页，提取主要内容，并使用 html2text 将 HTML 转换为 Markdown。支持静态 HTTP、并发异步、隐匿反爬虫（Camoufox/Firefox）和动态 Playwright Chromium抓取模式，具有生产级自动匹配功能。

0· 364·0 当前·0 累计

by @yumiu8103-hue·MIT-0

浏览器自动化 API工具开发工具

下载技能包

License

MIT-0

最后更新

2026/3/10

安全扫描

VirusTotal

无害

查看报告

OpenClaw

安全

high confidence

该技能的代码、运行指令和输入与网页到 Markdown 抓取器一致 — 不会请求无关的凭据或系统全局权限，其行为与描述匹配。

评估建议

["该技能内部一致，但安装前请检查：","1. 脚本动态导入并依赖外部 'scrapling' 包和 Playwright — 安装前审计或信任这些包。","2. 正当使用隐匿模式和代理访问反爬虫保护的页面，但不得用于绕过登录墙、CAPTCHAs、付费墙或访问受限内容（如 SKILL.md 中所述）。","3. Playwright 安装会下载 Chromium 二进制文件 — 确保接受此下载。","4. 运行时传递的代理凭据将用于路由请求 — 保持安全，避免提供不可信的凭据。","5. 工具将写入 Markdown 文件和自动匹配数据库到输出目录 — 根据需要审查和管理这些本地文件。"]...

详细分析 ▾

✓ 用途与能力

Name, description, README, SKILL.md and the included Python script all align: they implement fetching public web pages (static or JS), extracting main content and converting HTML to Markdown. Features like stealth, proxies, Playwright, and automatch are legitimate for robust scraping and are consistent with the stated purpose.

ℹ 指令范围

SKILL.md and the script limit network calls to user-supplied URLs and an optional proxy. The skill provides flags to enable stealth, proxying, and Playwright rendering; these are powerful but described and constrained (rules state not to bypass logins/paywalls). The code dynamically imports the 'scrapling' package at runtime, so actual fetching behavior depends on that external dependency.

✓ 安装机制

No install spec is included (instruction-only); the README suggests installing third-party Python packages (scrapling, html2text, Playwright). That is a normal, low-risk pattern for an instruction-only Python skill, but it does mean the fetched packages and Playwright binaries will be installed separately by the user.

✓ 凭证需求

The skill declares no required environment variables or credentials. Proxy credentials can be supplied as runtime flags (appropriate for a scraper). The script's security manifest claims it reads only user-provided URL/file inputs and writes only to the chosen output directory and the Scrapling-managed local DB—no unexpected secrets are requested.

ℹ 持久化与权限

always is false and the skill is user-invocable. It writes local output files and (per its manifest) a Scrapling automatch SQLite DB; this is reasonable for its functionality but does create persistent local artifacts that a user should be aware of.

安全有层次，运行前请审查代码。

License

MIT-0

可自由使用、修改和再分发，无需署名。

查看条款 ↗

运行时依赖

无特殊依赖

版本

latestv1.0.02026/3/10

初始发布。- 4 种抓取模式：http、async、stealth（Camoufox）、dynamic（Playwright）- 基于 CSS 选择器的内容提取，带 auto_save / auto_match- 代理支持，humanize、geoip、block-webrtc 选项- --disable-resources 和 --block-images 用于加速抓取- --retry N 带指数退避- 结构化 JSON 输出，包含每页标题、Markdown 和状态

● 无害

安装命令点击复制

官方npx clawhub@latest install web-markdown-scraper

镜像加速npx clawhub@latest install web-markdown-scraper --registry https://cn.clawhub-mirror.com

技能文档

使用此技能时用户想要：

抓取一个或多个公共网页（静态或 JavaScript 渲染）
将 HTML 页面转换为干净的 Markdown
提取文章/正文文本用于摘要、分析或索引
通过隐匿模式绕过反爬虫保护（Cloudflare、Datadome 等）
并发抓取多个 URL（异步模式）
可靠地跟踪页面元素，适应网站设计更改（自动匹配）
将提取的结果保存为 .md 文件

...（其他内容保持原样，不翻译）

Use this skill when the user wants to:

Scrape one or more public webpages (static or JavaScript-rendered)
Convert HTML pages into clean Markdown
Extract article/body text for summarization, analysis, or indexing
Bypass anti-bot protections (Cloudflare, Datadome, etc.) via stealth mode
Scrape many URLs concurrently (async mode)
Track page elements reliably across website redesigns (automatch)
Save the extracted results as .md files

Fetcher Mode Selection Guide

Mode	Fetcher Class	Best For
`http` (default)	`Fetcher`	Fast static pages, RSS, APIs
`async`	`AsyncFetcher`	Batch of 5+ static URLs in parallel
`stealth`	`StealthyFetcher`	Anti-bot sites, Cloudflare, fingerprint checks
`dynamic`	`PlayWrightFetcher`	Heavy SPAs, React/Vue/Angular apps

Decision rule: Start with http. If you get a 403 / CAPTCHA / empty body, switch to stealth. If the content is rendered client-side (empty on first load), use dynamic. Use async when scraping many static URLs at once to save time.

Inputs

URL sources

--url URL — one target URL (repeat flag for multiple: --url A --url B)
--url-file FILE — plain text file with one URL per line

Fetcher

--mode http|async|stealth|dynamic — fetcher backend (default: http)

Content extraction

--selector CSS — CSS selector for the main content area (omit = full page)
--preserve-links — keep hyperlinks in the Markdown output
--output-dir DIR — save per-page .md files and a master index.json here

AutoMatch — production resilience

--auto-save — fingerprint & persist selected elements to the local DB on first run
--auto-match — on subsequent runs, find elements by fingerprint even if the site

layout has changed (do NOT need to update the CSS selector)

Browser options (stealth / dynamic only)

--headless true|false|virtual — headless mode; virtual uses Xvfb (default: true)
--network-idle — wait until no network activity for ≥500 ms before capturing
--block-images — block image loading (saves bandwidth and proxy quota)
--disable-resources — drop fonts/images/media/stylesheets for ~25% faster loads
--wait-selector CSS — pause until this element appears in the DOM
--wait-selector-state attached|visible|detached|hidden — element state (default: attached)
--timeout MS — global timeout in ms (default: 30 000)
--wait MS — extra idle wait after page load in ms

StealthyFetcher extras (stealth mode only)

--humanize SECONDS — simulate human-like cursor movement (max duration in seconds)
--geoip — spoof browser timezone, locale, language, and WebRTC IP from proxy geolocation
--block-webrtc — prevent real-IP leaks via WebRTC
--disable-ads — install uBlock Origin in the browser session
--proxy URL — HTTP/SOCKS proxy as a URL string, or JSON:

'{"server":"host:port","username":"u","password":"p"}'

Reliability

--retry N — retry failed requests up to N times with exponential backoff (max 30 s)

Rules

Only process public http:// or https:// pages.
Never bypass login walls, CAPTCHAs, paywalls, or access controls.
Prefer the main article or body content; avoid polluting the output with navigation,

headers, footers, or cookie banners — use --selector to target the content area.

When --auto-save is used, always also pass --selector so Scrapling knows which

element fingerprint to record.

On subsequent runs for layout-changed pages, use --auto-match instead of --auto-save.

Do not use both flags at once.

Use --mode async for batch jobs with 5+ static URLs for parallel execution.
Combine --disable-resources with --block-images in stealth/dynamic mode when

you only need text content — this can cut load times by up to 40%.

Always inspect the top-level ok field and per-result ok fields before using content.
If ok is false, report the exact error string — do not invent or guess content.
When --network-idle is insufficient, use --wait-selector for a specific DOM element

to guarantee the content has loaded before capture.

Command Patterns

Basic static page

python3 "{baseDir}/scrape_to_markdown.py" --url ""

Static page — target specific content area

python3 "{baseDir}/scrape_to_markdown.py" --url "" --selector "article.main-content"

Stealth mode — bypass anti-bot protection

python3 "{baseDir}/scrape_to_markdown.py" --url "" --mode stealth --network-idle

Stealth + proxy + human fingerprint (maximum stealth)

python3 "{baseDir}/scrape_to_markdown.py" \
  --url "" \
  --mode stealth \
  --proxy "http://user:pass@host:port" \
  --humanize 2.0 \
  --geoip \
  --block-webrtc \
  --network-idle

Dynamic SPA page (Playwright Chromium)

python3 "{baseDir}/scrape_to_markdown.py" \
  --url "" \
  --mode dynamic \
  --wait-selector ".product-list" \
  --network-idle \
  --disable-resources

Async concurrent batch (multiple URLs)

python3 "{baseDir}/scrape_to_markdown.py" \
  --mode async \
  --url "" --url "" --url ""

Batch from file + stealth + save to disk

python3 "{baseDir}/scrape_to_markdown.py" \
  --url-file urls.txt \
  --mode stealth \
  --disable-resources \
  --output-dir outputs

First-run automatch setup (save fingerprint)

python3 "{baseDir}/scrape_to_markdown.py" \
  --url "" \
  --selector ".article-body" \
  --auto-save \
  --output-dir outputs

Subsequent run after site layout change (adaptive match)

python3 "{baseDir}/scrape_to_markdown.py" \
  --url "" \
  --selector ".article-body" \
  --auto-match \
  --output-dir outputs

Full production scrape

python3 "{baseDir}/scrape_to_markdown.py" \
  --url "" \
  --mode stealth \
  --selector "main article" \
  --auto-match \
  --preserve-links \
  --network-idle \
  --disable-resources \
  --timeout 60000 \
  --retry 3 \
  --output-dir outputs

Output Handling

JSON is printed to stdout. Always check ok before using content.

Top-level fields:

ok — true only if every URL succeeded
total / succeeded / failed — count summary
results — array of per-URL result objects
output_index_file — path to saved index.json (if --output-dir used)

Per-URL result fields (when ok: true):

url — the requested URL
status — HTTP status code (e.g. 200)
title — page </code> text</li> <li><code>markdown</code> — extracted content as Markdown ← <strong>use this as main content</strong></li> <li><code>markdown_length</code> — character count (useful for quality checks)</li> <li><code>output_markdown_file</code> — path to saved <code>.md</code> file (if <code>--output-dir</code> used)</li></ul><p><strong>On failure (<code>ok: false</code> in a result):</strong> <ul><li><code>error</code> — exact error message; report this verbatim, do not invent content</li> </ul></div></div> <div style="text-align:center;padding:var(--spacing-4);font-size:12px;color:var(--color-gray-400)">数据来源：<a href="https://clawhub.ai/yumiu8103-hue/web-markdown-scraper" target="_blank" style="color:var(--color-gray-400)">ClawHub ↗</a> · 中文优化：龙虾技能库</div> </section> <section class="cta-card" aria-label="定制服务"> <div style="display:flex;align-items:center;gap:16px;flex-wrap:wrap"> <div style="flex:1;min-width:280px"> <div style="font-size:var(--font-size-md);font-weight:700;color:var(--color-text);margin-bottom:6px"><svg width="1em" height="1em" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-0.125em"><path d="M14.7 6.3a1 1 0 0 0 0 1.4l1.6 1.6a1 1 0 0 0 1.4 0l3.77-3.77a6 6 0 0 1-7.94 7.94l-6.91 6.91a2.12 2.12 0 0 1-3-3l6.91-6.91a6 6 0 0 1 7.94-7.94l-3.76 3.76z"/></svg> OpenClaw 技能定制 / 插件定制 / 私有工作流定制</div> <p style="font-size:var(--font-size-sm);color:var(--color-warning);margin-bottom:0;line-height:1.5"><svg width="1em" height="1em" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-0.125em"><path d="M10.29 3.86 1.82 18a2 2 0 0 0 1.71 3h16.94a2 2 0 0 0 1.71-3L13.71 3.86a2 2 0 0 0-3.42 0z"/><line x1="12" y1="9" x2="12" y2="13"/><line x1="12" y1="17" x2="12.01" y2="17"/></svg> 免费技能或插件可能存在安全风险，如需更匹配、更安全的方案，建议联系付费定制</p> </div> <a href="/custom" class="btn-cta">了解定制服务</a> </div> </section> </article> <script> function switchLang(lang){ document.querySelectorAll('.i18n').forEach(function(el){ var v=el.getAttribute('data-'+lang); if(v)el.innerHTML=v; }); document.querySelectorAll('.lang-cn').forEach(function(el){el.style.display=lang==='cn'?'block':'none'}); document.querySelectorAll('.lang-en').forEach(function(el){el.style.display=lang==='en'?'block':'none'}); document.querySelectorAll('.sk-lgb').forEach(function(b){b.classList.remove('sk-lga')}); var btn=document.getElementById('lang-'+lang);if(btn)btn.classList.add('sk-lga'); } </script> <style> .sk-hero{border:1px solid var(--color-border);border-radius:12px;padding:22px;background:var(--color-bg-card);margin-bottom:16px;box-shadow:0 4px 12px rgba(0,0,0,.08)} .sk-hero-main{display:flex;gap:24px}.sk-hero-left{flex:1;min-width:0}.sk-hero-right{width:200px;flex-shrink:0;display:flex;flex-direction:column;gap:10px} .sk-tr{display:flex;align-items:center;gap:10px;flex-wrap:wrap;margin-bottom:4px}.sk-t{font-size:20px;font-weight:700;margin:0;line-height:1.3} .sk-d{font-size:14px;color:var(--color-text-secondary);line-height:1.6;margin:8px 0 12px} .sk-sr{display:flex;align-items:center;gap:8px;flex-wrap:wrap;font-size:13px;color:var(--color-text-secondary)}.sk-dot{color:var(--color-gray-400);opacity:.5} .sk-au{font-size:13px;color:var(--color-text-muted);margin-top:4px}.sk-au a{color:var(--color-primary)} .sk-lbox{font-size:13px;color:var(--color-text-secondary)}.sk-lbl2{font-size:11px;font-weight:600;color:var(--color-text-muted);margin-bottom:2px} .sk-bdl{display:block;text-align:center;min-height:48px;line-height:20px;border-radius:12px;font-size:1.05rem;font-weight:600;padding:0.75rem 2rem;transition:background 0.2s} .sk-bdl:hover{text-decoration:none;color:#fff;background:#333} .btn--dark.sk-bdl{background:#1a1a1a;color:#fff;border:none} .btn--dark.sk-bdl:hover{background:#333;color:#fff;transform:translateY(-1px);box-shadow:0 4px 12px rgba(0,0,0,0.15)} .btn--dark.sk-bdl:active{transform:translateY(0);box-shadow:none} [data-theme="dark"] .btn--dark.sk-bdl{background:#e0e0e0;color:#1a1a1a} [data-theme="dark"] .btn--dark.sk-bdl:hover{background:#ccc;color:#1a1a1a;box-shadow:0 4px 12px rgba(255,255,255,0.08)} [data-theme="dark"] .btn--dark.sk-bdl:active{background:#bbb;box-shadow:none} .sk-sc{border:1px solid var(--color-border);border-radius:12px;padding:16px;margin-bottom:16px;background:var(--color-bg-card)} .sk-sc--safe{border-left:4px solid #67C23A} .sk-sc--suspicious{border-left:4px solid #E6A23C} .sk-sc--dangerous{border-left:4px solid #F56C6C} .sk-sch{font-size:15px;font-weight:700;margin-bottom:12px} .sk-scr{display:flex;align-items:center;gap:12px;padding:8px 12px;background:var(--color-gray-100);border-radius:8px;margin-bottom:8px} .sk-scw{font-size:13px;font-weight:600;min-width:120px}.sk-scs{font-size:13px;font-weight:700}.sk-scl{font-size:12px;color:var(--color-primary);margin-left:auto} .sk-scsm{font-size:13px;color:var(--color-text-secondary);line-height:1.6;padding:8px 12px;background:var(--color-gray-100);border-radius:8px;margin-bottom:8px} .sk-scd{margin-top:4px}.sk-scd summary{cursor:pointer;color:var(--color-primary);font-size:13px;font-weight:600;padding:4px 0} .sk-dm{padding:8px 12px;background:var(--color-gray-100);border-radius:8px;font-size:13px;margin-bottom:4px} .sk-adv{margin-top:8px;padding:12px;background:#FDF6EC;border:1px solid #FFEEBA;border-radius:8px}.sk-adv ol{margin:4px 0 0 16px;font-size:12px;color:var(--color-text-secondary);line-height:1.8} .sk-ig{display:grid;grid-template-columns:repeat(auto-fit,minmax(240px,1fr));gap:12px;margin-bottom:16px} .sk-ib{border:1px solid var(--color-border);border-radius:12px;padding:16px;background:var(--color-bg-card)}.sk-ib h3{font-size:14px;font-weight:700;margin:0 0 12px} .sk-lic-desc{font-size:12px;color:var(--color-text-secondary);line-height:1.5;margin:8px 0 0} .sk-ii{font-size:13px;margin-bottom:4px;display:flex;align-items:center;gap:8px}.sk-il{color:var(--color-text-muted);min-width:50px;flex-shrink:0;font-size:12px} .sk-ii code{background:var(--color-gray-100);padding:1px 6px;border-radius:4px;font-size:12px} .sk-cm{font-family:monospace;font-size:13px;padding:12px;background:var(--color-gray-100);border:1px solid var(--color-border);border-radius:8px;margin-bottom:8px;cursor:pointer;word-break:break-all;transition:border-color 300ms} .sk-cm:hover{border-color:var(--color-primary)}.sk-cl{font-size:11px;font-weight:600;color:var(--color-text-muted);display:block;margin-bottom:2px} .sk-cma{background:var(--color-primary-light);border-color:var(--color-primary);color:var(--color-primary)} .sk-cnn{background:#FDF6EC;border:1px solid #FFEEBA;border-radius:12px;padding:16px;margin-bottom:16px}.sk-cnn h3{font-size:15px;margin:0 0 8px}.sk-cnn p{font-size:14px;line-height:1.6;margin:0} .sk-lgb{padding:4px 14px;border-radius:9999px;border:1px solid var(--color-border);background:var(--color-bg-card);color:var(--color-text-secondary);cursor:pointer;font-size:13px;transition:all .15s} .sk-lgb:hover{border-color:var(--color-primary);color:var(--color-primary)} .sk-lgb:active{transform:scale(0.96)} .sk-lga{background:#1a1a1a;color:#fff;border-color:#1a1a1a} [data-theme="dark"] .sk-lga{background:#555;color:#fff;border-color:#555} .skill-content h2{font-size:20px;margin:24px 0 12px;padding-bottom:8px;border-bottom:1px solid var(--color-border)} .skill-content h3{font-size:17px;margin:20px 0 8px}.skill-content p{margin-bottom:12px;line-height:1.8} .skill-content ul,.skill-content ol{margin:8px 0 12px 20px;line-height:1.8}.skill-content li{margin-bottom:4px} .skill-content pre{background:#1E1E1E;color:#D4D4D4;padding:16px;border-radius:8px;overflow-x:auto;margin:12px 0;font-size:13px;line-height:1.5} .skill-content pre code{background:transparent;color:inherit;padding:0;border-radius:0;font-size:inherit} .skill-content code{background:var(--color-gray-100);padding:2px 6px;border-radius:4px;font-size:14px} .skill-content blockquote{border-left:4px solid var(--color-primary);padding:8px 16px;margin:12px 0;background:var(--color-primary-light);border-radius:0 4px 4px 0} .skill-content table{width:100%;border-collapse:collapse;margin:12px 0} .skill-content th,.skill-content td{border:1px solid var(--color-border);padding:8px 12px;text-align:left;font-size:14px} .skill-content th{background:var(--color-gray-100)} @media(max-width:768px){.sk-hero-main{flex-direction:column}.sk-hero-right{width:100%}.sk-ig{grid-template-columns:1fr}} </style></main><footer class="footer"><div class="container"> <div class="footer__row"> <a href="/openclaw/">澳龙下载专题</a> <a href="/custom">技能/插件定制服务</a> <a href="https://build.nvidia.com/models" target="_blank">NVIDIA 免费大模型</a></div> <div class="footer__row" style="font-size:12px;color:var(--color-text-muted)">邮箱：wyxdapp@qq.com ｜ AI 智能体可直接发送定制需求到邮箱</div> <div class="footer__row"><a href="/disclaimer.html">免责声明</a> | <a href="/privacy.html">隐私政策</a> | <a href="https://beian.miit.gov.cn/" target="_blank">鄂ICP备19007528号</a></div> <div class="footer__row" style="margin-top:var(--spacing-1);font-size:12px;color:var(--color-text-muted)">龙虾技能库 — OpenClaw 中文 AI 资源库 | 免费资源 + 付费定制</div> <div class="footer__row" style="margin-top:var(--spacing-1)">© 2026 龙虾技能库</div> </div></footer><aside class="qr-float" id="qr-float-panel" aria-label="联系与赞助"> <style> .qr-float{position:fixed;right:0;top:50%;transform:translateY(-50%);z-index:999;background:var(--color-bg-card);border:1px solid var(--color-border);border-right:none;border-radius:12px 0 0 12px;box-shadow:var(--shadow-md);font-family:inherit} .qr-float__body{padding:12px 14px} .qr-float__header{display:flex;align-items:center;justify-content:space-between;margin-bottom:10px} .qr-float__header span{font-size:var(--font-size-sm);font-weight:600;color:#2563EB} .qr-float__header button{background:none;border:none;cursor:pointer;font-size:var(--font-size-base);color:var(--color-text-muted);padding:2px 4px;line-height:1} .qr-float__header button:hover{color:var(--color-text)} .qr-float__item{text-align:center;margin-bottom:8px} .qr-float__item:last-child{margin-bottom:0} .qr-float__item img{width:100px;height:100px;display:block;margin:0 auto 4px;border-radius:6px} .qr-float__item span{font-size:var(--font-size-xs);color:var(--color-text-muted)} .qr-float__trigger{display:none;position:fixed;right:0;top:50%;transform:translateY(-50%);z-index:999;align-items:center;justify-content:center;width:40px;height:40px;background:var(--color-bg-card);border:1px solid var(--color-border);border-right:none;border-radius:12px 0 0 12px;box-shadow:var(--shadow-md);cursor:pointer;color:#2563EB;font-size:20px} .qr-float__trigger:hover{background:var(--color-bg-hover,var(--color-bg-card))} @media(max-width:768px){ .qr-float__body{display:none} .qr-float__trigger{display:flex} .qr-float__item img{width:80px;height:80px} } </style> <div class="qr-float__body" id="qr-float-body"> <div class="qr-float__header"> <span>联系 & 赞助</span> <button onclick="document.getElementById('qr-float-body').style.display='none';document.getElementById('qr-float-trigger').style.display='flex'" aria-label="收起面板">✕</button> </div> <div class="qr-float__item"><img src="/image/erweima.png" alt="站长微信二维码" loading="lazy"><span>关注站长微信</span></div> <div class="qr-float__item"><img src="/image/weixinpay.png" alt="微信赞助收款码" loading="lazy"><span>微信赞助</span></div> <div class="qr-float__item"><img src="/image/alipay.png" alt="支付宝赞助收款码" loading="lazy"><span>支付宝赞助</span></div> </div> <button class="qr-float__trigger" id="qr-float-trigger" onclick="document.getElementById('qr-float-body').style.display='block';document.getElementById('qr-float-trigger').style.display='none'" aria-label="展开联系与赞助面板"><svg width="1em" height="1em" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-0.125em"><path d="M21 15a2 2 0 0 1-2 2H7l-4 4V5a2 2 0 0 1 2-2h14a2 2 0 0 1 2 2z"/></svg></button> </aside></body></html>

License

运行时依赖

版本

安装命令 点击复制

技能文档

Fetcher Mode Selection Guide

Inputs

URL sources

Fetcher

Content extraction

AutoMatch — production resilience

Browser options (stealth / dynamic only)

StealthyFetcher extras (stealth mode only)

Reliability

Rules

Command Patterns

Basic static page

Static page — target specific content area

Stealth mode — bypass anti-bot protection

Stealth + proxy + human fingerprint (maximum stealth)

Dynamic SPA page (Playwright Chromium)

Async concurrent batch (multiple URLs)

Batch from file + stealth + save to disk

First-run automatch setup (save fingerprint)

Subsequent run after site layout change (adaptive match)

Full production scrape

Output Handling

安装命令点击复制