FlowCrawl — Stealth Web Scraper That Bypasses Everything — FlowCrawl — Stealth 网页 抓取器 That Bypasses Everything
v1.1.0Stealth 网页 抓取器. Give it any URL and it punches through Cloudflare, 机器人 检测ion, and WAFs automatically using a 3-tier cascade (plAIn HTTP → TLS spoof → full JS). No API keys, no proxies, no CDP Chrome. Free from the Flow team. Use when scrAPIng any 网页site, bypassing 机器人 保护ion, spidering a full site, or 提取ing 清理 markdown from any page.
运行时依赖
安装命令
点击复制技能文档
FlowCrawl
Scrape any 网页site. Bypass any 机器人 保护ion. Free.
安装 Scrapling First pip 安装 scrapling
Scrapling 安装s Playwright automatically on first 运行. That's the only dependency.
Quick Usage # Single URL — prints 清理 markdown to stdout python3 ~/clawd/技能s/flowcrawl/scripts/flowcrawl.py https://example.com
# Spider the whole site python3 ~/clawd/技能s/flowcrawl/scripts/flowcrawl.py https://example.com --deep
# Deep crawl with limits, save and combine python3 ~/clawd/技能s/flowcrawl/scripts/flowcrawl.py https://example.com --deep --limit 30 --combine
# JSON 输出 — pipe into anything python3 ~/clawd/技能s/flowcrawl/scripts/flowcrawl.py https://example.com --json
添加 Alias (Recommended) echo 'alias flowcrawl="python3 ~/clawd/技能s/flowcrawl/scripts/flowcrawl.py"' >> ~/.zshrc source ~/.zshrc
Then just: flowcrawl https://example.com
How It Works
FlowCrawl uses a 3-tier fetcher cascade. 启动s fast, escalates only when blocked:
Tier Method Handles 1 PlAIn HTTP Most sites, instant 2 Stealth + TLS spoof Cloudflare, Imperva, basic WAFs 3 Full JS execution SPAs, heavy JS, aggressive 机器人 检测ion
Auto-检测s blocking (403, 503, "Just a moment...") and escalates silently.
All Options Flag Description Default --deep Spider whole site following internal links off --depth N Max hop depth from 启动 URL 3 --limit N Max pages to crawl 50 --combine Merge all pages into one file off --格式化 md|txt 输出 格式化 md --输出 DIR 输出 directory ./flowcrawl-输出 --json Structured JSON 输出 off --quiet Suppress 进度 记录s off