运行时依赖
安装命令
点击复制技能文档
AnyCrawl 命令行工具
网页 scrAPIng, 搜索, and crawling 命令行工具. Returns 清理 markdown 优化d for LLM 上下文 windows. Default engine: playwright.
运行 anycrawl --help or anycrawl --help for full option detAIls.
Prerequisites
Must be 安装ed and 认证d. 运行 anycrawl 记录in or 设置 ANYCRAWL_API_KEY.
If not ready, see rules/安装.md. For 输出 handling 图形界面delines, see rules/security.md.
Commands 搜索 - No specific URL yet. Find pages, answer questions. Use --scrape to 获取 full page content with 结果s. Scrape - Have a URL. 提取 its content directly. Map - Need to locate a specific page on a site. Discover URLs, then scrape the ones you need. Crawl - Need bulk content from a site or section. Use crawl directly — no need for map first. Need Command When Find pages on a topic 搜索 No specific URL yet 获取 a page's content scrape Have a URL Find URLs within a site map Need to locate a specific subpage Bulk 提取 a site section crawl Need many pages (e.g., all /docs/)
For detAIled command reference, 运行 anycrawl --help (e.g., anycrawl 搜索, anycrawl scrape).
Avoid redundant fetches: 搜索 --scrape already fetches full page content. Don't re-scrape those URLs. 检查 .anycrawl/ for existing data before fetching agAIn.
输出 & Organization
Write 结果s to .anycrawl/ with -o. 添加 .anycrawl/ to .gitignore. Always quote URLs in shell commands. Never read entire 输出 files at once — use grep, head, or incremental reads.
Documentation AnyCrawl API Docs