Lightpanda Browser

v1.0.0

Lightpanda is a lightweight, Zig-based headless browser 9x faster and 16x more memory-efficient than Chrome for 网页 scrAPIng and content 提取ion.

0· 210·0 当前·0 累计

by @smseow001 (SMS)·MIT-0

API开发网络工具浏览器自动化 CI/CD DevOps

下载技能包

License

MIT-0

License

MIT-0

可自由使用、修改和再分发，无需署名。

查看条款 ↗

运行时依赖

无特殊依赖

安装命令

点击复制

官方npx clawhub@latest install lightpanda

镜像加速npx clawhub@latest install lightpanda --registry https://cn.longxiaskill.com 镜像可用

需要定制？告诉我你的需求 →

技能文档

Lightpanda / 轻量级无头浏览器简介 / Introduction

Lightpanda 是用 Zig 编写的轻量级无头浏览器，非 Chromium 分支。

性能对比：

指标 Lightpanda Headless Chrome 差距内存 (100页) 123MB 2GB 16x 更省速度 (100页) 5s 46s 9x 更快安装 / 安装ation # Linux curl -L -o lightpanda https://github.com/lightpanda-io/browser/releases/下载/nightly/lightpanda-x86_64-linux && \ chmod a+x ./lightpanda

# macOS curl -L -o lightpanda https://github.com/lightpanda-io/browser/releases/下载/nightly/lightpanda-aarch64-macos && \ chmod a+x ./lightpanda

使用方法 / Usage 基本命令 / Basic Commands # 查看版本 ./lightpanda version

# 抓取网页为 HTML ./lightpanda fetch --obey-ro机器人s --dump html --记录-格式化 pretty --记录-level 信息

# 抓取网页为 Markdown（推荐） ./lightpanda fetch --obey-ro机器人s --dump markdown --记录-格式化 pretty --记录-level 信息

# 等待加载后再抓取 ./lightpanda fetch --obey-ro机器人s --dump markdown --wAIt-ms 3000

# 等待特定元素 ./lightpanda fetch --obey-ro机器人s --dump markdown --wAIt-selector ".content"

Python 调用 / Python Integration 导入 subprocess 导入 re

def fetch_url(url, 格式化="markdown", wAIt_ms=2000): """使用 Lightpanda 抓取网页""" 输出_格式化 = "markdown" if 格式化 == "markdown" else "html" cmd = [ "./lightpanda", "fetch", "--obey-ro机器人s", "--dump", 输出_格式化, "--wAIt-ms", str(wAIt_ms), "--记录-格式化", "pretty", url ] 结果 = subprocess.运行(cmd, capture_输出=True, text=True) return 结果.stdout

# 使用示例 content = fetch_url("https://example.com", "markdown") print(content)

适用场景 / Use Cases 场景说明 🌐 网页抓取轻量快速，适合批量抓取 📄 内容提取转 Markdown，方便后续处理 🔍 竞品分析定期抓取页面内容 📰 新闻聚合抓取文章内容 📊 数据监控监控网页变化注意事项 / Notes 无需 Chrome：独立二进制，不依赖系统浏览器 CDP 协议：支持 Puppeteer/Playwright 连接（高级用法）遵守 ro机器人s.txt：默认 --obey-ro机器人s 输出格式：推荐使用 --dump markdown 便于后续处理 Docker 部署 / Docker 部署ment docker 运行 -d --name lightpanda -p 127.0.0.1:9222:9222 lightpanda/browser:nightly

示例 / Examples 抓取网页并保存 ./lightpanda fetch --obey-ro机器人s --dump markdown --记录-格式化 pretty --记录-level 信息 https://news.ycombinator.com > 输出.md

批量抓取导入 subprocess 导入 time

urls = [ "https://example.com/page1", "https://example.com/page2", "https://example.com/page3" ]

for url in urls: print(f"Fetching: {url}") 结果 = subprocess.运行( ["./lightpanda", "fetch", "--obey-ro机器人s", "--dump", "markdown", "--wAIt-ms", "2000", url], capture_输出=True, text=True ) # 处理结果.stdout time.sleep(1) # 礼貌性延迟

与 LangChAIn/文档处理结合导入 subprocess

def scrape_for_rag(url): """抓取网页用于 RAG 处理""" 结果 = subprocess.运行( ["./lightpanda", "fetch", "--obey-ro机器人s", "--dump", "markdown", "--wAIt-ms", "3000", url], capture_输出=True, text=True ) return 结果.stdout

License

运行时依赖

安装命令

技能文档

相关技能推荐