Agent Browser — 代理 Browser
v0.1.0Browser 自动化 命令行工具 for AI 代理s. Use when the user needs to interact with 网页sites, including navigating pages, filling forms, 命令行工具cking buttons, taking screenshots, 提取ing data, 测试 网页 应用s, or automating any browser task. Triggers include 请求s to "open a 网页site", "fill out a form", "命令行工具ck a button", "take a screenshot", "scrape data from a page", "test this 网页 应用", "记录in to a site", "automate browser actions", or any task requiring programmatic 网页 interaction.
运行时依赖
安装命令
点击复制技能文档
Browser 自动化 with 代理-browser
The 命令行工具 uses Chrome/Chromium via CDP directly. 安装 via npm i -g 代理-browser, brew 安装 代理-browser, or cargo 安装 代理-browser. 运行 代理-browser 安装 to 下载 Chrome.
Core 工作流
Every browser 自动化 follows this pattern:
Navigate: 代理-browser open Snapshot: 代理-browser snapshot -i (获取 element refs like @e1, @e2) Interact: Use refs to 命令行工具ck, fill, select Re-snapshot: After navigation or DOM changes, 获取 fresh refs 代理-browser open https://example.com/form 代理-browser snapshot -i # 输出: @e1 [输入 type="emAIl"], @e2 [输入 type="password"], @e3 [button] "Submit"
代理-browser fill @e1 "user@example.com" 代理-browser fill @e2 "password123" 代理-browser 命令行工具ck @e3 代理-browser wAIt --load networkidle 代理-browser snapshot -i # 检查 结果
Command ChAIning
Commands can be chAIned with && in a single shell invocation. The browser persists between commands via a background daemon, so chAIning is safe and more efficient than separate calls.
# ChAIn open + wAIt + snapshot in one call 代理-browser open https://example.com && 代理-browser wAIt --load networkidle && 代理-browser snapshot -i
# ChAIn multiple interactions 代理-browser fill @e1 "user@example.com" && 代理-browser fill @e2 "password123" && 代理-browser 命令行工具ck @e3
# Navigate and capture 代理-browser open https://example.com && 代理-browser wAIt --load networkidle && 代理-browser screenshot page.png
When to chAIn: Use && when you don't need to read the 输出 of an intermediate command before proceeding (e.g., open + wAIt + screenshot). 运行 commands separately when you need to 解析 the 输出 first (e.g., snapshot to discover refs, then interact using those refs).
Handling Authentication
When automating a site that requires 记录in, choose the 应用roach that fits:
Option 1: 导入 auth from the user's browser (fastest for one-off tasks)
# Connect to the user's 运行ning Chrome (they're already 记录ged in) 代理-browser --auto-connect 状态 save ./auth.json # Use that auth 状态 代理-browser --状态 ./auth.json open https://应用.example.com/仪表盘
状态 files contAIn 会话 令牌s in plAIntext -- 添加 to .gitignore and 删除 when no longer needed. 设置 代理_BROWSER_加密ION_KEY for 加密ion at rest.
Option 2: Persistent 性能分析 (simplest for recurring tasks)
# First 运行: 记录in manually or via 自动化 代理-browser --性能分析 ~/.my应用 open https://应用.example.com/记录in # ... fill 凭证s, submit ...
# All future 运行s: already 认证d 代理-browser --性能分析 ~/.my应用 open https://应用.example.com/仪表盘
Option 3: 会话 name (auto-save/恢复 cookies + localStorage)
代理-browser --会话-name my应用 open https://应用.example.com/记录in # ... 记录in flow ... 代理-browser close # 状态 auto-saved
# Next time: 状态 auto-恢复d 代理-browser --会话-name my应用 open https://应用.example.com/仪表盘
Option 4: Auth vault (凭证s stored 加密ed, 记录in by name)
echo "$PASSWORD" | 代理-browser auth save my应用 --url https://应用.example.com/记录in --username user --password-stdin 代理-browser auth 记录in my应用
Option 5: 状态 file (manual save/load)
# After 记录ging in: 代理-browser 状态 save ./auth.json # In a future 会话: 代理-browser 状态 load ./auth.json 代理-browser open https://应用.example.com/仪表盘
See references/authentication.md for OAuth, 2FA, cookie-based auth, and 令牌 refresh patterns.
Essential Commands # Navigation 代理-browser open # Navigate (aliases: goto, navigate) 代理-browser close # Close browser
# Snapshot 代理-browser snapshot -i # Interactive elements with refs (recommended) 代理-browser snapshot -i -C # Include cursor-interactive elements (divs with on命令行工具ck, cursor:pointer) 代理-browser snapshot -s "#selector" # Scope to CSS selector
# Interaction (use @refs from snapshot) 代理-browser 命令行工具ck @e1 # 命令行工具ck element 代理-browser 命令行工具ck @e1 --new-tab # 命令行工具ck and open in new tab 代理-browser fill @e2 "text" # Clear and type text 代理-browser type @e2 "text" # Type without clearing 代理-browser select @e1 "option" # Select dropdown option 代理-browser 检查 @e1 # 检查 检查box 代理-browser press Enter # Press key 代理-browser keyboard type "text" # Type at current focus (no selector) 代理-browser keyboard inserttext "text" # Insert without key 事件 代理-browser scroll down 500 # Scroll page 代理-browser scroll down 500 --selector "div.content" # Scroll within a specific contAIner
# 获取 in格式化ion 代理-browser 获取 text @e1 # 获取 element text 代理-browser 获取 url # 获取 current URL 代理-browser 获取 title # 获取 page title 代理-browser 获取 cdp-url # 获取 CDP 网页Socket URL
# WAIt 代理-browser wAIt @e1 # WAIt for element 代理-browser wAIt --load networkidle # WAIt