📦 Claude for Safari — 操控Safari

v1.0.0

通过 AppleScript 与屏幕截图,在 macOS 上直接控制真实 Safari:读取标签页、注入 JavaScript、截屏反馈,无需安装即可实现浏览器级自动化。

0· 293·2 当前·2 累计
sdlll 头像by @sdlll (SDLLL)
下载技能包
最后更新
2026/4/18
0
安全扫描
VirusTotal
可疑
查看报告
OpenClaw
安全
high confidence
该技能的请求与运行时指令与其声明目的相符(通过 AppleScript 与截图控制 Safari);功能强大且涉及隐私,但内部逻辑一致,无未解释的依赖或安装行为。
评估建议
此技能名副其实:使用 AppleScript、在页面内执行 JavaScript 并对真实 Safari 会话截屏。安装前请考虑:1)授予“自动化”与“屏幕录制”权限后,终端/智能体将直接访问你的打开标签、已登录会话、表单数据及屏幕内容——仅在你信任该智能体时启用这些权限。2)技能会在运行时于 /tmp 编译并运行一个小助手;如需审计可查看该文件(并非从网络下载)。3)因智能体将获得页面文本与截图,避免在含敏感信息(网银、2FA 验证码)的页面使用,或为敏感账户单独使用配置文件/无痕窗口。4)限制或复查智能体的自动批准(交互式逐条确认而非批量),不使用时撤销自动化/屏幕录制权限。若想更谨慎,先用非敏感页面测试。...
详细分析 ▾
用途与能力
名称与描述承诺通过 AppleScript 与屏幕截图控制 Safari,SKILL.md 提供 AppleScript 命令、页面内 JavaScript 执行及截屏工作流。所需资源(osascript、screencapture、运行时 swift 编译器)正是此类技能所需。
指令范围
指令明确让智能体列出标签、读取页面内容、在页面上下文运行任意 JavaScript 并截图。这对浏览器控制技能属预期行为,但这些动作可访问已登录会话、Cookie、表单内容及任何可见页面数据——均属高敏感度。技能还会在 /tmp 编译一个小 Swift 助手以查找 Safari 窗口 ID(在 /tmp 写入并执行二进制)。无读取无关文件或环境变量的指令。
安装机制
无安装规范或外部下载。唯一的运行时写入/执行行为是在 /tmp 生成并编译临时 Swift 助手(swiftc),这对 macOS 原生工具合理,但值得注意,因其在运行时创建可执行文件。
凭证需求
技能未声明环境变量、凭证或配置路径。所请求的权限(控制 Safari 的“自动化”,可选“屏幕录制”)直接映射其功能,无无关凭证请求。
持久化与权限
技能仅基于指令且非持续启用。它允许自主调用(平台默认),结合控制 Safari 的能力扩大影响范围:智能体可在无需额外系统凭证的情况下读取活跃会话或运行 JS。这对浏览器自动化技能属预期行为,但需用户明确谨慎。
安全有层次,运行前请审查代码。

运行时依赖

无特殊依赖

版本

latestv1.0.02026/3/8

首次发布——通过 AppleScript 与屏幕截图在 macOS 上控制真实 Safari 浏览器。零安装。

可疑

安装命令

点击复制
官方npx clawhub@latest install claude-for-safari
镜像加速npx clawhub@latest install claude-for-safari --registry https://cn.longxiaskill.com

技能文档

  • Safari > 设置 > 高级 —— 勾选“在菜单栏中显示‘开发’菜单”,随后 开发菜单 > 允许 AppleEvent 运行 JavaScript ## 核心能力 ### 1. 列出所有打开的标签页 ``bash osascript -e ' tell application "Safari" set output to "" repeat with w from 1 to (count of windows) repeat with t from 1 to (count of tabs of window w) set tabName to name of tab t of window w set tabURL to URL of tab t of window w set output to output & "W" & w & "T" & t & " | " & tabName & " | " & tabURL & linefeed end repeat end repeat return output end tell' ` ### 2. 读取页面内容 读取当前标签页的全部文本内容: `bash osascript -e ' tell application "Safari" do JavaScript "document.body.innerText" in current tab of front window end tell' ` 读取结构化内容(标题、URL、meta 描述、各级标题): `bash osascript -e ' tell application "Safari" do JavaScript "JSON.stringify({ title: document.title, url: location.href, description: document.querySelector(\"meta[name=description]\")?.content || \"\", h1: [...document.querySelectorAll(\"h1\")].map(e => e.textContent).join(\" | \"), h2: [...document.querySelectorAll(\"h2\")].map(e => e.textContent).join(\" | \") })" in current tab of front window end tell' ` 读取简化 DOM(类似 Chrome ACP 的 browser_read): `bash osascript -e ' tell application "Safari" do JavaScript " (function() { const walk = (node, depth) => { let result = \"\"; for (const child of node.childNodes) { if (child.nodeType === 3) { const text = child.textContent.trim(); if (text) result += text + \"\\n\"; } else if (child.nodeType === 1) { const tag = child.tagName.toLowerCase(); if ([\"script\",\"style\",\"noscript\",\"svg\"].includes(tag)) continue; const style = getComputedStyle(child); if (style.display === \"none\" || style.visibility === \"hidden\") continue; if ([\"h1\",\"h2\",\"h3\",\"h4\",\"h5\",\"h6\"].includes(tag)) result += \"#\".repeat(parseInt(tag[1])) + \" \"; if (tag === \"a\") result += \"[\"; if (tag === \"img\") result += \"[Image: \" + (child.alt || \"\") + \"]\\n\"; else if (tag === \"input\") result += \"[Input \" + child.type + \": \" + (child.value || child.placeholder || \"\") + \"]\\n\"; else if (tag === \"button\") result += \"[Button: \" + child.textContent.trim() + \"]\\n\"; else result += walk(child, depth + 1); if (tag === \"a\") result += \"](\" + child.href + \")\\n\"; if ([\"p\",\"div\",\"li\",\"tr\",\"br\",\"h1\",\"h2\",\"h3\",\"h4\",\"h5\",\"h6\"].includes(tag)) result += \"\\n\"; } } return result; }; return walk(document.body, 0).substring(0, 50000); })() " in current tab of front window end tell' ` ### 3. 执行 JavaScript 在页面上下文中运行任意 JavaScript 并获取返回值: `bash osascript -e ' tell application "Safari" do JavaScript "YOUR_JS_CODE_HERE" in current tab of front window end tell' ` 多行脚本请使用 heredoc: `bash osascript << 'APPLESCRIPT' tell application "Safari" do JavaScript " (function() { // Multi-line JS here return 'result'; })() " in current tab of front window end tell APPLESCRIPT ` ### 4. 截图 提供两种方式,会话开始时自动检测可用性: `bash # 测试是否已授予屏幕录制权限(支持后台截图) /tmp/safari_wid 2>/dev/null && echo "BACKGROUND_SCREENSHOT=true" || echo "BACKGROUND_SCREENSHOT=false" ` #### 后台截图(需屏幕录制权限) 若终端应用已获屏幕录制权限,可用 screencapture -l 在不激活 Safari 的情况下截取窗口: `bash # 每个会话仅编译一次 helper(若未生成) if [ ! -f /tmp/safari_wid ]; then cat > /tmp/safari_wid.swift << 'SWIFT' import CoreGraphics import Foundation let options: CGWindowListOption = [.optionOnScreenOnly, .excludeDesktopElements] guard let windowList = CGWindowListCopyWindowInfo(options, kCGNullWindowID) as? [[String: Any]] else { exit(1) } for window in windowList { guard let owner = window[kCGWindowOwnerName as String] as? String, owner == "Safari", let layer = window[kCGWindowLayer as String] as? Int, layer == 0, let wid = window[kCGWindowNumber as String] as? Int else { continue } print(wid) exit(0) } exit(1) SWIFT swiftc /tmp/safari_wid.swift -o /tmp/safari_wid fi # 后台截取 Safari 窗口(无需激活) WID=$(/tmp/safari_wid) screencapture -l "$WID" -o -x /tmp/safari_screenshot.png ` 启用方法:指导用户前往 系统设置 > 隐私与安全性 > 屏幕录制 —— 授予终端应用(Terminal / iTerm / Warp)权限。 #### 前台截图(无需额外权限) 若未获屏幕录制权限,则退而求其次,基于区域截取。此方式会短暂激活 Safari(约 0.5 秒)后切回原应用: `bash # 记住当前最前应用 FRONT_APP=$(osascript -e 'tell application "System Events" to get name of first process whose frontmost is true') # 激活 Safari 并截取窗口区域 osascript -e 'tell application "Safari" to activate' sleep 0.3 BOUNDS=$(osascript -e ' tell application "System Events" tell process "Safari" -- Safari 可能将细窄工具栏暴露为窗口 1;找出最大窗口 set bestW to 0 set bestBounds to "" repeat with i from 1 to (count of windows) set {x, y} to position of window i set {w, h} to size of window i if w h > bestW then set bestW to w h set bestBounds to (x as text) & "," & (y as text) & "," & (w as text) & "," & (h as text) end if end repeat return bestBounds end tell end tell') screencapture -x -R "$BOUNDS" /tmp/safari_screenshot.png # 切回之前的应用 osascript -e "tell application \"$FRONT_APP\" to activate" ` 无论哪种方式,截取后读取截图查看屏幕内容: ` Use the Read tool on /tmp/safari_screenshot.png to view the captured image. ` ### 5. 导航 在当前标签页打开 URL: `bash osascript -e ' tell application "Safari" set URL of current tab of front window to "https://example.com" end tell' ` 在新标签页打开 URL: `bash osascript -e ' tell application "Safari" tell front window set newTab to make new tab with properties {URL:"https://example.com"} set current tab to newTab end tell end tell' ` 在新窗口打开 URL: `bash osascript -e 'tell application "Safari" to make new document with properties {URL:"https://example.com"}' ` ### 6. 点击元素 使用 JavaScript 点击(推荐——兼容 SPA 及响应式框架): `bash osascript -e ' tell application "Safari" do JavaScript " const el = document.querySelector(\"button.submit\"); if (el) { el.dispatchEvent(new MouseEvent(\"click\", {bubbles: true, cancelable: true})); \"clicked\"; } else { \"element not found\"; } " in current tab of front window end tell' ` 注意:React/Vue/Angular 请用 dispatchEvent(new MouseEvent(..., {bubbles: true})),而非 .click(),后者可能绕过合成事件处理器。 ### 7. 输入与填写表单 通过 JavaScript 设置 input 值: `bash osascript -e ' tell application "Safari" do JavaScript " const input = document.querySelector(\"input[name=search]\"); const nativeSetter = Object.getOwnPropertyDescriptor(window.HTMLInputElement.prototype, \"value\").set; nativeSetter.call(input, \"search text\"); input.dispatchEvent(new Event(\"input\", {bubbles: true})); input.dispatchEvent(new Event(\"change\", {bubbles: true})); " in current tab of front window end tell' ` 注意:React 受控组件请采用上述原生 setter + dispatchEvent 模式,直接改 .value 不会触发 React 状态更新。 通过 System Events 打字(模拟真实键盘——适用于 JS 注入被拦截时): `bash osascript -e ' tell application "Safari" to activate delay 0.3 tell application "System Events" keystroke "hello world" end tell' ` 按下特殊键: `bash osascript -e ' tell application "System Events" key code 36 -- Enter/Return key code 48 -- Tab key code 51 -- Delete/Backspace keystroke "a" using command down -- Cmd+A(全选) keystroke "c" using command down -- Cmd+C(复制) end tell' ` ### 8. 滚动 `bash # 向下滚动 500px osascript -e 'tell application "Safari" to do JavaScript "window.scrollBy(0, 500)" in current tab of front window' # 滚动到顶部 osascript -e 'tell application "Safari" to do JavaScript "window.scrollTo(0, 0)" in current tab of front window' # 滚动到底部 osascript -e 'tell application "Safari" to do JavaScript "window.scrollTo(0, document.body.scrollHeight)" in current tab of front window' # 滚动元素到可视区域 osascript -e 'tell application "Safari" to do JavaScript "document.querySelector(\"#target\").scrollIntoView({behavior: \"smooth\"})" in current tab of front window' ` ### 9. 切换标签页 `bash # 切换到前台窗口的第 2 个标签页 osascript -e 'tell application "Safari" to set current tab of front window to tab 2 of front window' # 按 URL 匹配切换标签页 osascript -e ' tell application "Safari" repeat with t from 1 to (count of tabs of front window) if URL of tab t of front window contains "github.com" then set current tab of front window to tab t of front window exit repeat end if end repeat end tell' ` ### 10. 等待页面加载 `bash osascript -e ' tell application "Safari" -- 等待页面加载完成(最多 10 秒) repeat 20 times set readyState to do JavaScript "document.readyState" in current tab of front window if readyState is "complete" then exit repeat delay 0.5 end repeat end tell' ` ## 工作流:带截图反馈的浏览循环 需要视觉确认的任务,请遵循截图循环: 1. 执行操作(导航、点击、滚动等)
  • 必要时等待页面加载
  • 截图(后台或前台)→ 读取图像查看结果
  • 根据所见内容决定下一步操作 ## 在指定标签页上操作 如需操作非当前标签页,使用 tab N of window M 语法: `bash # 读取窗口 1 的第 3 个标签页内容 osascript -e 'tell application "Safari" to do JavaScript "document.title" in tab 3 of window 1' # 在指定标签页执行 JS osascript -e 'tell application "Safari" to do JavaScript "document.body.innerText.substring(0, 1000)" in tab 2 of front window' `` 注意:后台截图会捕获整个 Safari 窗口(即当时活跃的标签页)。如需截图特定标签页,先通过 AppleScript 切换到该标签页。 ## 限制 - 仅支持 macOS —— AppleScript 与 screencapture 为 macOS 专有
  • 无法拦截网络请求 —— 仅能访问页面内容与执行 JS
  • 无法访问跨源 iframe —— 受浏览器安全策略限制
  • 隐私浏览窗口 —— AppleScript 无法控制隐私窗口
  • System Events 的 keystroke 是“盲打” —— 会向当前焦点应用输入;使用前确保 Safari 位于最前
数据来源ClawHub ↗ · 中文优化:龙虾技能库