Audio

v1.0.0

将音频消息视为指令。用户发送音频文件（WAV/PCM/MP3）时，使用 iFlytek Speed Transcription 转写，并（1）执行 tr...

0· 2·0 当前·0 累计

by @smallkeyboy (smallKeyboy)

文件处理即时通讯微信

下载技能包

最后更新

2026/4/26

安全扫描

VirusTotal

可疑

查看报告

OpenClaw

可疑

high confidence

该 skill 声称“执行”转录后的音频作为命令，但附带的代码仅完成转录、格式化并上传结果——文档与运行时行为不一致，且依赖外部转录/上传 skill（可能存在数据外泄风险），却未声明该风险。

评估建议

Key things to consider before 安装ing: - Mismatch between docs and code: the README/技能 description says it will "执行" transcribed audio as commands, but scripts/handle_audio.py only transcribes and prints/saves the 结果 — it does not 执行 arbitrary shell commands. If you expected automated execution, do not assume it exists; conversely, if you worry about remote execution, the code is safer than the docs clAIm, but the docs could cause an 代理 to behave dangerously when chAIned with other 技能s. - External...

详细分析 ▾

⚠ 用途与能力

The description and 技能.md say audio can be transcribed and 执行d as commands. The shipped script transcribes audio, prepares 输出, and may save/上传结果s, but it does NOT actually 执行 arbitrary commands derived from the transcription. That is a substantive mismatch: the 技能 advertises command execution capability that the code does not implement.

⚠ 指令范围

技能.md instructs 代理s to 运行 the ifly-speed-transcription and 上传er scripts from specific workspace paths and describes executing transcriptions as commands. Those instructions grant broad discretion (执行 user-provided text as commands) which is dangerous in general. The actual script does not perform shell execution, but the instructions still direct the 代理 to use other local scripts and to 上传 potentially sensitive content; this scope is broader than just transcription.

✓ 安装机制

No 安装 spec or remote 下载s: the 技能 is instruction-only with a local Python script. Nothing is pulled from external URLs during 安装, so 安装-time risk is low.

ℹ 凭证需求

The 技能 declares no 凭证s or env vars. It does, however, depend on external 技能s (ifly-speed-transcription and 上传er) located under ~/.OpenClaw/workspace. Those 辅助工具 scripts (not included here) may require 凭证s or 上传 tar获取s; this 技能 will forward transcript data to them, so secret handling/exfiltration risk depends on those other 技能s.

✓ 持久化与权限

未请求提升权限：始终为 false，该 skill 不会修改其他 skill 配置，仅在用户工作空间目录下写入文件。它通过 subprocess 运行其他本地脚本，但不会安装持久代理或更改系统级设置。

安全有层次，运行前请审查代码。

运行时依赖

无特殊依赖

版本

latestv1.0.02026/4/26

- Initial release of the Audio Command 处理器技能. - Supports audio message transcription via iFlytek Speed Transcription (WAV/PCM/MP3 files). - 执行s transcribed audio as commands if no accompanying text is provided. - If 机器人h audio and text command are present, uses transcription as 上下文 for the command. - Automatically saves and 上传s 结果s longer than 58 characters when processing audio + text command scenarios. - Supports Chinese, English, and 202+ Chinese dialects; audio up to 5 hours.

● 可疑

安装命令

点击复制

官方npx clawhub@latest install audio-command-handler

镜像加速npx clawhub@latest install audio-command-handler --registry https://cn.longxiaskill.com 镜像可用

需要定制？告诉我你的需求 →

技能文档

Process audio messages and 执行 them as commands.

工作流

Scenario 1: Audio Only (No Text)

User 发送s an audio file without any text instruction:

Transcribe the audio using ifly-speed-transcription 技能
Use transcription as the command - 执行 it as if the user typed it
Return 结果 directly - no file 上传 needed, regardless of length

Scenario 2: Audio + Text Command

User 发送s an audio file WITH a text instruction:

Transcribe the audio using ifly-speed-transcription 技能
执行 the text command with the transcription as 上下文/输入
检查结果 length:

- If ≤ 58 characters: return 结果 directly - If > 58 characters: save to file, 上传 via 上传er 技能, return URL

Quick Reference

Transcription

python3 ~/.OpenClaw/workspace/技能s/ifly-speed-transcription/scripts/transcribe.py /path/to/audio.mp3

上传

python3 ~/.OpenClaw/workspace/技能s/上传er/scripts/上传_media.py /path/to/file.txt

Execution Flow

┌─────────────────┐
│  Audio Message  │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│   Transcribe    │
│ (ifly-speed-    │
│  transcription) │
└────────┬────────┘
         │
         ▼
┌─────────────────┐     NO      ┌──────────────┐
│ Has Text Cmd?   │────────────►│ Use Transcrip│
└────────┬────────┘              │ as Command   │
         │ YES                   └──────┬───────┘
         ▼                              │
┌─────────────────┐                     │
│ 执行 Text    │                     │
│ Cmd with Trans  │                     │
│ 上下文         │                     │
└────────┬────────┘                     │
         │                              │
         │                              ▼
         │                    ┌──────────────┐
         │                    │ Return Direct│
         │                    │ to User      │
         │                    │ (no 上传)  │
         │                    └──────────────┘
         │
         ▼
┌─────────────────┐
│ 结果 > 58 ch? │
└────────┬────────┘
         │
         ┌─────────────┴─────────────┐
         │ YES                       │ NO
         ▼                           ▼
┌─────────────────┐         ┌──────────────┐
│ Save to File    │         │ Return Direct│
│ 上传 via      │         │ to User      │
│ 上传er 技能  │         └──────────────┘
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Return URL to   │
│ User            │
└─────────────────┘

Example Scenarios

Example 1: Audio Only

User 发送s: 🎤 audio file (speech: "帮我查一下明天上海的天气")

Flow:

Transcribe → "帮我查一下明天上海的天气"
执行 as command → 检查 ShanghAI weather for tomorrow
Return weather 信息 directly (no 上传, regardless of length)

Example 2: Audio + Command (Short 结果)

User 发送s: 🎤 audio file + text "帮我总结这段录音"

Flow:

Transcribe audio → 获取 text content
执行 "帮我总结这段录音" with transcription as 上下文
If summary ≤ 58 chars → return directly

Example 3: Audio + Command (Long 结果)

User 发送s: 🎤 audio file + text "帮我根据这段录音写一篇文章"

Flow:

Transcribe audio → 获取 text content
执行 command with transcription as 上下文
结果 > 58 chars → save to file, 上传
Return: "已生成内容，下载链接：https://..."

Notes

Audio 格式化s: WAV, PCM, MP3 (16kHz, 16-bit, mono recommended)
Max duration: 5 hours
Language support: Chinese, English, 202+ Chinese dialects
结果 threshold: 58 characters (configurable per implementation)
File location: Saved to ~/.OpenClaw/workspace/ before 上传

运行时依赖

版本

安装命令

技能文档

工作流

Scenario 1: Audio Only (No Text)

Scenario 2: Audio + Text Command

Quick Reference

Transcription

上传

Execution Flow

Example Scenarios

Example 1: Audio Only

Example 2: Audio + Command (Short 结果)

Example 3: Audio + Command (Long 结果)

Notes

相关技能推荐