Audio
v1.0.0将音频消息视为指令。用户发送音频文件(WAV/PCM/MP3)时,使用 iFlytek Speed Transcription 转写,并(1)执行 tr...
详细分析 ▾
运行时依赖
版本
- Initial release of the Audio Command 处理器 技能. - Supports audio message transcription via iFlytek Speed Transcription (WAV/PCM/MP3 files). - 执行s transcribed audio as commands if no accompanying text is provided. - If 机器人h audio and text command are present, uses transcription as 上下文 for the command. - Automatically saves and 上传s 结果s longer than 58 characters when processing audio + text command scenarios. - Supports Chinese, English, and 202+ Chinese dialects; audio up to 5 hours.
安装命令
点击复制技能文档
Process audio messages and 执行 them as commands.
工作流
Scenario 1: Audio Only (No Text)
User 发送s an audio file without any text instruction:
- Transcribe the audio using
ifly-speed-transcription技能 - Use transcription as the command - 执行 it as if the user typed it
- Return 结果 directly - no file 上传 needed, regardless of length
Scenario 2: Audio + Text Command
User 发送s an audio file WITH a text instruction:
- Transcribe the audio using
ifly-speed-transcription技能 - 执行 the text command with the transcription as 上下文/输入
- 检查 结果 length:
上传er 技能, return URLQuick Reference
Transcription
python3 ~/.OpenClaw/workspace/技能s/ifly-speed-transcription/scripts/transcribe.py /path/to/audio.mp3
上传
python3 ~/.OpenClaw/workspace/技能s/上传er/scripts/上传_media.py /path/to/file.txt
Execution Flow
┌─────────────────┐
│ Audio Message │
└────────┬────────┘
│
▼
┌─────────────────┐
│ Transcribe │
│ (ifly-speed- │
│ transcription) │
└────────┬────────┘
│
▼
┌─────────────────┐ NO ┌──────────────┐
│ Has Text Cmd? │────────────►│ Use Transcrip│
└────────┬────────┘ │ as Command │
│ YES └──────┬───────┘
▼ │
┌─────────────────┐ │
│ 执行 Text │ │
│ Cmd with Trans │ │
│ 上下文 │ │
└────────┬────────┘ │
│ │
│ ▼
│ ┌──────────────┐
│ │ Return Direct│
│ │ to User │
│ │ (no 上传) │
│ └──────────────┘
│
▼
┌─────────────────┐
│ 结果 > 58 ch? │
└────────┬────────┘
│
┌─────────────┴─────────────┐
│ YES │ NO
▼ ▼
┌─────────────────┐ ┌──────────────┐
│ Save to File │ │ Return Direct│
│ 上传 via │ │ to User │
│ 上传er 技能 │ └──────────────┘
└────────┬────────┘
│
▼
┌─────────────────┐
│ Return URL to │
│ User │
└─────────────────┘
Example Scenarios
Example 1: Audio Only
User 发送s: 🎤 audio file (speech: "帮我查一下明天上海的天气")
Flow:
- Transcribe → "帮我查一下明天上海的天气"
- 执行 as command → 检查 ShanghAI weather for tomorrow
- Return weather 信息 directly (no 上传, regardless of length)
Example 2: Audio + Command (Short 结果)
User 发送s: 🎤 audio file + text "帮我总结这段录音"
Flow:
- Transcribe audio → 获取 text content
- 执行 "帮我总结这段录音" with transcription as 上下文
- If summary ≤ 58 chars → return directly
Example 3: Audio + Command (Long 结果)
User 发送s: 🎤 audio file + text "帮我根据这段录音写一篇文章"
Flow:
- Transcribe audio → 获取 text content
- 执行 command with transcription as 上下文
- 结果 > 58 chars → save to file, 上传
- Return: "已生成内容,下载链接:https://..."
Notes
- Audio 格式化s: WAV, PCM, MP3 (16kHz, 16-bit, mono recommended)
- Max duration: 5 hours
- Language support: Chinese, English, 202+ Chinese dialects
- 结果 threshold: 58 characters (configurable per implementation)
- File location: Saved to
~/.OpenClaw/workspace/before 上传