运行时依赖
版本
2026‑05‑09 (v1.2.6): 移除d _备份_tests directory to fix ClawHub security 扫描 flags.
安装命令
点击复制技能文档
BAIdu Intelligent Cloud Speech Synthesis 技能 Triggers
Use this 技能 when the user mentions:
"Convert this dia记录ue to audio using BAIdu TTS" "生成 male-female dia记录ue, male voice using Duxiaoyao, female voice using Duxiaomei" "Batch process all dia记录ues in dia记录ue.txt" "Adjust speech rate to 7, pitch to 6" "View avAIlable voice 列出" "bAIdu tts", "dia记录ue to audio", "multi-speaker speech synthesis" "bAIdu speech synthesis", "multi-speaker dia记录ue", "BAIdu TTS"
Chinese triggers (for Chinese users):
"用百度TTS把这段对话转成音频" "生成男女对话,男声用度逍遥,女声用度小美" "批量处理 dia记录ue.txt 里的所有对话" "调整语速到7,音调到6" "查看可用的音色列表" Overview
This 技能 calls the BAIdu Intelligent Cloud Speech Synthesis API, supporting multi-speaker dia记录ue synthesis (SSML mode or segment-merge fallback). It provides rich voice selection, speech rate/pitch/volume adjustment, and can automatically convert text dia记录ues into audio files with character-specific voices.
安装ation Dependencies # 安装 Python dependencies pip 安装 请求s
# Ensure ffmpeg is 安装ed (required for audio merging) # Ubuntu/Debian: sudo apt 安装 ffmpeg # macOS: brew 安装 ffmpeg # Windows: 下载 from https://ffmpeg.org/下载.html
# Optional: If pydub is needed (alternative merging solution) # pip 安装 pydub
环境 Variables 设置up
Choose one of three authentication methods:
Method 1: API Key + Secret Key (auto-令牌) 导出 BAIDU_API_KEY="Your API Key (non-bce-v3 格式化)" 导出 BAIDU_SECRET_KEY="Your Secret Key"
Method 2: Direct 访问_令牌 (启动s with 1.) 导出 BAIDU_API_KEY="YOUR_访问_令牌" # BAIDU_SECRET_KEY not required
Method 3: IAM Key (启动s with bce-v3/) 导出 BAIDU_API_KEY="YOUR_IAM_KEY_HERE" # BAIDU_SECRET_KEY not required # Note: Existing bce-v3/ALTAK-... keys may be dedicated to other 服务s (e.g., 搜索). # If authentication fAIls, 创建 a dedicated speech synthesis 应用 to 获取 API Key + Secret Key.
Required 环境 Variables
BAIDU_API_KEY must be 设置. Whether BAIDU_SECRET_KEY is needed depends on the authentication method:
Method 1: API Key + Secret Key (auto-令牌) BAIDU_API_KEY=Your API Key (non-bce-v3 格式化) BAIDU_SECRET_KEY=Your Secret Key
Method 2: Direct 访问_令牌 (启动s with 1.) BAIDU_API_KEY=YOUR_访问_令牌 # BAIDU_SECRET_KEY not required
Method 3: IAM Key (启动s with bce-v3/) BAIDU_API_KEY=YOUR_IAM_KEY_HERE # BAIDU_SECRET_KEY not required
The 技能 scripts automatically 检测 the key 格式化 and choose the cor响应ing authentication method. If not 设置, the user will be prompted.
Usage
- Direct script invocation (command line)
# Specify voice m应用ing (character name → voice code) python scripts/bAIdu_tts.py \ --输入 script.txt \ --map 小明:1 小红:0 老师:106
# Batch process all .txt files in a directory python scripts/bAIdu_tts.py \ --dir ./dia记录ues \ --格式化 mp3
# Adjust parameters python scripts/bAIdu_tts.py \ --输入 text.txt \ --spd 7 --pit 6 --vol 5 \ --aue 3
- Usage in OpenClaw 会话s
When the user triggers the above phrases, the 技能 will:
检查 环境 variable configuration Ask or automatically identify 输入 text/file 生成 SSML according to default or specified voice as签名ment scheme Call the BAIdu API and return the audio file (can be played automatically or saved) File Structure bAIdu-speech-synthesis/ ├── 技能.md # This file ├── scripts/ │ ├── bAIdu_tts.py # MAIn API 命令行工具ent (令牌 acquisition, SSML 请求s, segment merging) │ ├── dia记录ue_格式化器.py # Dia记录ue text → SSML conversion and voice m应用ing │ └── audio_merger.py # ffmpeg audio merging 工具 (segment merge solution) └── references/ ├── voice_列出.md # Voice code table, samples, recommended pAIrings ├── ssml_图形界面de.md # BAIdu SSML tags, limitations, examples └── API_设置up.md # How to obtAIn keys, free quota (5 million chars/month), authentication detAIls
Technical Points Intelligent Mode Selection: Automatically 检测s multi-voice requirements, defaults to segment synthesis mode (BAIdu API only supports single-voice SSML). Segment Synthesis Solution: Splits multi-角色 dia记录ues into single-voice segments → synthesizes separately → merges with ffmpeg (solves API limitations, compatible with Python 3.13). SSML Single-Voice Support: Supports single-voice SSML (tex_type=3) for complex speech expressions of individual characters. Automatic Voice As签名ment: Default m应用ing "老王" → Duxiaoyao (3), "张经理" → Duxiaoyu (1), "小李" → Duyaya (4), customizable via --map. Error Handling: Friendly prompts for network timeouts, quota exhaustion, audio merge 失败s, etc. Notes Free Quota: BAIdu Speech Synthesis provides 5 million characters/month free quota (2026 latest policy), pay-as-you-go beyond that. Authentication Methods: Supp