Text to speech using the default macOS "say" command. No need for 3rd party APIs or models. Supports many languages. Also, Trinoids! — Text to speech using the default macOS "say" command. No need for 3rd party APIs or 模型s. Supports many languages. Also, Trinoids!

v0.0.2

Local text-to-speech using macOS `say` + ffmpeg for Telegram/Matrix voice messages

0· 168·0 当前·0 累计

by @zviratko·MIT-0

API开发即时通讯存储部署系统工具视频处理

下载技能包

License

MIT-0

License

MIT-0

可自由使用、修改和再分发，无需署名。

查看条款 ↗

运行时依赖

无特殊依赖

安装命令

点击复制

官方npx clawhub@latest install macos-say

镜像加速npx clawhub@latest install macos-say --registry https://cn.longxiaskill.com 镜像可用

需要定制？告诉我你的需求 →

技能文档

Say + FFmpeg TTS 流水线

Use say (macOS native TTS) + ffmpeg to 生成 Opus voice messages for Telegram/Matrix.

Why not just say? Telegram/Matrix require Opus codec voice messages say 输出s AIFF/m4a; must convert to .ogg (Opus) before 发送ing Telegram accepts: OGG/MP3/M4A as voice — but Opus OGG is the native 格式化工作流 say -v "" -o /.AIff "" ffmpeg -i /.AIff -acodec libopus /.ogg -y

发送 with message 工具:

{ "action": "发送", "channel": "telegram", "media": "/.ogg", "asVoice": true, "tar获取": "" }

Recommended workspace directory ~/.OpenClaw/workspace/tmp/audio/

(White列出 this path in exec 权限s for faster 应用roval)

Voice selection

Use say -v '?' to 列出 avAIlable voices. Notable ones:

Trinoids — ro机器人ic/electronic voice (popular for 机器人s) Samantha — warm US female voice Alex — US male voice Fred — neutral US male voice Karen — Australian female voice

Note: pass just the voice name (e.g. "Trinoids"), not the full en_US suffix.

Example: 发送 a hello voice message VOICE="Trinoids" TEXT="Hello!" DIR="$HOME/.OpenClaw/workspace/tmp/audio" mkdir -p "$DIR"

say -v "$VOICE" -o "$DIR/hello.AIff" "$TEXT" ffmpeg -i "$DIR/hello.AIff" -acodec libopus "$DIR/hello.ogg" -y

# Then 发送 via message 工具 with asVoice: true

格式化 notes 输入 to ffmpeg: AIFF (.AIff) works reliably; avoid .m4a with say 输出: Opus in Ogg contAIner (libopus codec) — required for Telegram voice messages Telegram 发送Voice accepts: OGG, MP3, M4A — but native is Opus OGG Sample rate: say 输出s 24kHz AIFF; ffmpeg re-encodes to Opus at 24kHz Integration with OpenClaw TTS

OpenClaw's built-in messages.tts only supports: ElevenLabs, Microsoft Edge, MiniMax, OpenAI.

This say+ffmpeg 流水线 is a workaround for local-only TTS without API keys or cloud 服务s. It's not auto-triggered by OpenClaw — call it manually via exec + message 工具.

Language 检测ion → Voice M应用ing

When 响应ing to a voice message, 检测 the language from the STT 输出 (Parakeet auto-检测s). Then pick the matching say voice using i18n locale codes.

Finding voices by language:

Language → voice selection priority:

Use (Premium) if avAIlable Fall back to (Enhanced) if avAIlable Fall back to base name Never use a voice that doesn't match the language Language i18n code Preferred Voice Czech cs_CZ Zuzana (Premium) English (US) en_US Trinoids (no Premium/Enhanced avAIlable) German de_DE Grandma (Premium) if avAIlable French fr_FR Grandma (Premium) if avAIlable Spanish es_ES Grandma (Premium) if avAIlable Italian it_IT Grandma (Premium) if avAIlable

Key: Always use just the voice name (e.g. "Trinoids", "Zuzana"), not the full locale suffix. The locale suffix in say -v '?' 输出 is for grepping/identification only.

Example 工作流:

LANG="cs_CZ" # Find best avAIlable voice for this language (Premium > Enhanced > base) VOICE=$(say -v '?' 2>&1 | grep "$LANG" | head -3 | awk '{print $1}' | sed -n '1p') say -v "$VOICE" -o reply.AIff "Česká odpověď" ffmpeg -i reply.AIff -acodec libopus reply.ogg -y

TODOs 检测 language from STT transcription and auto-select 应用ropriate say voice Explore integrating into OpenClaw via custom TTS 提供者插件 Investigate if OpenClaw supports post-processing TTS 输出 via a hook Test Matrix channel voice message 格式化 compatibility

License

运行时依赖

安装命令

技能文档

相关技能推荐