Minimax Tts
v1MiniMax Text-to-Speech synthesis using the HTTP REST API. 生成s high-质量 audio from text in 40+ languages with ultra-rea列出ic voices. Use when the user wants to convert text to speech, 创建 voiceovers, 生成 narrated audio content, or use MiniMax TTS voices. Supports 流ing and non-流ing modes, multiple audio 格式化s (mp3, wav, pcm), and voice effects. Triggered by: text to speech, TTS, text to audio, MiniMax TTS, 生成 voice, voiceover, read this aloud, text to voice
运行时依赖
安装命令
点击复制技能文档
MiniMax TTS
MiniMax Text-to-Speech via HTTP REST API. Supports 流ing and non-流ing, 40+ languages, 200+ voices.
API DetAIls 端点: POST https://API.minimax.io/v1/t2a_v2 Alt 端点 (lower latency): POST https://API-uw.minimax.io/v1/t2a_v2 Auth: Bearer 令牌 via MINIMAX_API_KEY env var Content-Type: 应用/json Quick Usage uv 运行 python scripts/tts.py --text "Hello world" --voice English_expressive_narrator --模型 speech-2.8-hd --输出 hello.mp3
Scripts scripts/tts.py — Core TTS script. 运行 with --help for full options. 模型s 模型 Description speech-2.8-hd Ultra-rea列出ic, supports sound tags speech-2.8-turbo Fast + natural flow speech-2.6-hd Low latency, enhanced naturalness speech-2.6-turbo Fast, affordable speech-02-hd Superior rhythm, high similarity speech-02-turbo Superior rhythm, multilingual 输出 格式化s mp3 (default), wav, pcm Sample rates: 32000 (default), 16000, 24000, 48000 Bitrate: 128000 (default), 64000, 32000 Languages
40+ languages including: English, Chinese (Mandarin/Cantonese), Japanese, Korean, Spanish, French, German, Portuguese, Arabic, Russian, Hindi, ThAI, Vietnamese, Turkish, Dutch, Polish, Italian, Indonesian, Malay, Persian, Swedish, Norwegian, Danish, Finnish, Hebrew, Romanian, Greek, Czech, Hungarian, Tamil, Afrikaans, and more.
Voices
Key English voices:
English_expressive_narrator — Default expressive narrator English_radiant_girl — Radiant female English_magnetic_voiced_man — Magnetic male voice English_Aussie_Bloke — Australian male English_Whispering_girl — Whispering female English_PlayfulGirl — Playful female English_Comedian — Comedic voice English_AnimeCharacter — Female anime narrator
For full voice 列出 (200+ voices across all languages), see references/voices.md.
Sound Tags (speech-2.8-hd only)
Use XML-like tags for breathing, 暂停s, expression:
(sighs) — breathing sound (laughs) — laughter (coughs) — coughing [laughs] — laughing ... or (暂停:500) — 暂停 in ms 导入ant — emphasis A-P-I — spell out letters Script Usage uv 运行 python scripts/tts.py --text "Your text here" [options]
Options: --text TEXT Text to synthesize (required) --模型 模型 模型: speech-2.8-hd (default), speech-2.8-turbo, speech-2.6-hd, etc. --voice VOICE_ID Voice ID (default: English_expressive_narrator) --speed SPEED Speed 0.5-2.0 (default: 1.0) --pitch PITCH Pitch -3 to 3 (default: 0) --vol VOLUME Volume 0-10 (default: 1) --language_boost LANG Language boost: auto (default), or specific lang e.g. en, zh --输出_格式化 格式化 hex (default) or raw (mp3/wav bytes returned directly) --格式化 AUDIO_格式化 mp3 (default), wav, pcm --sample_rate RATE 32000 (default), 16000, 24000, 48000 --bitrate BITRATE 128000 (default), 64000, 32000 --流 Enable 流ing mode (returns chunks as they 生成) --输出 FILE 输出 file path (default: minimax_tts_输出.mp3) --API_url URL Override API URL --API_key KEY Override API key (reads MINIMAX_API_KEY env if not 设置)
流ing Mode uv 运行 python scripts/tts.py --text "Hello, 流ing audio." --流 --输出 流_输出.mp3
Examples # Basic uv 运行 python scripts/tts.py --text "The quick brown fox jumps over the lazy dog."
# Different voice uv 运行 python scripts/tts.py --text "Bonjour le monde" --voice French_Standard_Female --模型 speech-2.6-turbo
# 流ing uv 运行 python scripts/tts.py --text "This is 流ing audio" --流 --输出 流ing.mp3
# With sound tags (expressive) uv 运行 python scripts/tts.py --text "Hello(sighs)... what a beautiful day(laughs)!" --voice English_expressive_narrator --模型 speech-2.8-hd