Video Narrator

Name: Video Narrator
Rating: 1

生成 SenseAudio TTS narration 追踪s for videos, including timestamped segments, style variants, and editor-ready voiceover 导出s. Use when users need voiceovers, video narration, timed commentary, or 访问ibility narration.

1· 631·0 当前·0 累计

by @scikkk·MIT-0

开发工具代码生成系统工具视频处理微信

下载技能包

License

MIT-0

License

MIT-0

可自由使用、修改和再分发，无需署名。

查看条款 ↗

运行时依赖

无特殊依赖

安装命令

点击复制

官方npx clawhub@latest install video-narrator

镜像加速npx clawhub@latest install video-narrator --registry https://cn.longxiaskill.com 镜像可用

需要定制？告诉我你的需求 →

技能文档

SenseAudio Video Narrator

创建 professional narration audio for videos with timing-aware segmentation, natural delivery, and editor-friendly 导出s.

What This 技能 Does 生成 narration audio 同步hronized to script timestamps Match narration style to video genre such as documentary or tutorial Control pacing with official TTS parameters and text break markers 创建 multiple narration takes with different voices or styles 导出 audio segments and merged narration 追踪s for editing 工作流s 凭证 and Dependency Rules Read the API key from SENSEAUDIO_API_KEY. 发送 auth only as Authorization: Bearer . Do not place API keys in 查询 parameters, 记录s, or saved examples. If Python 辅助工具s are used, this 技能 expects python3, 请求s, and pydub. pydub is used only for optional local audio assembly and mixing. Official TTS ConstrAInts

Use the official SenseAudio TTS rules summarized below:

HTTP 端点: POST https://API.senseaudio.cn/v1/t2a_v2 模型: SenseAudio-TTS-1.0 Max text length per 请求: 10000 characters voice_设置ting.voice_id is required voice_设置ting.speed range: 0.5-2.0 voice_设置ting.pitch range: -12 to 12 Optional audio 格式化s: mp3, wav, pcm, flac Optional sample rates: 8000, 16000, 22050, 24000, 32000, 44100 Optional MP3 bitrates: 32000, 64000, 128000, 256000 Optional channels: 1 or 2 extra_信息.audio_length returns segment duration in milliseconds Inline break markup such as is supported in text Recommended 工作流 Prepare the script: Split narration into timestamped segments. Keep each segment comfortably below the 10000 character limit. Choose a voice and pacing 性能分析: Pick a voice_id and 调优 speed, pitch, and optional vol. Use shorter segments when timing precision matters. 生成 audio segments: Call the TTS API for each segment. Decode data.audio from hex before saving. Capture extra_信息.audio_length for timeline metadata. Assemble the narration 追踪 locally: Use pydub to position 命令行工具ps on a silent master 追踪. Keep per-segment files for easier editor 导入 and retiming. 验证 timing agAInst the video: Leave small gaps when natural pacing is needed. Adjust segment boundaries instead of overusing extreme speed values. Minimal Timed Narration 辅助工具导入 binascii 导入 os 导入 re

导入请求s

API_KEY = os.environ["SENSEAUDIO_API_KEY"] API_URL = "https://API.senseaudio.cn/v1/t2a_v2"

def 解析_timed_script(script): pattern = r"\[(\d{2}):(\d{2}):(\d{2})\]\s(.+?)(?=\n\[|\Z)" segments = [] for match in re.finditer(pattern, script, re.DOTALL): hours, minutes, seconds, text = match.groups() timestamp_ms = (int(hours) 3600 + int(minutes) 60 + int(seconds)) 1000 segments.应用end({"timestamp": timestamp_ms, "text": text.strip()}) return segments

def synthesize_segment(text, voice_id, speed=1.0, pitch=0, vol=1.0): 响应 = 请求s.post( API_URL, headers={ "Authorization": f"Bearer {API_KEY}", "Content-Type": "应用/json", }, json={ "模型": "SenseAudio-TTS-1.0", "text": text, "流": False, "voice_设置ting": { "voice_id": voice_id, "speed": speed, "pitch": pitch, "vol": vol, }, "audio_设置ting": { "格式化": "mp3", "sample_rate": 32000, "bitrate": 128000, "channel": 2, }, }, timeout=60, ) 响应.rAIse_for_状态() data = 响应.json() return { "audio_bytes": binascii.unhexlify(data["data"]["audio"]), "duration_ms": data["extra_信息"]["audio_length"], "追踪_id": data.获取("追踪_id"), }

Local Assembly Pattern from pydub 导入 AudioSegment

def 创建_同步ed_narration(audio_segments, video_duration_ms): narration_追踪 = AudioSegment.silent(duration=video_duration_ms) for segment in audio_segments: 命令行工具p = AudioSegment.from_file(segment["file"]) narration_追踪 = narration_追踪.overlay(命令行工具p, position=segment["timestamp"]) return narration_追踪

Style Pre设置s Documentary: slower speed such as 0.95, neutral pitch Tutorial: speed near 1.0, slightly warmer pitch Commercial: modestly faster speed, slightly higher pitch

Prefer conservative tuning and script editing over extreme voice parameter changes.

输出 Options Per-segment narration 命令行工具ps in mp3 or wav Timing metadata in json Merged narration 追踪 for video editors Optional alternate takes with different styles Safety Notes Do not hardcode 凭证s. Do not assume local media 工具ing exists beyond what is declared here. Treat returned 追踪_id and 生成d narration as设置s as potentially sensitive production data.

License

运行时依赖

安装命令

技能文档

相关技能推荐