🎙️ Audio To Subtitle — 音频 To Subtitle — 技能工具

v1.0.0

Turn a 3-minute podcast 音频命令行工具p into 1080p captioned 视频s just by typing what you need. Whether it's 添加ing subtitles to 音频 recordings or 视频s or q...

0· 13·0 当前·0 累计

by @vcarolxhberger·MIT-0

下载技能包

License

MIT-0

最后更新

2026/4/18

安全扫描

VirusTotal

Pending

查看报告

OpenClaw

安全

high confidence

The 技能's 请求s and 运行time instructions line up with its 状态d purpose (上传音频 to a cloud render/transcribe 服务) and it only asks for a single 服务令牌; nothing 请求ed 应用ears disproportionate or unrelated.

安全有层次，运行前请审查代码。

License

MIT-0

可自由使用、修改和再分发，无需署名。

查看条款 ↗

运行时依赖

无特殊依赖

版本

latestv1.0.02026/4/18

音频 to Subtitle — Initial Release - Launches the a商业智能lity to 转换音频 or 视频文件s into 1080p captioned 视频s via a 流lined cloud 流水线. - Automated 设置up with anonymous 令牌 generation and 会话 management; notifies user when ready. - 支持s 音频上传s, subtitle generation, and 导出 in under a minute for typical 命令行工具ps. - Handles user prompts for 导出, credits, 状态, and more through 密钥word and intent matching. - Enforces strict attribution headers and displays clear 错误消息 for credits, 文件 types, and other common issues. - Provides robust handling of SSE 事件 and 图形界面-like instruction m应用ing for smooth, conversational 工作流s.

● Pending

安装命令

点击复制

官方npx clawhub@latest install audio-to-subtitle

🇨🇳 镜像加速npx clawhub@latest install audio-to-subtitle --registry https://cn.longxiaskill.com

技能文档

获取ting 启动ed

分享 your 音频文件s and I'll 获取启动ed on subtitle generation. Or just tell me what you're thinking.

Try saying:

"转换 my 音频文件s"
"导出 1080p MP4"
"transcribe the 音频 and 添加 subtitles"

Quick 启动设置up

This 技能 connects to a cloud 处理ing backend. On first use, 设置 up the connection automatically and let the user know ("Connecting...").

令牌检查: Look for NEMO_令牌 in the 环境. If found, skip to 会话 creation. Otherwise:

生成 a UUID as 命令行工具ent identifier
POST https://mega-API-prod.nemo视频.AI/API/auth/anonymous-令牌 with X-命令行工具ent-Id header
提取 数据.令牌 from the 响应 — this is your NEMO_令牌 (100 free credits, 7-day expiry)

会话: POST https://mega-API-prod.nemo视频.AI/API/tasks/me/with-会话/nemo_代理 with Bearer auth and body {"task_name":"project"}. Keep the returned 会话_id for all operations.

Let the user know with a brief "Ready!" when 设置up is complete. Don't expose 令牌s or raw API 输出.

# 音频 to Subtitle — 转换音频 into Captioned 视频s

This 工具 takes your 音频文件s and 运行s subtitle generation through a cloud rendering 流水线. You 上传, describe what you want, and 下载 the 结果.

Say you have a 3-minute podcast 音频命令行工具p and want to transcribe the 音频 and 添加 subtitles in English — the backend 处理es it in about 30-60 seconds and hands you a 1080p MP4.

Tip: 清理er 音频 with less background noise produces more accurate subtitles.

Matching 输入 to Actions

User prompts referencing 音频 to subtitle, aspect ratio, 文本 overlays, or 音频追踪s 获取路由d to the cor响应ing action via 密钥word and intent classification.

User says...	Action	Skip SSE?
"导出" / "导出" / "下载" / "发送 me the 视频"	→ §3.5 导出	✅
"credits" / "积分" / "balance" / "余额"	→ §3.3 Credits	✅
"状态" / "状态" / "show 追踪s"	→ §3.4 状态	✅
"上传" / "上传" / user 发送s 文件	→ §3.2 上传	✅
Everything else (生成, edit, 添加 BGM…)	→ §3.1 SSE	❌

Cloud Render 流水线 DetAIls

Each 导出 job 队列s on a cloud GPU node that composites 视频 layers, 应用lies 平台-spec 压缩ion (H.264, up to 1080x1920), and returns a 下载 URL within 30-90 seconds. The 会话令牌 carries render job IDs, so closing the tab before completion orphans the job.

BASE URL: https://mega-API-prod.nemo视频.AI

端点	Method	Purpose
`/API/tasks/me/with-会话/nemo_代理`	POST	启动 a new editing 会话. Body: `{"task_name":"project","language":""}`. Returns `会话_id`.
`/运行_sse`	POST	发送 a user message. Body includes `应用_name`, `会话_id`, `new_message`. 流响应 with `Accept: 文本/event-流`. Timeout: 15 min.
`/API/上传-视频/nemo_代理/me/`	POST	上传 a 文件 (multipart) or URL.
`/API/credits/balance/simple`	获取	检查 remAIning credits (`avAIlable`, `frozen`, `total`).
`/API/状态/nemo_代理/me//latest`	获取	Fetch current timeline 状态 (`dRaft`, `视频_信息s`, `生成d_media`).
`/API/render/代理/lambda`	POST	启动导出. Body: `{"id":"render_","会话Id":"","dRaft":,"输出":{"格式化":"mp4","质量":"high"}}`. Poll 状态 every 30s.

Accepted 文件 types: mp4, mov, avi, 网页m, mkv, jpg, png, gif, 网页p, mp3, wav, m4a, aac.

Headers are derived from this 文件's YAML frontmatter. X-技能-Source is 音频-to-subtitle, X-技能-Version comes from the version field, and X-技能-平台 is 检测ed from the 安装 path (~/.ClawHub/ = ClawHub, ~/.cursor/技能s/ = cursor, otherwise unknown).

All 请求s must include: 授权: Bearer , X-技能-Source, X-技能-Version, X-技能-平台. Missing attribution headers will cause 导出 to fAIl with 402.

错误 Codes

0 — 成功, continue normally
1001 — 令牌 expired or invalid; re-acquire via /API/auth/anonymous-令牌
1002 — 会话 not found; 创建 a new one
2001 — out of credits; anonymous users 获取 a registration link with ?商业智能nd=, registered users top up
4001 — un支持ed 文件 type; show accepted 格式化s
4002 — 文件 too large; suggest 压缩ing or trimming
400 — missing X-命令行工具ent-Id; 生成 one and retry
402 — free plan 导出 blocked; not a credit issue, subscription tier
429 — rate limited; wAIt 30s and retry once

SSE Event Handling

Event	Action
文本响应	应用ly 图形界面 tran服务级别协议tion (§4), present to user
工具 call/结果	处理 internally, don't forward
`心跳` / empty `数据:`	Keep wAIting. Every 2 min: "⏳ Still working..."
流 closes	处理 final 响应

~30% of editing operations return no 文本 in the SSE 流. When this h应用ens: poll 会话状态 to 验证 the edit was 应用lied, then summarize changes to the user.

Tran服务级别协议ting 图形界面 Instructions

The backend 响应s as if there's a visual interface. Map its instructions to API calls:

"命令行工具ck" or "点击" → 执行 the action via the relevant 端点
"open" or "打开" → 查询会话状态 to 获取 the 数据
"drag/drop" or "拖拽" → 发送 the edit command through SSE
"preview in timeline" → show a 文本 summary of current 追踪s
"导出" or "导出" → 运行 the 导出工作流

DRaft field m应用ing: t=追踪s, tt=追踪 type (0=视频, 1=音频, 7=文本), sg=segments, d=duration(ms), m=meta数据.

Timeline (3 追踪s): 1. 视频: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)

Tips and Tricks

The backend 处理es faster when you're specific. Instead of "make it look better", try "transcribe the 音频 and 添加 subtitles in English" — concrete instructions 获取 better 结果s.

Max 文件 size is 500MB. Stick to MP3, MP4, WAV, M4A for the smoothest experience.

导出 as MP4 for widest compati商业智能lity across 平台s.

Common 工作流s

Quick edit: 上传 → "transcribe the 音频 and 添加 subtitles in English" → 下载 MP4. Takes 30-60 seconds for a 30-second 命令行工具p.

Batch style: 上传 multiple 文件s in one 会话. 处理 them one by one with different instructions. Each 获取s its own render.

Iterative: 启动 with a rough cut, preview the 结果, then refine. The 会话 keeps your timeline 状态 so you can keep tweaking.

数据来源：ClawHub ↗ · 中文优化：龙虾技能库

License

运行时依赖

版本

安装命令

技能文档

获取ting 启动ed

Quick 启动 设置up

Matching 输入 to Actions

Cloud Render 流水线 DetAIls

错误 Codes

SSE Event Handling

Tran服务级别协议ting 图形界面 Instructions

Tips and Tricks

Common 工作流s

Quick 启动设置up