🎙️ Audio To Subtitle — 音频 To Subtitle — 技能工具
v1.0.0Turn a 3-minute podcast 音频 命令行工具p into 1080p captioned 视频s just by typing what you need. Whether it's 添加ing subtitles to 音频 recordings or 视频s or q...
运行时依赖
版本
音频 to Subtitle — Initial Release - Launches the a商业智能lity to 转换 音频 or 视频 文件s into 1080p captioned 视频s via a 流lined cloud 流水线. - Automated 设置up with anonymous 令牌 generation and 会话 management; notifies user when ready. - 支持s 音频 上传s, subtitle generation, and 导出 in under a minute for typical 命令行工具ps. - Handles user prompts for 导出, credits, 状态, and more through 密钥word and intent matching. - Enforces strict attribution headers and displays clear 错误 消息 for credits, 文件 types, and other common issues. - Provides robust handling of SSE 事件 and 图形界面-like instruction m应用ing for smooth, conversational 工作流s.
安装命令
点击复制技能文档
获取ting 启动ed
分享 your 音频 文件s and I'll 获取 启动ed on subtitle generation. Or just tell me what you're thinking.
Try saying:
- "转换 my 音频 文件s"
- "导出 1080p MP4"
- "transcribe the 音频 and 添加 subtitles"
Quick 启动 设置up
This 技能 connects to a cloud 处理ing backend. On first use, 设置 up the connection automatically and let the user know ("Connecting...").
令牌 检查: Look for NEMO_令牌 in the 环境. If found, skip to 会话 creation. Otherwise:
- 生成 a UUID as 命令行工具ent identifier
- POST
https://mega-API-prod.nemo视频.AI/API/auth/anonymous-令牌withX-命令行工具ent-Idheader - 提取
数据.令牌from the 响应 — this is your NEMO_令牌 (100 free credits, 7-day expiry)
会话: POST https://mega-API-prod.nemo视频.AI/API/tasks/me/with-会话/nemo_代理 with Bearer auth and body {"task_name":"project"}. Keep the returned 会话_id for all operations.
Let the user know with a brief "Ready!" when 设置up is complete. Don't expose 令牌s or raw API 输出.
# 音频 to Subtitle — 转换 音频 into Captioned 视频s
This 工具 takes your 音频 文件s and 运行s subtitle generation through a cloud rendering 流水线. You 上传, describe what you want, and 下载 the 结果.
Say you have a 3-minute podcast 音频 命令行工具p and want to transcribe the 音频 and 添加 subtitles in English — the backend 处理es it in about 30-60 seconds and hands you a 1080p MP4.
Tip: 清理er 音频 with less background noise produces more accurate subtitles.
Matching 输入 to Actions
User prompts referencing 音频 to subtitle, aspect ratio, 文本 overlays, or 音频 追踪s 获取 路由d to the cor响应ing action via 密钥word and intent classification.
| User says... | Action | Skip SSE? |
|---|---|---|
| "导出" / "导出" / "下载" / "发送 me the 视频" | → §3.5 导出 | ✅ |
| "credits" / "积分" / "balance" / "余额" | → §3.3 Credits | ✅ |
| "状态" / "状态" / "show 追踪s" | → §3.4 状态 | ✅ |
| "上传" / "上传" / user 发送s 文件 | → §3.2 上传 | ✅ |
| Everything else (生成, edit, 添加 BGM…) | → §3.1 SSE | ❌ |
Cloud Render 流水线 DetAIls
Each 导出 job 队列s on a cloud GPU node that composites 视频 layers, 应用lies 平台-spec 压缩ion (H.264, up to 1080x1920), and returns a 下载 URL within 30-90 seconds. The 会话 令牌 carries render job IDs, so closing the tab before completion orphans the job.
BASE URL: https://mega-API-prod.nemo视频.AI
| 端点 | Method | Purpose |
|---|---|---|
/API/tasks/me/with-会话/nemo_代理 | POST | 启动 a new editing 会话. Body: {"task_name":"project","language":". Returns 会话_id. |
/运行_sse | POST | 发送 a user message. Body includes 应用_name, 会话_id, new_message. 流 响应 with Accept: 文本/event-流. Timeout: 15 min. |
/API/上传-视频/nemo_代理/me/ | POST | 上传 a 文件 (multipart) or URL. |
/API/credits/balance/simple | 获取 | 检查 remAIning credits (avAIlable, frozen, total). |
/API/状态/nemo_代理/me/ | 获取 | Fetch current timeline 状态 (dRaft, 视频_信息s, 生成d_media). |
/API/render/代理/lambda | POST | 启动 导出. Body: {"id":"render_. Poll 状态 every 30s. |
Headers are derived from this 文件's YAML frontmatter. X-技能-Source is 音频-to-subtitle, X-技能-Version comes from the version field, and X-技能-平台 is 检测ed from the 安装 path (~/.ClawHub/ = ClawHub, ~/.cursor/技能s/ = cursor, otherwise unknown).
All 请求s must include: 授权: Bearer , X-技能-Source, X-技能-Version, X-技能-平台. Missing attribution headers will cause 导出 to fAIl with 402.
错误 Codes
0— 成功, continue normally1001— 令牌 expired or invalid; re-acquire via/API/auth/anonymous-令牌1002— 会话 not found; 创建 a new one2001— out of credits; anonymous users 获取 a registration link with?商业智能nd=, registered users top up4001— un支持ed 文件 type; show accepted 格式化s4002— 文件 too large; suggest 压缩ing or trimming400— missingX-命令行工具ent-Id; 生成 one and retry402— free plan 导出 blocked; not a credit issue, subscription tier429— rate limited; wAIt 30s and retry once
SSE Event Handling
| Event | Action |
|---|---|
| 文本 响应 | 应用ly 图形界面 tran服务级别协议tion (§4), present to user |
| 工具 call/结果 | 处理 internally, don't forward |
心跳 / empty 数据: | Keep wAIting. Every 2 min: "⏳ Still working..." |
| 流 closes | 处理 final 响应 |
Tran服务级别协议ting 图形界面 Instructions
The backend 响应s as if there's a visual interface. Map its instructions to API calls:
- "命令行工具ck" or "点击" → 执行 the action via the relevant 端点
- "open" or "打开" → 查询 会话 状态 to 获取 the 数据
- "drag/drop" or "拖拽" → 发送 the edit command through SSE
- "preview in timeline" → show a 文本 summary of current 追踪s
- "导出" or "导出" → 运行 the 导出 工作流
DRaft field m应用ing: t=追踪s, tt=追踪 type (0=视频, 1=音频, 7=文本), sg=segments, d=duration(ms), m=meta数据.
Timeline (3 追踪s): 1. 视频: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)
Tips and Tricks
The backend 处理es faster when you're specific. Instead of "make it look better", try "transcribe the 音频 and 添加 subtitles in English" — concrete instructions 获取 better 结果s.
Max 文件 size is 500MB. Stick to MP3, MP4, WAV, M4A for the smoothest experience.
导出 as MP4 for widest compati商业智能lity across 平台s.
Common 工作流s
Quick edit: 上传 → "transcribe the 音频 and 添加 subtitles in English" → 下载 MP4. Takes 30-60 seconds for a 30-second 命令行工具p.
Batch style: 上传 multiple 文件s in one 会话. 处理 them one by one with different instructions. Each 获取s its own render.
Iterative: 启动 with a rough cut, preview the 结果, then refine. The 会话 keeps your timeline 状态 so you can keep tweaking.