抖音视频快速转文字 — 技能工具

Name: 抖音视频快速转文字 — 技能工具
Author: btboy773

btboy773

抖音视频快速转文字 — 技能工具

v1.0.0

抖音视频快速转文字（优化版）。用户发抖音链接，自动提取文案。特点：本地 Whisper 转录，无需 API Key，零成本，高隐私。触发词：抖音、转文字、提取文案、视频转录

0· 78·0 当前·0 累计

by @btboy773·MIT-0

API工具自动化开发工具

下载技能包

License

MIT-0

最后更新

2026/3/25

安全扫描

VirusTotal

可疑

查看报告

OpenClaw

可疑

medium confidence

The skill's stated purpose (local transcription of Douyin videos) matches what it does, but there are notable implementation issues and security risks (missing declared dependency on mcporter and unsafe shell command usage with user-controlled inputs) that you should understand before installing.

评估建议

This skill appears to do what it claims (local Whisper transcription), but proceed with caution. Key points: - The script calls 'mcporter' (douyin-mcp) to parse Douyin links but the SKILL.md didn't list mcporter as a required tool — make sure you install and trust mcporter before use. - The script builds shell commands with user-supplied values (share links and extracted URLs) and runs them with shell=True. That is a real command-injection risk if you or others pass crafted inputs. Prefer runnin...

详细分析 ▾

ℹ 用途与能力

The name/description (local Whisper transcription of Douyin videos) aligns with the code and instructions: it uses douyin-mcp to get a video URL, ffmpeg to extract audio, and local Whisper to transcribe. Minor inconsistency: the SKILL.md pre-requisites list Python, ffmpeg, and openai-whisper but omit the required 'mcporter' tool (used to call douyin-mcp) which is necessary for URL extraction.

⚠ 指令范围

The SKILL.md and script instruct running shell commands that include user-provided data (share links and extracted video URLs). The Python script constructs shell commands (via subprocess.run with shell=True) embedding these values without escaping or sanitization, which opens the door to command injection if input is malicious or crafted. Aside from that, instructions stay within the transcription purpose and do not attempt to read unrelated system secrets or send outputs to hidden endpoints.

✓ 安装机制

No install spec (instruction-only + single helper script) — nothing is downloaded or written automatically by an installer. Dependencies are managed manually (ffmpeg, whisper, mcporter). This is lower risk from an install-source perspective, but the README omission of mcporter is an operational gap.

✓ 凭证需求

The skill declares no credentials or environment variables and does not request unrelated secrets. It writes transcripts into a directory under the user's home (~/.openclaw/workspace/douyin-transcripts), which is reasonable for this function but worth noting as persisted data on disk.

✓ 持久化与权限

The skill is not marked 'always:true' and uses normal autonomous invocation defaults. It does not attempt to modify other skill configurations or require elevated privileges.

安全有层次，运行前请审查代码。

License

MIT-0

可自由使用、修改和再分发，无需署名。

查看条款 ↗

运行时依赖

无特殊依赖

版本

latestv1.0.02026/3/25

- 首发版本，实现抖音视频本地极速转文字。 - 支持用户发送抖音链接或视频文件，自动提取音频并本地 Whisper 转录，无需 API Key。 - 依赖 Python、ffmpeg 和 openai-whisper（本地安装，无外部请求，保护隐私）。 - 多模型选择（tiny/base/small）兼顾转录速度与质量，灵活适配短/中/长视频场景。 - 零成本，离线可用，完整支持隐私敏感和网络受限环境。 - 提供详细安装教程、使用流程与常见问题排查。

● 可疑

安装命令点击复制

官方npx clawhub@latest install douyin-transcribe-fast

镜像加速npx clawhub@latest install douyin-transcribe-fast --registry https://cn.clawhub-mirror.com

技能文档

本地 Whisper 转录，无需 API Key，零成本，高隐私。

前置依赖检查

使用前确保以下工具已安装：

1. Python 3.8+

python --version

2. FFmpeg（音频处理）

ffmpeg -version

未安装？Windows: winget install Gyan.FFmpeg

3. OpenAI Whisper（本地转录）

pip install openai-whisper

使用方式

方式 1：抖音链接

用户发送抖音链接，如：

2.89 03/17 zTl:/ n@d.nq 真正赚钱的人到底怎么用 AI？ https://v.douyin.com/D4SVbwCEY6g/

执行步骤：

步骤 1：解析视频信息

使用 douyin-mcp 获取视频下载链接：

mcporter call douyin-mcp.parse_douyin_video_info share_link="<抖音链接>"

步骤 2：下载视频（仅音频流）

ffmpeg -i "<视频URL>" -vn -acodec pcm_s16le -ar 16000 -ac 1 "audio.wav" -y

步骤 3：本地 Whisper 转录

whisper "audio.wav" --model tiny --language Chinese --output_format txt

💡 优化提示：
- 使用 tiny 模型最快（适合短视频）
- 使用 base 模型平衡速度和质量
- 使用 small 模型质量最好（适合长视频）

步骤 4：返回结果

读取生成的 txt 文件，返回给用户。

方式 2：本地视频文件

用户发送视频文件，直接执行步骤 3-4。

优化策略

🚀 速度优化

策略	效果	适用场景
只下载音频流	减少 90% 下载时间	所有视频
使用 tiny 模型	CPU 转录 1-2 分钟	短视频 (<3分钟)
使用 base 模型	CPU 转录 3-5 分钟	中等视频 (3-10分钟)
跳过视频下载	直接提取音频 URL	网页版抖音

💰 成本优化

零 API 费用：本地 Whisper 完全免费
零网络依赖：不需要 Groq/OpenAI API
隐私保护：视频/音频不离开本地机器

🛡️ 稳定性优化

不依赖浏览器：避免抖音反爬和登录问题
不依赖第三方 API：避免 API 限制和费用
离线可用：安装后无需网络即可转录

完整工作流程

用户发送抖音链接
    ↓
提取 modal_id / 视频 URL（通过 douyin-mcp）
    ↓
下载音频流（ffmpeg，~1-5MB）
    ↓
本地 Whisper 转录（tiny/base/small 模型）
    ↓
返回中文文案

总耗时：

短视频（<3分钟）：2-3 分钟
中等视频（3-10分钟）：5-8 分钟
长视频（>10分钟）：10-15 分钟

故障排查

问题	原因	解决
douyin-mcp 返回 403	API Key 无效	检查 `~/.cursor/mcp.json` 配置
ffmpeg 未找到	未安装或不在 PATH	安装 ffmpeg 并添加到环境变量
whisper 未找到	未安装	运行 `pip install openai-whisper`
转录质量差	模型太小或音频不清	改用 base/small 模型
转录速度慢	CPU 性能不足	使用 tiny 模型或升级硬件

模型选择建议

模型	速度	质量	显存/内存	推荐场景
tiny	⚡ 最快	⭐⭐	~1GB	短视频、快速预览
base	🚀 快	⭐⭐⭐	~1GB	日常使用
small	🚗 中等	⭐⭐⭐⭐	~2GB	高质量需求
medium	🐢 慢	⭐⭐⭐⭐⭐	~5GB	专业用途

配置示例

Windows PowerShell 环境变量

$env:PATH = "C:\Users\<用户名>\AppData\Local\Programs\Python\Python311\Scripts;" +
            "C:\ffmpeg\bin;" +
            $env:PATH

快速转录命令

# 下载音频 ffmpeg -i "<视频URL>" -vn -acodec pcm_s16le -ar 16000 -ac 1 "audio.wav" -y # 转录（tiny 模型，最快） whisper "audio.wav" --model tiny --language Chinese --output_format txt

# 转录（base 模型，平衡） whisper "audio.wav" --model base --language Chinese --output_format txt

与原版 skill 对比

特性	douyin-transcribe	douyin-transcribe-fast（本版）
依赖	Groq API Key	无需 API Key
费用	免费（Groq）	完全免费
隐私	音频上传到 Groq	完全本地
速度	3-5 秒	2-15 分钟（取决于视频长度）
网络要求	需要网络	安装后离线可用
准确度	⭐⭐⭐⭐⭐	⭐⭐⭐⭐（small模型）
适用场景	快速转录、大量视频	隐私敏感、离线环境、零成本

最佳实践

短视频（<3分钟）：直接用 tiny 模型，2分钟出结果
中等视频（3-10分钟）：用 base 模型，平衡速度和质量
长视频（>10分钟）：用 small 模型，或分段处理
批量处理：先下载所有音频，再批量转录
质量优先：对重要视频使用 small 模型，日常用 base

技术栈

douyin-mcp：获取视频信息
ffmpeg：音频提取和处理
OpenAI Whisper：本地语音识别
Python：运行环境

优化版 Skill，让抖音文案提取更简单、更私密、更经济。

本地 Whisper 转录，无需 API Key，零成本，高隐私。

前置依赖检查

使用前确保以下工具已安装：

1. Python 3.8+

python --version

2. FFmpeg（音频处理）

ffmpeg -version

未安装？Windows: winget install Gyan.FFmpeg

3. OpenAI Whisper（本地转录）

pip install openai-whisper

使用方式

方式 1：抖音链接

用户发送抖音链接，如：

2.89 03/17 zTl:/ n@d.nq 真正赚钱的人到底怎么用 AI？ https://v.douyin.com/D4SVbwCEY6g/

执行步骤：

步骤 1：解析视频信息

使用 douyin-mcp 获取视频下载链接：

mcporter call douyin-mcp.parse_douyin_video_info share_link="<抖音链接>"

步骤 2：下载视频（仅音频流）

ffmpeg -i "<视频URL>" -vn -acodec pcm_s16le -ar 16000 -ac 1 "audio.wav" -y

步骤 3：本地 Whisper 转录

whisper "audio.wav" --model tiny --language Chinese --output_format txt

💡 优化提示：
- 使用 tiny 模型最快（适合短视频）
- 使用 base 模型平衡速度和质量
- 使用 small 模型质量最好（适合长视频）

步骤 4：返回结果

读取生成的 txt 文件，返回给用户。

方式 2：本地视频文件

用户发送视频文件，直接执行步骤 3-4。

优化策略

🚀 速度优化

策略	效果	适用场景
只下载音频流	减少 90% 下载时间	所有视频
使用 tiny 模型	CPU 转录 1-2 分钟	短视频 (<3分钟)
使用 base 模型	CPU 转录 3-5 分钟	中等视频 (3-10分钟)
跳过视频下载	直接提取音频 URL	网页版抖音

💰 成本优化

零 API 费用：本地 Whisper 完全免费
零网络依赖：不需要 Groq/OpenAI API
隐私保护：视频/音频不离开本地机器

🛡️ 稳定性优化

不依赖浏览器：避免抖音反爬和登录问题
不依赖第三方 API：避免 API 限制和费用
离线可用：安装后无需网络即可转录

完整工作流程

用户发送抖音链接
    ↓
提取 modal_id / 视频 URL（通过 douyin-mcp）
    ↓
下载音频流（ffmpeg，~1-5MB）
    ↓
本地 Whisper 转录（tiny/base/small 模型）
    ↓
返回中文文案

总耗时：

短视频（<3分钟）：2-3 分钟
中等视频（3-10分钟）：5-8 分钟
长视频（>10分钟）：10-15 分钟

故障排查

问题	原因	解决
douyin-mcp 返回 403	API Key 无效	检查 `~/.cursor/mcp.json` 配置
ffmpeg 未找到	未安装或不在 PATH	安装 ffmpeg 并添加到环境变量
whisper 未找到	未安装	运行 `pip install openai-whisper`
转录质量差	模型太小或音频不清	改用 base/small 模型
转录速度慢	CPU 性能不足	使用 tiny 模型或升级硬件

模型选择建议

模型	速度	质量	显存/内存	推荐场景
tiny	⚡ 最快	⭐⭐	~1GB	短视频、快速预览
base	🚀 快	⭐⭐⭐	~1GB	日常使用
small	🚗 中等	⭐⭐⭐⭐	~2GB	高质量需求
medium	🐢 慢	⭐⭐⭐⭐⭐	~5GB	专业用途

配置示例

Windows PowerShell 环境变量

$env:PATH = "C:\Users\<用户名>\AppData\Local\Programs\Python\Python311\Scripts;" +
            "C:\ffmpeg\bin;" +
            $env:PATH

快速转录命令

# 下载音频 ffmpeg -i "<视频URL>" -vn -acodec pcm_s16le -ar 16000 -ac 1 "audio.wav" -y # 转录（tiny 模型，最快） whisper "audio.wav" --model tiny --language Chinese --output_format txt

# 转录（base 模型，平衡） whisper "audio.wav" --model base --language Chinese --output_format txt

与原版 skill 对比

特性	douyin-transcribe	douyin-transcribe-fast（本版）
依赖	Groq API Key	无需 API Key
费用	免费（Groq）	完全免费
隐私	音频上传到 Groq	完全本地
速度	3-5 秒	2-15 分钟（取决于视频长度）
网络要求	需要网络	安装后离线可用
准确度	⭐⭐⭐⭐⭐	⭐⭐⭐⭐（small模型）
适用场景	快速转录、大量视频	隐私敏感、离线环境、零成本

最佳实践

短视频（<3分钟）：直接用 tiny 模型，2分钟出结果
中等视频（3-10分钟）：用 base 模型，平衡速度和质量
长视频（>10分钟）：用 small 模型，或分段处理
批量处理：先下载所有音频，再批量转录
质量优先：对重要视频使用 small 模型，日常用 base

技术栈

douyin-mcp：获取视频信息
ffmpeg：音频提取和处理
OpenAI Whisper：本地语音识别
Python：运行环境

优化版 Skill，让抖音文案提取更简单、更私密、更经济。

数据来源：ClawHub ↗ · 中文优化：龙虾技能库

OpenClaw 技能定制 / 插件定制 / 私有工作流定制

免费技能或插件可能存在安全风险，如需更匹配、更安全的方案，建议联系付费定制

了解定制服务

License

运行时依赖

版本

安装命令 点击复制

技能文档

前置依赖检查

1. Python 3.8+

2. FFmpeg（音频处理）

3. OpenAI Whisper（本地转录）

使用方式

方式 1：抖音链接

步骤 1：解析视频信息

步骤 2：下载视频（仅音频流）

步骤 3：本地 Whisper 转录

步骤 4：返回结果

方式 2：本地视频文件

优化策略

🚀 速度优化

💰 成本优化

🛡️ 稳定性优化

完整工作流程

故障排查

模型选择建议

配置示例

Windows PowerShell 环境变量

快速转录命令

与原版 skill 对比

最佳实践

技术栈

前置依赖检查

1. Python 3.8+

2. FFmpeg（音频处理）

3. OpenAI Whisper（本地转录）

使用方式

方式 1：抖音链接

步骤 1：解析视频信息

步骤 2：下载视频（仅音频流）

步骤 3：本地 Whisper 转录

步骤 4：返回结果

方式 2：本地视频文件

优化策略

🚀 速度优化

💰 成本优化

🛡️ 稳定性优化

完整工作流程

故障排查

模型选择建议

配置示例

Windows PowerShell 环境变量

快速转录命令

与原版 skill 对比

最佳实践

技术栈

安装命令点击复制