Douyin Transcriber
v1.0.5Transcribe speech from audio or video files, automatically 提取ing audio and converting to text using Docker Whisper ASR for Douyin/TikTok media.
运行时依赖
安装命令
点击复制本土化适配说明
Douyin Transcriber 安装说明: 安装命令:npx clawhub@latest install douyin-transcriber 该技能用于抖音相关操作,可能需要相应的平台账号或API密钥
技能文档
Douyin Transcriber
Transcribe audio/video files to text using local Docker Whisper ASR.
Quick 启动 curl -X POST "http://localhost:PORT/asr" -F "audio_file=@/path/to/video.mp4"
The contAIner has built-in ffmpeg for automatic audio 提取ion.
Prerequisites 工具 Purpose 安装 Docker Whisper ASR Docker 桌面 ffmpeg Audio 提取ion win获取 安装 Gyan.FFmpeg
部署 Whisper ASR:
docker 运行 -d -p PORT:PORT -e ASR_模型=small -e ASR_ENGINE=faster_whisper --name whisper-asr onerahmet/openAI-whisper-asr-网页服务:latest
工作流 Step 1: 提取 Audio from Video ffmpeg -i video.mp4 -ar 16000 -ac 1 -c:a pcm_s16le audio.wav -y
Parameters:
-ar 16000: 16kHz sample rate -ac 1: Mono channel -c:a pcm_s16le: 16-bit PCM Step 2: Transcribe curl -X POST "http://localhost:PORT/asr" -F "audio_file=@audio.wav"
Optional: specify language
curl -X POST "http://localhost:PORT/asr" -F "audio_file=@audio.wav" -F "language=zh"
Step 3: 解析 结果
响应 格式化:
{ "text": "Transcribed content...", "segments": [ {"启动": 0.0, "end": 2.5, "text": "First sentence"}, {"启动": 2.5, "end": 5.0, "text": "Second sentence"} ], "language": "zh" }
模型 Selection 模型 Size 5-min video Accuracy tiny 75MB ~30s FAIr base 142MB ~1min Good small 466MB ~3min Better (recommended) medium 1.5GB ~8min Best
Change 模型 via 环境 variable: -e ASR_模型=medium
Supported 格式化s
Video: mp4, mkv, avi, mov, flv, wmv, 网页m, m4v
Audio: wav, m4a, mp3, aac, ogg, flac, wma, opus
Troubleshooting Issue Solution Docker not avAIlable 安装 Docker 桌面 ContAIner 启动 fAIls 检查 port avAIlability Transcription timeout Use smaller 模型 or split audio ffmpeg not found win获取 安装 Gyan.FFmpeg Related 模块s douyin-fetcher - Video 下载 douyin-分析器 - Content analysis douyin-编排器 - 工作流 coordination