VideoToText — subtitles & summary
v1.0.0指导稳定拉取 B 站官方字幕(应对限流/登录可见轨道)、并用 OpenAI 兼容接口生成中文总结稿;技能包内含 code/ 镜像源码与 env 模板,可打 zip 给 OpenClaw 离线使用。 适用于 Bilibili 链接解析、字幕提取失败排查、SESSDATA/Cookie、WBI player 字幕、字...
1· 61·0 当前·0 累计
安全扫描
OpenClaw
可疑
high confidenceThe skill's code and SKILL.md are coherent with its stated purpose (pull Bilibili subtitles and call a compatible LLM), but the package metadata omits the sensitive environment variables it actually requires and there are a few implementation details that merit caution before installing or providing secrets.
评估建议
This skill appears to implement exactly what it says (pull Bilibili subtitles and call an LLM to summarize) but before installing or supplying secrets consider the following:
- Transparency: the registry metadata did not declare the environment variables the code actually expects. Confirm with the publisher which env vars are required (SESSDATA, BILI_JCT, DEDEUSERID, SUMMARY_LLM_API_KEY / OPENAI_API_KEY, SUMMARY_LLM_BASE_URL/CHAT_COMpletions_URL, etc.).
- Secrets: do not provide your main accou...详细分析 ▾
⚠ 用途与能力
The skill's name and description match the included code: it extracts Bilibili subtitles and posts them to an OpenAI-compatible endpoint for summarization. However the registry metadata declares no required environment variables or credentials, while the code and SKILL.md clearly expect sensitive env vars (SESSDATA, BILI_JCT, DEDEUSERID, OPENAI/SUMMARY_LLM_* keys). The omission in the registry is an inconsistency that reduces transparency.
ℹ 指令范围
SKILL.md's runtime instructions align with the code: expand short URLs, fetch view/player/subtitle JSON, optionally use user-provided Cookie values, and send subtitle text + title to a configured LLM chat/completions endpoint. The instructions do not direct the agent to read unrelated files, nor to exfiltrate anything to hidden endpoints. Important operational behavior: the skill will transmit full subtitle text (potentially private) to whatever SUMMARY_LLM_BASE_URL / CHAT_COMPLETIONS_URL is configured, using SUMMARY_LLM_API_KEY or OPENAI_API_KEY if set.
✓ 安装机制
There is no install spec and the skill is instruction-only with a code mirror included. That minimizes opaque installation steps. The included requirements-code.txt lists httpx, pydantic, and yt-dlp (normal for this function). No remote download URLs or archive extraction are present in the manifest.
⚠ 凭证需求
The code legitimately needs Bilibili login tokens (SESSDATA and optional BILI_JCT/DEDEUSERID...) to fetch login-only subtitles and an LLM API key or base URL for summaries. Those are sensitive secrets. The problem: the registry metadata lists no required env vars, so the skill fails the transparency test — it asks for credentials in code/instructions but does not declare them up front. Also settings.py tries to locate a .env file via a path heuristic; depending on where the agent runs that could pick up a different .env than the user expects (verify .env placement).
✓ 持久化与权限
The skill does not request 'always: true' and does not claim to modify other skills or agent-wide settings. It does perform network calls and may be invoked autonomously (default), which is expected for this type of integration.
安全有层次,运行前请审查代码。
运行时依赖
无特殊依赖
版本
latestv1.0.02026/4/5
Initial release of B 站字幕稳定提取与总结核心模块。 - 首发提供 Bilibili 官方字幕提取、稳定排错、Cookie 及轨道选择全流程说明 - 支持字幕抽取与多轮采样、轨道优先级判定、校验逻辑详解 - 内含 OpenAI 兼容总结服务调用与回退说明,环境变量详尽标注 - 附源码 code/ 目录镜像、requirements、参考文档组成完整技能包
● Pending
安装命令 点击复制
官方npx clawhub@latest install videototext
镜像加速npx clawhub@latest install videototext --registry https://cn.clawhub-mirror.com
技能文档
# videototext(B 站字幕 + 总结)
合规与前提
- 仅使用用户本人账号导出的 Cookie;不协助绕过付费墙、地域限制或批量爬取。
- 尊重站点频率限制:项目内已对 B 站 HTTP 做节流 + 重试退避,修改时不要去掉
_throttle/_request_with_retry逻辑。 - 详细环境变量与默认值见同目录 reference.md。
- 技能包内已附带与主流程一致的源码镜像:见 code/(导入方式见 code/README.md)。环境变量名与示例见 主仓库 根目录
.env.example,自行复制为技能包根目录的.env后填写。
端到端流程(与代码一致)
- URL:
app/utils/url_tools.py(包内 code/app/utils/url_tools.py)— 展开b23.tv、解析BV与分 P?p=。 - 编排(完整仓库):
app/services/orchestrator.py— 先官方字幕;失败且开启 ASR 时再本机转写;正文就绪后调用总结服务。技能包未含 orchestrator/ASR,仅含抽取与总结核心模块。 - B 站元数据与字幕:
app/extractors/bilibili.py+app/services/bilibili_subtitle.py(包内code/app/...)。 - 无字幕兜底(仅完整仓库):
app/utils/audio_fetch.py+app/asr/local_faster_whisper.py(需LOCAL_ASR_ENABLED等)。 - 总结:
app/services/summary.py— 优先 LLM(OpenAI 兼容 Chat Completions),失败则本地抽样回退。
字幕链路:反爬与稳定性要点
Cookie 策略(核心)
SESSDATA(及可选BILI_JCT、DEDEUSERID、DEDEUSERID__CKMD5)由app/services/bilibili_subtitle.py中_build_bilibili_cookie()组装为Cookie请求头。- 优先带 Cookie 请求;若仍失败再尝试无 Cookie(同一套 view → player 流程)。
- 需登录才返回的字幕(
need_login_subtitle):仅 SESSDATA 往往不够,应配齐四件套(见.env.example注释)。 - yt-dlp 取信息与音频时复用同一 Cookie:
get_bilibili_sessdata_cookie_header()(app/extractors/bilibili.py)。
HTTP 行为
- 统一 Referer
https://www.bilibili.com/、桌面 Chrome User-Agent、Accept JSON(_client_headers)。 - 全站最小请求间隔
BILIBILI_MIN_INTERVAL_SECONDS(默认约 0.8s),全局锁节流。 - 重试:网络/412/429/5xx 等指数退避 + 抖动,次数由
BILIBILI_MAX_RETRIES等控制。
接口顺序
GET https://api.bilibili.com/x/web-interface/view?bvid=— 取aid、pages、标题、时长等。GET https://api.bilibili.com/x/player/wbi/v2?aid=&cid=&bvid=— 优先取字幕轨道列表(与 yt-dlp 一致,登录可见轨常在此)。- 若
subtitles为空,再回退GET https://api.bilibili.com/x/player/v2(部分稿件仅旧接口有轨,如部分 AI 字幕)。 - 对每条轨道的
subtitle_url再 GET 拉 JSON,body[].content拼正文(payload_to_text)。
轨道选择与质量
- 轨道排序:中文类优先,其内简体/通用 zh 优先于繁体,再 AI 中文;同组内可匹配
prefer_lang(_ordered_tracks)。 BILIBILI_SUBTITLE_VALIDATE=true时:多轮采样可选;对每条下载结果做时长覆盖与标题汉字二元组命中率过滤,防止 AI 串台;AI 轨使用更高阈值(BILIBILI_SUBTITLE_AI_MIN_TITLE_MATCH)。BILIBILI_SUBTITLE_VALIDATE=false时:更快,取首条非空;若开启BILIBILI_SUBTITLE_AI_SANITY_WHEN_VALIDATE_OFF,仍可对 AI 轨做标题 sanity。- 用户在前端显式选语言时:走
only_lan分支,不做标题命中率过滤(仍可在 validate 开启时做时长覆盖)。
排错清单
- 轨道列表为空:检查 Cookie 是否完整、是否登录账号、分 P 是否正确。
- 正文空或校验全失败:尝试开启/关闭
BILIBILI_SUBTITLE_VALIDATE、调整采样轮次与间隔,或让用户指定lan。 - 频繁 412/429:增大
BILIBILI_MIN_INTERVAL_SECONDS,避免并发多请求。
总结链路(TextSummaryService)
- 配置
SUMMARY_ENABLED、SUMMARY_LLM_BASE_URL或SUMMARY_LLM_CHAT_COMPLETIONS_URL、SUMMARY_LLM_MODEL;API Key 可用SUMMARY_LLM_API_KEY或OPENAI_API_KEY(见app/core/settings.py别名)。 - 端点:若
BASE_URL已以/chat/completions结尾则不再拼接;否则为{base}/chat/completions。 - 请求体:OpenAI 兼容
chat/completions,temperature来自SUMMARY_LLM_TEMPERATURE。 - 短目标(
max_chars <= 200):单段电报体中文;长目标:## 总标题+ 若干主题:块(见summary.py内 prompt)。 - 正文入模上限约 60000 字符;响应解析支持
message.content为字符串或多段 list;无有效 content 时回退_summarize_fallback。
回归与自检
- 字幕大改后:在项目根执行
python scripts/verify_subtitle_canary.py(需.env中SESSDATA等,见CLAUDE.md)。 - 全量测试:
pytest。
打包给 OpenClaw 使用
将本目录videototext 整夹 打成 zip(含 SKILL.md、reference.md、code/、requirements-code.txt)。解压后:
- Agent 阅读:
SKILL.md、reference.md、code/下源码。 - 若需本地运行子集:
pip install -r requirements-code.txt,PYTHONPATH指向code/,.env放在与SKILL.md同级目录(见 code/README.md)。
延伸阅读
- 环境变量逐项说明:reference.md
数据来源:ClawHub ↗ · 中文优化:龙虾技能库
OpenClaw 技能定制 / 插件定制 / 私有工作流定制
免费技能或插件可能存在安全风险,如需更匹配、更安全的方案,建议联系付费定制