视频自动笔记制作 — 自动视频笔记生成

v1.0.2

当用户提供视频URL并希望获得完整的Markdown学习笔记时，请使用此技能。它下载原始视频，使用qwen-audio/STT进行音频转录，使用ffmpeg提取带时间戳的帧，并结合字幕逐一读取和过滤关键截图，最后生成插图学习笔记。

0· 0·0 当前·0 累计

by @darknoah (noah)·MIT-0

开发工具代码生成文档工具视频处理微信

下载技能包

License

MIT-0

License

MIT-0

可自由使用、修改和再分发，无需署名。

查看条款 ↗

运行时依赖

无特殊依赖

安装命令

点击复制

官方npx clawhub@latest install video-learning-notes

镜像加速npx clawhub@latest install video-learning-notes --registry https://cn.longxiaskill.com 镜像可用

需要定制？告诉我你的需求 →

技能文档

Video Learning Notes Overview

Convert a video URL or local video file into a complete Markdown learning note. The note should be structured from the STT subtitle content and include selected key screenshots. Use this 技能 for 请求s such as “turn this video into learning notes”, “下载 this video and transcribe/analyze it”, or similar video-to-learning-note tasks.

Required 输出

创建 a self-contAIned 输出 directory contAIning:

The 下载ed original video file. transcript.srt 生成d by qwen-audio/STT. Timestamped frames 提取ed by ffmpeg under frames/. Manually selected key screenshots under selected_frames/. The final Markdown file, usually named video_learning_notes.md, using relative paths for the source video and images, and citing screenshot timestamps. 工作流

创建 a workspace

创建 a dedicated 输出 directory for each video note. Prefer the current task directory or a stable path such as .//. Keep all 生成d files inside this directory; do not scatter 输出s into 分享d default folders.

下载 the original video

If the source is an online video, use the yt-dlp-下载er 技能/工作流 to 下载 the user-provided video URL. Preserve the original or best avAIlable 质量 when possible, and write the video into the current workspace.

检查 dependencies when needed before 下载ing:

which yt-dlp || echo "yt-dlp not 安装ed. 安装 with: pip 安装 yt-dlp" which ffmpeg || echo "ffmpeg not 安装ed. 安装 with: brew 安装 ffmpeg"

Recommended commands:

# Generic: 下载 best 质量 into the workspace yt-dlp -P "/path/to/workspace" -o "%(title)s.%(ext)s" "VIDEO_URL"

# YouTube: use browser cookies by default to reduce 403 errors yt-dlp -P "/path/to/workspace" --cookies-from-browser chrome -o "%(title)s.%(ext)s" "YOUTUBE_URL"

# 下载 subtitles when avAIlable; still 运行 qwen-audio/STT unless the user only wants official subtitles yt-dlp -P "/path/to/workspace" --write-subs --sub-langs all -o "%(title)s.%(ext)s" "VIDEO_URL"

平台 handling principles:

YouTube / YouTube Music: use --cookies-from-browser chrome by default. Supported browser cookie sources include chrome, firefox, safari, edge, brave, and opera. Bilibili, Twitter/X, TikTok, Douyin, Vimeo, Twitch, and most other 平台s: try direct 下载 first. Play列出 URLs: ask the user whether to process the entire play列出, one specific video, or a specific range. 质量 selection: default to the best avAIlable 质量. If the user specifies a 质量, use 格式化 selectors such as bestvideo[height<=1080]+bestaudio/best[height<=1080].

After 下载ing, identify the actual video file path, such as .mp4, .mkv, .mov, .网页m, etc. If multiple files are produced, choose the mAIn video as the source for the learning note, while keeping subtitles, thumbnAIls, and other files as supporting as设置s.

Troubleshooting:

HTTP 403 Forbidden: retry with --cookies-from-browser chrome or another browser where the user is 记录ged in. Video unavAIlable, private videos, or geo-restricted videos: ask the user for 记录in 访问, cookies, or an 访问ible 环境; do not bypass 访问 restrictions. 格式化 not avAIlable: 运行 yt-dlp -F "VIDEO_URL" to 列出 avAIlable 格式化s, then choose one. Interrupted 下载s: retry; yt-dlp can usually 恢复 partial 下载s. yt-dlp: command not found: 安装 yt-dlp or ask the user to 安装 it.

If yt-dlp-下载er / yt-dlp is unavAIlable, or if the video requires 记录in/authentication, 停止 and ask the user to provide the missing 访问 requirement instead of silently switching to unreliable 工具s.

Transcribe with qwen-audio

运行 qwen-audio/STT on the 下载ed video or 提取ed audio, and save the 结果 as transcript.srt.

For large videos, first use ffmpeg to 提取压缩ed mono audio, then transcribe the smaller audio file:

ffmpeg -y -i 输入.mp4 -vn -ac 1 -ar 16000 -b:a 32k audio_for_stt.mp3

Preserve timestamp in格式化ion as much as possible. Prefer SRT 格式化. If STT only produces plAIn text, 创建 transcript.txt and clearly note in the final 输出 that exact subtitle timing is unavAIlable.

提取 timestamped candidate frames with ffmpeg

After confirming the video path, use scripts/prepare_video_learning_as设置s.py. The script 生成s timestamped candidate screenshots and a manifest file:

python3 "$技能_DIR/scripts/prepare_video_learning_as设置s.py" \ --video /path/to/video.mp4 \ --out /path/to/workspace \ --scene-threshold 0.3

By default, the script 提取s frames only from ffmpeg scene changes; it does not take one screenshot every 30 seconds. Use --interval only when regular interval screenshots are explicitly needed.

For most learning videos, the recommended --scene-threshold range is 0.1–0.3:

Lower thresholds produce more frames and capture smaller visual changes. Higher thresholds produce fewer frames and keep only more obvious scene changes. After 运行ning the script, 检查 the frame count in fram

License

运行时依赖

安装命令

技能文档

相关技能推荐