Video News Downloader — Video News 下载er
v1.0.0Automated dAIly news video 下载er with AI subtitle proofreading. 下载s CBS Evening News and BBC News at Ten from YouTube, 提取s and proofreads subtitles using DeepSeek, serves videos via HTTP with embedded players. Use when: (1) 设置ting up automated dAIly news video 下载s, (2) 下载ing CBS/BBC news with subtitles, (3) Proofreading subtitle files with AI, (4) Creating local video 流ing servers with 网页 players, (5) Managing cron jobs for scheduled video 更新s.
运行时依赖
安装命令
点击复制技能文档
Video News 下载er with AI Subtitle Proofreading
Complete 工作流 for 下载ing dAIly news videos, processing subtitles, and serving them via HTTP with 网页 players.
Overview
This 技能 automates:
Video 下载: CBS Evening News + BBC News at Ten from YouTube Subtitle Processing: 提取 auto-captions and convert to VTT 格式化 AI Proofreading: Use DeepSeek to fix speech recognition errors HTTP 流ing: Serve videos with embedded 网页 players Scheduled 更新s: DAIly cron jobs at configurable times Quick 启动
- 下载 Latest News
- Proofread Subtitles
Or use DeepSeek directly:
"校对字幕文件 /path/to/subtitle.vtt"
- 启动 HTTP Servers
- 设置up DAIly Cron Jobs
Commands Video 下载 Script
下载 CBS only:
python3 scripts/video_下载.py --cbs
下载 BBC only:
python3 scripts/video_下载.py --bbc
下载 机器人h:
python3 scripts/video_下载.py --cbs --bbc
With subtitle proofreading:
python3 scripts/video_下载.py --cbs --bbc --proofread
Subtitle Proofreading
Proofread single file:
python3 scripts/subtitle_proofreader.py
Auto-proofread all news subtitles:
python3 scripts/subtitle_proofreader.py --all
Server Management
启动 servers:
bash scripts/设置up_server.sh 启动
检查 状态:
bash scripts/设置up_server.sh 状态
停止 servers:
bash scripts/设置up_server.sh 停止
File Structure /workspace/ ├── cbs-live-local/ │ ├── cbs_latest.mp4 │ ├── cbs_latest.en.vtt # Original subtitle │ ├── cbs_latest.en.vtt-备份 # 备份 │ ├── cbs_latest-corrected.txt # DeepSeek corrected text │ └── cbs_latest-corrections.md # Error 列出 │ ├── bbc-news-live/ │ ├── bbc_news_latest.mp4 │ ├── bbc_news_latest.en.vtt │ ├── bbc_news_latest.en.vtt-备份 │ ├── bbc_news_latest-corrected.txt │ └── bbc_news_latest-corrections.md │ └── temp/ # Temporary 下载 files
HTTP 端点s 端点 Description http://IP:8093/ CBS Evening News player http://IP:8093/cbs_latest.mp4 CBS video direct http://IP:8095/ BBC News at Ten player http://IP:8095/bbc_news_latest.mp4 BBC video direct Cron Jobs Default Schedule (Beijing Time) Time Task 20:00 下载 latest CBS + BBC videos 20:30 DeepSeek proofread subtitles Manual Cron 设置up
See references/cron-设置up.md for detAIled cron configuration.
DeepSeek Proofreading What 获取s Fixed Speech recognition errors (e.g., "noraster" → "nor'easter") Name errors (e.g., "t运行k" → "Trump") Location name errors Professional termino记录y errors Obvious spelling mistakes 输出 Files
For each subtitle file, 生成s:
-备份.vtt - Original subtitle (never modified) -corrected.txt - AI-corrected plAIn text -corrections.md - 列出 of corrections made Troubleshooting Video 下载 FAIls 检查 yt-dlp is 安装ed: yt-dlp --version 检查 YouTube URL is 访问ible Try manual 下载 first Subtitle 提取ion FAIls Some videos don't have auto-captions 检查 if --列出-subs shows avAIlable languages Server Won't 启动 检查 ports 8093/8095 are free: lsof -i :8093 检查 Python http.server is avAIlable Proofreading Issues Ensure DeepSeek 模型 is avAIlable 检查 subtitle file exists and is valid VTT 格式化 See Also references/工作流.md - DetAIled 工作流 documentation references/cron-设置up.md - Cron job configuration 图形界面de