详细分析 ▾
运行时依赖
版本
ai-video-remix 0.1.2 - Converted this skill to an instruction-only/documentation format. - Added explicit notice that runtime source code must be cloned separately; no code is bundled. - Provided GitHub source and homepage links in metadata. - Updated Quick Start instructions to clarify cloning and setup process. - No code or functional changes included in this release.
安装命令 点击复制
技能文档
This is an instruction-only skill — it provides guidance and reference documentation for the AI Video Remix CLI tool. The runtime source code lives in the GitHub repository and must be cloned separately (see Quick Start below).
Generate styled video compositions from a local ShotAI video library using natural language.
Important: Video 库 Requirement
This skill can only search and use videos that have been imported into ShotAI. Videos simply stored on your hard drive are not searchable — they must be added to a ShotAI collection and fully indexed first.
Before using this skill, make sure you have:
- Opened ShotAI 和 created collection
- Added video files 或 folders 到 collection
- Waited 对于 indexing 到 complete (shot detection + semantic analysis — progress shown 在...中 ShotAI)
If the search returns no results or low-quality matches, the most common reason is that the relevant videos have not been imported into ShotAI yet.
Prerequisites
See references/setup.md for full installation instructions, including:
- ShotAI 下载 和 setup
- ffmpeg installation
- yt-dlp installation (对于 auto music)
- 节点.js dependencies
Quick 开始
Note: This skill does not bundle runtime code. Clone the source repository first.
git clone https://github.com/abu-ShotAI/ai-video-remix.git
cd ai-video-editor
npm install
cp .env.example .env # fill in SHOTAI_URL, SHOTAI_TOKEN, and optionally AGENT_PROVIDER
npx tsx src/skill/cli.ts "帮我做一个旅行混剪"
Pipeline (8 steps)
- Agent: parseIntent — LLM extracts 主题, selects composition, optionally overrides music 样式
- Agent: refineQueries — LLM rewrites per-slot 搜索 terms 到 match 库 content
- ShotAI: pickShots — Semantic 搜索 per slot 通过 local ShotAI MCP server (localhost 仅), best shot selected
- Music: resolveMusic — Uses local MP3 通过
--bgm(recommended), 或 optionally downloads 从 YouTube 通过 yt-dlp - ffmpeg: extractClip — 每个 shot trimmed 到 independent
.mp4clip file (local 处理中 仅) - Agent: annotateClips — LLM assigns per-clip visual effect params (tone, dramatic, kenBurns, caption)
- File Server — Localhost-仅 HTTP server (127.0.0.1) serves clips 到 Remotion renderer 在...内 相同 machine
- Remotion: render — Composition rendered 到 final MP4
CLI Usage
After cloning the repository and running npm install:
npx tsx src/skill/cli.ts "" [options]Options:
--composition Override composition (skip LLM selection)
--bgm Local MP3 path (skip YouTube search)
--output Output directory (default: ./output)
--lang Output language: zh Chinese (default) / en English
Affects: video title, per-clip captions & location labels, attribution line
--probe Scan library first, let LLM plan slots from actual content
Compositions
| ID | Label | Best For |
|---|---|---|
CyberpunkCity | 赛博朋克夜景 | Neon city, night scenes, sci-fi |
TravelVlog | 旅行 Vlog | Multi-city travel with location cards |
MoodDriven | 情绪驱动混剪 | Fast/slow emotion cuts |
NatureWild | 自然野生动物 | BBC nature documentary style |
SwitzerlandScenic | 瑞士风光 | Alpine/scenic travel with captions |
SportsHighlight | 体育集锦 | ESPN-style with goal captions |
Modes
Standard mode (默认): LLM picks composition + generates 搜索 queries 从 registry templates.
Probe mode (--probe): Scans 库 videos 第一个 (names, shot samples, mood/scene tags), 然后 LLM generates custom slots tailored 到 什么 actually exists.
Choose probe mode when: library content is unknown, user wants "best of my library", or standard slots return low-quality shots.
Environment Variables
See references/config.md for all environment variables and LLM provider setup.
Troubleshooting & Quality Tuning
See references/tuning.md for solutions to:
- Clip boundary flicker / 1–2 frame flash 在 cuts
- Red flash artifact 在...中 CyberpunkCity (GlitchFlicker 在...上 short clips)
- Low-quality 或 off-topic shots
- Music 下载 failures
Recommended .env defaults 对于 best quality:
MIN_SCORE=0.5 # filter short/low-quality shots
Writing ShotAI 搜索 Queries
ShotAI uses semantic search powered by AI-generated tags and embedding vectors. Query quality is the single biggest factor in shot relevance — invest time here.
查询 construction rules
Always 写入 满 sentences 或 rich phrases, never bare keywords.
The search engine understands semantic similarity ("ocean" matches "sea", "waves", "shoreline"), so richer context produces better recall.
| Quality | Example | When to use |
|---|---|---|
| ⭐ Detailed description | "A white seagull with spread wings gliding smoothly over calm blue ocean water, golden sunset light reflecting on the waves" | Best precision — use for hero shots |
| ⭐ Full sentence | "A seagull flying gracefully over the ocean at sunset" | Good balance of precision and recall |
| Short phrase | "seagull flying over ocean" | Acceptable fallback |
| Single keyword | "seagull" | Avoid — low precision, noisy results |
什么 到 include 在...中 查询
Describe the visual content of the ideal shot across these dimensions:
- Subject: 什么/谁 在...中 frame (
lone hiker,city traffic 在 night,athlete celebrating) - Action: 什么 happening (
walking slowly 通过 fog,speeding 通过 intersection,jumping 带有 arms raised) - Environment: location, 设置, 时间 的 day (
rain-soaked Tokyo street,mountain meadow 在 golden hour,空 stadium 在...下 floodlights) - Mood / atmosphere: emotional tone (
melancholic,tense,euphoric,serene) - Camera feel: implied movement 或 framing (
wide establishing shot,tight 关闭-up,slow pan,handheld shaky)
Not all dimensions are needed every time — include whichever are most distinctive for the shot you want.
refineQueries step
When the agent runs refineQueries, it rewrites the composition's default slot queries to better match the user's actual library. Apply these principles:
- 开始 从 slot's semantic intent — 什么 emotional 或 narrative 角色 做 shot play 在...中 composition?
- Incorporate 任何 context 从 用户's 请求 — location names, 事件 names, specific subjects mentioned
- Expand synonyms — 如果 slot says
"water", try"river flowing 通过 forest"或"lake reflecting mountains"based 在...上 什么 库 likely contains - Avoid negations —
"不 indoors"做 不 work; 代替 describe positive version ("outdoor daylight scene") - One 查询 per slot — 使 specific rather 比 trying 到 cover multiple scenarios
Examples: slot 查询 → refined 查询
Slot default: "city at night"
User request: "帮我做一个东京旅行混剪"
Refined: "Neon-lit Tokyo street at night, pedestrians crossing under glowing signs, rain reflections on pavement"Slot default: "nature landscape"
User request: "trip to Patagonia last month"
Refined: "Dramatic Patagonia mountain landscape, snow-capped peaks under stormy clouds, vast open wilderness"
Slot default: "athlete in action"
User request: "basketball highlight from last game"
Refined: "Basketball player driving to the hoop, explosive movement, crowd in background blurred"
Adding 新的 Composition
See references/composition-guide.md to add a new Remotion composition to the registry.
Safety 和 Fallback
Network & credential scope
- 所有 credentials stay local.
SHOTAI_TOKENsent 仅 到 local ShotAI MCP server (127.0.0.1). LLM API keys (如果 configured) sent 仅 到 respective provider endpoints — never 到 ShotAI, YouTube, 或 任何 其他 服务. - clip file server binds 到
127.0.0.1仅 (默认 port 8080). 不 accessible 从 其他 machines 在...上 network. serves temporary clip files 到 Remotion renderer running 在...上 相同 machine 和 shuts down 之后 rendering completes. - yt-dlp 可选. 使用
--bgm /path/到/local.mp3到 skip 所有 YouTube network access. 当...时 yt-dlp used, 仅 downloads single background music track — 否 其他 data sent 到 YouTube. - LLM access 可选. 设置
AGENT_PROVIDER=无到 run 在...中 heuristic mode 带有 zero external network calls (aside 从 local ShotAI MCP server).
错误 handling
- 如果
SHOTAI_URL或SHOTAI_TOKENunset, display warning: "ShotAI MCP server 不 configured. 设置SHOTAI_URL和SHOTAI_TOKEN在...中.envfile. 下载 ShotAI 在 https://www.shotai.io." - 如果 ShotAI MCP server returns 错误 (连接 refused, HTTP 4xx/5xx), display 错误 消息 和 停止 — 做 不 fabricate shot results.
- Never fabricate video file paths, shot timestamps, 或 similarity scores.
- 如果 music 下载 fails (yt-dlp 错误 或 network unreachable), suggest 使用
--bgm到 provide local audio file 代替. - 如果 Remotion render fails, display 错误 输出 和 suggest checking 节点.js version (18+) 和 所有 clip files 是 extracted successfully.
- 如果 LLM provider unreachable, fall back 到 heuristic mode: 使用 composition 默认 queries directly 没有 refinement, 和 skip
annotateClips(使用 composition 默认 effect params).
License
MIT-0 — Free to use, modify, and redistribute. No attribution required. See https://spdx.org/licenses/MIT-0.html
免费技能或插件可能存在安全风险,如需更匹配、更安全的方案,建议联系付费定制