Image To Video Motion — 图片转视频动效

Name: Image To Video Motion — 图片转视频动效
Author: siddylcon

siddylcon

🎞️ Image To Video Motion — 图片转视频动效

v1.0.0

将单张产品图或风景图片通过文字描述转换为1080p动态视频剪辑。无需时间轴编辑，AI云端渲染，30-60秒即可生成视频。支持上传图片并用自然语言描述想要的动效效果，如慢速缩放、镜头平移等。兼容主流图片、视频和音频格式。

0· 2·0 当前·0 累计

by @siddylcon·MIT-0

生产力工具

下载技能包

License

MIT-0

最后更新

2026/4/16

安全扫描

VirusTotal

无害

查看报告

OpenClaw

安全

medium confidence

该技能的请求和运行时指令与图片→视频云渲染服务一致：需要单个API令牌，将图片上传至声明的nemovideo.ai后端，无安装或隐藏二进制文件——但在安装前需审查隐私和一个小元数据不一致问题。

评估建议

该技能似乎能实现其所述功能：将用户图片上传至nemovideo.ai后端，创建会话，运行云渲染，并返回下载链接。安装前：1) 决定是否信任nemovideo.ai来接收/存储/处理您发送的图片和任何音频/文本——敏感图片将被上传。2) 如需问责或限制，建议自行提供NEMO_TOKEN而非依赖匿名令牌获取。3) 澄清元数据不匹配问题：SKILL.md引用了配置路径(~/.config/nemovideo/)，而注册摘要未列出——请询问发布者该技能是否会读写该路径。4) 注意该技能可能会检查代理的安装路径以设置X-Skill-Platform（轻微的环境访问）。如果您对第三方上传感到不适或无法验证提供商的隐私/安全性，请勿安装。...

详细分析 ▾

✓ 用途与能力

名称/描述与实际行为一致：上传图片、创建会话、发送SSE生成命令，并从nemovideo.ai API获取渲染的MP4。要求提供API令牌(NEMO_TOKEN)对于此类云服务是预期的。

ℹ 指令范围

运行时指令保持在所述目的范围内（创建会话、上传文件、启动渲染、轮询结果）。一个轻微的范围说明：指令要求从安装路径自动检测X-Skill-Platform（clawhub/cursor/unknown），这意味着读取代理的安装路径/上下文；这不敏感，但属于环境探查而非严格必要。此外，指令不要求获取无关文件或密钥。

✓ 安装机制

纯指令技能，无安装规范或代码文件——磁盘持久化或任意代码安装的风险最低。

ℹ 凭证需求

仅需NEMO_TOKEN，这对于云渲染API是合理的。然而，SKILL.md frontmatter列出了注册摘要未提及的配置路径(~/.config/nemovideo/)，此不匹配应予以澄清。另请注意，该技能会将图片上传至第三方域(mega-api-prod.nemovideo.ai)，因此提供任何令牌或允许匿名令牌发放意味着您的图片将被传输至该服务。

✓ 持久化与权限

always为false，无安装时持久化请求，且该技能不请求修改其他技能或系统级设置。

安全有层次，运行前请审查代码。

License

MIT-0

可自由使用、修改和再分发，无需署名。

查看条款 ↗

运行时依赖

无特殊依赖

版本

latestv1.0.02026/4/16

Image to Video Motion — 初始版本：通过简单的聊天工作流将静态产品图或风景照片制作为1080p MP4视频剪辑。接受图片上传和自然语言指令；无需时间轴编辑。AI驱动的快速云渲染——视频导出通常在30-90秒内就绪。支持upload、export、credits和状态查询等工作流命令。处理认证，支持免费起始令牌回退。兼容主流图片、视频和音频格式；导出免费，采用基于会话的任务跟踪。

● 无害

安装命令点击复制

官方npx clawhub@latest install image-to-video-motion

镜像加速npx clawhub@latest install image-to-video-motion --registry https://cn.clawhub-mirror.com

技能文档

入门

发送您的静态图片，我会处理AI动态视频制作。或者只需描述您的需求。

尝试这样说：

"将单张产品图或风景图片转换为1080p MP4"
"用慢速缩放和柔和的镜头平移来动画化这张照片"
"为社交媒体创作者将静态照片转为动态视频剪辑"

连接

在处理任何用户请求之前，请先建立与后端API的连接。显示简短的"连接中..."状态。

如果环境中有NEMO_TOKEN，直接使用它并创建会话。否则，获取免费起始令牌：

生成UUID作为客户端标识符
带X-Client-Id头POST到https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token
响应包含一个有效期7天、100免费积分的token——将其用作NEMO_TOKEN

然后创建会话：使用Bearer授权和body {"task_name":"project","language":"en"} POST到https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent。响应中的session_id是后续所有请求需要的。

告诉用户您已准备好。将技术细节保留在聊天之外。

# Image to Video Motion — 将图片制作为视频剪辑

此工具接收您的静态图片，通过云渲染管道运行AI动态视频制作。您上传、描述需求，然后下载结果。

假设您有一张单张产品图或风景图片，想用慢速缩放和柔和的镜头平移来动画化这张照片——后端会在约30-60秒内处理并为您提供1080p MP4。

提示：高对比度、主体清晰的照片能产生更流畅的动效结果。

将用户输入与操作匹配

引用图片转视频动效、宽高比、文字叠加或音轨的用户提示会通过关键词和意图分类路由到相应操作。

用户说...	操作	跳过SSE?
"export" / "导出" / "download" / "send me the video"	→ §3.5 导出	✅
"credits" / "积分" / "balance" / "余额"	→ §3.3 积分	✅
"status" / "状态" / "show tracks"	→ §3.4 状态	✅
"upload" / "上传" / 用户发送文件	→ §3.2 上传	✅
其他一切（生成、编辑、添加BGM…）	→ §3.1 SSE	❌

云渲染管道详情

每个导出任务在云GPU节点上排队，合成视频层，应用平台特定压缩（H.264，最高1080x1920），并在30-90秒内返回下载链接。会话令牌携带渲染任务ID，因此在完成前关闭标签页会使任务孤立。

每个请求必须包含Authorization: Bearer 和所有归属头——遗漏它们会在导出时触发402。

每个请求需要三个归属头，必须与此文件的frontmatter匹配：

头	值
`X-Skill-Source`	`image-to-video-motion`
`X-Skill-Version`	frontmatter `version`
`X-Skill-Platform`	从安装路径自动检测：`clawhub` / `cursor` / `unknown`

API基础URL: https://mega-api-prod.nemovideo.ai

创建会话: POST /api/tasks/me/with-session/nemo_agent — body {"task_name":"project","language":""} — 返回task_id、session_id。

发送消息(SSE): POST /run_sse — body {"app_name":"nemo_agent","user_id":"me","session_id":"","new_message":{"parts":[{"text":""}]}} 并带Accept: text/event-stream。最大超时：15分钟。

上传: POST /api/upload-video/nemo_agent/me/ — 文件：multipart -F "files=@/path"，或URL：{"urls":[""],"source_type":"url"}

积分: GET /api/credits/balance/simple — 返回available、frozen、total

会话状态: GET /api/state/nemo_agent/me//latest — 关键字段：data.state.draft、data.state.video_infos、data.state.generated_media

导出（免费，无积分消耗）: POST /api/render/proxy/lambda — body {"id":"render_","sessionId":"","draft":,"output":{"format":"mp4","quality":"high"}}。每30秒轮询GET /api/render/proxy/lambda/直到status = completed。下载URL在output.url。

支持的格式：mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac。

错误码

0 — 成功，正常继续
1001 — 令牌过期或无效；通过/api/auth/anonymous-token重新获取
1002 — 会话未找到；创建新会话
2002 — 积分不足；匿名用户获得带?bind=的注册链接，注册用户充值
4001 — 不支持的文件类型；显示接受的格式
4002 — 文件过大；建议压缩或裁剪
400 — 缺少X-Client-Id；生成一个并重试
402 — 免费计划导出被阻止；不是积分问题，是订阅等级
429 — 速率限制；等待30秒并重试一次

后端响应翻译

后端假设存在GUI。将这些翻译为API操作：

后端说	您执行
"click [button]" / "点击"	通过API执行
"open [panel]" / "打开"	查询会话状态
"drag/drop" / "拖拽"	通过SSE发送编辑
"preview in timeline"	显示轨道摘要
"Export button" / "导出"	执行导出工作流

读取SSE流

文本事件直接发送给用户（经过GUI翻译后）。工具调用保留在内部。心跳和空的data:行意味着后端仍在工作——每2分钟显示"⏳ 仍在处理中..."。约30%的编辑操作会关闭流而不包含任何文本。发生这种情况时，轮询/api/state确认时间轴已更改，然后告诉用户更新了什么。

Draft JSON使用短键：t表示轨道，tt表示轨道类型（0=视频，1=音频，7=文字），sg表示片段，d表示持续时间（毫秒），m表示元数据。

示例时间轴摘要：

时间轴（3个轨道）：
视频：城市延时（0-10秒）
BGM：Lo-fi（0-10秒，35%）
标题："Urban Dreams"（0-3秒）

技巧和窍门

后端在您具体描述时处理更快。与其说"让它看起来更好"，不如说"用慢速缩放和柔和的镜头平移来动画化这张照片"——具体指令能获得更好的结果。

最大文件大小为50MB。坚持使用JPG、PNG、WEBP、HEIC以获得最流畅的体验。导出为MP4以获得社交平台的最大兼容性。

常见工作流

快速编辑：上传 → "用慢速缩放和柔和的镜头平移来动画化这张照片" → 下载MP4。30秒剪辑需要30-60秒。

批量风格：在一个会话中上传多个文件。用不同指令逐一处理。每个都会获得自己的渲染任务。

迭代：从粗剪开始，预览结果，然后精修。会话会保留您的时间轴状态，这样您可以继续调整。

Getting Started

Send me your still images and I'll handle the AI motion video creation. Or just describe what you're after.

Try saying:

"convert a single product photo or landscape image into a 1080p MP4"
"animate this photo with a slow zoom and gentle camera pan"
"turning static photos into moving video clips for social media for social media creators"

Getting Connected

Before handling any user request, establish a connection to the backend API. Show a brief status like "Connecting...".

If NEMO_TOKEN is in the environment, use it directly and create a session. Otherwise, acquire a free starter token:

Generate a UUID as client identifier
POST to https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with the X-Client-Id header
The response includes a token with 100 free credits valid for 7 days — use it as NEMO_TOKEN

Then create a session by POSTing to https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Bearer authorization and body {"task_name":"project","language":"en"}. The session_id in the response is needed for all following requests.

Tell the user you're ready. Keep the technical details out of the chat.

# Image to Video Motion — Animate Images Into Video Clips

This tool takes your still images and runs AI motion video creation through a cloud rendering pipeline. You upload, describe what you want, and download the result.

Say you have a single product photo or landscape image and want to animate this photo with a slow zoom and gentle camera pan — the backend processes it in about 30-60 seconds and hands you a 1080p MP4.

Tip: high-contrast images with clear subjects produce smoother motion results.

Matching Input to Actions

User prompts referencing image to video motion, aspect ratio, text overlays, or audio tracks get routed to the corresponding action via keyword and intent classification.

User says...	Action	Skip SSE?
"export" / "导出" / "download" / "send me the video"	→ §3.5 Export	✅
"credits" / "积分" / "balance" / "余额"	→ §3.3 Credits	✅
"status" / "状态" / "show tracks"	→ §3.4 State	✅
"upload" / "上传" / user sends file	→ §3.2 Upload	✅
Everything else (generate, edit, add BGM…)	→ §3.1 SSE	❌

Cloud Render Pipeline Details

Each export job queues on a cloud GPU node that composites video layers, applies platform-spec compression (H.264, up to 1080x1920), and returns a download URL within 30-90 seconds. The session token carries render job IDs, so closing the tab before completion orphans the job.

Include Authorization: Bearer and all attribution headers on every request — omitting them triggers a 402 on export.

Three attribution headers are required on every request and must match this file's frontmatter:

Header	Value
`X-Skill-Source`	`image-to-video-motion`
`X-Skill-Version`	frontmatter `version`
`X-Skill-Platform`	auto-detect: `clawhub` / `cursor` / `unknown` from install path

API base: https://mega-api-prod.nemovideo.ai

Create session: POST /api/tasks/me/with-session/nemo_agent — body {"task_name":"project","language":""} — returns task_id, session_id.

Send message (SSE): POST /run_sse — body {"app_name":"nemo_agent","user_id":"me","session_id":"","new_message":{"parts":[{"text":""}]}} with Accept: text/event-stream. Max timeout: 15 minutes.

Upload: POST /api/upload-video/nemo_agent/me/ — file: multipart -F "files=@/path", or URL: {"urls":[""],"source_type":"url"}

Credits: GET /api/credits/balance/simple — returns available, frozen, total

Session state: GET /api/state/nemo_agent/me//latest — key fields: data.state.draft, data.state.video_infos, data.state.generated_media

Export (free, no credits): POST /api/render/proxy/lambda — body {"id":"render_","sessionId":"","draft":,"output":{"format":"mp4","quality":"high"}}. Poll GET /api/render/proxy/lambda/ every 30s until status = completed. Download URL at output.url.

Supported formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

Error Codes

0 — success, continue normally
1001 — token expired or invalid; re-acquire via /api/auth/anonymous-token
1002 — session not found; create a new one
2001 — out of credits; anonymous users get a registration link with ?bind=, registered users top up
4001 — unsupported file type; show accepted formats
4002 — file too large; suggest compressing or trimming
400 — missing X-Client-Id; generate one and retry
402 — free plan export blocked; not a credit issue, subscription tier
429 — rate limited; wait 30s and retry once

Backend Response Translation

The backend assumes a GUI exists. Translate these into API actions:

Backend says	You do
"click [button]" / "点击"	Execute via API
"open [panel]" / "打开"	Query session state
"drag/drop" / "拖拽"	Send edit via SSE
"preview in timeline"	Show track summary
"Export button" / "导出"	Execute export workflow

Reading the SSE Stream

Text events go straight to the user (after GUI translation). Tool calls stay internal. Heartbeats and empty data: lines mean the backend is still working — show "⏳ Still working..." every 2 minutes.

About 30% of edit operations close the stream without any text. When that happens, poll /api/state to confirm the timeline changed, then tell the user what was updated.

Draft JSON uses short keys: t for tracks, tt for track type (0=video, 1=audio, 7=text), sg for segments, d for duration in ms, m for metadata.

Example timeline summary:

Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)

Tips and Tricks

The backend processes faster when you're specific. Instead of "make it look better", try "animate this photo with a slow zoom and gentle camera pan" — concrete instructions get better results.

Max file size is 50MB. Stick to JPG, PNG, WEBP, HEIC for the smoothest experience.

Export as MP4 for widest compatibility across social platforms.

Common Workflows

Quick edit: Upload → "animate this photo with a slow zoom and gentle camera pan" → Download MP4. Takes 30-60 seconds for a 30-second clip.

Batch style: Upload multiple files in one session. Process them one by one with different instructions. Each gets its own render.

Iterative: Start with a rough cut, preview the result, then refine. The session keeps your timeline state so you can keep tweaking.

数据来源：ClawHub ↗ · 中文优化：龙虾技能库

OpenClaw 技能定制 / 插件定制 / 私有工作流定制

免费技能或插件可能存在安全风险，如需更匹配、更安全的方案，建议联系付费定制

了解定制服务

License

运行时依赖

版本

安装命令 点击复制

技能文档

入门

连接

将用户输入与操作匹配

云渲染管道详情

错误码

后端响应翻译

读取SSE流

技巧和窍门

常见工作流

Getting Started

Getting Connected

Matching Input to Actions

Cloud Render Pipeline Details

Error Codes

Backend Response Translation

Reading the SSE Stream

Tips and Tricks

Common Workflows

安装命令点击复制