入门
发送您的静态图片,我会处理AI动态视频制作。或者只需描述您的需求。
尝试这样说:
- "将单张产品图或风景图片转换为1080p MP4"
- "用慢速缩放和柔和的镜头平移来动画化这张照片"
- "为社交媒体创作者将静态照片转为动态视频剪辑"
连接
在处理任何用户请求之前,请先建立与后端API的连接。显示简短的"连接中..."状态。
如果环境中有NEMO_TOKEN,直接使用它并创建会话。否则,获取免费起始令牌:
- 生成UUID作为客户端标识符
- 带
X-Client-Id头POST到https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token
- 响应包含一个有效期7天、100免费积分的
token——将其用作NEMO_TOKEN
然后创建会话:使用Bearer授权和body {"task_name":"project","language":"en"} POST到https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent。响应中的session_id是后续所有请求需要的。
告诉用户您已准备好。将技术细节保留在聊天之外。
# Image to Video Motion — 将图片制作为视频剪辑
此工具接收您的静态图片,通过云渲染管道运行AI动态视频制作。您上传、描述需求,然后下载结果。
假设您有一张单张产品图或风景图片,想用慢速缩放和柔和的镜头平移来动画化这张照片——后端会在约30-60秒内处理并为您提供1080p MP4。
提示:高对比度、主体清晰的照片能产生更流畅的动效结果。
将用户输入与操作匹配
引用图片转视频动效、宽高比、文字叠加或音轨的用户提示会通过关键词和意图分类路由到相应操作。
| 用户说... | 操作 | 跳过SSE? |
|---|
| "export" / "导出" / "download" / "send me the video" | → §3.5 导出 | ✅ |
| "credits" / "积分" / "balance" / "余额" | → §3.3 积分 | ✅ |
| "status" / "状态" / "show tracks" | → §3.4 状态 | ✅ |
| "upload" / "上传" / 用户发送文件 | → §3.2 上传 | ✅ |
| 其他一切(生成、编辑、添加BGM…) | → §3.1 SSE | ❌ |
云渲染管道详情
每个导出任务在云GPU节点上排队,合成视频层,应用平台特定压缩(H.264,最高1080x1920),并在30-90秒内返回下载链接。会话令牌携带渲染任务ID,因此在完成前关闭标签页会使任务孤立。
每个请求必须包含Authorization: Bearer 和所有归属头——遗漏它们会在导出时触发402。
每个请求需要三个归属头,必须与此文件的frontmatter匹配:
| 头 | 值 |
|---|
X-Skill-Source | image-to-video-motion |
X-Skill-Version | frontmatter version |
X-Skill-Platform | 从安装路径自动检测:clawhub / cursor / unknown |
API基础URL:
https://mega-api-prod.nemovideo.ai创建会话: POST /api/tasks/me/with-session/nemo_agent — body {"task_name":"project","language":""} — 返回task_id、session_id。
发送消息(SSE): POST /run_sse — body {"app_name":"nemo_agent","user_id":"me","session_id":"","new_message":{"parts":[{"text":""}]}} 并带Accept: text/event-stream。最大超时:15分钟。
上传: POST /api/upload-video/nemo_agent/me/ — 文件:multipart -F "files=@/path",或URL:{"urls":[""],"source_type":"url"}
积分: GET /api/credits/balance/simple — 返回available、frozen、total
会话状态: GET /api/state/nemo_agent/me//latest — 关键字段:data.state.draft、data.state.video_infos、data.state.generated_media
导出(免费,无积分消耗): POST /api/render/proxy/lambda — body {"id":"render_","sessionId":"","draft":,"output":{"format":"mp4","quality":"high"}}。每30秒轮询GET /api/render/proxy/lambda/直到status = completed。下载URL在output.url。
支持的格式:mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac。
错误码
0 — 成功,正常继续
1001 — 令牌过期或无效;通过/api/auth/anonymous-token重新获取
1002 — 会话未找到;创建新会话
2002 — 积分不足;匿名用户获得带?bind=的注册链接,注册用户充值
4001 — 不支持的文件类型;显示接受的格式
4002 — 文件过大;建议压缩或裁剪
400 — 缺少X-Client-Id;生成一个并重试
402 — 免费计划导出被阻止;不是积分问题,是订阅等级
429 — 速率限制;等待30秒并重试一次
后端响应翻译
后端假设存在GUI。将这些翻译为API操作:
| 后端说 | 您执行 |
|---|
| "click [button]" / "点击" | 通过API执行 |
| "open [panel]" / "打开" | 查询会话状态 |
| "drag/drop" / "拖拽" | 通过SSE发送编辑 |
| "preview in timeline" | 显示轨道摘要 |
| "Export button" / "导出" | 执行导出工作流 |
读取SSE流
文本事件直接发送给用户(经过GUI翻译后)。工具调用保留在内部。心跳和空的data:行意味着后端仍在工作——每2分钟显示"⏳ 仍在处理中..."。约30%的编辑操作会关闭流而不包含任何文本。发生这种情况时,轮询/api/state确认时间轴已更改,然后告诉用户更新了什么。
Draft JSON使用短键:t表示轨道,tt表示轨道类型(0=视频,1=音频,7=文字),sg表示片段,d表示持续时间(毫秒),m表示元数据。
示例时间轴摘要:
时间轴(3个轨道):
- 视频:城市延时(0-10秒)
- BGM:Lo-fi(0-10秒,35%)
- 标题:"Urban Dreams"(0-3秒)
技巧和窍门
后端在您具体描述时处理更快。与其说"让它看起来更好",不如说"用慢速缩放和柔和的镜头平移来动画化这张照片"——具体指令能获得更好的结果。
最大文件大小为50MB。坚持使用JPG、PNG、WEBP、HEIC以获得最流畅的体验。导出为MP4以获得社交平台的最大兼容性。
常见工作流
快速编辑:上传 → "用慢速缩放和柔和的镜头平移来动画化这张照片" → 下载MP4。30秒剪辑需要30-60秒。
批量风格:在一个会话中上传多个文件。用不同指令逐一处理。每个都会获得自己的渲染任务。
迭代:从粗剪开始,预览结果,然后精修。会话会保留您的时间轴状态,这样您可以继续调整。
Getting Started
Send me your still images and I'll handle the AI motion video creation. Or just describe what you're after.
Try saying:
- "convert a single product photo or landscape image into a 1080p MP4"
- "animate this photo with a slow zoom and gentle camera pan"
- "turning static photos into moving video clips for social media for social media creators"
Getting Connected
Before handling any user request, establish a connection to the backend API. Show a brief status like "Connecting...".
If NEMO_TOKEN is in the environment, use it directly and create a session. Otherwise, acquire a free starter token:
- Generate a UUID as client identifier
- POST to
https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with the X-Client-Id header
- The response includes a
token with 100 free credits valid for 7 days — use it as NEMO_TOKEN
Then create a session by POSTing to https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Bearer authorization and body {"task_name":"project","language":"en"}. The session_id in the response is needed for all following requests.
Tell the user you're ready. Keep the technical details out of the chat.
# Image to Video Motion — Animate Images Into Video Clips
This tool takes your still images and runs AI motion video creation through a cloud rendering pipeline. You upload, describe what you want, and download the result.
Say you have a single product photo or landscape image and want to animate this photo with a slow zoom and gentle camera pan — the backend processes it in about 30-60 seconds and hands you a 1080p MP4.
Tip: high-contrast images with clear subjects produce smoother motion results.
Matching Input to Actions
User prompts referencing image to video motion, aspect ratio, text overlays, or audio tracks get routed to the corresponding action via keyword and intent classification.
| User says... | Action | Skip SSE? |
|---|
| "export" / "导出" / "download" / "send me the video" | → §3.5 Export | ✅ |
| "credits" / "积分" / "balance" / "余额" | → §3.3 Credits | ✅ |
| "status" / "状态" / "show tracks" | → §3.4 State | ✅ |
| "upload" / "上传" / user sends file | → §3.2 Upload | ✅ |
| Everything else (generate, edit, add BGM…) | → §3.1 SSE | ❌ |
Cloud Render Pipeline Details
Each export job queues on a cloud GPU node that composites video layers, applies platform-spec compression (H.264, up to 1080x1920), and returns a download URL within 30-90 seconds. The session token carries render job IDs, so closing the tab before completion orphans the job.
Include Authorization: Bearer and all attribution headers on every request — omitting them triggers a 402 on export.
Three attribution headers are required on every request and must match this file's frontmatter:
| Header | Value |
|---|
X-Skill-Source | image-to-video-motion |
X-Skill-Version | frontmatter version |
X-Skill-Platform | auto-detect: clawhub / cursor / unknown from install path |
API base:
https://mega-api-prod.nemovideo.aiCreate session: POST /api/tasks/me/with-session/nemo_agent — body {"task_name":"project","language":""} — returns task_id, session_id.
Send message (SSE): POST /run_sse — body {"app_name":"nemo_agent","user_id":"me","session_id":"","new_message":{"parts":[{"text":""}]}} with Accept: text/event-stream. Max timeout: 15 minutes.
Upload: POST /api/upload-video/nemo_agent/me/ — file: multipart -F "files=@/path", or URL: {"urls":[""],"source_type":"url"}
Credits: GET /api/credits/balance/simple — returns available, frozen, total
Session state: GET /api/state/nemo_agent/me//latest — key fields: data.state.draft, data.state.video_infos, data.state.generated_media
Export (free, no credits): POST /api/render/proxy/lambda — body {"id":"render_","sessionId":"","draft":,"output":{"format":"mp4","quality":"high"}}. Poll GET /api/render/proxy/lambda/ every 30s until status = completed. Download URL at output.url.
Supported formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.
Error Codes
0 — success, continue normally
1001 — token expired or invalid; re-acquire via /api/auth/anonymous-token
1002 — session not found; create a new one
2001 — out of credits; anonymous users get a registration link with ?bind=, registered users top up
4001 — unsupported file type; show accepted formats
4002 — file too large; suggest compressing or trimming
400 — missing X-Client-Id; generate one and retry
402 — free plan export blocked; not a credit issue, subscription tier
429 — rate limited; wait 30s and retry once
Backend Response Translation
The backend assumes a GUI exists. Translate these into API actions:
| Backend says | You do |
|---|
| "click [button]" / "点击" | Execute via API |
| "open [panel]" / "打开" | Query session state |
| "drag/drop" / "拖拽" | Send edit via SSE |
| "preview in timeline" | Show track summary |
| "Export button" / "导出" | Execute export workflow |
Reading the SSE Stream
Text events go straight to the user (after GUI translation). Tool calls stay internal. Heartbeats and empty data: lines mean the backend is still working — show "⏳ Still working..." every 2 minutes.
About 30% of edit operations close the stream without any text. When that happens, poll /api/state to confirm the timeline changed, then tell the user what was updated.
Draft JSON uses short keys: t for tracks, tt for track type (0=video, 1=audio, 7=text), sg for segments, d for duration in ms, m for metadata.
Example timeline summary:
Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)
Tips and Tricks
The backend processes faster when you're specific. Instead of "make it look better", try "animate this photo with a slow zoom and gentle camera pan" — concrete instructions get better results.
Max file size is 50MB. Stick to JPG, PNG, WEBP, HEIC for the smoothest experience.
Export as MP4 for widest compatibility across social platforms.
Common Workflows
Quick edit: Upload → "animate this photo with a slow zoom and gentle camera pan" → Download MP4. Takes 30-60 seconds for a 30-second clip.
Batch style: Upload multiple files in one session. Process them one by one with different instructions. Each gets its own render.
Iterative: Start with a rough cut, preview the result, then refine. The session keeps your timeline state so you can keep tweaking.