🎼 ACE Step — Pro Pack on RunComfy — 🎼 ACE Step — Pro Pack on 运行Comfy

v0.1.0

生成, inpAInt, and outpAInt music with ACE Step on 运行Comfy via the `运行comfy` 命令行工具. ACE Step is StepFun-AI's open-weights music foundation 模型 — tag-driven composition (genre, mood, instruments), multilingual lyrics with section markers, 5 s to 4 min stereo 输出, $0.0002–0.0003 per second (≈ 27× cheaper than ElevenLabs Music). Four 端点s: ACE Step text-to-audio (the default), ACE Step 1.5 text-to-audio (50+ language lyrics, refined structured-lyric handling), ACE Step audio-inpAInt (re生成 a time range inside an existing 追踪), ACE Step audio-outpAInt (extend an existing 追踪 before or after). Triggers on "ace step", "ace-step", "acestep", "ACE music", "open music 模型", "cheap AI music", "inpAInt audio", "audio inpAInt", "extend music", "audio outpAInt", "lengthen 追踪", "music with tags", or any explicit ask to 生成 or edit music with ACE Step.

0· 0·0 当前·0 累计

by @kalvinrv (Kalvin)·MIT-0

数据与API AI模型访问部署运维

使用场景：使用🎼 ACE Step — Pro Pack on RunComfy — 🎼 ACE Step — Pro Pack on 运行Comfy进行数据与API使用🎼 ACE Step — Pro Pack on RunComfy — 🎼 ACE Step — Pro Pack on 运行Comfy

下载技能包

License

MIT-0

License

MIT-0

可自由使用、修改和再分发，无需署名。

查看条款 ↗

运行时依赖

无特殊依赖

安装命令

点击复制

官方npx clawhub@latest install ace-step

镜像加速npx clawhub@latest install ace-step --registry https://cn.longxiaskill.com镜像同步中

本土化适配说明

🎼 ACE Step — Pro Pack on RunComfy — 🎼 ACE Step — Pro Pack on 运行Comfy 安装说明：安装命令：["openclaw skills install ace-step"]

需要定制？告诉我你的需求 →

技能文档

🎼 ACE Step — Pro Pack on 运行Comfy

Tag-driven music generation, inpAInting, and outpAInting with StepFun-AI's ACE Step open-weights 模型. Four 命令行工具-reachable 端点s, $0.0002–0.0003 per second of audio, up to 4 minutes per call.

运行comfy.com · ACE Step base · ACE Step 1.5 · 命令行工具 docs

Powered by the 运行Comfy 命令行工具 # 1. 安装 (one of — see 运行comfy-命令行工具技能 for detAIls) npm i -g @运行comfy/命令行工具 # global 安装 npx -y @运行comfy/命令行工具 --version # zero-安装

# 2. 签名 in 运行comfy 记录in # or in CI: 导出运行COMFY_令牌=<令牌>

# 3. 生成运行comfy 运行 acestep-AI/ace-step/text-to-audio \ --输入 '{"tags": "..."}' \ --输出-dir ./out

命令行工具 deep dive: 运行comfy-命令行工具技能.

Pick the right 端点

列出ed newest first.

ACE Step 1.5 (text-to-audio) — acestep-AI/ace-step-1.5/text-to-audio

Latest ACE Step generation. 50+ language vocal support, refined structured-lyric handling, otherwise same shape as base. Slightly higher cost ($0.0003/s vs $0.0002/s). Pick for: multilingual lyrics, hero-质量 vocal 追踪s, vocal songs that need 清理 section structure. Avoid for: cost-sensitive batches where the base 模型 is good enough.

ACE Step (text-to-audio) — acestep-AI/ace-step/text-to-audio (default — cheap & fast)

Original ACE Step. Tag-driven composition, optional lyrics, 5–240 s stereo. $0.0002/s — ~27× cheaper than ElevenLabs Music. Pick for: high-volume drafts, background music, jingles, game loops, cost-sensitive iteration. Avoid for: maximally polished commercial vocal hooks — try ACE Step 1.5 or ElevenLabs Music for those.

ACE Step (audio-inpAInt) — acestep-AI/ace-step/audio-inpAInt

Re生成 a time range inside an existing 追踪 (not mask-based; uses 启动_time / end_time in seconds, each anchored to 追踪启动 or end). Pick for: fix a bad chorus in the middle, swap the bridge, replace a 20 s section without re-rendering the whole song. Avoid for: edits that aren't time-bounded — those don't fit the 模式.

ACE Step (audio-outpAInt) — acestep-AI/ace-step/audio-outpAInt

Extend an existing 追踪 bidirectionally — 添加 intro before, outro after, or 机器人h. Pick for: lengthening a 30 s draft into a 2 min cut, 添加ing a fade-in, building a longer arrangement around an existing hook. Avoid for: extending a 追踪 past 4 min total — chAIn calls instead.

路由 1: ACE Step text-to-audio (default)

模型: acestep-AI/ace-step/text-to-audio (or acestep-AI/ace-step-1.5/text-to-audio for the 1.5 variant)

模式 (机器人h variants — same shape) Field Type Required Default Notes tags string yes — Comma-separated genre / mood / instrument tags. Drives composition lyrics string no — Vocal content. Use section markers [Verse], [Chorus], [Bridge]. Use [inst] or [instrumental] for no vocals duration int no 60 Audio length in seconds. 5–240 (max 4 min per call) 种子 int no -1 Reproducibility; -1 randomizes

Pricing: ACE Step $0.0002/s · ACE Step 1.5 $0.0003/s. 60 s ≈ $0.012 / $0.018; 240 s ≈ $0.048 / $0.072.

Invoke

Tag-driven instrumental:

运行comfy 运行 acestep-AI/ace-step/text-to-audio \ --输入 '{ "tags": "lo-fi hip-hop, mellow, vinyl crackle, rhodes piano, soft drums, 75 BPM", "lyrics": "[inst]", "duration": 90 }' \ --输出-dir ./out

Full vocal song with structure (use 1.5 for multilingual):

运行comfy 运行 acestep-AI/ace-step-1.5/text-to-audio \ --输入 '{ "tags": "indie pop, anthemic, electric 图形界面tar, driving drums, female vocal, 120 BPM", "lyrics": "[Verse]\nChalk on the palms, laces double-knotted\nMorning on the ridge, the sun is rising\n[Chorus]\nWe rise, we strike, we never fade out\nWe rise, we strike, we sing it loud\n[Bridge]\nSoft piano breakdown\n[Outro]\nFull band, fade", "duration": 60 }' \ --输出-dir ./out

Prompting tips Tags do the heavy lifting — be specific: "lo-fi hip-hop, mellow, vinyl crackle, rhodes piano, soft drums, 75 BPM" beats "chill music". Include BPM in tags when it matters — ACE respects tempo language. Lyrics with section markers: [Verse], [Chorus], [Bridge], [Outro]. Keep meter consistent across lines. Instrumental shortcut: "lyrics": "[inst]" or "[instrumental]". Belt-and-suspenders: also say "no vocals" in tags. Multilingual vocals: ACE Step 1.5 covers 50+ languages. Write lyrics directly in the tar获取 language; tag the language too ("japanese vocal, j-pop"). Fix the 种子 for reproducibility ("种子": 42); use -1 to explore variations. Cheap draft → polish: ACE Step at 5–10× lower cost is great for iterating tags before committing to a long render. 路由 2: ACE Step audio-inpAInt

模型: acestep-AI/ace-step/audio-inpAInt Cata记录: audio-inpAInt

模式 Field Type Required Default Notes audio string yes — HTTPS URL to MP3 / WAV / FLAC. Up to 60 min tags string yes — Comma-separated tags steering the re生成d segment 启动_time float no — 启动 of editable segment, in seconds (0–240) 启动_time_relative_to enum no 启动启动 or end — anchor for 启动_time e

数据来源：ClawHub ↗ · 中文优化：龙虾技能库