首页龙虾技能列表 › Aliyun Speech Transcriber — 技能工具

🎤 Aliyun Speech Transcriber — 技能工具

v0.1.0

[自动翻译] Transcribe publicly accessible audio or video URLs with Aliyun speech services. Use when the user wants speech-to-text via Aliyun DashScope, needs tra...

1· 102·0 当前·0 累计
by @chenggongdu·MIT-0
下载技能包
License
MIT-0
最后更新
2026/3/24
安全扫描
VirusTotal
无害
查看报告
OpenClaw
安全
medium confidence
The skill's code and instructions are consistent with its stated purpose (transcribing public media URLs via Aliyun DashScope), but there are a few minor operational and security notes to be aware of before installing.
评估建议
This skill appears to do what it says: submit public media URLs to Aliyun DashScope and return transcripts. Before installing: ensure Node.js is available on the environment (the package instructs running node but 'required binaries' was left empty in metadata); keep your ASR_DASHSCOPE_API_KEY secret (do not hardcode it); only transcribe URLs you control or explicitly trust. Be aware the script will fetch any transcription_url the provider returns and include that content in its output — if an u...
详细分析 ▾
用途与能力
The skill's name, description, SKILL.md, and included script all align: they submit public media URLs to Aliyun DashScope and return transcript JSON/plain text. The declared required environment variable (ASR_DASHSCOPE_API_KEY with a DASHSCOPE_API_KEY fallback) matches the code. One incongruity: registry metadata lists no required binaries, but the runtime instructions and included file require running 'node scripts/transcribe.js' (i.e., Node.js must be available).
指令范围
The SKILL.md directs the agent to run the bundled Node script which only interacts with DashScope endpoints and the transcription result URLs. However, the script will fetch any transcription_url returned by DashScope and include that content in the printed JSON. If DashScope (or a malicious intermediary) returned a URL pointing at an internal endpoint or other unintended resource, the script would fetch and expose that content in the transcript output. The SKILL.md does include a safety rule to only send URLs the user intends to transcribe, but there is an inherent risk in following provider-supplied result URLs without additional validation.
安装机制
No install spec (instruction-only with an included script). All code is provided in the package, so nothing is downloaded from unknown external URLs at install time. This is low installation risk.
凭证需求
Only Aliyun DashScope API key environment variables are required (ASR_DASHSCOPE_API_KEY or DASHSCOPE_API_KEY); optional vars control model, language hints, and polling/timeouts. There are no unrelated credentials or broad access requests.
持久化与权限
The skill does not request permanent presence (always:false) and uses normal agent invocation. It does not modify other skills or system configs. This is proportionate for the stated function.
scripts/transcribe.js:33
Environment variable access combined with network send.
安全有层次,运行前请审查代码。

License

MIT-0

可自由使用、修改和再分发,无需署名。

运行时依赖

无特殊依赖

版本

latestv0.1.02026/3/24

Initial release of Aliyun Speech Transcriber skill. - Enables transcription of publicly accessible audio/video URLs via Aliyun DashScope. - Supports JSON and plain-text transcript extraction from media URLs. - Accepts multiple URLs and integrates with Qiniu-uploaded media. - Requires `ASR_DASHSCOPE_API_KEY` environment variable for authentication. - Provides configurable model, language hints, polling interval, and timeout options. - Returns structured JSON including transcript text and metadata.

● 无害

安装命令 点击复制

官方npx clawhub@latest install aliyun-speech-transcriber
镜像加速npx clawhub@latest install aliyun-speech-transcriber --registry https://cn.clawhub-mirror.com

技能文档

Use this skill to turn externally accessible media URLs into transcript results.

Current scope

Current implementation focuses on DashScope file transcription using the paraformer-v2 model, aligned with the existing Java service pattern.

Required environment variables

  • ASR_DASHSCOPE_API_KEY

Fallback supported:

  • DASHSCOPE_API_KEY

Optional:

  • ALIYUN_SPEECH_MODEL - defaults to paraformer-v2
  • ALIYUN_SPEECH_LANG_HINTS - defaults to zh,en
  • ALIYUN_SPEECH_POLL_SECONDS - defaults to 5
  • ALIYUN_SPEECH_TIMEOUT_SECONDS - defaults to 1800

Inputs

Pass one or more externally accessible URLs:

node scripts/transcribe.js --file-url "https://example.com/audio.mp3"

Multiple files:

node scripts/transcribe.js --file-url "https://a.com/1.mp3" --file-url "https://a.com/2.mp3"

Output

The script returns JSON with:

  • success
  • provider
  • engine
  • taskId
  • requestId
  • results
  • text

text is a best-effort plain-text extraction from the final JSON result.

Chaining from Qiniu

Typical workflow:

  • Use qiniu-upload to upload a local file.
  • Prefer a signed private URL if the domain is not anonymously readable.
  • Pass the returned URL into this skill.

Safety rules

  • Never hardcode Aliyun credentials.
  • Fail fast if DASHSCOPE_API_KEY is missing.
  • Only send URLs the user intends to transcribe.
数据来源:ClawHub ↗ · 中文优化:龙虾技能库
OpenClaw 技能定制 / 插件定制 / 私有工作流定制

免费技能或插件可能存在安全风险,如需更匹配、更安全的方案,建议联系付费定制

了解定制服务