midasheng-audio-text-distance — 语音文本检索

v1.0.0

基于 GLAP 通用语音预训练的多语种音频-文本检索与分类模型，可快速按文本搜索、匹配、排序音频文件。

0· 140·0 当前·0 累计

by @jimbozhang (Junbo Zhang)

AI模型访问

使用场景：使用midasheng-audio-text-distance — 语音文本检索进行AI模型访问使用midasheng-audio-text-distance — 语音文本检索

下载技能包

最后更新

2026/3/19

安全扫描

VirusTotal

无害

查看报告

OpenClaw

安全

high confidence

The skill's code, instructions, and network calls are consistent with its stated purpose (uploading audio to an external GLAP search service); there are no hidden endpoints, unexpected credential requests, or install steps that don't match the description.

评估建议

This skill appears to do what it says: it uploads audio files to a Xiaomi-hosted GLAP search API and returns similarity/classification results. Before installing or using it, consider: (1) Privacy — audio files are sent to https://llmplus.ai.xiaomi.com with no auth in examples, so do not upload sensitive or proprietary recordings unless you trust the service and its terms; (2) Network usage — the tool requires outbound network access; (3) Sanity check — test with non-sensitive samples first; (4)...

详细分析 ▾

✓ 用途与能力

The skill's name and description claim audio-text retrieval via GLAP and all required artifacts (SKILL.md examples and scripts/audiosearch.py) perform exactly that against the Xiaomi llmplus.ai.xiaomi.com/dasheng/audio/search endpoint. There are no unrelated binaries, config paths, or credentials requested.

✓ 指令范围

Runtime instructions and the included script only read user-supplied audio files and call the documented remote API (and a metrics endpoint for queue status). They do not read arbitrary system files or environment variables beyond what the user supplies. The SKILL.md and script consistently show network calls to the stated endpoint.

✓ 安装机制

This is an instruction-only skill with no install spec and a single small Python script; nothing is downloaded or written to disk by an installer, which minimizes install-time risk.

ℹ 凭证需求

No environment variables or credentials are requested (proportionate). However, the skill uploads audio files to a third-party endpoint (llmplus.ai.xiaomi.com) without any authentication in the provided examples, so sensitive audio will be transmitted off-host; users should consider privacy and trust of that endpoint before use.

✓ 持久化与权限

always is false, the skill does not request persistent system presence or modify other skills/config; it behaves as a normal, non-persistent, user-invoked utility.

安全有层次，运行前请审查代码。

运行时依赖

无特殊依赖

版本

latestv1.0.02026/3/19

- Initial release of midasheng-audio-text-distance. - Enables multilingual audio-text retrieval and classification using the GLAP model. - Supports searching and matching audio files against text descriptions, classifying audio by text queries, and zero-shot audio event detection in 50+ languages. - Provides queue status monitoring and guidance on interpreting service response delays. - Supports multiple common audio formats (mp3, wav, flac, ogg, m4a).

● 无害

安装命令

点击复制

官方npx clawhub@latest install midasheng-audio-text-distance

镜像加速npx clawhub@latest install midasheng-audio-text-distance --registry https://cn.longxiaskill.com镜像同步中

本土化适配说明

midasheng-audio-text-distance — 语音文本检索安装说明：安装命令：npx clawhub@latest install midasheng-audio-text-distance

需要定制？告诉我你的需求 →

运行时依赖

版本

安装命令

本土化适配说明

相关技能推荐