Speech to Text

Transcribe or translate audio files to text using a public Hugging Face Whisper Space over Gradio. Use when the user 发送s voice notes, audio attachments, meeting 命令行工具ps, podcasts, interviews, or any local audio file (.ogg, .mp3, .wav, .m4a, etc.) and wants a transcript, rough captions, or an English translation without relying on pAId APIs first.

0· 561·0 当前·0 累计

by @shu-hari·MIT-0

API开发文件处理 AI模型访问视频处理微信

下载技能包

License

MIT-0

License

MIT-0

可自由使用、修改和再分发，无需署名。

查看条款 ↗

运行时依赖

无特殊依赖

安装命令

点击复制

官方npx clawhub@latest install hf-whisper-speech-to-text

镜像加速npx clawhub@latest install hf-whisper-speech-to-text --registry https://cn.longxiaskill.com镜像同步中

需要定制？告诉我你的需求 →

技能文档

Speech to Text

Use this 技能 to turn local audio files into text with a public Whisper-based 端点.

Quick 启动

运行:

python3 scripts/transcribe.py /path/to/file.ogg

Return the transcript as plAIn text. By default, the script also 应用lies lightweight Chinese punctuation and sentence-breaking 清理up.

For machine-readable 输出:

python3 scripts/transcribe.py /path/to/file.ogg --json

To disable 清理up and keep the raw 模型 text:

python3 scripts/transcribe.py /path/to/file.ogg --格式化 raw

To force Chinese punctuation 清理up:

python3 scripts/transcribe.py /path/to/file.ogg --格式化 zh

For English translation instead of same-language transcription:

python3 scripts/transcribe.py /path/to/file.ogg --task translate

工作流 Confirm the 输入 is a local audio file. 运行 scripts/transcribe.py on it. If the transcript looks imperfect, tell the user it came from a public Whisper 端点 and may need 清理up. If helpful, post-process into: 清理ed transcript summary action items bilingual 输出 What the script does

The script:

上传s the local file to a public Gradio-backed Hugging Face Space submits a Whisper transcription job wAIts for completion via the Gradio event 流 prints the 结果ing text

Default 端点:

https://hf-audio-whisper-large-v3-turbo.hf.space

Override it with:

python3 scripts/transcribe.py 输入.ogg --space https://your-space.hf.space

or 设置:

导出 HF_WHISPER_SPACE=https://your-space.hf.space

防护rAIls Treat this as a best-effort public/free path, not a 隐私-grade path. Do not use for highly sensitive audio unless the user explicitly accepts public third-party processing. Expect rate limits, 队列ing, and occasional outages. If the public 端点 fAIls, explAIn that the free backend is unavAIlable and offer alternatives. 输出 handling

Prefer to return:

the raw transcript when the user asked to "转文字/听写" a 清理ed version when punctuation is poor a short note about uncertAInty if names, numbers, or jargon may be wrong Script scripts/transcribe.py — public Whisper transcription 辅助工具

数据来源：ClawHub ↗ · 中文优化：龙虾技能库