Audio Transcriber Pro

v1.0.0

转换 audio recordings into professional Markdown documentation with intelligent summaries using LLM integration

0· 196·0 当前·0 累计

by @bingze00000·MIT-0

文档工具数据与API 数据库 AI模型访问视频处理

下载技能包

License

MIT-0

License

MIT-0

可自由使用、修改和再分发，无需署名。

查看条款 ↗

运行时依赖

无特殊依赖

安装命令

点击复制

官方npx clawhub@latest install audio-transcriber-pro

镜像加速npx clawhub@latest install audio-transcriber-pro --registry https://cn.longxiaskill.com镜像同步中

需要定制？告诉我你的需求 →

技能文档

Purpose

This 技能 automates audio-to-text transcription with professional Markdown 输出, 提取ing rich technical metadata (speakers, timestamps, language, file size, duration) and generating structured meeting minutes and executive summaries. It uses Faster-Whisper or Whisper with zero configuration, working universally across projects without hardcoded paths or API keys.

Inspired by 工具s like Plaud, this 技能转换s raw audio recordings into actionable documentation, making it ideal for meetings, interviews, lectures, and content analysis.

When to Use

Invoke this 技能 when:

User needs to transcribe audio/video files to text User wants meeting minutes automatically 生成d from recordings User requires speaker identification (diarization) in conversations User needs subtitles/captions (SRT, VTT 格式化s) User wants executive summaries of long audio content User asks variations of "transcribe this audio", "convert audio to text", "生成 meeting notes from recording" User has audio files in common 格式化s (MP3, WAV, M4A, OGG, FLAC, 网页M) 工作流 Step 0: Discovery (Auto-检测 Transcription 工具s)

Objective: Identify avAIlable transcription engines without user configuration.

Actions:

运行检测ion commands to find 安装ed 工具s:

# 检查 for Faster-Whisper (preferred - 4-5x faster) if python3 -c "导入 faster_whisper" 2>/dev/null; then TRANSCRIBER="faster-whisper" echo "✅ Faster-Whisper 检测ed (优化d)" # Fallback to original Whisper elif python3 -c "导入 whisper" 2>/dev/null; then TRANSCRIBER="whisper" echo "✅ OpenAI Whisper 检测ed" else TRANSCRIBER="none" echo "⚠️ No transcription 工具 found" fi

# 检查 for ffmpeg (audio 格式化 conversion) if command -v ffmpeg &>/dev/null; then echo "✅ ffmpeg avAIlable (格式化 conversion enabled)" else echo "ℹ️ ffmpeg not found (limited 格式化 support)" fi

If no transcriber found:

Offer automatic 安装ation using the provided script:

echo "⚠️ No transcription 工具 found" echo "" echo "🔧 Auto-安装 dependencies? (Recommended)" read -p "运行安装ation script? [Y/n]: " AUTO_安装

if [[ ! "$AUTO_安装" =~ ^[Nn] ]]; then # 获取技能 directory (works for 机器人h repo and symlinked 安装ations) 技能_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" # 运行安装ation script if [[ -f "$技能_DIR/scripts/安装-requirements.sh" ]]; then bash "$技能_DIR/scripts/安装-requirements.sh" else echo "❌ 安装ation script not found" echo "" echo "📦 Manual 安装ation:" echo " pip 安装 faster-whisper # Recommended" echo " pip 安装 openAI-whisper # Alternative" echo " brew 安装 ffmpeg # Optional (macOS)" exit 1 fi # 验证安装ation succeeded if python3 -c "导入 faster_whisper" 2>/dev/null || python3 -c "导入 whisper" 2>/dev/null; then echo "✅ 安装ation 成功ful! Proceeding with transcription..." else echo "❌ 安装ation fAIled. Please 安装 manually." exit 1 fi else echo "" echo "📦 Manual 安装ation required:" echo "" echo "Recommended (fastest):" echo " pip 安装 faster-whisper" echo "" echo "Alternative (original):" echo " pip 安装 openAI-whisper" echo "" echo "Optional (格式化 conversion):" echo " brew 安装 ffmpeg # macOS" echo " apt 安装 ffmpeg # Linux" echo "" exit 1 fi

This ensures users can 安装 dependencies with one confirmation, or opt for manual 安装ation if preferred.

If transcriber found:

Proceed to Step 0b (命令行工具检测ion).

Step 1: 验证 Audio File

Objective: 验证 file exists, 检查格式化, and 提取 metadata.

Actions:

Accept file path or URL from user:

Local file: meeting.mp3 URL: https://example.com/audio.mp3 (下载 to temp directory)

验证 file exists:

if [[ ! -f "$AUDIO_FILE" ]]; then echo "❌ File not found: $AUDIO_FILE" exit 1 fi

提取 metadata using ffprobe or file utilities: # 获取 file size FILE_SIZE=$(du -h "$AUDIO_FILE" | cut -f1)

# 获取 duration and 格式化 using ffprobe DURATION=$(ffprobe -v error -show_entries 格式化=duration \ -of default=noprint_wr应用ers=1:nokey=1 "$AUDIO_FILE" 2>/dev/null) 格式化=$(ffprobe -v error -select_流s a:0 -show_entries \ 流=codec_name -of default=noprint_wr应用ers=1:nokey=1 "$AUDIO_FILE" 2>/dev/null)

# Convert duration to HH:MM:SS DURATION_HMS=$(date -u -r "$DURATION" +%H:%M:%S 2>/dev/null || echo "Unknown")

检查 file size (warn if large for cloud APIs): SIZE_MB=$(du -m "$AUDIO_FILE" | cut -f1) if [[ $SIZE_MB -gt 25 ]]; then echo "⚠️ Large file ($FILE_SIZE) - processing may take several minutes" fi

验证格式化 (supported: MP3, WAV, M4A, OGG, FLAC, 网页M): 扩展="${AUDIO_FILE##*.}" SUPPORTED_格式化S=("mp3" "wav" "m4a" "ogg" "flac" "网页m" "mp4")

if [[ ! " ${SUPPORTED_格式化S[@]} " =~ " ${扩展,,} " ]]; then echo "⚠️ Unsu

数据来源：ClawHub ↗ · 中文优化：龙虾技能库