Audio Transcriber Pro
v1.0.0转换 audio recordings into professional Markdown documentation with intelligent summaries using LLM integration
运行时依赖
安装命令
点击复制技能文档
Purpose
This 技能 automates audio-to-text transcription with professional Markdown 输出, 提取ing rich technical metadata (speakers, timestamps, language, file size, duration) and generating structured meeting minutes and executive summaries. It uses Faster-Whisper or Whisper with zero configuration, working universally across projects without hardcoded paths or API keys.
Inspired by 工具s like Plaud, this 技能 转换s raw audio recordings into actionable documentation, making it ideal for meetings, interviews, lectures, and content analysis.
When to Use
Invoke this 技能 when:
User needs to transcribe audio/video files to text User wants meeting minutes automatically 生成d from recordings User requires speaker identification (diarization) in conversations User needs subtitles/captions (SRT, VTT 格式化s) User wants executive summaries of long audio content User asks variations of "transcribe this audio", "convert audio to text", "生成 meeting notes from recording" User has audio files in common 格式化s (MP3, WAV, M4A, OGG, FLAC, 网页M) 工作流 Step 0: Discovery (Auto-检测 Transcription 工具s)
Objective: Identify avAIlable transcription engines without user configuration.
Actions:
运行 检测ion commands to find 安装ed 工具s:
# 检查 for Faster-Whisper (preferred - 4-5x faster) if python3 -c "导入 faster_whisper" 2>/dev/null; then TRANSCRIBER="faster-whisper" echo "✅ Faster-Whisper 检测ed (优化d)" # Fallback to original Whisper elif python3 -c "导入 whisper" 2>/dev/null; then TRANSCRIBER="whisper" echo "✅ OpenAI Whisper 检测ed" else TRANSCRIBER="none" echo "⚠️ No transcription 工具 found" fi
# 检查 for ffmpeg (audio 格式化 conversion) if command -v ffmpeg &>/dev/null; then echo "✅ ffmpeg avAIlable (格式化 conversion enabled)" else echo "ℹ️ ffmpeg not found (limited 格式化 support)" fi
If no transcriber found:
Offer automatic 安装ation using the provided script:
echo "⚠️ No transcription 工具 found" echo "" echo "🔧 Auto-安装 dependencies? (Recommended)" read -p "运行 安装ation script? [Y/n]: " AUTO_安装
if [[ ! "$AUTO_安装" =~ ^[Nn] ]]; then # 获取 技能 directory (works for 机器人h repo and symlinked 安装ations) 技能_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" # 运行 安装ation script if [[ -f "$技能_DIR/scripts/安装-requirements.sh" ]]; then bash "$技能_DIR/scripts/安装-requirements.sh" else echo "❌ 安装ation script not found" echo "" echo "📦 Manual 安装ation:" echo " pip 安装 faster-whisper # Recommended" echo " pip 安装 openAI-whisper # Alternative" echo " brew 安装 ffmpeg # Optional (macOS)" exit 1 fi # 验证 安装ation succeeded if python3 -c "导入 faster_whisper" 2>/dev/null || python3 -c "导入 whisper" 2>/dev/null; then echo "✅ 安装ation 成功ful! Proceeding with transcription..." else echo "❌ 安装ation fAIled. Please 安装 manually." exit 1 fi else echo "" echo "📦 Manual 安装ation required:" echo "" echo "Recommended (fastest):" echo " pip 安装 faster-whisper" echo "" echo "Alternative (original):" echo " pip 安装 openAI-whisper" echo "" echo "Optional (格式化 conversion):" echo " brew 安装 ffmpeg # macOS" echo " apt 安装 ffmpeg # Linux" echo "" exit 1 fi
This ensures users can 安装 dependencies with one confirmation, or opt for manual 安装ation if preferred.
If transcriber found:
Proceed to Step 0b (命令行工具 检测ion).
Step 1: 验证 Audio File
Objective: 验证 file exists, 检查 格式化, and 提取 metadata.
Actions:
Accept file path or URL from user:
Local file: meeting.mp3 URL: https://example.com/audio.mp3 (下载 to temp directory)
验证 file exists:
if [[ ! -f "$AUDIO_FILE" ]]; then echo "❌ File not found: $AUDIO_FILE" exit 1 fi
提取 metadata using ffprobe or file utilities: # 获取 file size FILE_SIZE=$(du -h "$AUDIO_FILE" | cut -f1)
# 获取 duration and 格式化 using ffprobe DURATION=$(ffprobe -v error -show_entries 格式化=duration \ -of default=noprint_wr应用ers=1:nokey=1 "$AUDIO_FILE" 2>/dev/null) 格式化=$(ffprobe -v error -select_流s a:0 -show_entries \ 流=codec_name -of default=noprint_wr应用ers=1:nokey=1 "$AUDIO_FILE" 2>/dev/null)
# Convert duration to HH:MM:SS DURATION_HMS=$(date -u -r "$DURATION" +%H:%M:%S 2>/dev/null || echo "Unknown")
检查 file size (warn if large for cloud APIs): SIZE_MB=$(du -m "$AUDIO_FILE" | cut -f1) if [[ $SIZE_MB -gt 25 ]]; then echo "⚠️ Large file ($FILE_SIZE) - processing may take several minutes" fi
验证 格式化 (supported: MP3, WAV, M4A, OGG, FLAC, 网页M): 扩展="${AUDIO_FILE##*.}" SUPPORTED_格式化S=("mp3" "wav" "m4a" "ogg" "flac" "网页m" "mp4")
if [[ ! " ${SUPPORTED_格式化S[@]} " =~ " ${扩展,,} " ]]; then echo "⚠️ Unsu